Sequence analysis
From MyBio
This section of MyBio is concerned with various types of sequence analysis, and is divided into subcategories by the types of sequence or functional units you may be looking for, and includes sequence viewer tools.
Contents |
[edit] Categories which link here
[edit] Related subcategories
[edit] Sequence analysis - Quick Links
[edit] Sequence analysis - Web Resources
CpG Island Scanners (commercial)
UlrichScience DataGenerator software is commonly used in three steps:
1. Design your in silico experiment. This will result in a settings file outlining the new experiment, which allows you to run the experiment.
2. Modify the initial experiment settings file. That is, load a settings file, modify some of its parameters and run the experiment.
3. Analyze the generated data either by inspecting the generated HTML and TXT files or in employing the provided Viewers.
Products such as the DScanPlus software are suited to design your personal CpG island scan. All products can be modified to meet your needs.
GenomicLandscapeViewer (fairware)
The GLV (Genomic Landscape Viewer) tool allows to view various flat files. File content is graphically presented and may be scaled over wide ranges down to the sequence level. Files containing 400 000 bp and more have been tested.
Java Word Frequencies (JFreq) is a front end to Schbath's R'MES which is available at the R'MES website. This is used to find the expected and actual frequencies of short nucleotide sequence strings using a Markov model to account for the effects of base composition and the frequencies of shorter strings. Unusually frequent or infrequent occurrences of certain strings may be indicative of biological relevance. R'MES can also determine whether strings occur in overlapping positions unusually often or unusually rarely frequencies.
Genome-based fingerprint scanning Java User Interface (GFS java user interface) is a java interface for GFS which is available at the GFS website. GFS is a program used to map peptide mass fingerprint (PMF) data directly to raw genomic sequences; this enables the rapid, low-cost identification of potential proteins in unannotated genome sequences. To scan a genome sequence of interest, an experimentally-obtained PMF is entered into the program the program will then generate a theoretical mass list by translating the genome of interest in 6 reading frames. This version of GFS should only be used with viral genomes (or similarly-sized DNA sequences).
Viral Genome Organizer (VGO) is a java based interface used for viewing and searching viral genome sequences. This organizer displays information relevant to a genome of interest, including its genes, ORFs and start/stop codons and can be used to perform a regular expression search, a fuzzy motif search, and a masslist search. VGO can be used to identify related genes across multiple sequences.
GraphDNA allows the user to generate graphical representations of raw DNA sequences. Currently there are 8 graphing options and it is possible to plot any of the individal genes or genomes in any format.
Hydrophobicity Grapher graphs the hydrophobicity/hydrophilicity of a sequence of amino acids using a sliding window. The window size can be specified and several hydrophobicity scales can be used to determine the plot.

%GC Calculates the fractional GC content of nucleic acid sequences.
CpG Islands Searcher
Identifies CpG islands in a given nucleotide sequence. Clusters of CpG dinucleotides in GC rich regions of the genome called CpG islands frequently occur in the 5-prime ends of genes. Methylation of CpG islands plays a role in transcriptional silencing in higher organisms in certain situations. CpG-island-extraction algorithm which has been deployed as a web service which has a simple user interface to identify CpG islands from submitted sequences of up to 50kb, displaying the results as a graphical map of CpG dinucleotide distribution and borders of CpG islands. A command-line version of the CpG islands searcher has also been developed for larger sequences.
| DPL Forum: Too few categories! |
Grail (ORNL) Prediction of protein coding regions, Exons, Pol II promoters, CpG islands, Islands,PolyA Sites and Repetitive DNA Elements.
Net Start 1.0 Prediction of translation start site by neural net. The NetStart server produces neural network predictions of translation start in vertebrate and Arabidopsis thaliana nucleotide sequences. NetStart has been trained on cDNA-like sequences and will therefore presumably have better performance for cDNAs and ESTs. Has not been tested the performance on genome data which may contain introns adjacent to the start codon.
FreqSQ FREQSQ calculates the frequencies (number and percentage) of each base, their dinucleotides, codons or amino acids of a nucleic acid sequence.
Isochore Plots GC content over large sequences.
VecScreen Screen for vector-derived sequence contamination in nucleic acid sequence.
SeqVISTA A Graphical Tool for Sequence Feature Visualization and Comparison. SeqVISTA presents a holistic, graphical view of features annotated on nucleotide or protein sequences. This interactive tool highlights the residues in the sequence that correspond to features chosen by the user, and allows easy searching for sequence motifs or extraction of particular subsequences. SeqVISTA is able to display results from diverse sequence analysis tools in an integrated fashion, and aims to provide much-needed unity to the bioinformatics resources scattered around the Internet. Our viewer may be launched on a GenBank record by a single click of a button installed in the web browser. SeqVISTA allows insights to be gained by viewing the totality of sequence annotations and predictions, which may be more revealing than the sum of their parts. SeqVISTA should run on any operating system with a Java 1.4 virtual machine, and it is freely available to academic users.
QGRS Mapper G-quadruplex analysis tool QGRS Mapper is a web-based program that generates information on composition and distribution of putative Quadruplex forming G-Rich Sequences (QGRS) in nucleotide sequences and NCBI genes.
