Which of the following is wrong about Genetics Computer Group Sequence Format?
aEarlier versions of the Genetics Computer Group (GCG) programs require a unique sequence format and include programs that convert other sequence formats into GCG format
b Information about the sequence in the GenBank entry is not included but the line information is carried out
cIf one or more sequence characters become changed through error, a program reading the sequence will be able to determine that the change has occurred because the checksum value in the sequence entry will no longer be correct
dLines of information are terminated by two periods, which mark the end of information and the start of the sequence on the next line
In Stanford University/Intelligenetics Sequence Format At the end of the sequence, a 1 is placed if the sequence is linear, and a 2 if the sequence is circular.
Which of the following is wrong about National Biomedical Research Foundation/Protein Information Resource Sequence Format?
aSequences retrieved from the PIR database are not in this compact format, but in an expanded format with much more information about the sequence
bThe NBRF format is similar to the FASTA sequence format but with significant differences
cThis is different than PIR format
dThe first line includes an initial “>” character followed by a two-letter code such as P for complete sequence or F for fragment, followed by a 1 or 2 to indicate type of sequence, then a semicolon, then a four- to six-character unique name for the entry
Which of the following is wrong about FASTA Sequence Format?
aThe FASTA sequence format includes a comment line identified by a “>” character in the first column followed by the name and origin of the sequence
bThe FASTA sequence format includes the sequence in standard one-letter symbols
cThis format provides a very convenient way to copy just the sequence part from one window to another because there are no numbers or other nonsequence characters within the sequence
dThe presence of ‘*’ is not quite essential for reading the sequence correctly by some sequence analysis programs
The format of an entry in the SwissProt protein sequence database is very similar to the EMBL format.
Which of the following is wrong about European Molecular Biology Laboratory Data Library Format?
aEMBL maintains DNA and protein sequence databases
bAs with GenBank entries, a large amount of information describing each sequence entry is given
cSequence entry includes literature references and information about the function of the sequence, but not locations of mRNAs and coding regions
dInformation is organized into fields, each with an identifier, shown as the first text on each line
In Organization of the GenBank database and the search procedure used by ENTREZ—each row is another sequence entry and each column another GenBank field.
A consecutive set of three-letter words that could be codons specifying the amino acid sequence of a protein. The sequence entry is assumed by computer programs to lie between the identifiers “ORIGIN” and “//”.
Which of the following is wrong about GenBank DNA Sequence Entry?
aThe information is organized into fields, each with an identifier, shown as the first text on each line
bIn some entries, these identifiers may be abbreviated to two letters, e.g., RF for reference
cSome identifiers may have additional subfields
dThe CDS subfield in the field FEATURES does not offer the amino acid sequence
Which of the given statements is incorrect about Database Types?
aRelational databases are more useful in the development of biological databases
bThe tables in relational database are carefully indexed and cross-referenced with each other, sometimes using additional tables, so that each item in the database has a unique set of identifying features
cThe relational database orders data in tables made up of rows giving specific items in the database, and columns giving the features as attributes of those items
dThe two principal types of DBs are the relational and object-oriented databases