Databases available for BLAST2 search
Please refer also to which BLAST program is appropriate to search against which database?. All databases
listed here are updated on a regularly basis (either daily or weekly). Actually you can run a BLAST search against a database to see its update date.
Attention: all listed databases are not updated anymore and this service is running as it is, Thu May 14 15:47:53 MDT 2009.
Perhaps a new setup should/will be done to keep this service running with the latest softwares and databases. Any sponsors? Thanks. - yuan@embl.de
Peptide|Protein Sequence Databases
- nrdb95 (daily updated)
- Using Liisa Holm's nrdb program, the database "nrdb" is checked and clustered during this procedure by using 95%-identity criteria. Only the representative seq. of a cluster remains in the "nrdb95", thus near-neighbour redundant sequences are removed from the database nrdb95. The reduction in sequence number is huge: 209190 sequences in "nrdb95" in comparison to 329796 sequences in nrdb (state: 02.09.1998). This should speed up your homology search. This database will be updated on a daily basis.
- swissprot (daily checking against a new rel. of swissprot)
- the latest major release of the SWISS-PROT protein sequence
database (Rel. 39.0 May.2000)
-
pdb (daily updated if any change)
- Sequences derived from the 3-dimensional structure Brookhaven Protein Data Bank. This db is daily updated and all sequences
only containing "X" and all theoret. Modell-sequences are removed from this db.
- sp_nrdb (daily updated)
- All non-redundant Protein Database SwissProt+SwissProtNew+SptremblNew+Sptrembl.
Actually, nonidentical sequences extracted from the above databases. It has fewer sequences (for example on 1.4.1998, 72558 sequences less) than nrdb (which is more exhaustive, 303,844 sequences), but mostly due to reduction of redundancy that is not identified by the NCBI's nrdb program. The sensitivity thus should increase.
- nrdb (daily updated)
- All non-identical protein sequences extracted from EMBL CDS translations+PDB+SwissProt+PIR
Although more sequences than in sp_nrdb and thus more exhautive, the redundancy is larger than in sp_nrdb which might cause loss of sensitivity.
-
Nucleotide Sequence Databases
- nrnee (? this database becomes "too big" for our search machine [with only 512 MB!] The search time is very very long!!! Any sponsor of interest? :-)) Not updated since Mid-2000.
- Non-redundant Nucleotide databases (consisting of EMBL+GenBank+DDBJ, without EST's or STS's)
-
nrest (? this database becomes "too big" for our search machine [with only 512 MB!] The search time is very very long!!!) Not updated since Mid-2000.
- Non-redundant Database of EMBL+GenBank+DDBJ EST Divisions