Useful subsets of PDB files


  • One or more chains have unobserved residues *in the middle* and not just at the terminals

  • With modified residues *other* than MSE / Seleno-Methionine / SeMet

    Other usful files


  • All FASTA sequences in the PDB

    Clustering of 100% sequence identical chains in the PDB using blastclust with the flag -L 1 (coverage treshold of 100%). http://resources.rcsb.org/sequence/clusters/bc-100.out uses a coverage treshold of 0.9 and thus sequences in this file are *not* 100% identical. See http://pdb.rcsb.org/pdb/statistics/clusterStatistics.do and http://www.ncbi.nlm.nih.gov/Web/Newsltr/Spring04/blastlab.html for a further explanation.
    The FASTA sequences used do not take into account modified residues. Instead the parent/standard residue of each modified residue is given in the FASTA sequence and the residue modification is not considered a mutation.
  • bc-100.out