The Protein Database Metrics or PDB-Metrics provides a summary of the protein structures deposited in the Protein Data Bank (PDB). It offers a vast collection of descriptors and means to recover specific PDB files. PDB-Metrics is a powerful tool for the bioinformatics researcher to analyze the PDB's collection of protein structure descriptions from a variety of perspectives, and recover specific files using a repertoire of alternate criteria.
PDB-Metrics employs a database system to maintain several measures extracted from files deposited in the PDB, such as size, type, deposition date, number of chains, number of models (for proteins resolved using NMR - Nuclear Magnetic Resonance), resolution (for non-NMR files), number of atoms, and the frequency of each kind of residue in the protein, among other characteristics. The PDB-Metrics database also includes keywords, taken from the headers of the PDB files, to support information search and recovery. This database is periodically updated (usually once a week). The time of the most recent update of the PDB-Metrics database appears in the top of the PDB-Metrics main page. This date and time refers to the moment PDB-Metrics started the updating process, using a local copy of the PDB, which must have been synchronized with the PDB central repository some minutes earlier.
In the following, we describe the several options provided by
PDB-Metrics for browsing and recovering its data.
Search by Name recovers an specific PDB structure by its name. When you choose this option, the PDB-Metrics prompts you to provide a PDB name (e.g., 1cho). Once you provide such a name, the program presents the data files available for that particular protein (e.g., .PDB, .HSSP, .CON (Contacts File), .AIR (Protein Dossier File), .ANG (Dihedral Angles File)). Click in the link associated with the PDB name in a particular column to see the respective data file. The column Active Site is a link to the description of the active site(s) of the structure, provided by the European Bioinformatics Institute (EBI). The # of Chains column presents the number of chains of the protein, and is associated with a link that allows you to visualize the structure of the protein using the Chime plugin (make sure you have properly installed Chime and all the STING required software in your system).
PDB files obtained by NMR list the PDB structures obtained by Nuclear Magnetic Resonance (NMR), with links to the description files available for each structure (e.g., .PDB, .HSSP, .CON (Contacts File), .AIR (Protein Dossier File), .ANG (Dihedral Angles File)), active sites and the protein view on Chime. The complete loading of the table can take a while, because of its large size. Click on Show the list of PDB names to see only the list of PDB names.
PDB files by Type list the types of PDB structures taken from the HEADER section of all deposited PDB structures. The types are presented in alphabetic order. For each type, the program presents the number of structures of that type in the PDB collection. Click in the link associated with a particular type in order to check the protein structures of the respective type.
PDB files by Deposition Year shows the number of PDB structures deposited each year. Click in the link associated with a particular year to check the structures deposited that year.
PDB files by Size groups the deposited PDB structures by ranges of the size of the .PDB files. Click in the link associated with a size range to check the structures whose PDB file (.PDB) size fall in that range.
PDB files by Number of Models list the deposited PDB structures with different numbers of models. This option takes into account only the structures obtained by Nuclear Magnetic Resonance. Click in the link associated with a particular number of models to check the NMR resolved structures with that number of models.
PDB files by Number Chains list the deposited PDB structures with different numbers of chains. Click in the link associated with a particular number of chains to check the respective PDB structures.
You are encouraged to supply enough criteria to restrict your query the most. If you fail to provide enough criteria to restrict your query, the system can take to long to recover all the data and send you the results. When PDB-Metrics detects that your query is not restrictive enough and can overload the server, it sends you the message "Please, fill in more field(s) to further restrict you search". In this case, try to provide values for more fields in the form, so that your query further restrict the amount of data to be recovered.
The fields of the Advanced Search that do not appear in the PDB-Metrics main page are described below.
Chain Type specifies the kind of PDB structures you want to recover. Any indicates any kind of structure. Protein, nucleic acids or virus indicates structures containing only amino acids and hetero-compounds. Nucleic acids indicates structures containing nitrogen bases (DNA or RNA). Complex indicates that you want any of the previous two, i.e., structures containing amino acids, hetero-compounds and/or nitrogen bases. Other indicates all other kinds of structures not mentioned above.
Keyword you can fill in this field with any word or expression that you want PDB-Metrics to check against the HEADER, TITLE, COMPOUND, SOURCE and KEYWORDS sections of the PDB files. PDB-Metrics will recover the structures containing the word/expression you have supplied in at least one of those sections.
Group by allows you to group the recovered protein structures according to a collection of up to three characteristics. You must chose values for these fields in the combo boxes in front of Group by, from the left to the right. The fields you leave blank are not taken into account.
Order by allows the ordering of the results according to a sequence of up to three characteristics. This fields will be considered only if you do not fill any field of the Group by collection. If you fill any field(s) of the Group By collection, PDB-Metrics will take those fields to define the ordering of the results. In any case you can choose Ascending or Descending ordering.