Latest News

Protein Data Bank: 100,000 structures


The Worldwide Protein Data Bank (wwPDB) today announces that it has released to the community its 100,000th macromolecular structure. Established in 1971, this central, public archive of experimentally-determined protein and nucleic acid structures has reached this important milestone thanks to the efforts of structural biologists throughout the world. 

Function follows form

In the 1950s, scientists had their first real look at the structures of proteins and DNA at the atomic level. The determination of these early three-dimensional structures by X-ray crystallography inspired a new era in biology. The value of archiving and sharing these data were indisputable and in 1971 the Protein Data Bank (PDB) was established as an international collaboration with sites in the US and the UK.

Beginning with just seven entries–carboxypeptidase, chymotrypsin, cytochrome, hemoglobin (lamprey), lactate dehydrogenase, subtilisin, and trypsin inhibitor–the PDB archive provides both a home and an access point to the World’s output of biomacromolecular structures. The PDB is growing swiftly, doubling in size since 2008 and releasing around 200 new structures to the scientific community every week. The resource is accessed hundreds of millions of times every year by researchers, students, and educators wishing to explore how different proteins are related to one another, to clarify biological mechanisms and to develop new medicines.

This week the PDB welcomes 219 new structures into the archive. These structures join others vital to pharmacology, bioinformatics and education, and takethe PDB beyond the 100,000 structure mark.

“The PDB is a critical resource for the international community of working scientists, which includes everyone from geneticists to pharmaceutical companies interested in drug targets,” says Nobel laureate Venki Ramakrishnan of the MRC Laboratory of Molecular Biology in Cambridge, UK.

A growing community

Since its inception, the PDB has been a community-driven archive, evolvinginto a critical international resource for biological research. Since 2003 the Worldwide PDB(wwPDB), a collaboration between the US, UK and Japan, has ensured that these valuable data continue to be stored, managed and kept freely available for the benefit of scientists worldwide.The wwPDB partner sites work closely with community experts to de?ne deposition and annotation policies, data representation issues and validation standards. In addition, the wwPDB works to raise the profile of structural biology with increasingly broad audiences.

Each structure submitted to the archive is carefully curated by wwPDB staff before release. New depositions are checked and enhanced with value-added annotations and cross-linked with other important biological data to ensure that PDB structures are discoverable and interpretable by users with a wide range of backgrounds and interests.

Future challenges

The scientific community eagerly awaits the next 100,000 structures and the knowledge these will undoubtedly bring. However, the increasing number, size and complexity of biological data being deposited in the PDB and the emergence of hybrid methods, which use a variety of biophysical, biochemical, and modelling techniques to determine the shapes of biologically relevant molecules, all present major challenges for the management and presentation of structural data. wwPDB will continue to work with the community to meet these challenges and to ensure that the archive maintains the highest possible standards of quality, integrity and consistency.

About the wwPDB

The wwPDB ( is the international partnership that manages the PDB archive. Its mission is to maintain a single archive of macromolecular structural data that is freely and publicly available to the global community. It consists of the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB; at Rutgers, The State University of New Jersey and the San Diego Supercomputer Center (SDSC) and Skaggs School of Pharmacy and Pharmaceutical Sciences at theUniversity of California San Diego and BioMagResBank (BMRB; at the University of Wisconsin in the USA, the Protein Data Bank in Europe (PDBe; at the EMBL European Bioinformatics Institute, and the Protein Data Bank Japan (PDBj; at Osaka University.

The RCSB PDB receives funds from the NSF, NIH and DOE. The PDBe receives funding from EMBL, the Wellcome Trust, NIH, EU, BBSRC and MRC. PDBj is funded by the Japan Science and Technology Agency, and BioMagResBank by NLM.