
In a PDB file every atom is represented by one line starting
with 'ATOM'. The
ATOM label is followed
in fixed format by the atom number, atom type, residue type,
[chain name], residue number, Cartesian coordinates (x, y, z in Å),
occupancy and B-factor:
ATOM 1 N
GLN A 3 -7.473 2.493
8.251 1.00 91.59
ATOM 2 CA
GLN A 3 -6.052 2.183
8.590 1.00 91.36
ATOM 3 C
GLN A 3 -5.650 2.944
9.849 1.00 90.93
ATOM 4 O
GLN A 3 -4.559 3.527
9.928 1.00 90.73
ATOM 5 CB
GLN A 3 -5.130 2.542
7.420 1.00 91.75
ATOM 6 N
ILE A 4 -6.557 2.930
10.824 1.00 90.45
ATOM 7 CA
ILE A 4 -6.368 3.594
12.115 1.00 90.40
ATOM 8 C
ILE A 4 -5.029 3.198
12.762 1.00 90.28
ATOM 9 O
ILE A 4 -4.109 4.023
12.872 1.00 89.61
ATOM 10 CB ILE
A 4 -7.552 3.258
13.085 1.00 90.26
ATOM 11 CG1 ILE A
4 -7.256 3.762 14.504
1.00 89.86
ATOM 12 CG2 ILE A
4 -7.855 1.754 13.068
1.00 89.61
ATOM 13 CD1 ILE A
4 -8.232 3.270 15.565
1.00 89.31
...etc...
END
Final PDB files also have an extensive HEADER
section that contains a lot of details about the macromolecule, references,
co-factors, data collection, refinement, etc. This header section
usually contains more details than the primary reference:
HEADER ION TRANSPORT
28-JUL-99 1C3W
TITLE BACTERIORHODOPSIN/LIPID COMPLEX AT
1.55 A RESOLUTION
COMPND MOL_ID: 1;
COMPND 2 MOLECULE: BACTERIORHODOPSIN (GROUND STATE
WILD TYPE "BR");
COMPND 3 CHAIN: A;
COMPND 4 ENGINEERED: YES;
COMPND 5 BIOLOGICAL_UNIT: HOMOTRIMER;
COMPND 6 OTHER_DETAILS: SCHIFF BASE LINKAGE BETWEEN
LYS 216 (NZ)
COMPND 7 AND RET 301 (C15) DIETHER LIPID BILAYER
SOURCE MOL_ID: 1;
SOURCE 2 ORGANISM_SCIENTIFIC: HALOBACTERIUM SALINARUM;
SOURCE 3 CELLULAR_LOCATION: PLASMA MEMBRANE;
SOURCE 4 EXPRESSION_SYSTEM: HALOBACTERIUM SALINARUM;
SOURCE 5 EXPRESSION_SYSTEM_CELLULAR_LOCATION: CYTOPLASM;
SOURCE 6 OTHER_DETAILS: THIS SEQUENCE OCCURS NATURALLY
IN H.
SOURCE 7 SALINARUM
KEYWDS ION PUMP, MEMBRANE PROTEIN, RETINAL PROTEIN,
LIPIDS,
KEYWDS 2 PHOTORECEPTOR, HALOARCHAEA, 7-TRANSMEMBRANE,
SERPENTINE,
KEYWDS 3 ION TRANSPORT, MEROHEDRAL TWINNING
EXPDTA X-RAY DIFFRACTION
AUTHOR H.LUECKE
REVDAT 2 22-SEP-99 1C3W
1 REMARK HETNAM
REVDAT 1 15-SEP-99 1C3W
0
JRNL AUTH
H.LUECKE,B.SCHOBERT,H.-T.RICHTER,J.-P.CARTAILLER,
JRNL AUTH 2 J.K.LANYI
JRNL TITL
STRUCTURE OF BACTERIORHODOPSIN AT 1.55 ANGSTROM
JRNL TITL 2 RESOLUTION
JRNL REF
J.MOL.BIOL.
V. 291 899 1999
JRNL REFN
ASTM JMOBAK UK ISSN 0022-2836
0070
REMARK 1
REMARK 1 REFERENCE 1
REMARK 1 AUTH H.LUECKE,H.-T.RICHTER,J.K.LANYI
REMARK 1 TITL PROTON TRANSFER PATHWAYS
IN BACTERIORHODOPSIN AT
REMARK 1 TITL 2 2.3 ANGSTROM RESOLUTION
REMARK 1 REF SCIENCE
V. 280 1934 1998
REMARK 1 REFN ASTM SCIEAS US
ISSN 0036-8075
0038
REMARK 2
REMARK 2 RESOLUTION. 1.55 ANGSTROMS.
REMARK 3
REMARK 3 REFINEMENT.
REMARK 3 PROGRAM
: SHELXL-97
REMARK 3 AUTHORS
: G.M.SHELDRICK
REMARK 3
REMARK 3 DATA USED IN REFINEMENT.
REMARK 3 RESOLUTION RANGE HIGH (ANGSTROMS)
: 1.55
REMARK 3 RESOLUTION RANGE LOW (ANGSTROMS)
: 12.0
REMARK 3 DATA CUTOFF
(SIGMA(F)) : 0.000
REMARK 3 COMPLETENESS FOR RANGE
(%) : 99.1
REMARK 3 CROSS-VALIDATION METHOD
: THROUGHOUT
REMARK 3 FREE R VALUE TEST SET SELECTION
: THIN RESOLUTION
REMARK 3
SHELLS
REMARK 3
REMARK 3 FIT TO DATA USED IN REFINEMENT (NO CUTOFF).
REMARK 3 R VALUE (WORKING +
TEST SET, NO CUTOFF) : NULL
REMARK 3 R VALUE
(WORKING SET, NO CUTOFF) : 0.158
REMARK 3 FREE R VALUE
(NO CUTOFF) : 0.225
REMARK 3 FREE R VALUE TEST SET SIZE (%,
NO CUTOFF) : 5.00
REMARK 3 FREE R VALUE TEST SET COUNT
(NO CUTOFF) : 1687
REMARK 3 TOTAL NUMBER OF REFLECTIONS
(NO CUTOFF) : 32249
REMARK 3
REMARK 3 FIT/AGREEMENT OF MODEL FOR DATA WITH
F>4SIG(F).
REMARK 3 R VALUE (WORKING +
TEST SET, F>4SIG(F)) : NULL
REMARK 3 R VALUE
(WORKING SET, F>4SIG(F)) : 0.140
REMARK 3 FREE R VALUE
(F>4SIG(F)) : 0.201
REMARK 3 FREE R VALUE TEST SET SIZE (%,
F>4SIG(F)) : 5.00
REMARK 3 FREE R VALUE TEST SET COUNT
(F>4SIG(F)) : 1390
REMARK 3 TOTAL NUMBER OF REFLECTIONS
(F>4SIG(F)) : 26270
REMARK 3
REMARK 3 NUMBER OF NON-HYDROGEN ATOMS USED IN
REFINEMENT.
REMARK 3 PROTEIN ATOMS
: 1720
REMARK 3 NUCLEIC ACID ATOMS :
0
REMARK 3 HETEROGEN ATOMS
: 330
REMARK 3 SOLVENT ATOMS
: 23
REMARK 3
REMARK 3 MODEL REFINEMENT.
REMARK 3 OCCUPANCY SUM OF NON-HYDROGEN
ATOMS : 2073.00
REMARK 3 OCCUPANCY SUM OF HYDROGEN ATOMS
: NULL
REMARK 3 NUMBER OF DISCRETELY DISORDERED
RESIDUES : 0
REMARK 3 NUMBER OF LEAST-SQUARES PARAMETERS
: 8300
REMARK 3 NUMBER OF RESTRAINTS
: 8209
REMARK 3
REMARK 3 RMS DEVIATIONS FROM RESTRAINT TARGET
VALUES.
REMARK 3 BOND LENGTHS
(A) : 0.010
REMARK 3 ANGLE DISTANCES
(A) : 0.030
REMARK 3 SIMILAR DISTANCES (NO TARGET VALUES)
(A) : NULL
REMARK 3 DISTANCES FROM RESTRAINT PLANES
(A) : 0.264
REMARK 3 ZERO CHIRAL VOLUMES
(A**3) : 0.072
REMARK 3 NON-ZERO CHIRAL VOLUMES
(A**3) : 0.079
REMARK 3 ANTI-BUMPING DISTANCE RESTRAINTS
(A) : 0.013
REMARK 3 RIGID-BOND ADP COMPONENTS
(A**2) : NULL
REMARK 3 SIMILAR ADP COMPONENTS
(A**2) : NULL
REMARK 3 APPROXIMATELY ISOTROPIC ADPS
(A**2) : NULL
REMARK 3
REMARK 3 BULK SOLVENT MODELING.
REMARK 3 METHOD USED: SHELXL-97 SWAT, BABINET'S
PRINCIPLE
...
HELIX 1 1 GLU A
9 GLY A 31 1
23
HELIX 2 2 ASP A 36
GLY A 63 1
28
HELIX 3 3 TRP A 80
VAL A 101 1
22
HELIX 4 4 ASP A 104 THR
A 128 1
25
HELIX 5 5 VAL A 130 GLY
A 155 1
26
HELIX 6 6 ARG A 164 GLY
A 192 1
29
HELIX 7 7 PRO A 200 SER
A 226 1
27
HELIX 8 8 ARG A 227 PHE
A 230 5
4
SHEET 1 A 2 LEU A 66
PHE A 71 0
SHEET 2 A 2 GLU A 74
TYR A 79 -1 N GLU A 74 O PHE
A 71
LINK NZ LYS
A 216
C15 RET A 301
CRYST1 60.631 60.631 108.156
90.00 90.00 120.00 P 63
...
ATOM 1 N THR A
5 24.150 25.374 -13.588 1.00
61.71 N
The mmCIF
format is an extension of the CIF (Crystallographic Information
File) format developed for small molecule x-ray structures by the
IUCr (International Union
of Crystallography). The objective is to provide a powerful
syntax to allow a complete description of any macromolecular structure.
mmCIF is currently only used for the deposition of observed structure factors.
In this file excerpt, there is a declaration or header portion which
defines the columns in the data portion. Each data line contains
three Miller indeces (h, k, l), the observed structure factor (in this
case its square or the intensity), and standard deviation of the square
of the observed structure factor:
data_r403dsf
loop_
_refln.index_h
_refln.index_k
_refln.index_l
_refln.F_squared_meas
_refln.F_squared_sigma
1 0 0
1.05 0.75
2 0 0
3184.56 49.78
4 0 0
79.85 7.26
5 0 0
9.50 3.84
6 0 0
1116.30 39.13
7 0 0
13.63 7.03
8 0 0
145.03 7.33
10 0 0
68.84 12.00
12 0 0
98.49 15.63
14 0 0
42.32 12.84
15 0 0
16.88 10.46
16 0 0
38.74 12.35
1 0 1
3905.92 48.39
2 0 1
41.17 2.84
3 0 1
1517.18 41.30
...
#END OF REFLECTIONS
ADIT (AutoDep Input Tool) is used for submitting x-ray diffraction, electron diffraction, NMR or theoretical structures to the PDB of data to be included in the PDB archive. It is also used internally by the PDB for data processing & validation. To deposit a structure, the user uploads the relevant coordinate and experimental data files and then adds any additional information. A session ID number is provided for depositors who wish to continue a deposition session at a later time. ADIT is a web-based tool that uses frames. ADIT works with most browsers, although there are known problems with Netscape on Linux.

Structures deposited using ADIT are processed immediately by the PDB staff and returned for final approval to the author. In most cases, files are fully processed within nine days and are released according to the release status provided by the author:
For publication in (almost) all journals you will
need the four character PDB code (like 1C3W below) at the galley stage:

The number of PDB entries as of June 6, 2000 is 12,474.
Growth of the database has been nearly exponential over the past 25 years.
With Structural Genomics on the horizon, this growth rate is unlikely
to change anytime soon:

Number of "new folds" (blue) and "old
folds" (red) for a given year. Note: A chain fold was considered
old if it was similar to a deposited fold according to the following
criteria:


Please report typos, errors etc. by EMAIL (mention the title of this page).