PROPOSED
DATA ITEMS DESCRIBING PROTEIN PRODUCTION and CRYSTALLIZATION
(Revised:
8-July-2003 J. Westbrook)
These
items are supposed to allow reproduction of the protein production and
are
aimed at the same level of detail as a J. Mol. Biol. Publication.
Please
email jon@strubi.ox.ac.uk with any
comments/questions on these items.
Further
details are available from http://www.oppf.ox.ac.uk/ (mmCIF2.htm).
01-Jul-02: Starting dictionary obtained from http://pdb.rutgers.edu/mmcif.
Original
dictionary developed with significant input from
Cathy
Lawson (Rutgers), Rosalind Kim (Berkeley),
Kim
Henrick (EBI/MSD), and John Westbrook (Rutgers).
Changes
12-Jul-02: The *method items are likely to be
enumerated lists rather than free text
Development stage dictionary item name has changed to _entity_src_gen.gene_src_dev_stage –
it was not a .pdbx_ item
_entity_src_gen_clone.gene_insert_method has been added and mention of
_entity_src_gen.pdbx_gene_insert_method removed
15-Jul-02: Several item name changes
ENTITY_SRC_GEN_CLONE_LIGATION_FREE
has changed name to ENTITY_SRC_GEN_CLONE_RECOMBINATION.
_entity_src_gen_express.protein_location removed – covered by
_entity_src_gen.pdbx_host_org_cellular_location.
Added _entity_src_gen_lysis.time
Added _entity_src_gen_fract.protein_volume
Removed _entity_src_gen_chrom.fraction_volume
Added _entity_src_gen_refold.time
Protein characterisation split out of ENTITY_SRC_GEN_PURE into ENTITY_SRC_GEN_PURE_CHARACTER
20-Aug-02: Corrections as pointed out by Ulrich Harttig, PSF (Berlin)
Removed _entity_src_gen.pdbx_host_org_gene_fusion_details
Added _entity_src_gen_express.inducer
Added _entity_src_gen_express.inducer_concentration
Removed _entity_src_gen_express.expression_level – value unknowable at this time
Added _entity_src_gen_fract.protein_yield_method
Renamed _entity_src_gen_chrom.protein_concentration to _entity_src_gen_chrom.sample_concentration
Added _entity_src_gen_chrom.elution_buffer_id
Added _entity_src_gen_chrom.sample_conc_method
Added _entity_src_gen_chrom.yield_method
Renamed _entity_src_gen_remove_tag.tag_removal_details to _entity_src_gen_remove_tag.details
Removed _entity_src_gen_remove_tag.cleavage_site – derivable
Removed _entity_src_gen_pure_character.pure_id
Added _entity_src_gen_pure_character.prod_step_id
08-Oct-02: Several alterations arising from discussions with John Ionides, EBI (Cambridge)
Entry has been typed – this is to facilitate data exchange using mmCIF
Entity references a globally unique target – source of this is a point for discussion
The sequence-storage category is a new category, PDBX_CONSTRUCT
Sequence annotation is in PDBX_CONSTRUCT_FEATURE
The expression category has been extended to contain expression strain information – seems sensible
The OD timepoints have been split off into a sub category – previous attempt was not valid mmCIF
ENTITY_SRC_GEN_REMOVE_TAG has been renamed ENTITY_SRC_GEN_PROTEOLYSIS – improve generality
ENTITY_SRC_GEN_CHARACTER has been reworded to be applicable at any stage
A generic process step category has been added – ENTITY_SRC_GEN_PROD_OTHER +
ENTITY_SRC_GEN_PROD_OTHER_PARAMETER
The workflow-tracing item prod_step_id has been renamed next_step_id – seems a more useful item
7-July-2003 Added crystallization data items.
Known
Omissions and Issues
Cell-free
expression is missing
Transformation
method for expression is not explicitly mentioned – it was hoped that this
could be covered by the transformation items in the cloning category but it may
be easier to simply add the requisite data items to the expression category also.
Data
Items
The *details items
generally allow free text entry. Other items generally allow numeric data or
selection from a predefined enumerated list. Current thinking is more clearly
expressed in the web pages available from http://www.oppf.ox.ac.uk/ (mmCIF2.htm).
Further information
about the dictionaries and mmCIF can be found at http://pdb.rutgers.edu/mmcif/ .
NB: Items are only detailed here where they modify or extend the pre-existing ENTRY category
|
Item name |
Description |
|
_entry.id |
The value of _entry.id identifies the data block. Note that this item need not be a number; it can be any unique identifier. In the context of a structural genomics project this identifier, when prefixed by the value of _entry.type, should be globally unique For protein production it is envisaged that the value of _entry.id should uniquely define a product from a protein production run. This will normally be a protein sample but may also be an expression vector. |
|
_entry.type |
For exchange within a structural genomics project, the value of _entry.id idebtifies the type of data given in the data block. A null value for this item indicates that this is not a structual genomics exchnage mmCIF but rather a complete entry that conforms to the _pdbx_style definitions. It is envisaged that both 'P' and 'E' are possible products of a protein production facility and would be identical in all aspects except that P describes a sample that is (predominatly) protein and E describes a sample that is a nucleic acid. |
NB: Items are only detailed here where they modify or extend the pre-existing ENTRY category
|
Item name |
Description |
|
_entity.id |
The value of _entity.id must uniquely identify a record in the ENTITY list. Note that this item need not be a number; it can be any unique identifier. |
|
_entity.target_id |
The value of _entity.target_id points to a target idenitifier from which this entity was generated. |
NB: Items are only detailed here where they modify or extend the pre-existing ENTRY category
|
Item name |
Description |
|
_entity_src_gen.entity_id |
This data item is a pointer to _entity.id in the ENTITY category. |
|
_entity_src_gen.host_org_common_name |
The common name of the organism that served as host for the production of the entity. Where full details of the protein production are available it would be expected that this item be derived from _entity_src_gen_express.host_org_common_name or via _entity_src_gen_express.host_org_tax_id |
|
_entity_src_gen.host_org_details |
A description of special aspects of the organism that served as host for the production of the entity. Where full details of the protein production are available it would be expected that this item would derived from _entity_src_gen_express.host_org_details |
|
_entity_src_gen.host_org_strain |
The strain of the organism in which the entity was expressed. Where full details of the protein production are available it would be expected that this item be derived from _entity_src_gen_express.host_org_strain or via _entity_src_gen_express.host_org_tax_id |
|
_entity_src_gen.plasmid_details |
A description of special aspects of the plasmid that produced the entity in the host organism. Where full details of the protein production are available it would be expected that this item would be derived from _pdbx_construct.details of the construct pointed to from _entity_src_gen_express.plasmid_id. |
|
_entity_src_gen.plasmid_name |
The name of the plasmid that produced the entity in the host organism. Where full details of the protein production are available it would be expected that this item would be derived from _pdbx_construct.name of the construct pointed to from _entity_src_gen_express.plasmid_id. |
|
_entity_src_gen.pdbx_host_org_variant |
Variant of the organism used as the expression system. Where full details of the protein production are available it would be expected that this item be derived from entity_src_gen_express.host_org_variant or via _entity_src_gen_express.host_org_tax_id |
|
_entity_src_gen.pdbx_host_org_cell_line |
A specific line of cells used as the expression system. Where full details of the protein production are available it would be expected that this item would be derived from entity_src_gen_express.host_org_cell_line |
|
_entity_src_gen.pdbx_host_org_atcc |
Americal Tissue Culture Collection of the expression system. Where full details of the protein production are available it would be expected that this item would be derived from _entity_src_gen_express.host_org_culture_collection |
|
_entity_src_gen.pdbx_host_org_culture_collection |
Culture collection of the expression system. Where full details of the protein production are available it would be expected that this item would be derived somehwere, but exactly where is not clear. |
|
_entity_src_gen.pdbx_host_org_cell |
Cell type from which the gene is derived. Where entity.target_id is provided this should be derived from details of the target. |
|
_entity_src_gen.pdbx_host_org_scientific_name |
The scientific name of the organism that served as host for the production of the entity. Where full details of the protein production are available it would be expected that this item would be derived from _entity_src_gen_express.host_org_scientific_name or via _entity_src_gen_express.host_org_tax_id |
|
_entity_src_gen.pdbx_host_org_tissue |
The specific tissue which expressed the molecule. Where full details of the protein production are available it would be expected that this item would be derived from _entity_src_gen_express.host_org_tissue |
|
_entity_src_gen.pdbx_host_org_vector |
Identifies the vector used. Where full details of the protein production are available it would be expected that this item would be derived from _entity_src_gen_clone.vector_name. |
|
_entity_src_gen.pdbx_host_org_vector_type |
Identifies the type of vector used (plasmid, virus, or cosmid). Where full details of the protein production are available it would be expected that this item would be derived from _entity_src_gen_express.vector_type. |
|
_entity_src_gen.expression_system_id |
A unique identifier for the expression system. This should be extracted from a local list of expression systems. |
|
_entity_src_gen.gene_src_dev_stage |
A string to indicate the life-cycle or cell development cycle in which the gene is expressed and the mature protein is active. |
|
_entity_src_gen.start_construct_id |
A pointer to _pdbx_construct.id in the PDBX_CONSTRUCT category. The indentified sequence is the initial construct. |
This category contains details for the DIGEST steps used in the overall protein production process. The digestion is assumed to be applied to the result of the previous production step, or the gene source if this is the first production step.
|
Item name |
Description |
|
_entity_src_gen_prod_digest.entry_id |
The value of _entity_src_gen_prod_digest.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_prod_digest.entity_id |
The value of _entity_src_gen_prod_digest.entity_id uniquely identifies each protein contained in the project target protein complex whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_prod_digest.step_id |
This item is the unique identifier for this digestion step. |
|
_entity_src_gen_prod_digest.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_prod_digest.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced nucleic acid sequence is that of the digest product |
|
_entity_src_gen_prod_digest.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_prod_digest.date |
The date of this production step. |
|
_entity_src_gen_prod_digest.restriction_enzyme_1 |
The first enzyme used in the restriction digestion. The sites at which this cuts can be derived from the sequence. |
|
_entity_src_gen_prod_digest.restriction_enzyme_2 |
The second enzyme used in the restriction digestion. The sites at which this cuts can be derived from the sequence. |
|
_entity_src_gen_prod_digest.purification_details |
String value containing details of any purification of the product of the digestion. |
This category contains details for the PCR steps used in the overall protein production process. The PCR is assumed to be applied to the result of the previous production step, or the gene source if this is the first production step.
|
Item name |
Description |
|
_entity_src_gen_prod_pcr.entry_id |
The value of _entity_src_gen_prod_pcr.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_prod_pcr.entity_id |
The value of _entity_src_gen_prod_pcr.entity_id uniquely identifies each protein contained in the project target protein complex whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_prod_pcr.step_id |
This item is the unique identifier for this PCR step. |
|
_entity_src_gen_prod_pcr.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_prod_pcr.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced nucleic acid sequence is that of the PCR product. |
|
_entity_src_gen_prod_pcr.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. The referenced robot is the robot responsible for the PCR reaction (normally the heat cycler). |
|
_entity_src_gen_prod_pcr.date |
The date of this production step. |
|
_entity_src_gen_prod_pcr.forward_primer_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced nucleic acid sequence is that of the forward primer. |
|
_entity_src_gen_prod_pcr.reverse_primer_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced nucleic acid sequence is that of the reverse primer. |
|
_entity_src_gen_prod_pcr.reaction_details |
String value containing details of the PCR reaction. |
|
_entity_src_gen_prod_pcr.purification_details |
String value containing details of any purification of the product of the PCR reaction. |
This category contains details for the cloning steps used in the overall protein production process. Each row in ENTITY_SRC_GEN_CLONE should have an equivalent row in either ENTITY_SRC_GEN_CLONE_LIGATION or ENTITY_SRC_GEN_CLONE_RECOMBINATION.
|
Item name |
Description |
|
_entity_src_gen_clone.entry_id |
The value of _entity_src_gen_clone.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_clone.entity_id |
The value of _entity_src_gen_clone.entity_id uniquely identifies each protein contained in the project target protein complex whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_clone.step_id |
This item is the unique identifier for this cloning step. |
|
_entity_src_gen_clone.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_clone.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced nucleic acid sequence is that of the cloned product. |
|
_entity_src_gen_clone.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_clone.date |
The date of this production step. |
|
_entity_src_gen_clone.gene_insert_method |
The method used to insert the gene into the vector. For 'Ligation', an ENTITY_SRC_GEN_CLONE_LIGATION entry with matching .step_id is expected. For 'Recombination', an ENTITY_SRC_GEN_CLONE_RECOMBINATION entry with matching .step_id is expected. |
|
_entity_src_gen_clone.vector_name |
The name of the vector used in this cloning step. |
|
_entity_src_gen_clone.vector_details |
Details of any modifications made to the named vector. |
|
_entity_src_gen_clone.transformation_method |
The method used to transform the expression cell line with the vector |
|
_entity_src_gen_clone.marker |
The type of marker included to allow selection of transformed cells |
|
_entity_src_gen_clone.verification_method |
The method used to verify that the incorporated gene is correct |
|
_entity_src_gen_clone.purification_details |
Details of any purification of the product. |
This category contains details for the ligation-based cloning steps used in the overall protein production process. _entity_src_gen_clone_ligation.clone_step_id in this category must point at a defined _entity_src_gen_clone.step_id. The details in ENTITY_SRC_GEN_CLONE_LIGATION extend the details in ENTITY_SRC_GEN_CLONE to cover ligation dependent cloning steps.
|
Item name |
Description |
|
_entity_src_gen_clone_ligation.entry_id |
This item is a pointer to _entity_src_gen_clone.entry_id in the ENTITY_SRC_GEN_CLONE category. |
|
_entity_src_gen_clone_ligation.entity_id |
This item is a pointer to _entity_src_gen_clone.entity_id in the ENTITY_SRC_GEN_CLONE category. |
|
_entity_src_gen_clone_ligation.step_id |
This item is a pointer to _entity_src_gen_clone.step_id in the ENTITY_SRC_GEN_CLONE category. |
|
_entity_src_gen_clone_ligation.cleavage_enzymes |
The names of the enzymes used to cleave the vector. In addition an enzyme used to blunt the cut ends, etc., should be named here. |
|
_entity_src_gen_clone_ligation.ligation_enzymes |
The names of the enzymes used to ligate the gene into the cleaved vector. |
|
_entity_src_gen_clone_ligation.temperature |
The temperature at which the ligation experiment was performed, in degrees celcius. |
|
_entity_src_gen_clone_ligation.time |
The duration of the ligation reaction in minutes. |
|
_entity_src_gen_clone_ligation.details |
Any details to be associated with this ligation step, e.g. the protocol. |
This category contains details for the recombination-based cloning steps used in the overall protein production process. It is assumed that these reactions will use commercially available kits. _entity_src_gen_clone_recombination.clone_step_id in this category must point at a defined _entity_src_gen_clone.step_id. The details in ENTITY_SRC_GEN_CLONE_RECOMBINATION extend the details in ENTITY_SRC_GEN_CLONE to cover recombination dependent cloning steps.
|
Item name |
Description |
|
_entity_src_gen_clone_recombination.entry_id |
This item is a pointer to _entity_src_gen_clone.entry_id in the ENTITY_SRC_GEN_CLONE category. |
|
_entity_src_gen_clone_recombination.entity_id |
This item is a pointer to _entity_src_gen_clone.entity_id in the ENTITY_SRC_GEN_CLONE category. |
|
_entity_src_gen_clone_recombination.step_id |
This item is a pointer to _entity_src_gen_clone.step_id in the ENTITY_SRC_GEN_CLONE category. |
|
_entity_src_gen_clone_recombination.system |
The name of the recombination system. |
|
_entity_src_gen_clone_recombination.recombination_enzymes |
The names of the enzymes used for this recombination step. |
|
_entity_src_gen_clone_recombination.details |
Any details to be associated with this recombination step, e.g. the protocol or differences from the manufacturer's specified protocol. |
This category contains details for the EXPRESSION steps used in the overall protein production process. It is hoped that this category will cover all forms of cell-based expression by reading induction as induction/transformation/transfection.
|
Item name |
Description |
|
_entity_src_gen_express.entry_id |
The value of _entity_src_gen_express.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_express.entity_id |
The value of _entity_src_gen_express.entity_id uniquely identifies each protein contained in the project target complex proteins whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_express.step_id |
This item is the unique identifier for this expression step. |
|
_entity_src_gen_express.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_express.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced sequence is expected to be the amino acid sequence of the expressed product. |
|
_entity_src_gen_express.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_express.date |
The date of production step. |
|
_entity_src_gen_express.promoter_type |
The nature of the promoter controlling expression of the gene. |
|
_entity_src_gen_express.plasmid_id |
This item is a pointer to _pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced entry will contain the nucleotide sequence that is to be expressed, including tags. |
|
_entity_src_gen_express.vector_type |
Identifies the type of vector used (plasmid, virus, or cosmid) in the expression system. |
|
_entity_src_gen_express.N_terminal_seq_tag |
Any N-terminal sequence tag as a string of one letter amino acid codes. |
|
_entity_src_gen_express.C_terminal_seq_tag |
Any C-terminal sequence tag as a string of one letter amino acid codes |
|
_entity_src_gen_express.host_org_scientific_name |
The scientific name of the organism that served as host for the expression system. It is expected that either this item or _entity_src_gen_express.host_org_tax_id should be populated. |
|
_entity_src_gen_express.host_org_common_name |
The common name of the organism that served as host for the expression system. Where _entity_src_gen_express.host_org_tax_id is populated it is expected that this item may be derived by look up against the taxonomy database. |
|
_entity_src_gen_express.host_org_variant |
The vairant of the organism that served as host for the expression system. Where _entity_src_gen_express.host_org_tax_id is populated it is expected that this item may be derived by a look up against the taxonomy database. |
|
_entity_src_gen_express.host_org_strain |
The strain of the organism that served as host for the expression system. Where _entity_src_gen_express.host_org_tax_id is populated it is expected that this item may be derived by a look up against the taxonomy database. |
|
_entity_src_gen_express.host_org_tissue |
The specific tissue which expressed the molecule. |
|
_entity_src_gen_express.host_org_culture_collection |
Culture collection of the expression system |
|
_entity_src_gen_express.host_org_cell_line |
A specific line of cells used as the expression system |
|
_entity_src_gen_express.host_org_tax_id |
The id for the NCBI taxonomy node corresponding to the organism that served as host for the expression system. |
|
_entity_src_gen_express.host_org_details |
A description of special aspects of the organism that served as host for the expression system. |
|
_entity_src_gen_express.culture_base_media |
The name of the base media in which the expression host was grown. |
|
_entity_src_gen_express.culture_additives |
Any additives to the base media in which the expression host was grown. |
|
_entity_src_gen_express.culture_volume |
The volume of media in millilitres in which the expression host was grown. |
|
_entity_src_gen_express.culture_time |
The time in hours for which the expression host was allowed to grow prior to induction/transformation/transfection. |
|
_entity_src_gen_express.culture_temperature |
The temperature in degrees celcius at which the expression host was allowed to grow prior to induction/transformation/transfection. |
|
_entity_src_gen_express.inducer |
The chemical name of the inducing agent. |
|
_entity_src_gen_express.inducer_concentration |
Concentration of the inducing agent. |
|
_entity_src_gen_express.induction_details |
Details of induction/transformation/transfection. |
|
_entity_src_gen_express.multiplicity_of_infection |
The multiplicity of infection for genes introduced by transfection, eg. for baculovirus-based expression. |
|
_entity_src_gen_express.induction_timepoint |
The time in hours after induction/transformation/transfection at which the optical density of the culture was measured. |
|
_entity_src_gen_express.induction_temperature |
The temperature in celcius at which the induced/transformed/transfected cells were grown. |
|
_entity_src_gen_express.harvesting_details |
Details of the harvesting protocol. |
|
_entity_src_gen_express.storage_details |
Details of how the harvested culture was stored. |
This category contains details for OD time series used to monitor a given EXPRESSION step used in the overall protein production process.
|
Item name |
Description |
|
_entity_src_gen_express_timepoint.entry_id |
The value of _entity_src_gen_express_timepoint.entry_id is a pointer to _entity_src_gen_express.entry_id |
|
_entity_src_gen_express_timepoint.entity_id |
The value of _entity_src_gen_express_timepoint.entity_id is a pointer to _entity_src_gen_express.entity_id |
|
_entity_src_gen_express_timepoint.step_id |
This item is a pointer to _entity_src_gen_express.step_id |
|
_entity_src_gen_express_timepoint.serial |
This items uniquely defines a timepoint within a series. |
|
_entity_src_gen_express_timepoint.OD |
The optical density of the expression culture in arbitrary units at the timepoint specified. |
|
_entity_src_gen_express_timepoint.time |
The time in hours after induction/transformation/transfection at which the optical density of the culture was measured. |
This category contains details for the cell lysis steps used in the overall protein production process.
|
Item name |
Description |
|
_entity_src_gen_lysis.entry_id |
The value of _entity_src_gen_lysis.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_lysis.entity_id |
The value of _entity_src_gen_lysis.entity_id uniquely identifies each protein contained in the project target protein complex whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_lysis.step_id |
This item is the unique identifier for this lysis step. |
|
_entity_src_gen_lysis.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_lysis.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced sequence is expected to be the amino acid sequence of the expressed product after lysis. |
|
_entity_src_gen_lysis.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_lysis.date |
The date of this production step. |
|
_entity_src_gen_lysis.method |
The lysis method. |
|
_entity_src_gen_lysis.buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that in which the lysis was performed. |
|
_entity_src_gen_lysis.buffer_volume |
The volume in millilitres of buffer in which the lysis was performed. |
|
_entity_src_gen_lysis.temperature |
The temperature in degrees celcius at which the lysis was performed. |
|
_entity_src_gen_lysis.time |
The time in seconds of the lysis experiment. |
|
_entity_src_gen_lysis.details |
String value containing details of the lysis protocol. |
This category contains details for the refolding steps used in the overall protein production process.
|
Item name |
Description |
|
_entity_src_gen_refold.entry_id |
The value of _entity_src_gen_refold.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_refold.entity_id |
The value of _entity_src_gen_refold.entity_id uniquely identifies each protein contained in the project target protein complex whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_refold.step_id |
This item is the unique identifier for this refolding step. |
|
_entity_src_gen_refold.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_refold.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced sequence is expected to be the amino acid sequence of the expressed product after the refolding step. |
|
_entity_src_gen_refold.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_refold.date |
The date of this production step. |
|
_entity_src_gen_refold.denature_buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that in which the protein was denatured. |
|
_entity_src_gen_refold.refold_buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that in which the protein was refolded. |
|
_entity_src_gen_refold.temperature |
The temperature in degrees celcius at which the protein was refolded. |
|
_entity_src_gen_refold.time |
The time in hours over which the protein was refolded. |
|
_entity_src_gen_refold.storage_buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that in which the refolded protein was stored. |
|
_entity_src_gen_refold.details |
String value containing details of the refolding. |
This category contains details for the protein purification tag removal steps used in the overall protein production process
|
Item name |
Description |
|
_entity_src_gen_proteolysis.entry_id |
The value of _entity_src_gen_proteolysis.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_proteolysis.entity_id |
The value of _entity_src_gen_proteolysis.entity_id uniquely identifies each protein contained in the project target complex proteins whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_proteolysis.step_id |
This item is the unique identifier for this tag removal step. |
|
_entity_src_gen_proteolysis.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_proteolysis.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced sequence is expected to be the amino acid sequence of the expressed product after the proteolysis step. |
|
_entity_src_gen_proteolysis.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_proteolysis.date |
The date of production step. |
|
_entity_src_gen_proteolysis.details |
Details of this tag removal step. |
|
_entity_src_gen_proteolysis.protease |
The name of the protease used for cleavage. |
|
_entity_src_gen_proteolysis.protein_protease_ratio |
The ratio of protein to protease used for the cleavage. = mol protein / mol protease |
|
_entity_src_gen_proteolysis.cleavage_buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that in which the cleavage was performed. |
|
_entity_src_gen_proteolysis.cleavage_temperature |
The temperature in degrees celcius at which the cleavage was performed. |
|
_entity_src_gen_proteolysis.cleavage_time |
The time in minutes for the cleavage reaction |
This category contains details for the fraction steps used in the overall protein production process. Examples of fractionation steps are centrifugation and magnetic bead pull-down purification.
|
Item name |
Description |
|
_entity_src_gen_fract.entry_id |
The value of _entity_src_gen_fract.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_fract.entity_id |
The value of _entity_src_gen_fract.entity_id uniquely identifies each protein contained in the project target protein complex whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_fract.step_id |
This item is the unique identifier for this fractionation step. |
|
_entity_src_gen_fract.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_fract.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced sequence is expected to be the amino acid sequence of the expressed product after the fractionation step. |
|
_entity_src_gen_fract.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_fract.date |
The date of this production step. |
|
_entity_src_gen_fract.method |
This item describes the method of fractionation. |
|
_entity_src_gen_fract.temperature |
The temperature in degrees celcius at which the fractionation was performed. |
|
_entity_src_gen_fract.details |
String value containing details of the fractionation. |
|
_entity_src_gen_fract.protein_location |
The fraction containing the protein of interest. |
|
_entity_src_gen_fract.protein_volume |
The volume of the fraction containing the protein. |
|
_entity_src_gen_fract.protein_yield |
The yield in milligrammes of protein from the fractionation. |
|
_entity_src_gen_fract.protein_yield_method |
The method used to determine the yield |
This category contains details for the chromatographic steps used in the purification of the protein.
|
Item name |
Description |
|
_entity_src_gen_chrom.entry_id |
The value of _entity_src_gen_chrom.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_chrom.entity_id |
The value of _entity_src_gen_chrom.entity_id uniquely identifies each protein contained in the project target complex proteins whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_chrom.step_id |
This item is the unique identifier for this chromatography step. |
|
_entity_src_gen_chrom.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_chrom.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced sequence is expected to be the amino acid sequence of the expressed product after the chromatography step. |
|
_entity_src_gen_chrom.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_chrom.date |
The date of production step. |
|
_entity_src_gen_chrom.column_type |
The type of column used in this step. |
|
_entity_src_gen_chrom.column_volume |
The volume of the column used in this step. |
|
_entity_src_gen_chrom.column_temperature |
The temperature in degrees celcius at which this column was run. |
|
_entity_src_gen_chrom.equilibration_buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that in which the column was equilibrated. |
|
_entity_src_gen_chrom.flow_rate |
The rate at which the equilibration buffer flowed through the column. |
|
_entity_src_gen_chrom.elution_buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that with which the protein was eluted. |
|
_entity_src_gen_chrom.elution_protocol |
Details of the elution protocol. |
|
_entity_src_gen_chrom.sample_prep_details |
Details of the sample preparation prior to running the column. |
|
_entity_src_gen_chrom.sample_volume |
The volume of protein solution run on the column. |
|
_entity_src_gen_chrom.sample_concentration |
The concentration of the protein solution put onto the column. |
|
_entity_src_gen_chrom.sample_conc_method |
The method used to determine the concentration of the protein solution put onto the column. |
|
_entity_src_gen_chrom.volume_pooled_fractions |
The total volume of all the fractions pooled to give the purified protein solution. |
|
_entity_src_gen_chrom.yield_pooled_fractions |
The yield in milligrammes of protein recovered in the pooled fractions. |
|
_entity_src_gen_chrom.yield_method |
The method used to determine the yield |
|
_entity_src_gen_chrom.post_treatment |
Details of any post-chromatographic treatment of the protein sample. |
This category contains details for the final purified protein product. Note that this category does not contain the amino acid sequence of the protein. The sequence will be found in the ENTITY_POLY_SEQ entry with matching entity_id. Only one ENTITY_SRC_GEN_PURE category is allowed per entity, hence there is no step_id for this category.
|
Item name |
Description |
|
_entity_src_gen_pure.entry_id |
The value of _entity_src_gen_pure.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_pure.entity_id |
The value of _entity_src_gen_pure.entity_id uniquely identifies each protein contained in the project target complex proteins whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_pure.step_id |
This item unique identifier the production step. |
|
_entity_src_gen_pure.product_id |
When present, this item should be a globally unique identifier that identifies the final product. It is envisaged that this should be the same as and product code associated with the sample and would provide the key by which information about the production process may be extracted from the protein production facility. For files describing the protein production process (i.e. where _entity.type is 'P' or 'E') this should have the same value as _entry.id |
|
_entity_src_gen_pure.date |
The date of production step. |
|
_entity_src_gen_pure.conc_device_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_pure.conc_details |
Details of the protein concentration procedure |
|
_entity_src_gen_pure.conc_assay_method |
The method used to measure the protein concentration |
|
_entity_src_gen_pure.protein_concentration |
The final concentration of the protein. |
|
_entity_src_gen_pure.protein_yield |
The yield of protein in milligrammes. |
|
_entity_src_gen_pure.protein_purity |
The purity of the protein. |
|
_entity_src_gen_pure.protein_oligomeric_state |
The oligomeric state of the protein. Monomeric is 1, dimeric 2, etc. |
|
_entity_src_gen_pure.storage_buffer_id |
This item is a pointer to pdbx_buffer.id in the PDBX_BUFFER category. The referenced buffer is that in which the protein was stored. |
|
_entity_src_gen_pure.storage_temperature |
The temperature in degrees celcius at which the protein was stored. |
This category contains details of protein characterisation. It refers to the characteristion of the product of a specific step.
|
Item name |
Description |
|
_entity_src_gen_character.entry_id |
The value of _entity_src_gen_character.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_character.entity_id |
The value of _entity_src_gen_character.entity_id uniquely identifies each protein contained in the project target complex proteins whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_character.step_id |
This item is the unique identifier for the step whose product has been characterised. |
|
_entity_src_gen_character.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. |
|
_entity_src_gen_character.date |
The date of characterisation step. |
|
_entity_src_gen_character.method |
The method used for protein characterisation. |
|
_entity_src_gen_character.result |
The result from this method of protein characterisation. |
|
_entity_src_gen_character.details |
Any details associated with this method of protein characterisation. |
This category contains details for process steps that are not explicitly catered for elsewhere. It provides some basic details as well as placeholders for a list of parameters and values (the category ENTITY_SRC_GEN_PROD_OTHER_PARAMETER). Note that processes that have been modelled explicitly should not be represented using this category.
|
Item name |
Description |
|
_entity_src_gen_prod_other.entry_id |
The value of _entity_src_gen_prod_other.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_entity_src_gen_prod_other.entity_id |
The value of _entity_src_gen_prod_other.entity_id uniquely identifies each protein contained in the project target protein complex whose structure is to be determined. This data item is a pointer to _entity.id in the ENTITY category. This item may be a site dependent bar code. |
|
_entity_src_gen_prod_other.step_id |
This item is the unique identifier for this process step. |
|
_entity_src_gen_prod_other.next_step_id |
This item unique identifier for the next production step. This allows a workflow to have multiple entry points leading to a single product. |
|
_entity_src_gen_prod_other.end_construct_id |
This item is a pointer to pdbx_construct.id in the PDBX_CONSTRUCT category. The referenced nucleic acid sequence is that of the product of the process step. |
|
_entity_src_gen_prod_other.robot_id |
This data item is a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category. The referenced robot is the robot responsible for the process step |
|
_entity_src_gen_prod_other.date |
The date of this process step. |
|
_entity_src_gen_prod_other.process_name |
Name of this process step. |
|
_entity_src_gen_prod_other.details |
Additional details of this process step. |
This category contains parameters and values required to capture information about a particular process step
|
Item name |
Description |
|
_entity_src_gen_prod_other_parameter.entry_id |
The value of _entity_src_gen_prod_other_parameter.entry_id is a pointer to _entity_src_gen_prod_other.entry.id |
|
_entity_src_gen_prod_other_parameter.entity_id |
The value of _entity_src_gen_prod_other_parameter.entity_id is a pointer to _entity_src_gen_prod_other.entity_id |
|
_entity_src_gen_prod_other_parameter.step_id |
This item is a pointer to _entity_src_gen_prod_other.step_id |
|
_entity_src_gen_prod_other_parameter.parameter |
The name of the parameter associated with the process step |
|
_entity_src_gen_prod_other_parameter.value |
The value of the parameter |
|
_entity_src_gen_prod_other_parameter.details |
Additional details about the parameter |
Data items in the PDBX_BUFFER category record details of the sample buffer.
|
Item name |
Description |
|
_pdbx_buffer.id |
The value of _pdbx_buffer.id must uniquely identify the sample buffer. |
|
_pdbx_buffer.name |
The name of each buffer. |
|
_pdbx_buffer.details |
Any additional details to do with buffer. |
Constituents of buffer in sample
|
Item name |
Description |
|
_pdbx_buffer_components.id |
The value of _pdbx_buffer_components.id must uniquely identify a component of the buffer. |
|
_pdbx_buffer_components.buffer_id |
This data item is a pointer to _pdbx_buffer.id in the BUFFER category. |
|
_pdbx_buffer_components.name |
The name of each buffer component. |
|
_pdbx_buffer_components.volume |
The volume of buffer component. |
|
_pdbx_buffer_components.conc |
The millimolar concentration of buffer component. |
|
_pdbx_buffer_components.details |
Any additional details to do with buffer composition. |
|
_pdbx_buffer_components.conc_units |
The concentration units of the component. |
|
_pdbx_buffer_components.isotopic_labeling |
The isotopic composition of each component, including the % labeling level, if known. For example: 1. Uniform (random) labeling with 15N: U-15N 2. Uniform (random) labeling with 13C, 15N at known labeling levels: U-95% 13C;U-98% 15N 3. Residue selective labeling: U-95% 15N-Thymine 4. Site specific labeling: 95% 13C-Ala18, 5. Natural abundance labeling in an otherwise uniformly labled biomolecule is designated by NA: U-13C; NA-K,H |
Data items in the PDBX_CONSTRUCT category specify a sequence of nucleic acids or amino acids. It is a catch-all that may be used to provide details of sequences known to be relevant to the project as well as primers, plasmids, proteins and such like that are either used or produced during the protein production process. Molecules described here are not necessarily complete, so for instance it would be possible to include either a complete plasmid or just its insert. This category may be considered as an abbreviated form of _entity where the molecules described are not required to appear in the final co-ordinates. Note that the details provided here all pertain to a single entry as defined at deposition. It is anticipated that _pdbx_construct.id would also be composed of a sequence that is unique within a given site prefixed by a code that identifies that site and would, therefore, be GLOBALLY unique. Thus this category could also be used locally to store details about the different constructs used during protein production without reference to the entry_id (which only becomes a meaningful concept during deposition).
|
Item name |
Description |
|
_pdbx_construct.entry_id |
The value of _pdbx_construct.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_pdbx_construct.id |
The value of _pdbx_construct.id must uniquely identify a record in the PDBX_CONSTRUCT list and should be arranged so that it is composed of a site-speicific prefix combined with a value that is unique within a given site.Note that this item need not be a number; it can be any unique identifier. |
|
_pdbx_construct.name |
_pdbx_construct.name provides a placeholder for the local name of the construct, for example the plasmid name if this category is used to list plasmids. |
|
_pdbx_construct.organisation |
_pdbx_construct.organisation describes the organisation in which the _pdbx_construct.id is unique. This will normally be the lab in which the constrcut originated. It is envisaged that this item will permit a globally unique identifier to be constructed in cases where this is not possible from the _pdbx_construct.id alone. |
|
_pdbx_construct.entity_id |
In cases where the construct IS found in the co-ordinates then this item provides a pointer to _entity.id in the ENTITY category for the corresponding molecule. |
|
_pdbx_construct.robot_id |
In cases where the sequence has been determined by a robot this data item provides a pointer to pdbx_robot_system.id in the PDBX_ROBOT_SYSTEM category for the robot responsible |
|
_pdbx_construct.date |
The date that the sequence was determined. |
|
_pdbx_construct.details |
Additional details about the construct that cannot be represented in the category _pdbx_construct_feature. |
|
_pdbx_construct.class |
The primary function of the construct. This should be considered as a guideline only. |
|
_pdbx_construct.type |
The type of nucleic acid sequence in the construct. Note that to find all the DNA molecules it is necessary to search for DNA + cDNA and for RNA, RNA + mRNA + tRNA. |
|
_pdbx_construct.seq |
sequence expressed as string of one-letter base codes or one letter amino acid codes. Unusual residues may be represented either using the appropriate one letter code wild cards or by the three letter code in parentheses. |
Data items in the PDBX_CONSTRUCT_FEATURE category may be used to specify various properties of a nucleic acid sequence used during protein production.
|
Item name |
Description |
|
_pdbx_construct_feature.id |
The value of _pdbx_construct_feature.id must uniquely identify a record in the PDBX_CONSTRUCT_FEATURE list. Note that this item need not be a number; it can be any unique identifier. |
|
_pdbx_construct_feature.construct_id |
The value of _pdbx_construct_feature.construct_id uniquely identifies the construct with which the feature is associated. This is a pointer to _pdbx_construct.id This item may be a site dependent bar code. |
|
_pdbx_construct_feature.entry_id |
The value of _pdbx_construct_feature.entry_id uniquely identifies a sample consisting of one or more proteins whose structure is to be determined. This is a pointer to _entry.id. This item may be a site dependent bar code. |
|
_pdbx_construct_feature.start_seq |
The sequence position at which the feature begins |
|
_pdbx_construct_feature.end_seq |
The sequence position at which the feature ends |
|
_pdbx_construct_feature.type |
The type of the feature |
|
_pdbx_construct_feature.details |
Details that describe the feature |
The details about each robotic system used to collect data for this project.
|
Item name |
Description |
|
_pdbx_robot_system.id |
Assign a numerical ID to each instrument. |
|
_pdbx_robot_system.model |
The model of the robotic system. |
|
_pdbx_robot_system.type |
The type of robotic system used for in the production pathway. |
|
_pdbx_robot_system.manufacturer |
The name of the manufacturer of the robotic system. |
EXPTL_CRYSTAL, EXPTL_CRYSTAL_GROW, EXPTL_GROW_COMP,
CELL and SYMMETRY
The details about crystallization and the characterization
of any produced crystals.
|
Dictionary Item Name |
Description |
|
_exptl_crystal_grow.method |
Crystallization method |
|
_exptl_crystal_grow.apparatus |
Apparatus |
|
_exptl_crystal_grow.temp _exptl_crystal_grow.temp_details |
Temperature |
|
_exptl_crystal_grow.pH _exptl_crystal_grow.pdbx_pH_range |
pH |
|
Tabulated in mmCIF category exptl_crystal_grow_comp |
Crystallization solutions compositions |
|
_exptl_crystal.preparation |
Additional treatments (e.g. soaking,
time in drop, annealing, cryoprotectant, etc) |
|
_exptl_crystal.pdbx_crystal_image_url _exptl_crystal.pdbx_crystal_image_format |
Image of crystal |
|
_exptl_crystal.size_* |
Crystal size |
|
_cell.length_a _cell.length_b _cell.length_c _cell.length_alpha _cell.length_beta _cell.length_gamma |
Cell constants |
|
_cell.length_a_esd _cell.length_b_esd _cell.length_c_esd _cell.length_alpha_esd _cell.length_beta_esd _cell.length_gamma_esd |
ESD of Cell constants |
|
_symmetry.space_group_name_H-M |
Space Group |
| © RCSB PDB | ||