RCSB PDB Protein Data Bank A Member of the wwPDB
An Information Portal to Biological Macromolecular Structures
PDB Home |

Item _array_data.data


Description

             The value of '_array_data.data' contains the array data
              encapsulated in a STAR string.

              The representation used is a variant on the
              Multipurpose Internet Mail Extensions (MIME) specified
              in RFC 2045-2049 by N. Freed et al.  The boundary
              delimiter used in writing an imgCIF or CBF is
              "--CIF-BINARY-FORMAT-SECTION--" (including the
              required initial "--").

              The Content-Type may be any of the discrete types permitted
              in RFC 2045; "application/octet-stream" is recommended.
              If an octet stream was compressed, the compression should
              be specified by the parameter 'conversions="x-CBF_PACKED"'
              or the parameter 'conversions="x-CBF_CANONICAL"'.

              The Content-Transfer-Encoding may be "BASE64",
              "Quoted-Printable", "X-BASE8", "X-BASE10", or
              "X-BASE16" for an imgCIF or "BINARY" for a CBF.  The
              octal, decimal and hexadecimal transfer encodings are
              for convenience in debugging, and are not recommended
              for archiving and data interchange.

              In an imgCIF file, the encoded binary data begins after
              the empty line terminating the header.  In a CBF, the
              raw binary data begins after an empty line terminating
              the header and after the sequence:

              Octet   Hex   Decimal  Purpose
                0     0C       12    (ctrl-L) Page break
                1     1A       26    (ctrl-Z) Stop listings in MS-DOS
                2     04       04    (Ctrl-D) Stop listings in UNIX
                3     D5      213    Binary section begins

              None of these octets are included in the calculation of
              the message size, nor in the calculation of the
              message digest.

              The X-Binary-Size header specifies the size of the
              equivalent binary data in octets.  If compression was
              used, this size is the size after compression, including
              any book-keeping fields.  An adjustment is made for
              the deprecated binary formats in which 8 bytes of binary
              header are used for the compression type.  In that case,
              the 8 bytes used for the compression type is subtracted
              from the size, so that the same size will be reported
              if the compression type is supplied in the MIME header.
              Use of the MIME header is the recommended way to
              supply the compression type.  In general, no portion of
              the  binary header is included in the calculation of the size.

              The X-Binary-Element-Type header specifies the type of
              binary data in the octets, using the same descriptive
              phrases as in '_array_structure.encoding_type'.  The default
              value is "unsigned 32-bit integer".

              An MD5 message digest may, optionally, be used. The "RSA Data
              Security, Inc. MD5 Message-Digest Algorithm" should be used.
              No portion of the header is included in the calculation of the
              message digest.

              If the Transfer Encoding is "X-BASE8", "X-BASE10", or
              "X-BASE16", the data is presented as octal, decimal or
              hexadecimal data organized into lines or words.  Each word
              is created by composing octets of data in fixed groups of
              2, 3, 4, 6 or 8 octets, either in the order ...4321 ("big-
              endian") or 1234... (little-endian).  If there are fewer
              than the specified number of octets to fill the last word,
              then the missing octets are presented as "==" for each
              missing octet.  Exactly two equal signs are used for each
              missing octet even for octal and decimal encoding.
              The format of lines is:

              rnd xxxxxx xxxxxx xxxxxx

              where r is "H", "O", or "D" for hexadecimal, octal or
              decimal, n is the number of octets per word. and d is "<"
              for ">" for the "...4321" and "1234..." octet orderings
              respectively.  The "==" padding for the last word should
              be on the appropriate side to correspond to the missing
              octets, e.g.

              H4< FFFFFFFF FFFFFFFF 07FFFFFF ====0000

              or

              H3> FF0700 00====

              For these hex, octal and decimal formats, only, comments
              beginning with "#" are permitted to improve readability.

              BASE64 encoding follows MIME conventions.  Octets are
              in groups of three, c1, c2, c3.  The resulting 24 bits
              are broken into four 6-bit quantities, starting with
              the high-order six bits (c1 >> 2) of the first octet, then
              the low-order two bits of the first octet followed by the
              high-order 4 bits of the second octet ((c1 & 3)<<4 | (c2>>4)),
              then the bottom 4 bits of the second octet followed by the
              high order two bits of the last octet ((c2 & 15)<<2 | (c3>>6)),
              then the bottom six bits of the last octet (c3 & 63).  Each
              of these four quantities is translated into an ASCII character
              using the mapping:

                        1         2         3         4         5         6
              0123456789012345678901234567890123456789012345678901234567890123
              |         |         |         |         |         |         |
              ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

              With short groups of octets padded on the right with one "="
              if c3 is missing, and with "==" if both c2 and c3 are missing.

              QUOTED-PRINTABLE encoding also follows MIME conventions, copying
              octets without translation if their ASCII values are 32..38,
              42, 48..57, 59..60, 62, 64..126 and the octet is not a ";"
              in column 1.  All other characters are translated to =nn, where
              nn is the hexadecimal encoding of the octet.  All lines are
              "wrapped" with a terminating "=" (i.e. the MIME conventions
              for an implicit line terminator are never used).

Category

array_data

Mandatory Code

yes

Data Type Code

binary

 

© RCSB PDB