Grammar, Syntax, and Vocabulary of the IGBA Card-Image File
Contents
INTRODUCTION
ANALYSIS DATA FILE
1. The Identification Field
2. Record Preface Images
2.1. The Record Title Card
2.2. The Record Reference and Location Card
3. Specimen Card 'A'; the name and unit card
4. Specimen Card 'B'; the essential oxide card
5. Other Specimen Cards; 'C', 'D', . . .,'Z'
6. Encoding Rules of Optional Informations
6.1. Punctuation
6.2. The Status List
6.3. The Trace-Element List
6.4. The Age List
6.4.1. The stratigraphic age field and its subfields
6.4.2. The physical age field and its subfields
6.5. The Petrographic Descriptor List
6.6. The Mineral Assemblage List
6.7. The Additional Information List
THE BILBLIOGRAPHY DATA FILE
INTRODUCTION
During the development of IGCP Projects 163 and 239 and the
first stage of this Subcommission a special structure has been
prepared to realize the successive IGBADAT versions.
The data base has been builded in two steps. Chayes (1985)
uses the names of "Data capture" for the (non-electronic)
movement of data from publications to project coding sheet and
"Data transfer" for the movement of data from coding sheet to
machine readable form. The sheet form has been described in
different papers and internal notes, for instance Chayes (1985).
IGBADAT is composed of two files, the first is Analyses Data
and the second is Bibliography Data. A five-digit number is the
linkage between them.
ANALYSIS DATA FILE
Analyses Data File consists of 80 characters long physical
records,referred along this note as "card image", "card" or
"image". The first six characters of each image are used for
identification, the remaining 74 for text.
The data file is divided to logical records or groups. The
first two card images of a logical record, the record preface
contain information common to the group.
The other cards in the record contain information about
individual specimens. Each specimen is described on 3 or more
card images: the first two having fix formats, the rest having
variable formats. The information in variable format cards is
always sequential consisting of the following parts: Status List,
Trace Element List, Age List, Petrographic Descriptors, Mineral
Assemblage, Addition Information (see after).
The separation of logical records, analyses and parts of the
analyses is made in the identification field.
1. The Identification Field
The identification field, the first six characters of each image
in a data file, is divided into three subfields:
Column(s) Contents of subfield
1-3 The record identifier; an alphabetic symbol of one to
three letters, right justified. All cards of a
particular logical record contain the same record
identifier. (Letters only. No digits.)
4-5 The specimen identifier; an alphabetic symbol of one or
two letters, right justified. All cards of a
particular specimen description carry the same specimen
identifier. (Letters only. No diqits.)
6 The card-sequence symbol; a one-character symbol which
may be either of the numerals 1 or 2, or any letter of
the alphabet. The within-specimen order of
card-sequence symbols is the same for all specimens.
The record preface consists of two cards, the first bearing the
card-sequence symbol '1', the second the card-sequence symbol '2'. The
first card of a specimen description contains the sequence symbol 'A',
the second 'B', the third 'C', etc., for as many cards as may be
necessary. Specimen identifiers may occur in any order, but
card-sequence symbols follow a fixed order in every logical record,
viz. 1, 2, A, B, ..., k1, A, B, ..., k2, A, B, ..., k3, etc., where ki
is the terminal card image of the i-th specimen description.
2. Record Preface Images
The record preface consists of a title card followed by a
reference and location card. Definitions and positions of the
variables they contain are as follows:
2.1 The Record Title Car
Column(s) Variable Definition
1-3 RS Record identifier
4-5 Blank
6 '1' Card-sequence symbol
7-80 TITL Up to 74 characters,
free field
2.2 The Record Reference and Location Card
Column(s) Variable Definition
1-3 RS Record identifier
4-5 Blank
6 '2' Card-sequence symbol
7-10 Not used
11-13 GLAT Latitude, to nearest degree
north of northernmost
specimen locality
14 GLA Either 'N' or 'S'
15-17 GLON Longitude, to nearest degree
east of easternmost
specimen locality
18 GLO Either 'E' or 'W'
19-30 KTRB Contributor's surname,
initial(s)
31-35 NREF(1) Index no. of 1st source
reference listed on group
title sheet
36-40 NREF(2) Index no. of 2d ref.
. (Use only as many
. as needed. Leave
. rest blank)
76-80 NREF(10) Index no. of 10th ref.
'GLAT' and 'GLON' are defined initially for records in which no
descriptions are of specimens distant from each other by more than 5?
of either latitude or longitude. Occasional references contain new
data from widely divergent sites. If the sites are clustered, a record
is created for each cluster. If there is no clustering of sites within
5? limits, 'GLAT' is set to 90鳱. and 'GLON' to 0鳺. (It is possible,
though no examples have so far been encountered, that some of the sites
of specimens described in a source reference are clustered and others
are not. In such a case, a record should be created for each cluster,
and another, with 'GLAT' = 90鳱, 'GLON' = O鳺, for the unclustered
material.)
3. Specimen Card 'A': the name and unit card
The first image of each specimen description, identified by
card-sequence symbol 'A', contains the specimen location, the literal
source name of the rock, and a brief (36-character) identification of
the geologic, stratigraphic, or geographic unit from which it was
collected, as given in the source reference. Information is stored in
image 'A' as shown on the following table:
Column(s) Variable Definition
1-3 RS Record identifier symbol
4-5 IS Specimen identifier symbol
6 'A' Card-sequence symbol
7-12 SLAT 1000 X (latitude to
nearest decimal part
of degree, as available),
right justified
13 SLA 'N' or 'S'
14-19 SLON 1000 X (longitude to
nearest decimal part
of degree, as available),
right justified
20 SLO 'E' or 'W'
21-44 LTNA Name of rock, as given
in source reference
45-80 GLUN Geologic unit from which
specimen was collected,
as specified in source
reference
4. Specimen Card 'B': the essential oxide card
Image 'B' of a specimen description contains the 'major' or
'essential' oxide analysis of the specimen. The value for each oxide
is multiplied by 100 to eliminate decimals, and stored in columns 11-71
inclusive, in the following sequence:
Column(s) Variable Definition
1-3 RS Record identifier symbol
4-5 IS Specimen identifier symbol
6 'B' Card-sequence symbol
7-9 NOREF Sequence no. of reference
in vector NREF of card '2'
10 Not used
11-14 NWT(1) SIO2 x 100, right justified
15-18 " (2) TIO2 " " "
19-22 " (3) AL2O3 " " "
23-26 " (4) FE2O3 " " "
27-30 " (5) FEO " " "
31-34 " (6) MNO " " "
35-38 " (7) MGO " " "
39-42 " (8) CAO " " "
43-46 " (9) NA2O " " "
47-50 " (10) K2O " " "
51-54 " (11) P2O5 " " "
55-58 " (12) CO2 " " "
59-62 " (13) H2O+ " " "
63-66 " (14) H2O- " " "
67-71 " (15) Author's total
72-76 RKNUM System no. of rock
name, from table 1,
right justified
NB: Trailing blanks are retained in the images of the oxide amounts.
For example, 2.10% TIO2 is entered as 210 in columns 16-18 but 2.1%
TIO2 is entered as 21 in columns 16-17.
5. Other Specimen Description Images: cards 'C', 'D' . . . .
Specimen description cards with sequence symbols > B contain
optional informations with variable length. This information is
encoded as a single character string separated into lists, fields
and subfields by punctuation characters. From the identification
field, cols 1-6 of every card, the card processor 'knows' what
record and item it is scanning, and from the sequence of
punctuation characters bounding fields and subfields of the
character string it 'determines' the nature of the information
currently awaiting interpretation. Conventions governing this
operation are described in section 6.
6. Encoding Information from Optional Informations.
6.1 Punctuation
In this section the characters used to bound lists, fields,
and subfields are described.
6.1.1 The list separator - The colon separates and identifies
lists. Every list ends with a colon. The sequence of lists is -
1. the status symbol list, see Appendix Table 2.
2. the trace element list
3. the geological age list, see Appendix Table 3
4. the petrographic descriptor list, see Appendix Table 4
5. the mineral association list, see Appendix Table 5
6. the additional information list, see Appendix Table 6.
The processor keeps track of its position by a count of
colons. Only if it has counted 3 colons, for instance, will it
properly interpret and test a field of petrographic descriptors.
If one of the leading colons is missing, the petrographic
descriptors will be considered geologic age symbols, and will be
rejected as 'unknown'. If two of these colons are missing, the
processor will try to identify the petrographic descriptors as
trace elements, with the same disastrous result. If all
subsequent lists are empty, however, only the terminal colon of
the last non-empty list need be used. If, for instance, only the
petrographic list was used in a particular description, the
sequence ':::PET. DESC-SYMBOLS:' would be sufficient.
All blanks are optional in these lists; only those in list 6
are retained in the base.
6.1.2 The field separator - The semicolon partitions certain
lists into fields. In some lists this is unnecessary, as for
example, in petrographic descriptor list. In the trace element
list, on the other hand, each trace element has a different field
containing 3 subfields, (of which one may be implicit).
6.1.3 Sub-field separators - Hyphens, commas, slashes or
relational operators (' > ','=',' < ') may be used to partition
fields into sub-fields. In context, the choice of sub-field
separator is usually self-evident; sub-field separators are
described below in the discussions of the lists in which they are
used.
6.2 The status list
This list lies between the identification field and the first
colon on a "C" card. It consists of a series of 2-character
symbols separated by commas. In each, the first character is a
digit, the second a letter.
Example:
4A,lD,2B,3C:
is a valid status list. Order of mention of symbols in the list
is immaterial. The currently recognized status symbols and their
meanings are shown in Appendix Table A2.
6.3 The Trace Element List
The trace element list lies between the first and second
colons of a specimen description. The first entry in each field
of the list is the standard literal symbol of some element in the
periodic table; one letter symbols are left justified. The
element symbol is separated by a relational operator (' > ', '=',
'< ') from the determined amount. The amount is given as an
integer, the scaling symbol 'P' (an acronym for 'parts per'), and
the exponent of the decimal scale, also an integer. The exponent
is usually separated by a comma from a third integer, the
position of the source reference key in the reference vector, as
given in image 'B' of the record preface. The reference subfield
is optional it provides for the contingency that a trace element
determination is drawn from a reference other than that which
provided the essential oxide analysis.
The trace element field is terminated by a colon if it is, and
by a semi-colon if it is not, the last field in the list.
Example
:BA >OP?;SR = 75P6, 3; RB = lP5, 3; CL = 15P7:
A trace of barium and 1.5 ppm of chlorine are given in the
source from which the essential oxide analysis was drawn; the
reference indexed by the contents of NREF(3) reports 75 ppm of
strontium and 10 ppm of rubidium. (Note that the treatment of
trailing zeroes in trace element amounts is at the option of the
contributor, but the amount itself is always given as an
integer).
6.4 The age list
An age list lies between the second and third colons of the
specimen description. If no information about age is to be
entered in the base, no non-blank characters occur between these
colons. A non-blank age list is partitioned into fields by
semi-colons; a field may be partitioned into sub-fields by any of
the characters '-', '/', or ','.
6.4.1. The stratigraphic age field and its sub-fields. The
stratigraphic age field is always the first field of an age list.
If no stratigraphic age is given, its absence is recorded by an
empty field unless no age data at all are available; in the
latter case the whole list is blank, e.g., if the left colon is
the origin of the list, the sequence ': :' signifies an empty
list, while ':;KKKK;LLLL:' signifies a list containing 2 physical
or radiochemical ages but no stratigraphic age.
The first subfield of the stratigraphic age field contains
either a noun or an adjective; in the latter case the second
subfield is a noun, and the two are separated by a hyphen. The
subfield that contains the age noun is bounded on the right by a
slash, a comma, a semicolon or a colon. A slash is used if the
succeeding sub-field contains a second stratigraphic noun or
noun-adjective pair defining a range, a comma if the succeeding
sub-field is a reference number, a semi-colon if the reference is
implicit and the current sub-field is the last in the field, a
colon if, in addition, the field is the last in the list. The
stratigraphic age vocabulary of IGBA is shown in Appendix Table
A3.
Calendar ages are considered stratigraphic. If no date or age
is given for a historic flow, its age field is just 'HISTORIC',
followed, if necessary, by a reference subfield. The date of a
flow may be given as either 'HISTORIC/KKKKII' or simply 'KKKKII',
where KKKK is an integer and II is either 'AD' or 'BC'. Either
form may be followed by a reference subfield. If KKKK is an age
rather than a date, II is 'BP'.
Examples
1) :MIDDLE-CAMBRIAN/SILURIAN, 5;
2) :LOWE-PALZ;
3) :1920 AD,8:
4) :HISTORIC:
Example 1 records an age range, example 2 a single
stratigraphic age assignment; examples 3 and 4 are calendar ages
of dated flows.
In examples 1 and 3, specific references (5,8) are cited. In
examples 2 and 4 the reference assianment is implicit.
Terminal punctuation indicates that in examples 1 and 2 the
specimen age list contains at least one more field, but that no
further dating is available for examples 3 and 4.
6.4.2. The physical age field and its sub-fields. For
convenience, all non-stratigraphic, non-calendar age
determinations are referred to here as "physical ages". With
exception of the magnetic and fission track procedures, those
currently recognized are in fact based on radioactive decay
schemes. The physical-age part of the age list lies between the
semicolon that terminates the stratigraphic age field and the
colon that terminates the list. Fields in this part of the age
list are partitioned into sub-fields by commas, hyphens or
slashes (no field contains more than one of each, and only the
hyphen is compulsory.)
The first subfield of a physical age field contains the age in
abbreviated scientific notation, with the decimal point implied
between the number and the scaling symbol 'E', i.e., an age of 10
million years is entered as lE7 or 10E6, one of 10.5 million
years as 105E5, one of 3.8 billion as 38E8 or 3800E6.
The age subfield is separated by a hyphen from the method
subfield, which contains a 3- or 4-character symbol denoting the
method by which the age was determined. The physical age method
symbols of IGBA are shown in Appendix Table A3.
The method subfield is separated by a slash from the materials
subfield, which contains a 2-letter symbol identifying the
material on which the determination was made. Age materials are
denoted by symbols drawn from Appendix Table A5.
The materials subfield is separated from the reference
subfield, if there is one, by a comma. The reference subfield is
optional; for rules governing its content and use, see the
section on the trace element list.
Only one stratigraphic field is permitted. There may be as
many as 5 physical ages per specimen description.
Examples
:; 1053E6 - UPB/TI, 2:
:; 280E6 - RBSR/WR;
Example 1 records that reference 2 contains a lead-uranium age
determination of 1053 million years made on zircon from the
analyzed specimen. Example 2 states that the source reference
from which the essential oxide analysis was drawn reports a whole
rock rubidium- strontium age determination of 280 million years
for the specimen. Terminal punctuation indicates that example 1
is, and example 2 is not, the last field of an age list.
6.5 The Petrographic Descriptor List
This list lies between the 3d and 4th colons and consists of a
single field partitioned by commas into as many as 15 sub-fields,
one for each of the circled symbols in Block E of the coding
form.
Examples
:AY, BV, DR, EG, GA, IB, HY:
and
:BH:
are valid petrographic descriptor lists.
Symbols used in the first two examples are taken from Appendix
Table A4. The first example records that the terms lava,
subaerial, amygdular, fine, vesicular and fresh occur in the
source reference description of the analyzed specimen. The
description also contains texture-structure terms not included in
the system glossary, (HY); these are noted later in the
additional information list. In the second example, evidently
the only source reference term clearly applicable to the analyzed
specimen is pillow lava. The petrographic descriptor list must
not contain more than 15 subfields.
6.6. The Mineral Assemblage List (Block F)
The mineral assemblage list, lying between the 4th and 5th
colons, consists of a single field divided by commas into as many
sub-fields as there are minerals found during the data capture.
The mineral vocabulary of IGBA is summarized in Appendix Table
A5.
The first two non-blank characters in each sub-field will be
interpreted as a mineral symbol drawn from table A5a. Any
additional non-blank characters in the sub-field will be
interpreted as mineral information flags drawn from Table A5b.
EXAMPLE
:NJ374,OG34,PE,RT:
The first sub-field of the example records the presence of
euhedral sanidine in phenocrysts and groundmass; the second records the
presence of euhedral groundmass nepheline, the third and fourth fields
record the presence of phlogopite and aegerine in the specimen.
The mineral assemblage list may contain up to 15 subfields no one
of which may contain more than 15 information flags.
6.7 The Additional Information List
This list lies between the 5th and 6th colons. Designed to
provide maximum freedom for recording information not included in
the previous lists. It may contain up to 500 characters, and all
characters except the colon are legal in it. Material in the
additional information block is uncoded either because it is so
uncommon that it would be wasteful to encumber the system with
grammatical conventions concerning it, or because it has so far
successfully resisted standardization of the variables concerned
or of the symbols used to denote them.
Routine machine sorting of information of this kind is not
feasible but it would often be useful to be able to determine
whether information of a certain type is present, or to retrieve
it without listing out the contents of the whole description. (A
person planning to visit a well studied area, for instance, might
want detailed locality information about analyzed specimens from
the area, or someone interested in comparing modal and chemical
analyses might wish to restrict his retrieval of the latter to
specimen descriptions for which the former were also available.)
Accordingly, each item in the additional information block is
framed and tagged. The leading frame is a double left
parenthesis, '((', the trailing frame a double right
parenthesis,'))'.
The first two non-blank characters to the right of a leading
frame are interpreted as a tag, identifying the type of
information contained in the frame. Any 2-letter symbol from
Appendix Tables A4 and A5 may be used as a tag. Appendix Table
A6 is a list of currently recognized IGBA tags for other kinds of
information. The frames in an additional information list are
indexed by a non-repetitive set of tags drawn from Tables A4, A5
and A6, in any order.
Tags will be added as necessary, both to accommodate types of
information so far overlooked and to permit expansion of the
information content of the system as the subject itself expands.
And although IGBA currently specifies neither vocabulary nor
syntax for material referenced by any tag, such specifications
could be established whenever desirable, either for existing tags
or for new ones; the system may thus grow without extensive
revision of operating software.
Example
:((XL - specimen collected in R.R. cut at E end of town))
((XM - Quartz = 38, Plag = 45, K-spar = 15, MGT = 2))((TQ
sphene mantles about magnetite)):
THE BILBLIOGRAPHY DATA FILE
This file is formed of bibliography records having variable
length. Physically it is also composed of 80 characters long
card images. Characters 1-5 is the identification field,
identical for all cards belonging to one bibliography record, and
contain the number of reference given in the Record Reference and
Location Card, see 2.2 above. Characres 6-80 contain the text of
citation.
The reference record is broken into 'author', 'title', and
'publication' fields. Within each field, the sequence is that currently
used in the Bulletin of the Geological Society of America. A slash
terminates the first and second fields, and a double slash the third.
The first word of the author field is the surname of the senior
author. A '$' symbol precedes the surname of each other author. The
date of publication is the last information in the author block.
Abbreviations may be used in the second field unless it contains a
book title, and for words like 'page' or 'volume' in the third. The name
of the publication, however, is to be spelled out in full, with upper and
lower case characters as in normal publication.
Example
2222 Washington, H. S., 1917/ Chemical Analyses of Igneous Rocks/
2222United States Geological Survey Professional Paper 99, 1201pp.//
--------------------------------------------------------------------------------