Help for Data Users
By following the links at
the left you will come to a PDS archive
containing your data of interest. All PDS archives have a similar structure. Understanding this
structure will help you get the most from the data.
Data
archived from NASA
missions that were
confirmed for flight
after November 1,
2011, are required
to use the PDS4
standard (see
About PDS4).
Missions that got
their start before
that date continue
to use the PDS3
standard. Individual
data providers are
also required to use
PDS4 now. Eventually
the PDS3 archives
will be "migrated"
to PDS4 (more about
that
below),
but for now you'll
find both PDS3 and
PDS4 archives here
at the Geosciences
Node.
Bundles,
collections,
products, and LIDs
These are the
building blocks of
PDS4 archives.
Data
products, document
products, and
several other kinds
of products are the
basic components of an
archive. A typical
product consists of
one file with the
contents of an
observation,
document, etc., and
another file with
its metadata, its
PDS4 label. Both
files have the same
base name, but the
PDS4 label is
written in XML and
has the file name extension
".xml".
A
collection is a
group of related
products, such as a
group of documents
or a group of
calibrated data
products from a
particular
instrument.
A
bundle is a group of
related collections,
such as the raw data
products, calibrated
data products, and
documents from a
particular
instrument.
Each
product has a
Logical Identifier
(LID) that is
guaranteed to be
unique PDS-wide.
Collections and
bundles are actually
special kinds of
products, and they
also have unique
LIDs.
Where to
look first
In
the top-level
directory you'll
find the bundle
label, a file named
"bundle_<something>.xml".
This label contains
overall information
about the bundle,
references to
publications, and a
list of the
collections that
make up the bundle.
Most
bundles have a file
named "readme.txt"
in the top-level
directory that
explains the
contents of the
bundle.
Most
bundles have a
document collection,
usually in a
subdirectory named
"document". For
mission-acquired
data, look for a
document with "SIS"
(Software Interface
Specification) in
the name; this will
be the primary
documentation.
Where to
find the data
products
In
the top-level
directory you'll
find one or more
subdirectories whose
names start with
"data", usually one
for each data
collection in the
bundle. (Not always;
it's possible to
have more than one
collection in the
same subdirectory.)
They may be further
subdivided if there
are many products in
the collection.
Where is the
list of data
products? (i.e.,
what happened to the
old PDS3 index
table?)
In
each collection
you'll see a
file named
"collection_<something>_inventory.csv"
and its label,
"collection_<something>.xml".
The inventory file
is a
comma-separated-value
text file that lists
the LIDs of all the
products in the
collection. Each LID
is preceded by a "P"
or an "S" indicating
whether the product
is a primary or a
secondary member of
the collection.
Primary members
reside in the
collection directory
or one of its
subdirectories.
Secondary members
reside in some other
PDS collection where
they are primary
members, and they
may or may not be
duplicated in this
collection.
How to read
PDS4 labels
PDS4
labels are written
using
XML (eXtended
Markup Language).
They are text files
that are both human-
and
machine-readable.
Labels are best
viewed in a text
editor that can
display XML with
formatting that
makes them easier to
read. (Notepad++ and
UltraEdit are two
such editors.) The
labels may also be
used with software
that can manipulate
XML documents.
How to read
PDS4 data
Tools
are available in the
PDS Tool
Registry
for working with
PDS4 archives. The
registry includes
the
PDS4 Viewer,
a program
that displays text
tables, binary
tables, images and
arrays by reading
their PDS4 labels.
The
Python
library
from which the
PDS4 Viewer is built
is also available.
An example
Jupyter Notebook
that uses Python
to view a PDS
image and save
it as a browse
image can be
found
here.
What You'll Find on a
PDS3 Archive Volume
Under the PDS3 standard, data products
are grouped into data sets along with
related documents and other material. Data sets are
stored on volumes, a term that goes
back to the days before the World-Wide Web when users
could request data on physical media such as CDs.
Typically one data set is stored on one volume, but
there may be multiple data sets on a volume. A large
data set accumulated over time may be stored on multiple
volumes.
General Hints
All files with extensions
.txt, .lbl, .cat, .tab, and
.asc are ASCII text files. Files with other
extensions are binary files that may not be viewable in
your
web browser.
Each subdirectory
contains a file with a name ending in info.txt
that describes the contents of that directory.
Top-level directory
-
aareadme.txt - An
introduction to the data volume. Read this first.
-
errata.txt - An
optional file containing notes and errata about the
volume.
-
voldesc.cat - A
short text file that serves as a PDS label for the
volume.
Catalog directory
Don't overlook this
valuable source of information about the data set. These
files are copies of the PDS Catalog descriptions of the
data set, the instrument that collected the data, the
spacecraft, the mission, personnel involved in making
the archive, and a list of references to published
literature.
Document directory
This directory contains
documentation such as the data product Software
Interface Specification that most missions are required
to provide.
Index directory
The file index.tab
(or a file with a similar name) lists every data product
on the archive volume, its directory and file name, and
other information that varies depending on the data set.
Its contents are described by the associated PDS label
index.lbl. If the data set is spread over more
than one volume, there will also be a cumulative index
table (cumindex.tab) that lists the data products
on all volumes created to date. See the last volume in
the data set for the complete cumulative index.
Data directories
The data products
themselves are under a directory that may be named
"data" or may have a data-set-specific name. There may
be more than one data directory. For instance, on some
archive volumes the data are organized by year under the
directories 2002, 2003, etc.
Every data product is
described by a PDS label. The label may either be
embedded (attached) at the beginning of the data file,
or in a separate file (detached) with the same name,
extension .lbl. PDS labels are ASCII text in a
keyword = value format that can be read both by humans
and by software. In some cases a data product is stored
in multiple files which are described by a single label.
The thing to remember is when you download a data file,
download the label too.
Other directories
Other optional
directories include Software, Calib
(calibration), Extras, Geometry, Browse,
and others. Read the info.txt fi file in each
directory for more information.
Migration of
archives from PDS3
to PDS4
The
Geosciences Node has
a plan for the
migration of most of
its PDS3 archives to
PDS4 over the next
few years. (The plan
excludes currently
active missions that
are still delivering
PDS3 data. That data
will be migrated
soon after each mission is
complete.) The
approach is to leave
the existing data
products and PDS3
labels untouched,
and to add PDS4
labels in the same
data directories, so
that each product
has both a PDS3 and
a PDS4 label. In
cases where the data
product is not in a
PDS4-compliant
format (the rules
are stricter in
PDS4), then the data
product itself may
be converted to a
compliant format,
but both old and new
versions will remain
in the archive.
For
examples of archives
that have been
migrated from PDS3
to PDS4, see the
MESSENGER archives.
If you have
questions about Geosciences Node data sets
One place to look for help is the
Frequently Asked Questions page.
If you don't see your question there, check the
Geosciences Node
Forums
to see if it's been discussed there. There are
forums for announcements, data users, data providers,
Analyst's Notebook users, and Orbital Data Explorer
users. You may post your question in one of the forums
(after signing up for a free account).
If you still need help, you may email
your question to
geosci@wunder.wustl.edu.
|