THREDDS Inventory Catalogs are designed to organize and describe collections of data. A dataset is a container for associated metadata and other datasets. Each dataset is either a collection dataset (i.e., contains other datasets) or an atomic dataset (i.e., has an access method).
An atomic dataset has no nested datasets,
and has an access URL with service type not Resolver or QueryCapability.
To find out more about this Dataset, one must use a non-THREDDS
protocol, which we call crossing the protocol boundary.
A collection dataset may be of the following types:
<xsd:element name="catalog" type="cat:catalogType">
<!-- Enforce dataset ID references:
1) Each dataset ID must be unique in the document.
2) Each dataset alias must reference a dataset ID in the document.
-->
<xsd:unique name="datasetID">
<xsd:selector xpath=".//cat:dataset"/>
<xsd:field xpath="@ID"/>
</xsd:unique>
<xsd:keyref name="datasetAlias" refer="cat:datasetID">
<xsd:selector xpath=".//cat:dataset"/>
<xsd:field xpath="@alias"/>
</xsd:keyref>
<!-- Enforce references to services:
1) Each service name must be unique and is required.
2) Each dataset that references a service (i.e., has a serviceName
attribute) must reference a service that exists.
3) Each access that references a service (i.e., has a serviceName
attribute) must reference a service that exists.
@todo Do we want unique service names. Currently, don't need to be unique.
@todo This does not enforce the current scoping of service elements.
-->
<!xsd:key name="serviceNameKey">
<xsd:selector xpath=".//cat:service" />
<xsd:field xpath="@name" />
</xsd:key>
<xsd:keyref name="datasetServiceName" refer="cat:serviceNameKey">
<xsd:selector xpath=".//cat:dataset" />
<xsd:field xpath="@serviceName" />
</xsd:keyref>
<xsd:keyref name="accessServiceName" refer="cat:serviceNameKey">
<xsd:selector xpath=".//cat:access" />
<xsd:field xpath="@serviceName" />
</xsd:keyref>
</xsd:element>
<xsd:complexType name="catalogType">
<xsd:sequence>
<xsd:element ref="cat:dataset" minOccurs="1" maxOccurs="1" />
</xsd:sequence>
<!--xsd:attribute name="name" type="xsd:string" use="required"/-->
<xsd:attribute name="version" type="xsd:token" default="0.7"/>
</xsd:complexType>
The catalog element is the
top-level element and must contain
exactly
one top-level dataset. The version
attribute allows
DTD
migration and should be set to"0.7". The name of the top-level dataset
is considered the name of the catalog and should be displayed to the
user when selecting from catalogs. Here is an example catalog with
top-level dataset:
<?xml version="1.0" encoding="UTF-8"?>
<catalog version="0.7"
xmlns ="http://www.unidata.ucar.edu/schemas/thredds/InvCatalog.0.7.xsd"
xmlns:xlink="http://www.w3.org/1999/xlink">
<dataset name="My data collection" >
...
</dataset>
</catalog>
Several uniqueness and reference rules for other elements are
enforced by parts of the schema snippet above. See the "dataset Element" section for more
details on dataset elements referencing
other dataset elements as well as service
elements. See the "access Element" section
for more details on access elements referencing service
elements.
<xsd:element name="dataset" type="cat:datasetType" />
<xsd:complexType name="datasetType">
<xsd:sequence>
<xsd:element ref="cat:service" minOccurs="0" maxOccurs="unbounded" />
<xsd:element ref="cat:documentation" minOccurs="0" maxOccurs="unbounded" />
<xsd:choice minOccurs="0" maxOccurs="unbounded" >
<xsd:element ref="cat:metadata" />
<xsd:element ref="cat:property" />
</xsd:choice>
<xsd:element ref="cat:access" minOccurs="0" maxOccurs="unbounded" />
<xsd:choice minOccurs="0" maxOccurs="unbounded">
<xsd:element ref="cat:dataset" />
<xsd:element ref="cat:catalogRef" />
</xsd:choice>
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required" />
<xsd:attribute name="dataType" type="cat:dataTypeEnum" />
<xsd:attribute name="authority" type="xsd:string" />
<xsd:attribute name="ID" type="xsd:token" />
<xsd:attribute name="alias" type="xsd:token" />
<xsd:attribute name="serviceName" type="xsd:token" />
<xsd:attribute name="urlPath" type="xsd:token" />
</xsd:complexType>
A dataset
element represents a named logical set of data at a level of
granularity
appropriate for presentation to a user. The name of the dataset
element (i.e., the value of the name
attribute) should be a human readable name that will be displayed to
users. A dataset
is considered an atomic dataset if it defines at least one access
method, otherwise it is just a container for nested datasets. [If an
atomic dataset is selected by a user, an event is sent to the client
software. Should we seperate out
the object/library/widget actions from the communication layer? Content
vs presentation?] Multiple access methods specify different
services for accessing the data. Choices among these different services
should be filterered by client
software
or presented to the user for selection. There are a variety of ways to
define an access method in a dataset element; they are
described in detail in the "Constructing
an Access Method" section below.
A dataset element contains 0 or more service elements followed by 0 or more documentation, metadata, or property elemets in any order, followed by 0 or more access elements, followed by 0 or more nested dataset or catalogRef elements. The data represented by a nested dataset element should be a subset, a specialization or in some other sense "contained" within the data represented by its parent dataset element.
A dataset may have a dataType, specified within itself or in a containing collection, whose value comes from a controlled vocabulary.
If a dataset has an alias attribute, the value of the attribute must be an ID of another dataset within the same catalog. Note it may not refer to a dataset in another catalog referred to by a catalogRef element. In this case, any other properties of the dataset are ignored, and the dataset to which the alias refers is used in its place.
A dataset may have a authority specified within itself or in a containing collection. If a dataset has an ID and a authority attribute, then the combination of the two should be globally unique for all time. If the same dataset is specified in multiple catalogs, then its authority - ID should be identical if possible.
Many of the properties of a dataset become the default for contained
datasets. This includes property
elements, and
dataType,
authority, and serviceName attributes. Any documentation
elements are displayed at the dataset itself when presenting the
catalog
to the user. Any metadata elements apply to all contained
datasets.
A dataset
element can reference another dataset element; the ID attribute (if one is given) must
be unique to the XML document and the alias
attribute must reference an existing ID
attribute.
<xsd:element name="service" type="cat:serviceElemType" />
<xsd:complexType name="serviceElemType">
<xsd:sequence>
<xsd:element ref="cat:property" minOccurs="0" maxOccurs="unbounded" />
<xsd:element ref="cat:service" minOccurs="0" maxOccurs="unbounded" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required" />
<xsd:attribute name="serviceType" type="cat:serviceTypeEnum" use="required" />
<!-- @todo What does "base" mean for a compound service? null value? -->
<xsd:attribute name="base" type="xsd:string" use="required" />
<xsd:attribute name="suffix" type="xsd:string" />
</xsd:complexType>
The service element ...
<xsd:element name="access" type="cat:accessType" />
<xsd:complexType name="accessType">
<xsd:attribute name="urlPath" type="xsd:token" use="required" />
<!-- @todo How can I restrict to serviceName OR serviceType, not both? -->
<xsd:attribute name="serviceName" type="xsd:string" />
<xsd:attribute name="serviceType" type="cat:serviceTypeEnum" />
</xsd:complexType>
An access element describes one
method for accessing the data that the parent dataset
represents. The access method accessing the data object that the dataset
<
A documentation
element ...
<
A metadata
element ...
<
A catalogRef
element ...
<xsd:element name="property" type="cat:propertyType" />
<xsd:complexType name="propertyType">
<xsd:attribute name="name" type="xsd:string" />
<xsd:attribute name="value" type="xsd:string" />
</xsd:complexType>
A property
element ...
There are a variety of ways to build an access method for a given
dataset:
1) The access method can be defined as a combination of the urlPath and serviceName
attribute of the given dataset element.
For example:
<dataset name="d1">
<service name="s1" serviceType="DODS" base="http://s1/dods" />
<service name="s2" serviceType="DODS" base="http://s2/dods" />
<!-- This datasets URL is "http://s1/dods/d1.1.nc" -->
<dataset name="d1.1" serviceName="s1" urlPath="d1.1.nc" />
<!-- This datasets URL is "http://s2/dods/d1.2.nc" -->
<dataset name="d1.2" serviceName="s2" urlPath="d1.2.nc" />
</dataset>
2) The access method can be defined as a combination of the urlPath
attribute of the given dataset element and the serviceName attribute of an ancestor
dataset
(i.e., the
service name value is inherited from or scoped within ancestor datasets).
This is convenient when all (or most) of the datasets in a parent
dataset have the same service. For example:
<dataset name="d1" serviceName="s1">
<service name="s1" serviceType="DODS" base="http://s1/dods" />
<service name="s2" serviceType="DODS" base="http://s2/dods" />
<dataset name="d1.1" urlPath="d1.1.nc" /> <!-- URL: "http://s1/dods/d1.1.nc" -->
<dataset name="d1.2" urlPath="d1.2.nc" /> <!-- URL: "http://s1/dods/d1.2.nc" -->
<dataset name="d1.3" urlPath="d1.2.nc" /> <!-- URL: "http://s1/dods/d1.3.nc" -->
<dataset name="d1.4" urlPath="d1.2.nc" /> <!-- URL: "http://s1/dods/d1.4.nc" -->
<dataset name="d1.5" serviceName="s2" urlPath="d1.2.nc" /> <!-- URL: "http://s2/dods/d1.5.nc" -->
</dataset>
3) The access method can be defined by a access
element that is the child of the given dataset
element. Each access element defines one
access method. An access element can define an
access method in two ways. First, an access method is defined by a
combination of a serviceType
attribute and a urlPath
attribute of the access element. In this case,
the value of the urlPath
attribute must be an absolute URL. Second, an access method can be
defined as a combination of the serviceName
attribute and the urlPath
attribute of the access element. In this case,
the URL given in the urlPath
attribute is a relative URL, relative to the base URL of the service
element referenced by the serviceName
attribute.
An access method defined by the dataset element's urlPath attribute (1 and 2 above) is considered the default access method. The default access method should be the preferred access method when no filtering or user choice is possible.
OLD STUFF TO BE REVIEWED
<!ELEMENT access EMPTY>An access element specifies how a dataset can be accessed through a data service. It is typically used when there is more than one service available for a dataset.
<!ATTLIST access
urlPath CDATA #REQUIRED
serviceName CDATA #IMPLIED
serviceType (%ServiceType;) #IMPLIED
>
Typically a serviceName is specified, which is the name of a service element in a parent element of the same catalog. Note it may not refer to a service in another catalog referred to by a catalogRef element. The dataset URL is then formed from the service base and the access urlPath, and optionally the service suffix (see forming URLs).
If a serviceName is not specified, a serviceType
must
be specified, which creates an "anonymous service" of that type. In
this
case the urlPath must be absolute.
<!ELEMENT catalog (dataset) >This is the top-level element. A catalog element contains exactly one top-level dataset. The name of the catalog should be displayed to the user when selecting among catalogs. The version allows DTD migration and should be set to"0.6".
<!ATTLIST catalog
name CDATA #REQUIRED
version CDATA #REQUIRED
xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink"
xmlns CDATA #FIXED "http://www.unidata.ucar.edu/thredds"
>
The XLink and default namespaces are declared here, so technically
they
do not have to be declared in the catalog XML itself. However Internet
Explorer cannot deal with namespaces declared in the DTD, so you should
add the same two namespace declarations in the catalog element in the
XML
document itself (see example).
This allows you to view the catalog in the IE browser. Netscape
Navigator
cannot yet view XML files (as of version 6.2.1).
<!ELEMENT catalogRef EMPTY>A catalogRef element refers to another catalog that becomes a dataset inside this catalog. This is used to seperately maintain catalogs and to break up large catalogs. The referenced catalog should not be read until the user explicitly requests it, so that very large dataset collections can be represented with catalogRef elements without large delays in presenting them to the user. The referenced catalog is not textually substituted into the containing catalog, but remains a self-contained object. The referenced catalog must be a valid THREDDS catalog, but it does not have to match versions with the containing catalog.
<!ATTLIST catalogRef
xlink:type (simple) #FIXED "simple"
xlink:href CDATA #REQUIRED
xlink:title CDATA #REQUIRED
>
The value of xlink:href is the URL of the referenced catalog. It may be absolute or reletive to the catalog URL. The value of xlink:title is displayed as the name of the dataset that the user can click on to follow the XLink. Note that the XLink has a fixed type of "simple" that is part of the DTD, so does not have to be specified in the catalog XML.
The dataset chooser software should seamlessly present a catalogRef
to the user, for example by eliminating the referenced catalog's
top-level
dataset in its presentation of the catalog when its name matches the
title
of the catalogRef title attribute.
<!ENTITY % DataType "Grid | Image | Station">
<!ELEMENT dataset (service*, (documentation | metadata | property)*, access*, (dataset | catalogRef)*)>A dataset element represents a logical set of data at a level of granularity appropriate for presentation to a user. A dataset is selectable if it contains at least one access path, otherwise it is just a container for nested datasets. If selectable, upon selection, an event is sent to the client software.
<!ATTLIST dataset
name CDATA #REQUIRED
dataType (%DataType;) #IMPLIED
authority CDATA #IMPLIED
ID ID #IMPLIED
alias IDREF #IMPLIED
serviceName CDATA #IMPLIED
urlPath CDATA #IMPLIED
>
A dataset element contains 0 or more service elements followed by 0 or more documentation, metadata, or property elemets in any order, followed by 0 or more access elements, followed by 0 or more nested dataset or catalogRef elements. The data represented by a nested dataset element should be a subset, a specialization or in some other sense "contained" within the data represented by its parent dataset element.
A dataset must have one or more access paths, specified implicitly through a urlPath attribute, or explicitly in contained access elements. An access path should be thought of as a URL, but its actually information from which a protocol-aware layer can construct URLs. When there is only one URL, this is typically specified in the dataset element itself. When there are multiple URLs, these may be specified in the dataset element and/or in contained access elements. Multiple URLs specify different services for accessing the dataset. Choices among these different services should be filterered by client software or presented to the user for selection. A URL specified in the dataset element itself is the default URL, which should be the preferred URL when no filtering or user choice is possible. Also see forming URLs.
A dataset may have a dataType, specified within itself or in a containing collection, whose value comes from a controlled vocabulary.
If a dataset has an alias attribute, the value of the attribute must be an ID of another dataset within the same catalog. Note it may not refer to a dataset in another catalog referred to by a catalogRef element. In this case, any other properties of the dataset are ignored, and the dataset to which the alias refers is used in its place.
A dataset may have a authority specified within itself or in a containing collection. If a dataset has an ID and a authority attribute, then the combination of the two should be globally unique for all time. If the same dataset is specified in multiple catalogs, then its authority - ID should be identical if possible.
Many of the properties of a dataset become the default for contained
datasets. This includes property
elements, and
dataType,
authority, and serviceName attributes. Any documentation
elements are displayed at the dataset itself when presenting the
catalog
to the user. Any metadata elements apply to all contained
datasets.
<!ELEMENT documentation (#PCDATA)>A documentation element contains or refers to content that should be displayed to an end-user when making selections from the catalog. The content may be HTML or plain text. We call this kind of content "human readable" information.
<!ATTLIST documentation
xlink:type (simple) #FIXED "simple"
xlink:href CDATA #IMPLIED
xlink:title CDATA #IMPLIED
xlink:show (new | replace | embed) "new"
>
The documentation element may contain arbitrary plain text content, which should be displayed inline at the position of the collection or the dataset element that contains it.
The documentation element may also contain an XLink to an
HTML
or plain text web page. This text should be either shown inline or
displayed
when the user activates the XLink, depending on the value of the xlink:show
attribute, whose default is new. If the value of xlink:show
is
new,
then the content of the XLink should be displayed in a new window when
the user selects it. If the value of xlink:show is embed,
then the context should be displayed inline, as if it was text content
in the documentation element. If the value of xlink:show is
replace,
the
content should replace the existing window. The value of xlink:title
is used for show and replace, and should be the
displayed
as the name that the user can click on to follow the XLink. The value
of
xlink:show
and xlink:title are heuristics for the dataset choosing widget,
which may not be able to fully implement them. These heuristics are
intended
to follow the XLink specification
as closely as possible. Note that the XLink has a fixed type of
"simple"
that is part of the DTD, so does not have to be specified in the XML.
<!ENTITY % MetadataType "THREDDS | ADN | Aggregation | DublinCore | DIF | FGDC | LAS | Other">
<!ELEMENT metadata ANY>A metadata element contains or refers to structured information about datasets, which is used by client programs to properly display or search for the dataset. Typically, metadata is not displayed to an end-user when making selections from the catalog, although it may be useful to make it optionally available. We call this kind of content "machine readable" information.
<!ATTLIST metadata
xlink:type (simple) #FIXED "simple"
xlink:href CDATA #IMPLIED
metadataType (%MetadataType;) #REQUIRED
>
The metadata element must contain a metadataType attribute whose value comes from a controlled vocabulary. The types and formats of the metadata are still being developed, and the current list should be considered experimental. Most are currently not operational.
<!ELEMENT property EMPTY>Property elements are arbitrary name/value pairs to associate with a dataset, collection or service elements. They will be used to create extended semantics, and should be available to client applications, but not typically displayed during dataset selection. Currently they have no specified semantics.
<!ATTLIST property
name CDATA #REQUIRED
value CDATA #REQUIRED
>
<!ENTITY % ServiceType "DODS | ADDE | NetCDF | Catalog | FTP | WMS | WFS | WCS | WSDL | Compound | Other">
<!ELEMENT service (property*, service*)>
<!ATTLIST service
name CDATA #REQUIRED
serviceType (%ServiceType;) #REQUIRED
base CDATA #REQUIRED
suffix CDATA #IMPLIED
>
A service element represents a data service. It must contain a name
and a serviceType attribute whose value comes from a controlled
vocabulary. It must contain a name unique within the catalog
(note that catalogs referenced by a catalogRef contain their
own ID namespaces). It must have a base attribute and
may have an optional suffix atribute which are used to
construct the dataset URL (see constructing
URLS). The base may be an absolute URL or reletive to the catalog
URL.
A service element may contain 0 or more property elements. These property elements are made available to the application when a dataset is selected, but are not otherwise used.
The scope of a service element is its sibling elements and their descendents, excluding catalogs referenced by catalogRef elements. The service name should be unique within its scope.
A service element with serviceType="Compound" must have nested service elements, and services with type other than Compound may not have nested service elements. Nested service elements may be used directly by dataset or access elements. They are at the same scoping level as their parent service.
Each dataset element must refer to one or more service
elements that appear in a parent collection. Since typically there will
be only a few service elements in a catalog but many dataset
elements, a service element factors out the common
properties of the data service for efficient representation within the
catalog.