DCAT-US - Version 3.0

Data Catalog Application Profile for the United States of America Candidate Recommendation Snapshot

The DCAT-US 3.0 Profile (DCAT-US 3.0) is an updated specification designed to facilitate data cataloging, discovery, and interoperability among US government agencies. Leveraging the strong foundation laid by the Project Open Data (POD) 1.1 standard (also known as [[DCAT-US-1.1]]), this profile seamlessly aligns with the emerging Data Catalog Vocabulary (DCAT) - Version 3 (DCAT 3) [[VOCAB-DCAT-3]] recommendations approved by the World Wide Web Consortium (W3C), all while upholding the essential FAIR principles. Moreover, it emphasizes maintaining compatibility with the existing POD 1.1 standard, ensuring a fluid transition. The result ensures data's Findability, Accessibility, Interoperability, and Reusability (FAIR).

The predominant significance of the DCAT-US 3.0 lies in its role as a bridge between the well-established DCAT-US 1.1 and the forward-looking DCAT 3, uniting them under a single, standardized approach for articulating and exchanging datasets. By harmonizing the most significant attributes of both standards, this profile also addresses the distinctive metadata prerequisites inherent to the US context. It goes above and beyond by encompassing specialized properties to address geospatial and statistical datasets, effectively harnessing established vocabularies to elevate the process of data sharing and subsequent reuse.

Distinguished by its usage of the Shapes Constraint Language (SHACL) [[?SHACL]] for structural and semantic validation, the DCAT-US 3.0 introduces a highly refined, interoperable, and future-proof framework for describing and validating dataset metadata. In essence, it is not just a specification but an advanced stride towards achieving a data-centric landscape where precise metadata description empowers the efficient flow of information while laying the groundwork for sustained innovation.

Background

The FAIRness Project is introducing a draft update to the Data Catalog (DCAT) standard for the United States. This update, “DCAT-US 3.0 Schema,” builds upon the requirements we received from agencies as well as data creators, providers, and users, Data Inventory statutory requirements, and the lessons learned over ten years of successful implementation of the Project Open Data Metadata Standard (DCAT-US v1.1) used by Data.gov.

We need your help to review and comment on this draft so that it meets agencies' data inventory needs and those of cross-government programs like Data.gov, GeoPlatform, and the Standard Application Process Portal.

Once approved and implemented, the update will improve the FAIRness, or Findability, Accessibility, Interoperability, and Reusability of all types of federal data. DCAT-US 3.0 will provide a *single* metadata standard able to support most requirements for documentation of business, technical, statistical, and geospatial data consistently.

The DCAT-US 3.0 Schema introduces the following key enhancements:

DCAT-US 3.0 represents a "profile" of the W3C DCAT standard, specifically aligned with [[VOCAB-DCAT-3]], rather than a new standard. It aims to tailor the DCAT specification to meet specific use cases and requirements within the United States.
It ensures backward compatibility with DCAT v1.1 metadata, facilitating seamless integration. While DCAT-US 3.0 introduces additional metadata elements to address emerging needs, it preserves the integrity of existing elements (with the exception of addressing errors introduced in DCAT-US 1.1), negating the necessity for metadata translation.
The schema extends support for enriched and updated controlled vocabularies, enhancing the precision in naming conventions for federal agencies, file formats, and units of measurement, thereby promoting uniformity across datasets.
DCAT-US 3.0 addresses and resolves the constraints encountered with DCAT v1.1 in documenting geospatial data. This advancement obviates the requirement for a distinct federal standard dedicated to geospatial datasets, streamlining documentation processes.
Aligning with practices akin to the European Data Catalogue Application Profile (DCAT-AP), DCAT-US 3.0 has garnered vendor support, with ongoing efforts to broaden this support base, indicating a commitment to interoperability and standard adoption.

Please review the documentation below and provide feedback to help make this standard as useful as possible to you and the broader federal data user community.

Please follow the instructions found here to submit your comments and issues with the current draft schema specification.

Overview

The DCAT-US 3.0 Profile is a comprehensive update to the Project Open Data (POD) 1.1 standard, designed to meet the evolving needs of data exchange and interoperability among US government agencies. This profile builds on the foundation laid by POD 1.1 and is aligned with the latest DCAT 3 standard from the World Wide Web Consortium (W3C). In addition, the profile aims to embody the FAIR principles, ensuring that data is Findable, Accessible, Interoperable, and Reusable. This introduction will provide an overview of the purpose of this profile, highlight the gaps between POD 1.1 and DCAT 3, and elaborate on the differences and enhancements offered by the DCAT-US 3.0 profile.

Purpose and Evolution

The purpose of the DCAT-US 3.0 Profile is to improve data discoverability, accessibility, and interoperability among US government agencies. By adhering to the FAIR principles, the profile promotes more effective data sharing and reuse. The FAIR principles emphasize that data should be:

Findable: Easy to discover by humans and machines.
Accessible: Easily retrieved, with well-defined access protocols.
Interoperable: Ready for use with other data sources and applications.
Reusable: Clearly documented and licensed, facilitating reuse by others.

The DCAT-US 3.0 Profile bridges the gap between the POD 1.1 and DCAT 3 standards by incorporating the best features of both while also addressing specific metadata requirements unique to the US context. It offers a standardized approach for describing and exchanging datasets, thereby enabling more efficient data sharing and reuse.

Data Structure

The Application Profile specified in this document is based on the specification of the Data Catalog Vocabulary Version 3 (DCAT 3) [[VOCAB-DCAT-3]] developed under the responsibility of the W3C Dataset Exchange Working Group (DXWG). DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. Additional classes and properties from other well-known vocabularies are re-used where necessary.

The DCAT vocabulary consists of classes and properties.

Classes are things on the internet: Not all of them have URIs, but it is recommended to provide a URI for them. They are complex things like a person, an organization, a dataset, a website or a downloadable data file.
Classes have properties: The properties are the attributes describing these things. Some properties occur in more than one class, a title for example is a common attribute. Other properties are very specialized such as a file format that only makes sense for a data file.
Properties can be simple or complex: Some properties are classes. For example, an organization can have a website. Or a dataset can have a data publisher. In general, a class can be recognized by its spelling: A property name starts with a lowercase letter such as dcat:dataset , while a class starts with a capital letter such as dcat:Dataset.

Classes and properties are used to deliver the metadata in a structured way.

Application Areas

The DCAT Application Profile for data portals in the United States (DCAT-US) is an Application Profile of the DCAT vocabulary.

The “Data Catalog Vocabulary” (DCAT) is a semantic language for describing data by means of an RDF vocabulary. It allows for a decentralized and interoperable approach for publishing and finding data by use of a common language for describing data.
DCAT is a generic language that can be applied in various contexts. An Application Profile applies the DCAT vocabulary to a specific domain, context, or application, with the goal of facilitating data discovery, access, and sharing. The DCAT-US application profile addresses specific requirements of the U.S. Government (laws, regulations, and policies) and other needs of U.S. data producers and consumers.
The DCAT-US Application Profile provides the guidance needed for data publishers to specify their data catalogs and for data portal managers to process data catalogs. DCAT-US specifies the schema for metadata -- data describing data that can be validated for correctness and completeness. Metadata is by definition secondary information about the data: when and by whom were they published, which usage conditions apply, how often are they updated, whom to contact about them, and where and how can they be accessed.

Gaps with DCAT-US 1.1

The DCAT-US 1.1 standard, while effective for its time, had some limitations that the DCAT 3 standard has addressed. The key differences between the two standards include:

Increased expressiveness: DCAT 3 offers a richer set of classes and properties, enabling more detailed descriptions of datasets and their relationships.
Improved support for geospatial data: DCAT 3 provides better support for describing geospatial datasets, including coordinate reference systems and geometry.
Enhanced handling of statistical data: DCAT 3 and RDF Data Cube (QB) specification introduce new vocabulary terms to describe statistical datasets and their dimensions more effectively.
Refined vocabularies: DCAT 3 benefits from updated reference controlled vocabularies to ensure better interoperability between agencies and systems.

DCAT-US Features

The DCAT-US 3.0 Profile not only incorporates the enhancements provided by DCAT 3 but also maintains the US-specific metadata requirements defined in POD 1.1. This profile offers a harmonized approach to data cataloging that accounts for the unique needs of US agencies.

One of the key features of this profile is its use of reference controlled vocabularies. These vocabularies enable better interoperability between US agencies by providing a common language for describing datasets. The profile also introduces new properties to handle geospatial data and statistical datasets, leveraging established vocabularies in these domains.

The Data Catalog Vocabulary (DCAT-US) specification introduces several key features designed to enhance the accessibility, interoperability, and effectiveness of data cataloging practices. Below, we outline the compelling advantages of adopting DCAT-US over traditional document-centric metadata standards, such as ISO 19115, highlighting its superiority in meeting the needs of modern data ecosystems.

Enhanced Interoperability and Integration: DCAT-US is engineered for maximum interoperability with web technologies and data catalogs, facilitating the sharing and discovery of datasets across diverse platforms. This level of integration surpasses the capabilities of document-centric standard such as ISO 19115, enabling broader visibility and usability of data assets.
Flexibility and Extensibility: Offering a flexible and extensible framework, DCAT-US adapts to the changing requirements of data publishers and consumers. Its ability to incorporate new metadata types ensures that the specification remains relevant in the face of technological advancements, a critical advantage over the more static metadata standards.
Web-Friendly and User-Centric: With its modern web-oriented design, DCAT-US enhances data accessibility and usability through support of Linked Data formats such as Turtle and JSON-LD, making datasets more discoverable and consumable for a wide range of users. This approach marks a significant improvement over the document-centric standards, prioritizing ease of use and efficiency.
Alignment with Global Standards: By aligning with the W3C's Data Catalog Vocabulary (DCAT), DCAT-US adheres to globally recognized standards, facilitating international data sharing and collaboration. This global perspective is essential for transcending the limitations of the more narrowly focused syntactic and structure schema-based standard.
Cost-Effective and Efficient Data Management: The adoption of DCAT-US promotes cost-effective and efficient data management practices. Its emphasis on standardization and interoperability reduces the complexities and costs associated with data integration, leveraging web infrastructure for data publication and consumption.

In conclusion, DCAT-US represents a forward-looking solution that significantly advances beyond traditional rigid document-centric metadata standard silos. Its design and features cater to the demands of contemporary data management and publishing, ensuring that data assets are more visible, accessible, and valuable to users across the data ecosystem.

Profile Encoding

The encoding of the DCAT-US profile involves the technical aspects of how data is represented and exchanged, addressing questions about data format and interoperability. While the DCAT-US 3.0 conformance does not strictly mandate the use of RDF serialization for data exchange, it emphasizes the importance of ensuring that the exchanged format can be unambiguously transformed into RDF. This flexibility allows for interoperability while accommodating various data exchange requirements.

One prevalent format for data exchange between systems is JSON (JavaScript Object Notation), which is widely used due to its simplicity and human-readable nature. To facilitate data exchange in JSON while adhering to the DCAT-US profile, a dedicated mechanism is provided: the JSON-LD context file. JSON-LD 1.1 (JSON for Linked Data) is a W3C Recommendation [[?JSON-LD]] that establishes a standardized approach for interpreting JSON structures as RDF, enhancing the potential for semantic integration and interoperability.

The DCAT-US profile offers a [[?JSON-LD]] context file that implementers can utilize as a foundation for their data exchange processes. By incorporating this JSON-LD context file, implementers can ensure that their data adheres to the DCAT-US standards while being exchanged in a JSON format. This allows for a coherent and consistent representation of the data that aligns with the RDF model, promoting interoperability among different systems and tools.

It's important to note that the provided JSON-LD context file is not normative, indicating that other JSON-LD contexts can also be used to establish a conformant DCAT-US data exchange. This flexibility caters to various implementation scenarios and data requirements, while still adhering to the overarching principles of the DCAT-US profile. Overall, the encoding of the DCAT-US profile acknowledges the significance of data format and interchange methods, leveraging JSON-LD and related mechanisms to facilitate seamless and interoperable data exchange within the context of the DCAT-US specification.

Profile Validation

While the JSON Schema approach used in POD 1.1 was effective in certain scenarios, it has limitations when compared to using SHACL for defining data models and constraints:

Expressiveness: JSON Schema primarily targets validation of JSON data structures, which can be less expressive than RDF data models used with SHACL. RDF enables a more flexible and semantically rich representation of data, while SHACL is designed to provide constraints and validation for RDF data.
Linked Data Compatibility: JSON Schema is not specifically designed for Linked Data or semantic web applications. SHACL, on the other hand, is tailored for RDF and Linked Data, making it a more natural fit for data models that need to interoperate with other semantic web resources and standards.
Inferencing Support: SHACL can be used in conjunction with RDF reasoners to validate inferred triples, offering advanced inferencing capabilities beyond the scope of JSON Schema. This feature enables more powerful and intelligent data validation.
Extensibility: SHACL is designed to be extensible, allowing users to define custom constraint components, which can be reused across multiple shapes and datasets. In contrast, JSON Schema has a fixed set of validation keywords, and extending it may require the introduction of non-standard or custom keywords, potentially affecting interoperability.

Considering these limitations, the DCAT-US 3.0 Profile has chosen SHACL as the foundation for its data modeling and validation, ensuring a more expressive, interoperable, and future-proof framework for defining dataset metadata.

The DCAT-US 3.0 Profile is defined using the Shapes Constraint Language (SHACL), which offers several advantages over previous approaches:

Expressive and flexible: SHACL allows for the precise definition of constraints on RDF data, making it easier to validate and verify datasets.
Scalable and efficient: SHACL is designed for efficient validation, making it suitable for use with large datasets and complex data models.
Widely adopted: As a W3C recommendation, SHACL enjoys broad support among data management tools and libraries, facilitating interoperability and reuse.

By using [[?SHACL]], the DCAT-US 3.0 Profile ensures a robust and extensible foundation for future updates, as well as compatibility with a wide range of data processing tools and applications.

Document Status

Candidate Recommendation Snapshot

Data Provider requirements

In order to conform to this Application Profile, an application that provides metadata MUST:

Provide a description of the Catalog, including at least the mandatory properties specified for the class Catalog.
Provide information for the mandatory properties for [Catalog
Records](#properties-for-catalog-record), if descriptions of Catalogue Records are provided - please note that the provision of descriptions of Catalogue Records is optional.
Provide descriptions of Datasets in the Catalogue, including at least the mandatory properties for the class Dataset.
Provide descriptions of Distributions, if any, of Datasets in the Catalogue, including at least the mandatory properties for the class Distribution.
Provide descriptions of Data Services, if any, of Datasets in the Catalogue, including at least the mandatory properties for the class Data Service.
Provide descriptions of all organisations involved in the descriptions of Catalogue and Datasets, including at least the mandatory properties for the class Agent.
Apply the publication requirements for the controlled vocabularies as mentioned in section [[[#controlled-vocabularies]]].

For the properties listed in the table in section [[[#controlled-vocabularies]]], the associated controlled vocabularies MUST be used. Additional controlled vocabularies MAY be used. In addition to the mandatory properties, any of the recommended and optional properties defined in each class description MAY be provided.

Receiver requirements

In order to conform to this Application Profile, an application that receives metadata MUST be able to:

It is able to process RDF catalogs that conform to DCAT-US .
Process information for all classes and properties specified in the class descriptions.
Process information for all controlled vocabularies specified in section [[[#controlled-vocabularies]]].
Properties not mentioned in this specification MAY be used if they are included in either DCAT 3 and their usage conforms to DCAT 3 if they are included in DCAT-US or to DCAT if they are only included in DCAT.

"Processing" means that receivers must accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).

DCAT-US Classes

This section displays the classes for the DCAT-US 3 profile. We distinguish core classes, which represent the primary business entities that the application profile is concerned with, from supporting classes, which are used to provide additional context, metadata, or structure to the core classes.

This following table provides a summary of critical changes and updates in the DCAT-US 3.0 Application Profile, offering valuable insights into the evolution of class definitions within this data cataloging standard. Each change type is carefully documented, from the introduction of new classes specifically designed for DCAT-US 3.0 to updates and adaptations from the broader DCAT specifications, such as DCAT 1, DCAT 2, and DCAT 3. Understanding these changes is essential for data practitioners, as it enables them to grasp the evolving landscape of data cataloging and its alignment with various DCAT versions, ultimately facilitating more effective data management and interoperability.

Change TypeDescriptionNew!New DCAT-US 3.0 specific class that is not referred in DCAT specificationsAlignedClass introduced in DCAT specifications that does not exist in DCAT-US 1.1

Core Classes

The DCAT US Application Profile (“DCAT-US ”) are structured around the following main classes:

Class nameUsage note for the Application ProfileURI and ReferenceChanges from DCAT-US 1.1CatalogA catalog or repository that hosts the Datasets or Data Services being described.dcat:CatalogAlignedCatalog RecordA record in a catalog, describing the registration of a single dcat:Resource``dcat:CatalogRecordAlignedDatasetA conceptual entity that represents the information published.dcat:DatasetAlignedDistributionA physical embodiment of the Dataset in a particular format.dcat:DistributionAlignedData ServiceA collection of operations that provides access to one or more datasets or data processing functions. dcat:DataServiceAlignedDataset SeriesA collection of datasets that are published separately, but share some characteristics that group them. dcat:DatasetSeriesAligned

UML Model for Core Classes of DCAT-US 3.0 (click to open) DCAT-US 3.0 Core Classes

Supporting Classes

Class nameUsage note for the Application ProfileURI and ReferenceChanges from DCAT-US 1.1

AccessRestriction

The "AccessRestriction" class used by NARA represents limitations placed on accessing specific records or information within their archives, ensuring controlled and responsible access based on legal, ethical, or security considerations.

dcat-us:AccessRestriction

New!

Activity

An activity carried out by an Agent over an entity, according to a plan, and generating another entity.

prov:Activity

Aligned

Address (Location)

A postal address for a Location.

locn:Address

New!

Address (Contact Point)

A postal address for Contact Point.

vcard:Address

AlignedAgent

An entity (e.g., an individual or an organization) that is associated with Catalogs, Catalog Records, Data Services, or Datasets. If the Agent is an organization, the use of the Organization Ontology [[VOCAB-ORG]] is recommended.

foaf:Agent

Aligned

Attribution

A responsibility of an Agent for a resource.

prov:Attribution

AlignedConcept

A subject of a Catalog, Dataset, or Data Service.

skos:Concept

AlignedConcept scheme

A concept collection (e.g. controlled vocabulary) in which the Concept is defined.

skos:ConceptScheme

AlignedChecksum

A value that allows the contents of a file to be authenticated. This class allows the results of a variety of checksum and cryptographic message digest algorithms to be represented.

spdx:Checksum

AlignedContact

A description following the [[VCARD-RDF]] specification, e.g. to provide telephone number and e-mail address for a contact point using vcard:Kind .

vcard:Kind

Aligned

CUI Restriction

Controlled Unclassified Information (CUI) is information that requires safeguarding or dissemination controls pursuant to and consistent with applicable law, regulations, and government-wide policies but is not classified.

dcat-us:CuiRestriction

New Document

A textual resource intended for human consumption that contains information, e.g., a Web page about a Dataset, a publication, a chapter book, a technical report, but also a blog post.

foaf:Document

Aligned

Geographic Bounding Box

GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.

dcat-us:GeographicBoundingBox

New Identifier

An identifier in a particular context, consisting of the string that is the identifier; an optional identifier for the identifier scheme; an optional identifier for the version of the identifier scheme; an optional identifier for the agency that manages the identifier scheme

adms:Identifier

New LiabilityStatement

A formal declaration accompanying a dataset which outlines the responsibilities and limitations of the data provider in terms of the accuracy, completeness, and potential use of the data. It often serves to limit the legal exposure of the data provider by defining the scope of allowed uses and disclaiming warranties or guarantees.

dcat-us:LiabilityStatement

New License Document

A legal document giving official permission to do something with a resource.

dcterms:LicenseDocument

AlignedLocation

A spatial region or named place. It can be represented using a controlled vocabulary or with geographic coordinates.

dcterms:Location

Aligned

Media type

A media type, e.g. the format of a computer file.

dcterms:MediaType

Aligned

Metric

Represents a standard to measure a quality dimension. An observation (instance of dqv:QualityMeasurement) assigns a value in a given unit to a Metric.

In DCAT-US, this class is used to define individuals corresponding to the different types of spatial resolution.

dqv:Metric

AlignedOrganization

Represents a collection of people organized together into a community or other social, commercial or political structure. The group has some common purpose or reason for existence which goes beyond the set of people belonging to it and can act as an Agent. Organizations are often decomposable into hierarchical structures.

org:OrganizationAlignedPeriod of time

An interval of time that is named or defined by its start and end dates.

dcterms:PeriodOfTime

AlignedPerson

This class represents an individual human being or a person. It can be used to provide information about individuals, such as their name, email address, homepage URL, and other personal details.

foaf:PersonAligned Provenance Statement

A statement of any changes in ownership and custody of a resource since its creation that are significant for its authenticity, integrity, and interpretation

dcterms:ProvenanceStatement

New!

Quality Measurement

Represents the evaluation of a given resource (as a Data Service, Dataset, or Distribution) against a specific quality metric.

dqv:QualityMeasurement

AlignedRelationship

An association class for attaching additional information to a relationship between DCAT Resources

dcat:Relationship

AlignedRights statement

A statement about the intellectual property rights (IPR) held in or over a resource, a legal document giving official permission to do something with a resource, or a statement about access rights.

dcterms:RightsStatement

AlignedRole

A role is the function of a resource or agent with respect to another resource, in the context of resource attribution or resource relationships.

Note it is a subclass of skos:Concept.

dcat:Role

AlignedStandard

A standard or other specification to which a Catalog, Catalog Record, Data Service, Dataset, or Distribution conforms.

dcterms:Standard

Aligned

Use Restriction

A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on access, distribution, reproduction, modification, or sharing, and they are often put in place to protect privacy, intellectual property rights, security, or compliance with legal or ethical standards.

dcat-us:UseRestriction

New!

UML Model for Supporting Classes of DCAT-US 3.0 (click to open) DCAT-US 3.0 Supporting Classes

Properties per Class

Overview

Requirement levels

DCAT-US defines four requirement levels for data receivers and senders:

Mandatory property: a receiver MUST be able to process the information for that property; a sender MUST provide the information for that property.
Recommended property: a receiver MUST be able to process the information for that property; a sender SHOULD provide the information for that property if it is available.
Optional property: a receiver MUST be able to process the information for that property; a sender MAY provide the information for that property but is not obliged to do so.
Deprecated property: a receiver SHOULD be able to process information about instances of that property; a sender SHOULD NOT provide the information about instances of that property.

The meaning of the terms MUST, MUST NOT, SHOULD and MAY in this section and in the following sections are as defined in RFC 2119.

In the given context, the term "processing" means that receivers MUST accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).

Notations

Property: denotes the Property that the class or property is given in DCAT-US .
URI: denotes the property URI.
Range: specifies the range of values that is expected for the property.
ReqLevel (“Requirement level”): denotes whether the class / property is mandatory, recommended or optional.
Card (“Cardinality”): specifies the minimum number of values that MUST be provided for that property and the maximum number of values that MAY be provided.
Usage note: specifies custom usage instructions and provides background information.
CV (“Controlled Vocabulary”): defines which controlled vocabulary SHOULD be used.

Property Evolution in DCAT-US 3.0.

The following table provides an overview of the various types of changes and updates within the DCAT-US specifications, shedding light on the evolution and adaptation of the data catalog standard. Each change type is categorized, and its significance is explained, ranging from the introduction of new properties to updates that align with the latest DCAT specifications. Understanding these changes is essential for data practitioners and stakeholders seeking to keep pace with the evolving landscape of data cataloging and data sharing standards.

Change TypeDescriptionNew!New DCAT-US 3.0 specific property that is not referred in DCAT specificationsAlignedProperty aligned with latest DCAT-3 specification that does not exist in DCAT-US 1.1FixedFixed property that is inconsistent with DCAT specificationNo ChangeNo change from DCAT-US 1.1 profileMultilingual SupportExtension of DCAT-US property to support multilingual values

AccessRestriction

The class "AccessRestriction" used by the National Archives and Records Administration (NARA) refers to a categorization or specification that denotes limitations or conditions imposed on the accessibility of certain records, documents, or information within their archival holdings. Access restrictions are employed to regulate and control access to sensitive or confidential content based on legal, ethical, security, or other relevant considerations. These restrictions may pertain to who can access the information, the purposes for which it can be accessed, and the conditions under which it can be utilized. The "AccessRestriction" class provides a structured framework for classifying and managing these access limitations within NARA's archival context, contributing to the proper governance and responsible dissemination of historical records and data.

RDF Class:dcat-us:AccessRestrictionDefinition:The "AccessRestriction" class used by NARA represents limitations placed on accessing specific records or information within their archives, ensuring controlled and responsible access based on legal, ethical, or security considerations.Usage note

The "AccessRestriction" class serves as a valuable tool within NARA's archival framework, enabling the organization to effectively manage and communicate access limitations for specific records or information. By employing this class, NARA can categorize and enforce controlled access to sensitive content, safeguarding confidentiality, adhering to legal requirements, and preserving the integrity of historical data. Researchers, archivists, and authorized users can rely on "AccessRestriction" to navigate and understand the accessibility parameters associated with archived materials, facilitating responsible information dissemination and usage.

Rationale The "AccessRestriction" class in the DCAT-US application profile is essential for categorizing and managing access restrictions according to NARA standards, ensuring responsible access to sensitive historical records. It enhances transparency, aiding researchers and authorized users in understanding and navigating access parameters for archived materials.

Properties Summary

PropertyURIRangeReqLevelCardrestriction status dcat-us:restrictionStatusskos:ConceptM1..1specific restriction dcat-us:specificRestrictionskos:ConceptR0..1restriction note dcat-us:restrictionNoterdfs:LiteralO0..1

Mandatory Properties

Property: restriction status

Propertyrestriction statusRequirement levelMandatoryCardinality1URIdcat-us:restrictionStatusRangeskos:ConceptDefinitionThe indication of whether or not there are access restrictions on the data.

Recommended Properties

Property: specific restriction

Propertyspecific restrictionRequirement levelRecommendedCardinality0..1URIdcat-us:specificRestrictionRangeskos:ConceptDefinitionThe authority of the restriction

Optional Properties

Property: restriction note

Propertyrestriction noteRequirement levelOptionalCardinality0..1URIdcat-us:restrictionNoteRangerdfs:LiteralDefinitionA note related to the access restriction

Example

Activity

RDF Class:prov:ActivityDefinition:An activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities.Usage note The activity associated with generation of a dataset will typically be an initiative, project, mission, survey, on-going activity ("business as usual"). mission or survey etc. Multiple prov:wasGeneratedBy properties can be used to indicate the dataset production context at various levels of granularity. Details about how to describe the activity that generated a dataset are out of scope for this applicition profile. prov:Activity provides for minimum basic properties for labeling and classification of activities. Rationale:

Integrating prov:Activity into the DCAT-US schema offers a streamlined, generic class to represent a myriad of operations, such as initiatives, projects, and ongoing activities, without the complexity of managing numerous specialized classes. This inclusion not only simplifies the representation of varied activities under a unified semantic framework but also enhances data provenance tracking and interoperability across diverse systems and domains. Consequently, it provides a flexible, future-proof approach to accommodate evolving types of activities without necessitating continual schema modifications.

Properties Summary

PropertyURIRangeReqLevelCardlabelrdfs:label``xsd:stringM1..ncategory dcterms:typeskos:ConceptO0..1

Mandatory Properties

Property: label

PropertylabelRequirement levelMandatoryCardinality1..nURIrdfs:labelRangexsd:stringUsage noteThis property is used to give a human-readable label for the activity.

Optional Properties

Property: category

PropertycategoryRequirement levelOptionalCardinality0..nURIdcterms:typeRangeskos:ConceptUsage note

Example

Address (Contact Point)

RDF Class:vcard:AddressObligationOptionalDefinition: Specify the components of the delivery address for a contact point Usage noteThis class is used only to associate an address with a contact point. When incorporating [[VCARD-RDF]] vcard:Address within DCAT-US, ensure to utilize its properties, such as vcard:street-address, vcard:locality, and vcard:country-name, to provide comprehensive and accurate address details for entities like organizations or publishers. Always adhere to consistent formatting across the catalog, be mindful of privacy considerations, especially for individual addresses, and validate the data regularly to maintain its accuracy and reliability. Rationale:Integrating [[VCARD-RDF]]'s contact point address into DCAT-US ensures a standardized, interoperable format for presenting address data

Properties Summary

PropertyURIRangeReqLevelCardadministrative areavcard:region``rdfs:LiteralR0..1cityvcard:locality``rdfs:LiteralR0..1country namevcard:country-name``rdfs:LiteralR0..1postal codevcard:postal-code``rdfs:LiteralR0..1street addressvcard:street-address``rdfs:LiteralR0..1

Recommended Properties

Property: administrative area

Propertyadministrative areaRequirement levelRecommendedCardinality0..1URIvcard:regionRangerdfs:LiteralUsage noteThe administrative area of an Address of the Kind. Depending on the country, this corresponds to a province, a county, a region, or a state.

Property: city

PropertycityRequirement levelRecommendedCardinality0..1URIvcard:localityRangerdfs:LiteralUsage noteThe city of an Address of the Kind.

Property: country

PropertycountryRequirement levelRecommendedCardinality0..1URIvcard:country-nameRangerdfs:LiteralUsage noteThe country of an Address of the Kind.

Property: postal code

Propertypostal codeRequirement levelRecommendedCardinality0..1URIvcard:postal-codeRangerdfs:LiteralUsage noteThe postal code of an Address of the Kind.

Property: street address

Propertystreet addressRequirement levelRecommendedCardinality0..1URIvcard:street-addressRangerdfs:LiteralUsage noteThe street name and civic number of an Address of the Kind.

Example

Address (Location)

RDF Class:locn:AddressDefinition:The address of a location.Usage noteThis class is used to define a location defined by an address. It should be used only with the property dcterms:spatial, not the contact point address property. Rationale:Incorporating locn:Address from the W3C Location ontology [[LOCN]] into DCAT-US provides a standardized, structured, and extensible format to represent physical addresses, facilitating consistent, interoperable, and precise sharing of location information across various datasets and digital platforms.

Properties Summary

PropertyURIRangeReqLevelCardadministrative area locn:adminUnitL2rdfs:LiteralR0..1city locn:postNamerdfs:LiteralR0..1country locn:adminUnitL1rdfs:LiteralR0..1postal code locn:postCoderdfs:LiteralR0..1street address locn:thoroughfarerdfs:LiteralR0..1

Recommended Properties

Property: administrative area

Propertyadministrative areaRequirement levelRecommendedCardinality0..1URIlocn:adminUnitL2Rangerdfs:LiteralUsage noteThe administrative area of an Address of the Agent. Depending on the country, this corresponds to a province, a county, a region, or a state.

Property: city

PropertycityRequirement levelRecommendedCardinality0..1URIlocn:postNameRangerdfs:LiteralUsage noteThe city of an Address of the Agent.

Property: country

PropertycountryRequirement levelRecommendedCardinality0..1URIlocn:adminUnitL1Rangerdfs:LiteralUsage noteThe country of an Address of the Agent.

Property: postal code

Propertypostal codeRequirement levelRecommendedCardinality0..1URIlocn:postCodeRangerdfs:LiteralUsage noteThe postal code of an Address of the Agent.

Property: street address

Propertystreet addressRequirement levelRecommendedCardinality0..1URIlocn:thoroughfareRangerdfs:LiteralUsage noteThe street name and civic number of an Address of the Agent.

Agent

RDF Class:foaf:AgentDefinition:An entity that acts on something (eg. person, group, software or physical artifact).Usage note

Use this class when refering to a software agent that is associated with Catalogs, Catalog Records, Data Services, or Datasets.
If the Agent is an organization, the use of the org:Organization is recommended.
If the Agent is a person, the use of foaf:Person is recommended

Rationale:The addition of the foaf:Agent class in DCAT-US 3.0 serves a dual purpose. Firstly, it allows for the representation of software agents, aligning with modern data automation needs. Secondly, it acts as an abstract class for both foaf:Person and org:Organization, promoting consistency and interoperability while simplifying resource descriptions within the dataset catalog.

Properties Summary

PropertyURIRangeReqLevelCardname foaf:namexsd:stringM1..1type dcterms:typeskos:ConceptR0..1

Mandatory Properties

Property: name

PropertynameRequirement levelMandatoryCardinality1URIfoaf:nameRangexsd:stringDefinitionThe name of the software agent

Recommended Properties

Property: type

PropertytypeRequirement levelRecommendedCardinality0..1URIdcterms:typeRangeskos:ConceptDefinitionThis property refers to a type of the Agent that makes the Catalog, Catalog Record, Data Service, or Dataset available

Example

Attribution

RDF Class:prov:AttributionDefinition:A responsibility of an Agent for a resource.Usage note Used to link to an Agent where the nature of the relationship is known but does not match one of the standard [[DCTERMS]] properties ( dcterms:creator, dcterms:contributor, dcterms:rightsHolder, and dcterms:publisher). Use dcat:hadRole on the prov:Attribution to capture the responsibility of the Agent with respect to the Resource.

Rationale

The inclusion of prov:Attribution in the DCAT profile enables clear data source attribution, promoting responsible data sharing and proper citation practices. It aligns the profile with data provenance best practices for accurate attribution in data sharing.

Properties Summary

PropertyURIRangeReqLevelCardagentprov:agentfoaf:AgentM1roledcat:hadRole``dcat:RoleM1

Mandatory Properties

Property: agent

PropertyagentRequirement levelMandatoryCardinality1URIprov:agentRangefoaf:AgentDefinitionThe prov:agent property references an Agent that plays a role in the resource

Property: role

PropertyroleRequirement levelMandatoryCardinality1URIdcat:hadRoleRangedcat:RoleDefinitionThe function of an entity or agent with respect to another entity or resource.

Example

Catalog

A Catalog or repository that hosts the Datasets or Data Services being described.

DCAT-US allows Catalogs of only Datasets, but also Catalogs of only Data Services

RDF Class:dcat:CatalogDefinition:A curated collection of metadata about resources (e.g., datasets and data services in the context of a data catalog)Sub-class of:dcat:DatasetUsage note

A Web-based data catalog is typically represented as a single instance of this class.
Populate metadata within the dcat:Catalog to facilitate resource discovery, including title, description, classifiers and other relevant information.
Specify the resources hosted within the catalog by linking them as dcat:dataset or dcat:service.

Rationale The update of the dcat:Catalog class aligns with the generalization of catalog scope in DCAT-US 3.0, accommodating catalogs of datasets, data services, or a mixture of both. It reflects the evolving landscape of data publication and discovery, allowing data publishers to describe and share their resources effectively. Additionally, by making Catalog a subclass of Dataset, DCAT-US promotes consistency in metadata representation and enables catalogs to be composed of other catalogs, promoting modularity and extensibility in the data catalog ecosystem. PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1titledcterms:title``rdfs:LiteralM1..nMultilingual supportdescriptiondcterms:description``rdfs:LiteralM1..nMultilingual supportpublisherdcterms:publisherfoaf:AgentM1..1Aligneddatasetdcat:dataset``dcat:DatasetM1..nNo Changehomepage foaf:homepagefoaf:DocumentR0..1Alignedlanguagedcterms:language``dcterms:LinguisticSystemR0..nAlignedlicensedcterms:license``dcterms:LicenseDocumentR0..1Alignedrelease datedcterms:issued``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1Alignedrightsdcterms:rights``dcterms:RightsStatementR0..nAlignedspatial/geographic coveragedcterms:spatial``dcterms:LocationR0..nAlignedthemesdcat:themeTaxonomy``skos:ConceptSchemeR0..nAlignedupdate/modification datedcterms:modified``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1Alignedschema versiondcterms:conformsTo``dcterms:StandardR0..1No Changecreatordcterms:creator``dcterms:AgentO0..nAlignedaccess rightsdcterms:accessRights``dcterms:RightsStatementO0..1Alignedcatalogdcat:catalog``dcat:CatalogO0..nAlignedcontact pointdcat:contactPoint``vcard:KindO0..nAlignedkeyword/tagdcat:keyword``rdfs:LiteralO0..nAlignedhas partdcterms:hasPart``dcat:CatalogO0..nAlignedcatalog recorddcat:record``dcat:CatalogRecordO0..nAlignedservicedcat:service``dcat:DataServiceO0..nAlignedtheme/categorydcat:theme``skos:ConceptO0..nAlignedidentifierdcterms:identifier``rdfs:LiteralO0..nAlignedrights holderdcterms:rightsHolderorg:OrganizationO0..1New subject dcterms:subject``skos:ConceptO0..nNew temporal coverage dcterms:temporal``dcterms:PeriodOfTimeO0..nAlignedqualified attributionprov:qualifiedAttribution``prov:AttributionO0..nAlignedcategory dcterms:typeskos:ConceptO0..1New!

Mandatory Properties

Property: title

PropertyTitleRequirement levelMandatoryCardinality1..nURIdcterms:titleRangerdfs:LiteralUsage note

The title of the catalog in the indicated language
This property can be repeated for parallel language versions of the description (see )

Property: description

PropertydescriptionRequirement levelMandatoryCardinality1..nURIdcterms:descriptionRangerdfs:LiteralDefinitionFree-text description of the catalog (in the language indicated in the attribute). Usage note

This property contains a free-text account of the data Catalog (in the language indicated in the attribute).
This property can be repeated for parallel language versions of the description (see ).

Property: publisher

PropertypublisherRequirement levelMandatoryCardinality1..1URIdcterms:publisherRangefoaf:AgentDefinitionEntity responsible for making the catalog available.Usage note

This property refers to an entity (organization) responsible for making the Catalog available.

Property: dataset

PropertydatasetRequirement levelMandatoryCardinality1..nURIdcat:datasetRangedcat:DatasetDefinitionDataset that is part of the catalog.Usage note

This property links the Catalog with a Dataset that is part of the Catalog.
As empty Catalogs are usually indications of problems, this property SHOULD be combined with the property service to implement an empty Catalog check.

Recommended Properties

Property: homepage

PropertyhomepageRequirement levelRecommendedCardinality0..1URIfoaf:homepageRangefoaf:DocumentUsage note

This property refers to a web page that acts as the main page for the Catalog
For instance data.gov, would be the homepage of the Data Catalog exported to [[?DATA-GOV]].

Property: language

PropertylanguageRequirement levelRecommendedCardinality0..nURIdcterms:languageRange dcterms:LinguisticSystemDefinitionA language used in the textual metadata describing titles, descriptions, etc. of the Datasets in the Catalogue.Usage noteResources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.Usage noteThe value(s) provided for members of a catalog (i.e., dataset or service) override the value(s) provided for the catalog if they conflict.Usage noteThis property can be repeated if the resources of the catalog are provided in multiple languages.

Property: license

PropertylicenseRequirement levelRecommendedCardinality0..1URIdcterms:licenseRangedcterms:LicenseDocumentUsage note

This property refers to the license under which the Catalog can be used or reused.
CV to be used: [[?DATA-GOV-LICENSE]]

Property: release date

Propertyrelease dateRequirement levelRecommendedCardinality0..1URIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note

This property contains the date of formal issuance (e.g., for publication of the Catalog).

Property: rights

PropertyrightsRequirement levelRecommendedCardinality0..nURIdcterms:rightsRangedcterms:RightsStatementUsage note

This property refers to a statement that specifies rights associated with the Catalog.

Property: spatial/geographic coverage

Propertyspatial/geographic coverageRequirement levelRecommendedCardinality0..nURIdcterms:spatialRangedcterms:LocationUsage note

This property refers to a geographical area covered by the Catalog.
Conventions to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[?GEONAMES]]

Property: themes

PropertythemesRequirement levelRecommendedCardinality0..nURIdcat:themeTaxonomyRange

skos:ConceptScheme

Usage note

This property refers to a knowledge organization system used to classify the Catalog's Datasets.
CV to be used: http://TBD/resource/dataset/data-theme

Property: update/modification date

Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modifiedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note

This property contains the most recent date on which the Catalog was modified.
The value of this property indicates a change to the actual resource, not a change to the catalog record. An absent value MAY indicate that the resource has never changed after its initial publication, or that the date of last modification is not known, or that the resource is continuously updated.

Property: schema version

Propertyschema versionDefinition:An established standard to which the described catalog conforms.Requirement levelRecommendedCardinality0..1URIdcterms:conformsToRangedcterms:StandardUsage note

This property SHOULD be used to indicate the model, schema, ontology, view or profile that the cataloged resource content conforms to.

Optional Properties

Property: creator

PropertycreatorRequirement levelOptionalCardinality0..nURIdcterms:creatorRangedcterms:AgentDefinition:The entity responsible for producing the resource.Usage note

Resources of type foaf:Agent are recommended as values for this property.

Property: access rights

Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRightsRangedcterms:RightsStatementUsage note

This property refers to information that indicates whether the Catalog is open data, has access restrictions or is not public.
CV to be used: [[?DATA-GOV-AR]].

Property: catalog

PropertycatalogRequirement levelOptionalCardinality0..nURIdcat:catalogRangedcat:CatalogUsage note

This property refers to a catalog whose contents are of interest in the context of this catalog.

Property: contact point

Propertycontact pointRequirement levelOptionalCardinality0..nURIdcat:contactPointRangevcard:KindUsage note

Relevant contact information for the cataloged resource. Use of vCard is recommended

Property: keyword/tag

Propertykeyword/tagRequirement levelOptionalCardinality0..nURIdcat:keywordRangerdfs:LiteralUsage note

A keyword or tag describing the resource.

Property: has part

Propertyhas partRequirement levelOptionalCardinality0..nURIdcterms:hasPartRangedcat:CatalogUsage note

This property refers to a related catalog that is part of the described catalog.

Property: catalog record

Propertycatalog recordDefinition:A record describing the registration of a single resource (e.g., a dataset, a data service) that is part of the catalog.Requirement levelOptionalCardinality0..nURIdcat:recordRangedcat:CatalogRecord

Property: service

PropertyserviceRequirement levelOptionalCardinality0..nURIdcat:serviceRangedcat:DataServiceUsage note

This property refers to a site or end-point (Data Service) that is listed in the Catalog.
As empty Catalogs are usually indications of problems, this property SHOULD be combined with the property Dataset to implement an empty Catalog check.

Property: theme/category

Propertytheme/categoryRequirement levelOptionalCardinality0..nURIdcat:themeRangeskos:ConceptUsage note

This property refers to a category of the Catalog. A Catalog may be associated with multiple themes.
CV to be used: [[?DATA-GOV-THEME]]

Property: identifier

PropertyidentifierRequirement levelOptionalCardinality0..nURIdcterms:identifierRangerdfs:LiteralUsage note

This property contains the main identifier for the Catalog, e.g. the URI or other unique identifier.

Property:rights holder

Propertyrights holderRequirement levelOptionalCardinality0..nURIdcterms:rightsHolderRangeorg:OrganizationUsage note

This property refers to an organization holding rights on the Catalog.

Property: subject

PropertysubjectRequirement levelOptionalCardinality0..nURIdcterms:subjectRangeskos:Concept

Property: temporal coverage

Propertytemporal coverageRequirement levelOptionalCardinality0..nURIdcterms:temporalRangedcterms:PeriodOfTimeUsage note

This property refers to a temporal period that the Catalog covers.

Property: qualified attribution

Propertyqualified attributionRequirement levelOptionalCardinality0..nURIprov:qualifiedAttributionRangeprov:AttributionUsage note

This property refers to a link to an Agent having some form of responsibility for the Catalog.

Property: category

PropertycategoryRequirement levelOptionalCardinality0..1URIdcterms:typeRangeskos:ConceptUsage note

The category of the Catalog

Example

Catalog Record

RDF Class:dcat:CatalogRecordDefinition:A record in a catalog, describing the registration of a single dcat:Resource.Usage note This class is optional and not all catalogs will use it. It exists for catalogs where a distinction is made between metadata about a dataset or service and metadata about the entry in the catalog about the dataset or service. For example, the publication date property of the dataset reflects the date when the information was originally made available by the publishing agency, while the publication date of the catalog record is the date when the dataset was added to the catalog. In cases where both dates differ, or where only the latter is known, the publication date SHOULD only be specified for the catalog record. Notice that the W3C PROV Ontology [[PROV-O]] allows describing further provenance information such as the details of the process and the agent involved in a particular change to a dataset or its registration. Rationale While its use is not mandatory, the incorporation of dcat:CatalogRecord into DCAT-US 3.0 holds significant value. It enables catalogs to distinguish between metadata describing datasets or services and the actual catalog entries. This differentiation proves especially advantageous for ensuring adherence to application profiles that demand specific metadata for catalog records. Furthermore, it streamlines resource lifecycle management, empowering catalogs to monitor alterations and revisions to their entries, ultimately bolstering data governance and quality assurance protocols.

Properties Summary

PropertyURIRangeReqLevelCardapplication profiledcterms:conformsTo``dcterms:StandardR0..1change typeadms:status``skos:ConceptR0..1descriptiondcterms:description``rdfs:LiteralO0..nlanguagedcterms:language``dcterms:LinguisticSystemO0..nlisting datedcterms:issued``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..nupdate/modification datedcterms:modified``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) M1..1primary topic foaf:primaryTopicdcat:ResourceM1..1source metadatadcterms:source``dcat:ResourceO0..1titledcterms:title``rdfs:LiteralO0..n

Mandatory Properties

Property: update/modification date

Propertyupdate/modification dateRequirement levelMandatoryCardinality1..1URIdcterms:modifiedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThe most recent date on which the Catalog Record's entry was changed or modified.

Property: primary topic

Propertyprimary topicRequirement levelMandatoryCardinality1..1URIfoaf:primaryTopicRangedcat:ResourceDefinitionA link to the Dataset, Data service or Catalog described in the Catalog Record.Usage noteA catalog record will refer to one entity in a catalog. This can be either a Dataset or a Data Service. To ensure an unambigous reading of the cardinality the range is set to Cataloged Resource. However it is not the intend with this range to require the explicit use of the class Cataloged Record. As abstract class, an subclass should be used.

Recommended Properties

Property: application profile

Propertyapplication profileRequirement levelRecommendedCardinality0..1URIdcterms:conformsToRangedcterms:StandardDefinitionAn Application Profile that the Catalog Record's metadata conforms to.

Property: change type

Propertychange typeRequirement levelRecommendedCardinality0..1URIadms:statusRangeskos:ConceptDefinitionThe status of the catalog record in the context of editorial flow of the dataset and data service descriptions.

Optional Properties

Property: description

PropertydescriptionRequirement levelOptionalCardinality0..nURIdcterms:descriptionRangerdfs:LiteralDefinitionA free-text account of the Catalog Record. This property can be repeated for parallel language versions of the description.

Property: language

PropertylanguageRequirement levelOptionalCardinality0..nURIdcterms:languageRangedcterms:LinguisticSystemDefinitionA language used in the textual metadata describing titles, descriptions, etc. of the members of the catalog. Usage noteResources defined by the Library of Congress [[ISO 639-1]] SHOULD be used.Usage noteThis property can be repeated if the metadata is provided in multiple languages.

Property: listing date

Propertylisting dateRequirement levelOptionalCardinality0..nURIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThe date on which the description of the Dataset was included in the Catalog.

Property: source metadata

Propertysource metadataRequirement levelOptionalCardinality0..1URIdcterms:sourceRangedcat:ResourceDefinitionThe original metadata that was used in creating metadata for the datasets, data services, or catalogs in the Catalog Record.

Property: title

PropertytitleRequirement levelOptionalCardinality0..nURIdcterms:titleRangerdfs:LiteralDefinitionA name given to the Catalog Record.Usage noteThis property can be repeated for parallel language versions of the name.

Example

Checksum

RDF Class:spdx:ChecksumDefinition:A Checksum is a value that allows to check the integrity of the contents of a file. Even small changes to the content of the file will change its checksum. This class allows the results of a variety of checksum and cryptographic message digest algorithms to be represented [[SPDX]].Usage note

The Checksum includes the algorithm ( spdx:algorithm) and value ( spdx:checksumValue) that allows the integrity of a file to be verified to ensure no errors occurred in transmission or storage.

Rationale:Introducing the spdx:Checksum class in DCAT-US bolsters data integrity and trust by ensuring datasets remain unaltered during transfers. This standardized approach promotes consistency across catalogs, facilitates error detection, and adapts to evolving cryptographic needs, enhancing the utility of automated tools.

Properties Summary

PropertyURIRangeReqLevelCardalgorithmspdx:algorithm``spdx:ChecksumAlgorithmM1..1checksum valuespdx:checksumValue``xsd:hexBinaryM1..1

Mandatory Properties

Property: algorithm

PropertyalgorithmRequirement levelMandatoryCardinality1..1URIspdx:algorithmRangespdx:ChecksumAlgorithmDefinitionThe algorithm used to produce the subject Checksum.

Property: checksum value

Propertychecksum valueRequirement levelMandatoryCardinality1..1URIspdx:checksumValueRangexsd:hexBinaryDefinitionA lower case hexadecimal encoded digest value produced using a specific algorithm.

Example

Concept

RDF Class:skos:ConceptDefinition:A controlled vocabulary term used to classify Catalog, Dataset, or Data Service.Usage note

Following FAIR Vocabulary principles, Concept URI should be made resolvable and accessible using SKOS encoding and provided in Linked Data format (RDF/XML,TTL, JSON-LD, NTriples)
Ensure FAIR Resolvability: Make Concept URIs resolvable using FAIR principles, allowing them to be Findable, Accessible, Interoperable, and Reusable. This ensures that skos:Concept instances can be easily discovered, accessed, integrated with other resources, and reused across the DCAT-US ecosystem, promoting data interoperability and accessibility.
To enhance data interoperability and consistency, it is advisable to reuse established controlled vocabularies such as Global Change Master Directory (GCMD) [[?GCMD]], Agrovoc, and NAICS for data description.

Rationale: The inclusion of skos:Concept in DCAT-US 3.0 enhances semantic search in catalogs, enabling more accurate discovery of Catalogs, Datasets, and Data Services. It improves user experience, promotes data discoverability, and supports better resource utilization. Additionally, it aligns with international standards like SKOS, ensuring compatibility and adherence to recognized controlled vocabulary practices.

Properties Summary

PropertyURIRangeReqLevelCardalternate labelskos:altLabel``rdfs:LiteralO0..ndefinitionskos:definition``rdfs:LiteralR0..nin schemeskos:inScheme``skos:ConceptSchemeM1..1notationskos:notation``xsd:stringO0..npreferred labelskos:prefLabel``rdfs:LiteralM1.n

Mandatory Properties

Property: preferred label

Propertypreferred labelRequirement levelMandatoryCardinality1..nURIskos:prefLabelRangerdfs:LiteralDefinitionPreferred label for the controlled vocabulary term (one per language).

Property: in scheme

Propertyin schemeRequirement levelMandatoryCardinality1URIskos:inSchemeRangeskos:ConceptSchemeDefinitionConcept scheme defining the concept.

Recommended Properties

Property: definition

PropertydefinitionRequirement levelRecommendedCardinality0..nURIskos:definitionRangerdfs:LiteralDefinitiondefinition of the controlled vocabulary term.

Optional Properties

Property: alternate label

Propertyalternate labelRequirement levelOptionalCardinality0..nURIskos:altLabelRangerdfs:LiteralDefinitionAlternative labels for a concept.

Property: notation

PropertynotationRequirement levelOptionalCardinality0..nURIskos:notationRangexsd:stringDefinitionAbbreviations or codes from code lists for an organization.

Example

Concept Scheme

RDF Class:skos:ConceptSchemeDefinition:A concept collection (e.g. controlled vocabulary) in which a concept is defined.Usage note

Following FAIR Vocabulary principles, Concept Scheme URI should be made resolvable and accessible using SKOS encoding and provided in Linked Data format (RDF/XML,TTL, JSON-LD, NTriples)
To enhance data interoperability and consistency, it is advisable to reuse established controlled vocabularies such as Global Change Master Directory (GCMD) [[?GCMD]], Agrovoc, and NAICS for data description.

Rationale:The introduction of skos:ConceptScheme in DCAT-US 3.0 enhances data resource organization, categorization, and accessibility. It provides a structured framework for controlled vocabularies, aligning with FAIR Vocabulary principles for improved data interoperability and discoverability.

Properties

PropertyURIRangeReqLevelCardtitle dcterms:titlerdfs:LiteralM1..ndescription dcterms:descriptionrdfs:LiteralR0..ncreation date dcterms:createdrdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..1publication date dcterms:issuedrdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..1update/modification date dcterms:modifiedrdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..1version infodcat:version``xsd:stringO0..1

Mandatory Properties

Property: title

PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:titleRangerdfs:LiteralDefinitionThe title of the concept scheme in the indicated language.Usage noteOnly one title per language.

Recommended Properties

Property: description

PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:descriptionRangerdfs:LiteralDefinitionThis property contains a description of the Concept Scheme.Usage noteMay be repeated for translations in different languages.

Optional Properties

Property: creation date

Propertycreation dateRequirement levelOptionalCardinality0..1URIdcterms:createdRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThis property contains the date on which the Concept Scheme has been first created.

Property: publication date

Propertypublication dateRequirement levelOptionalCardinality0..1URIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThis property contains the date of formal issuance (e.g., publication) of the Concept Scheme.

Property: update/modification date

Propertyupdate/modification dateRequirement levelOptionalCardinality0..1URIdcterms:modifiedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThis property contains the most recent date at which the Concept Scheme was changed or modified.

Property: version info

Propertyversion infoRequirement levelOptionalCardinality0..1URIdcat:versionRangexsd:stringDefinitionThis property contains a version number or other version designation of the Concept Scheme.

Examples

Contact

RDF Class:vcard:KindDefinition:Point of Contact informationRationale: The introduction of vcard:Kind in DCAT-US 3.0 is driven by the need for standardized, reliable, and interoperable Point of Contact information, ultimately improving the accessibility and usability of data resources within the DCAT-US ecosystem.

Properties Summary

PropertyURIRangeReqLevelCardformatted namevcard:fn``xsd:stringM1emailvcard:hasEmail``rdfs:ResourceM1telephonevcard:tel``rdfs:ResourceO0..1organization namevcard:organization-name``xsd:stringO0..1family namevcard:family-name``xsd:stringO0..1given namevcard:given-name``xsd:stringO0..1position titlevcard:title``xsd:stringO0..1has uidvcard:hasUID``xsd:stringO0..1addressvcard:address``vcard:AddressO0..n

Mandatory Properties

Property: formatted name

Propertyformatted nameRequirement levelMandatoryCardinality1URIvcard:fnRangexsd:stringDefinitionThe formatted text corresponding to the name of the contact

Property: email

PropertyemailRequirement levelMandatoryCardinality1URIvcard:hasEmailRangerdfs:ResourceDefinitionThe email address of the contact.Usage noteUse email with function name instead of individual name (e.g. support). The email address should be formatted as url starting with "mailto:" scheme

Optional Properties

Property: telephone

PropertytelephoneRequirement levelOptionalCardinality0..1URIvcard:telRangerdfs:ResourceDefinitionThis property specifies the telephone number for telephony communication with the person or organization.

Property: organization name

Propertyorganization nameRequirement levelOptionalCardinality0..1URIvcard:organization-nameRangexsd:stringDefinitionThis property specifies the name of the organization to contact

Property: family name

Propertyfamily nameRequirement levelOptionalCardinality0..1URIvcard:family-nameRangexsd:stringDefinitionThis property specifies the family name of the person to contact

Property: given name

Propertygiven nameRequirement levelOptionalCardinality0..1URIvcard:given-nameRangexsd:stringDefinitionThis property specifies the given name of the person to contact

Property: position title

PropertytitleRequirement levelOptionalCardinality0..1URIvcard:titleRangexsd:stringDefinitionThis property specifies the position role of the person to contact

Property: has UID

PropertyhasUIDRequirement levelOptionalCardinality0..1URIvcard:hasUIDRangexsd:stringDefinitionThis property specifies a value that represents a globally unique identifier corresponding to the contact (could also be used as URI component of the contact)Usage NoteThe hasUID property is used to assign a unique identifier to a contact associated with a dataset or catalog. This identifier, which is optional and should be a string, ensures that each contact can be distinctly recognized and referenced. The utility of this property is particularly evident in scenarios where contacts need to be uniquely identified across different datasets or catalogs, preventing any ambiguity. It can also serve as a part of a URI for a contact, providing a consistent and resolvable identifier. Implementers are encouraged to use a globally unique string value, such as a ORCID or a URI that is guaranteed to be unique, to facilitate unambiguous identification and referencing of contacts.

Property: address

PropertyaddressRequirement levelOptionalCardinality0..nURIvcard:addressRangevcard:AddressDefinitionThis property specifies the address of the contact

Example

CUI Restriction

RDF Class:dcat-us:CuiRestrictionDefinition:Represents Controlled Unclassified Information (CUI), which is information that requires safeguarding or dissemination controls in accordance with applicable laws, regulations, and government-wide policies but is not classified as confidential.Usage note

The CUI Restriction class is designed to capture information related to Controlled Unclassified Information (CUI) in accordance with NARA guidelines.
Users of this class must provide the mandatory properties, i.e the CUI banner marking and designation indicator, to accurately describe the CUI status of a resource.
The optional property, "required indicator per authority," allows for additional information or context about CUI restrictions, providing flexibility for specific use cases.

Rationale:

The introduction of the dcat-us:CuiRestriction class in DCAT-US 3.0 is driven by the need for compliance with National Archives and Records Administration (NARA) guidelines regarding Controlled Unclassified Information (CUI). This addition ensures that DCAT-US aligns with NARA's standards, promotes transparency, facilitates compliance audits, and supports efficient resource management. Ultimately, it enhances data interoperability and security within the government data ecosystem.

Properties Summary

PropertyURIRangeReqLevelCardCUI banner marking dcat-us:cuiBannerMarkingxsd:stringM1..1CUI designation indicator dcat-us:designationIndicatorxsd:stringM1..1required indicator per authority dcat-us:requiredIndicatorPerAuthorityxsd:stringO0..n

Mandatory Properties

Property: CUI designation indicator

PropertyCUI designation IndicatorRequirement levelMandatoryCardinality1URIdcat-us:designationIndicatorRangexsd:stringDefinitionDesignation Indicator shows which agency made the document CUIUsage note

Free text per NARA Marking Guidebook and DODI 5200.48 (should have at least "Controlled by:").
It is best practice to include contact information.

Optional Properties

Property: required indicator per authority

Propertyrequired indicator per authorityRequirement levelOptionalCardinality0..nURIdcat-us:requiredIndicatorPerAuthorityRangexsd:stringDefinitionfree text (e.g., text of the category description or the distribution statement)

Example

Data Service

RDF Class:dcat:DataServiceDefinition: A collection of operations that provides access to one or more datasets or data processing functions. Sub-class of:dcat:ResourceSub-class of:dctype:ServiceUsage note

If a dcat:DataService is bound to one or more specified Datasets, they are indicated by the dcat:servesDataset property.
The kind of service can be indicated using the dcterms:type property. Its value may be taken from a controlled vocabulary such as the Data.GOV spatial data service type code list [[?DATA-GOV-SDST]].

Rationale: Introducing dcat:DataService is essential as it clarifies the representation of data services, addressing the confusion caused by using dcat:Distribution to describe services in DCAT 1. This addition promotes clear communication of service-related information, improving discoverability, and facilitating seamless integration and usage by data consumers and applications. PropertyURIRangeReqLevelCardendpoint URLdcat:endpointURL``rdfs:ResourceM1..ncontact pointdcat:contactPoint``vcard:KindM1..npublisherdcterms:publisherfoaf:AgentM1..1titledcterms:title``rdfs:LiteralM1..nendpoint descriptiondcat:endpointDescription``rdfs:ResourceR0..nlicensedcterms:license``dcterms:LicenseDocumentR0..1serves datasetdcat:servesDataset``dcat:DatasetR0..nkeyword/tagdcat:keyword``rdfs:LiteralO0..nspatial resolution in meters dcat:spatialResolutionInMetersrdfs:Literal typed as xsd:decimalO0..ntemporal resolution dcat:temporalResolutionrdfs:Literal typed as xsd:durationO0..ntheme/categorydcat:theme``skos:ConceptO0..naccess rightsdcterms:accessRights``dcterms:RightsStatementO0..1conforms todcterms:conformsTo``dcterms:StandardO0..ncreation datedcterms:created``rdfs:Literal typed as xsd:date or xsd:dateTimeO0..1creatordcterms:creator``dcterms:AgentO0..ndescriptiondcterms:description``rdfs:LiteralOO..nidentifierdcterms:identifier``rdfs:LiteralO0..nlanguagedcterms:language``dcterms:LinguisticSystemO0..nupdate/modification datedcterms:modified``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..1rightsdcterms:rights``dcterms:RightsStatementO0..nrights holderdcterms:rightsHolderorg:OrganizationO0..nspatial/geographic coveragedcterms:spatial``dcterms:LocationO0..nstatusadms:status``skos:ConceptO0..1termporal coveragedcterms:temporal``dcterms:PeriodOfTimeO0..ncategory dcterms:typeskos:ConceptO0..1quality measurement dqv:hasQualityMeasurementdqv:QualityMeasurementO0..nqualified attributionprov:qualifiedAttribution``prov:AttributionO0..nwas used byprov:wasUsedBy``prov:ActivityO0..ngeographic bounding box dcat-us:geographicBoundingBoxdcat-us:GeographicBoundingBoxO0..n

Mandatory Properties

Property: endpoint URL

RDF Propertydcat:endpointURLRequirement levelMandatoryCardinality1..nURIdcat:endpointURLRangerdfs:ResourceUsage note The root location or primary endpoint of the service (a Web-resolvable IRI)

Property: contact point

Propertycontact pointRequirement levelMandatoryCardinality1..nURIdcat:contactPointRangevcard:KindDefinitionThis property contains contact information that can be used for sending comments about the Data Service.Usage note

This property MUST contain an email address that is continuously monitored by the data publisher.
If there are several contributors involved in the publication of the Dataset, the property can be used multiple times.

Property: publisher

PropertypublisherRequirement levelMandatoryCardinality1..1URIdcterms:publisherRangefoaf:AgentDefinitionThis property refers to an entity (organization) responsible for making the Data Service available. Usage note This property refers to an entity (organization) responsible for making the Catalog available.

Property: title

PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:titleRangerdfs:LiteralUsage note

The title of the catalog in the indicated language
This property can be repeated for parallel language versions of the description (see )

Recommended Properties

Property: endpoint description

Propertyendpoint descriptionRequirement levelRecommendedCardinality0..nURIdcat:endpointDescriptionDefinition: A description of the services available via the end-points, including their operations, parameters etc. Domaindcat:DataServiceRangerdfs:ResourceUsage note

The endpoint description gives specific details of the actual endpoint instances, while dcterms:conformsTo is used to indicate the general standard or specification that the endpoints implement.
An endpoint description may be expressed in a machine-readable form, such as an OpenAPI (Swagger) description [[?OpenAPI]], an OGC GetCapabilities response [[?WFS]], [[?ISO-19142]], [[?WMS]], [[?ISO-19128]], a SPARQL Service Description [[?SPARQL11-SERVICE-DESCRIPTION]], an [[?OpenSearch]] or [[?WSDL20]] document, a Hydra API description [[?HYDRA]], else in text or some other informal mode if a formal representation is not possible.

Property: license

PropertylicenseRequirement levelRecommendedCardinality0..1URIdcterms:licenseRangedcterms:LicenseDocumentDefinitionThis property refers to the license under which the Data Service is made available.Usage noteCV to used: [[?DATA-GOV-LICENSE]]

Property: serves dataset

Propertyserves datasetRequirement levelRecommendedCardinality0..nURIdcat:servesDatasetRangedcat:DatasetDefinitionThe Dataset that is served by this data service.Usage note This property refers to a collection of data that this data service can distribute.

Optional Properties

Property: keyword/tag

Propertykeyword/tagRequirement levelOptionalCardinality0..nURIdcat:keywordRangerdfs:LiteralDefinition This property contains a keyword or tag describing the Data Service.

Property: spatial resolution in meters

Propertyspatial resolution in metersRequirement levelOptionalCardinality0..nURIdcat:spatialResolutionInMetersRangerdfs:Literal typed as xsd:decimalDefinition This property refers to the minimum spatial separation resolvable in a Data Service, measured in meters.

Property: temporal resolution

Propertytemporal resolutionRequirement levelOptionalCardinality0..nURIdcat:temporalResolutionRangerdfs:Literal typed as xsd:durationDefinition The minimum time period resolvable by the Data Service.

Property: theme/category

Propertytheme/categoryRequirement levelOptionalCardinality0..nURIdcat:themeRangeskos:ConceptDefinitionThis property refers to a theme of the Data Service. A Data Service may be associated with multiple themes.Usage noteCV to be used: [[?DATA-GOV-THEME]]

Property: access rights

Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRightsRangedcterms:RightsStatementDefinitionThis property MAY include information regarding access or restrictions based on privacy, security, or other policies.Usage noteCV must be used: [[?DATA-GOV-AR]]

Property: conforms to

Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:conformsToRangedcterms:StandardDefinition This property is used to indicate the general standard or specification that the Data Service endpoints implement.

Property: creation date

Propertycreation dateRequirement levelOptionalCardinality0..1URIdcterms:createdRangerdfs:Literal typed as xsd:date or xsd:dateTimeDefinition This property contains the date on which the Data Service has been first created.

Property: creator

PropertycreatorRequirement levelOptionalCardinality0..nURIdcterms:creatorRangefoaf:AgentUsage note This property refers to the Agent primarily responsible for producing the Data Service.

Property: description

PropertydescriptionRequirement levelOptionalCardinality0..nURIdcterms:descriptionRangerdfs:LiteralDefinitionThis property contains a free-text account of the Data Service.Usage noteThis property can be repeated for parallel language versions of the description (see ). On the user interface of data portals, the content of the element whose language corresponds to the display language selected by the user is displayed.

Property: identifier

PropertyidentifierRequirement levelOptionalCardinality0..nURIdcterms:identifierRangerdfs:LiteralDefinition This property contains the main identifier for the Data Service, e.g. the URI or other unique identifier in the context of the Catalog.

Property: language

PropertylanguageRequirement levelOptionalCardinality0..nURIdcterms:languageRangedcterms:LinguisticSystemDefinitionThis property refers to a language supported by the Data Service. This property can be repeated if multiple languages are supported in the Data Service.Usage noteResources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.Usage noteThis property can be repeated if the service is provided in multiple languages. (e.g. map service rendering maps in spanish or english)

Property: update/modification date

Propertyupdate/modification dateRequirement levelOptionalCardinality0..1URIdcterms:modifiedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Definition This property contains the most recent date on which the Data Service was changed or modified.

Property: rights

PropertyrightsRequirement levelOptionalCardinality0..nURIdcterms:rightsRangedcterms:RightsStatementDefinition A statement that concerns all rights for the Data Service not addressed with dcterms:license or dcterms:accessRights, such as copyright statements.

Property: rights holder

Propertyrights holderRequirement levelOptionalCardinality0..nURIdcterms:rightsHolderRangeorg:OrganizationDefinition This property refers to an Agent (organization) holding rights on the Data Service.

Property: spatial/geographic coverage

Propertyspatial/geographic coverageRequirement levelOptionalCardinality0..nURIdcterms:spatialRangedcterms:LocationDefinitionThis property refers to a geographic region that is covered by the Data Service.Usage noteTO DISCUSS: Conventions to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[GEONAMES]]

Property: temporal coverage

Propertytemporal coverageRequirement levelOptionalCardinality0..nURIdcterms:temporalRangedcterms:PeriodOfTimeDefinition This property refers to a temporal period that the Data Service covers.

Property: category

PropertycategoryRequirement levelOptionalCardinality0..1URIdcterms:typeRangeskos:ConceptDefinitionCategory of the data serviceUsage note This property SHOULD take as value one of the URIs of a concept defined in service type taxonomy or code list.

Property: quality measurement

Propertyquality measurementRequirement levelOptionalCardinality0..nURIdqv:hasQualityMeasurementRangedqv:QualityMeasurementDefinitionRefers to the performed quality measurements.It represents the evaluation of a given dataset against a specific quality metricUsage noteUse for quality measurements of data services (availability,response time, reliability)

Property: qualified attribution

Propertyqualified attributionRequirement levelOptionalCardinality0..nURIprov:qualifiedAttributionRangeprov:AttributionDefinition This property refers to a link to an Agent having some form of responsibility for the Data Service.

Property: status

PropertystatusRequirement levelOptionalCardinality0..1URIadms:statusRangeskos:ConceptUsage note This property refers to the maturity of the Data Service. It MUST take one of the values Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]] vocabulary.

Property: was used by

Propertywas used byRequirement levelOptionalCardinality0..nURIprov:wasUsedByRangeprov:ActivityDefinitionThis property refers to an Activity that used the Data Service.Usage noteThis property MAY be used to specify a testing Activity over a Data Service, against a given Standard, producing as output a conformance degree.

Property: geographic bounding box

Propertygeographic bounding boxRequirement levelOptionalCardinality0..nURIdcat-us:geographicBoundingBoxRangedcat-us:GeographicBoundingBoxDefinition This property describes the spatial extent of domain of application of an data service and is standardized in WGS 84 Lat/Long coordinate system.

Example

Dataset

A Dataset is a collection of data, published or curated by a single source and related by a common idea or concept. In contrast to a Data Service a Dataset is expected to be a collection of data that is available for access or download in one or more formats, as Distributions. Distributions belonging to the same Dataset should not differ in regards to the idea of the data that they represent. They may differ in regards to the physical representation of the data such as format or resolution. Or they may split the data of the dataset into portions of comparable size such as data per time period or location.

DCAT 3 provides guidelines about the usage of Data services and Distribution in relation to Datasets [[VOCAB-DCAT-3]].:

RDF Class:dcat:DatasetDefinition:A collection of data, published or curated by a single agent, and available for access or download in one or more representations.Subclass Of:dcat:ResourceUsage note

This class describes the conceptual dataset. One or more representations might be available, with differing schematic layouts and formats or serializations.
This class describes the actual dataset as published by the dataset provider. In cases where a distinction between the actual dataset and its entry in the catalog is necessary (because metadata such as modification date might differ), the dcat:CatalogRecord class can be used for the latter.
The notion of dataset in DCAT is broad and inclusive, with the intention of accommodating resource types arising from all communities. Data comes in many forms including numbers, text, pixels, imagery, sound and other multi-media, and potentially other types, any of which might be collected into a dataset.

Rationale:The update of dcat:Dataset is crucial as it aligns the DCAT profile with international standards, offering a standardized and widely recognized way to describe datasets. This alignment enhances data interoperability and discoverability, enabling data publishers to provide structured metadata, improving data sharing, and facilitating seamless integration for users and applications.PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1titledcterms:title``rdfs:LiteralM1..nMultilingual supportdescriptiondcterms:description``rdfs:LiteralM1..nMultilingual supportcontact pointdcat:contactPoint``vcard:KindR0..nNo Changedata dictionary dcat-us:describedBydcat:DistributionR0..1Fixeddataset distributiondcat:distribution``dcat:DistributionR0..nNo Changeidentifierdcterms:identifier``rdfs:LiteralR0..nFixedspatial/geographic coveragedcterms:spatial``dcterms:LocationR0..nFixedkeyword/tagdcat:keyword``rdfs:LiteralR0..nNo Changelanding pagedcat:landingPagefoaf:DocumentR0..nNo Changeupdate/modification datedcterms:modified``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1No Changepublisherdcterms:publisherfoaf:AgentR0..1No Changegeographic bounding box dcat-us:geographicBoundingBoxdcat-us:GeographicBoundingBoxR0..nNew temporal coverage dcterms:temporal``dcterms:PeriodOfTimeR0..nFixedtheme/categorydcat:theme``skos:ConceptR0..nFixedaccess rightsdcterms:accessRights``dcterms:RightsStatementO0..1Alignedconforms todcterms:conformsTo``dcterms:StandardO0..nNo Changecontributordcterms:contributor``dcterms:AgentO0..nNew creator dcterms:creator``dcterms:AgentO0..nAligneddocumentation foaf:pagefoaf:DocumentO0..nNewdcterms:accrualPeriodicity``dcterms:FrequencyO0..1Fixedhas versiondcat:hasVersion``dcat:DatasetO0..nAlignedimageschema:image``schema:url or schema:ImageObjectO0..nNewdcat:inSeries``dcat:DatasetSeriesO0..nAlignedis referenced bydcterms:isReferencedBy``rdfs:ResourceO0..nAlignedlanguagedcterms:language``dcterms:LinguisticSystemO0..nFixedliability statement dcat-us:liabilityStatementdcat-us:LiabilityStatementO0..1New metadata distribution dcat-us:metadataDistributiondcat:DistributionO0..nNew next dcat:next``dcat:DatasetO0..1Alignedother identifieradms:identifieradms:IdentifierO0..nNew purpose dcat-us:purposerdfs:LiteralO0..nNew prev dcat:prev``dcat:DatasetO0..1Alignedprovenancedcterms:provenance``dcterms:ProvenanceStatementO0..nNew qualified attribution prov:qualifiedAttribution``prov:AttributionO0..nAlignedqualified relationdcat:qualifiedRelation``dcat:RelationshipO0..nAlignedrelated resourcedcterms:relation``rdfs:ResourceO0..nAlignedrelease datedcterms:issued``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..1No Changerightsdcterms:rights``dcterms:RightsStatementO0..nFixedsampleadms:sample``dcat:DistributionO0..nNew scope note skos:scopeNote``rdfs:LiteralO0..nNew source dcterms:source``dcat:DatasetO0..nNew status adms:status``skos:ConceptO0..1Alignedsubjectdcterms:subject``skos:ConceptO0..nNew quality measurement dqv:hasQualityMeasurementdqv:QualityMeasurementO0..nAlignedspatial resolution in metersdcat:spatialResolutionInMeters``rdfs:Literal (typed as xsd:decimal)O0..nAlignedtemporal resolutiondcat:temporalResolution``rdfs:Literal (typed as xsd:duration) O0..nAlignedcategorydcterms:type``skos:ConceptO0..1Alignedversiondcat:version``rdfs:LiteralO0..nAlignedversion notesadms:versionNotes``rdfs:LiteralO0..nNew was generated by prov:wasGeneratedBy``prov:ActivityO0..nNew!

Mandatory Properties

Property: title

PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:titleRangerdfs:LiteralDefinitionThis property contains a name given to the Dataset.Usage noteThis property can be repeated for parallel language versions of the title (see Multilingualism).

Property: description

PropertydescriptionRequirement levelMandatoryCardinality1..nURIdcterms:descriptionRangerdfs:LiteralDefinitionThis property contains a free-text account of the Dataset.Usage noteThis property can be repeated for parallel language versions of the description (see Multilingualism). On the user interface of data portals, the content of the element whose language corresponds to the display language selected by the user is displayed.

Recommended Properties

Property: contact point

Propertycontact pointRequirement levelRecommendedCardinality0..nURIdcat:contactPointRangevcard:KindUsage note

This property contains contact information that can be used for sending comments about the Dataset.
This property MUST contain an email address that is continuously monitored by the data publisher.
If there are several contributors involved in the publication of the Dataset, the property can be used multiple times.

Property: dataset distribution

Propertydataset distributionCardinality0..nRequirement levelRecommendedURIdcat:distributionRangedcat:DistributionUsage note

This property links the Dataset to an available Distribution.
In exceptional cases, a Dataset for which no distribution form exists (yet) can be described in the Catalog. In this case, the element dcat:distribution may be omitted.

Property: identifier

PropertyidentifierRequirement levelMandatoryCardinality0..nURIdcterms:identifierRangerdfs:LiteralUsage note

This property contains a unique identifier for the Dataset, e.g. a URI or other unique identifier in the context of the Catalog.

Property: spatial/geographic coverage

Propertyspatial/geographic coverageRequirement levelRecommendedCardinality0..nURIdcterms:spatialRangedcterms:LocationUsage note

This property refers to a geographic region that is covered by the Dataset.
CV to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[GEONAMES]]

Property: keyword/tag

Propertykeyword/tagRequirement levelRecommendedCardinality0..nURIdcat:keywordRangerdfs:LiteralUsage note

This property contains a keyword or tag describing the Dataset.
Good practice: mark the language of the keywords with the [[ISO 639-1]] language code such as "geodata"@en.

Property: landing page

Propertylanding pageRequirement levelRecommendedCardinality0..nURIdcat:landingPageRangefoaf:DocumentUsage note

This property refers to a web page that provides access to the Dataset, its Distributions and/or additional information.
It is intended to point to a landing page at the original data provider, not to a page on a site of a third party, such as an aggregator.

Property: update/modification date

Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modifiedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note

This property contains the most recent date on which the Dataset was changed or modified.
No value may indicate that the Dataset has never changed after its initial publication, or that the date of the last modification is not known, or that the Dataset is continuously updated
This property MUST only be set if the distributions (the actual data) that the Dataset describes have been updated after it has been issued. In this case the property MUST contain the date of the last update. That way a person or institution using the data for an analysis or application will know when to update the report or application on their side.

Property: publisher

PropertypublisherRequirement levelRecommendedCardinality0..1URIdcterms:publisherRangefoaf:AgentUsage note This property refers to an entity (organization) responsible for making the Dataset available.

Property: geographic bounding box

Propertygeographic bounding boxRequirement levelRecommendedCardinality0..nURIdcat-us:geographicBoundingBoxRangedcat-us:GeographicBoundingBoxDefinition A geographic bounding box in WGS 84 coordinate systems (Lat/Long) that describes the spatial extent of the dataset. Usage A dataset can have multiple geographic bounding boxes (example continental US and Alaska). The goal of having geographic bounding box is to provide a common coordinate reference system to describe the spatial extent of the dataset.

Property: temporal coverage

Propertytemporal coverageRequirement levelRecommendedCardinality0..nURIdcterms:temporalRangedcterms:PeriodOfTimeUsage noteThe temporal coverage of a dataset may be encoded as an instance of dcterms:PeriodOfTime, or may be indicated using an IRI reference (link) to a resource describing a time period.

Property: theme/category

Propertytheme/categoryRequirement levelRecommendedCardinality0..nURIdcat:themeRangeskos:ConceptUsage note

This property refers to a category of the Dataset. A Dataset may be associated with multiple themes.
CV to be used: [[?DATA-GOV-THEME]]

Optional Properties

Property: access rights

Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRightsRangedcterms:RightsStatementUsage note

This property refers to information that indicates whether the Dataset is open data, has access restrictions or is not public.
CV to be used: [[?DATA-GOV-AR]].

Property: conforms to

Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:conformsToRangedcterms:StandardUsage note

This property refers to an implementing rule or other specification.
This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation of a Dataset conforms to. This is (generally) a complementary concern to the media-type or format.

Property: contributor

PropertycontributorRequirement levelOptionalCardinality0..nURIdcterms:contributorRangefoaf:AgentUsage note This property refers to an agent contributing to the Dataset.

Property: creator

PropertycreatorRequirement levelOptionalCardinality0..1URIdcterms:creatorRangefoaf:AgentUsage note This property refers to an entity responsible for producing the dataset.

Property: data dictionary

Propertydata dictionaryRequirement levelRecommendedCardinality0..1URIdcat-us:describedByRangedcat:DistributionUsage note

This is used to specify a data dictionary or schema that defines fields (variables, dimensions, measures, attributes) in the dataset.

Property: documentation

PropertydocumentationRequirement levelOptionalCardinality0..nURIfoaf:pageRangefoaf:DocumentUsage note This property refers to a page or document about this Dataset.

Property: frequency

PropertyfrequencyRequirement levelOptionalCardinality0..1URIdcterms:accrualPeriodicityRangedcterms:FrequencyUsage note

This property refers to the frequency at which the Dataset is updated.
CV to be used: [[CLD-FREQ]].

Property: quality measurement

Propertyquality measurementRequirement levelOptionalCardinality0..nURIdqv:hasQualityMeasurementRangedqv:QualityMeasurementDefinitionRefers to the performed quality measurements.It represents the evaluation of a given dataset against a specific quality metricUsage noteUse for quality measurements other than spatial resolution in meters (use dcat:spatialResolutionInMeters). Examples of quality measurements includes completeness, accuracy, accuracy, timeliness, granularity.

Property: has version

URIdcat:hasVersionDefinition: This resource has a more specific, versioned resource [[?PAV]]. Equivalent property:pav:hasVersionSub-property of:dcterms:hasVersionSub-property of:prov:generalizationOfUsage note

A related Dataset that is a version, edition, or adaptation of the described Dataset.

Property: inSeries

PropertyinSeriesRequirement levelOptionalOptional0..nURIdcat:inSeriesRangedcat:DatasetSeriesUsage noteThe datasets are linked to the dataset series by using the property dcat:inSeries. Note that a dataset series can also be hierarchical, and a dataset series can be a member of another dataset series DefinitionA dataset series of which the dataset is part.

Property: is referenced by

Propertyis referenced byRequirement levelOptionalCardinality0..nURIdcterms:isReferencedByRangerdfs:ResourceUsage note This property is about a related resource, such as a publication, that references, cites, or otherwise points to the Dataset.

Property: language

PropertylanguageRequirement levelOptionalOptional0..nURIdcterms:languageRangedcterms:LinguisticSystemDefinitionA language of the dataset. This refers to the natural language used for textual metadata (i.e., titles, descriptions, etc.) of a dataset. Usage noteResources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.Usage noteThe value(s) provided for members of a catalog (i.e., dataset or service) override the value(s) provided for the catalog if they conflict.Usage noteIf representations of a dataset are available for each language separately, define an instance of dcat:Distribution for each language and describe the specific language of each distribution using dcterms:language (i.e., the dataset will have multiple dcterms:language values and each distribution will have just one as the value of its dcterms:language property). In case of multilingual distributions, the distributions will have multiple dcterms:language values.

Property: next

PropertynextRequirement levelOptionalOptional0..1URIdcat:nextRangedcat:DatasetDefinitionThe following resource (after the current one) in an ordered collection or series of resources.

Property: other identifier

Propertyother identifierRequirement levelOptionalOptional0..nURIadms:identifierRangeadms:IdentifierUsage note A secondary identifier of the Dataset, such as MAST/ADS17, DataCite18, DOI19, EZID20 or W3ID21.

Property: prev

PropertyprevRequirement levelOptionalOptional0..1URIdcat:prevRangedcat:DatasetUsage noteUnless the dataset is the last in the chain a dataset in a collection must have a previous one. DefinitionThe previous resource (before the current one) in an ordered collection or series of resources.

Property: provenance

PropertyprovenanceRequirement levelOptionalOptional0..nURIdcterms:provenanceRangedcterms:ProvenanceStatementDefinition

A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation.

Usage note This property contains a statement about the lineage of a Dataset.

Property: qualified attribution

Propertyqualified attributionRequirement levelOptionalCardinality0..nURIprov:qualifiedAttributionRangeprov:AttributionUsage note This property refers to a link to an Agent having some form of responsibility for the resource.

Property: qualified relation

Propertyqualified relationRequirement levelOptionalCardinality0..nURIdcat:qualifiedRelationRangedcat:RelationshipUsage note

This property provides a link to a description of a relationship with another resource and it is especially meant for relationships between Datasets.
It replaces the property rdfs:seeAlso of DCAT-US v1.
See here for examples on how to use it: dcat:qualifiedRelation.

Property: release date

Propertyrelease dateRequirement levelOptionalCardinality0..1URIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note

This property contains the date of formal issuance (e.g., first publication) of the Dataset.
If this date is not known, the date of the first referencing of the data collection in the Catalog can be entered.

Property: rights

PropertyrightsRequirement levelRecommendedCardinality0..nURIdcterms:rightsRangedcterms:RightsStatementUsage note This property refers to a statement that specifies copyrights associated with the Dataset.

Property: sample

PropertysampleRequirement levelOptionalCardinality0..nURIadms:sampleRangedcat:DistributionDefinition

Links to a sample of an Dataset, which is a dcat:Distribution.

Property: usage note

Propertyusage noteRequirement levelOptionalCardinality0..nURIskos:scopeNote

Property: source

PropertysourceRequirement levelOptionalCardinality0..nURIdcterms:sourceRangedcat:DatasetUsage note A related Dataset from which the described Dataset is derived.

Property: subject

PropertysubjectRequirement levelOptionalCardinality0..nURIdcterms:subjectRangeskos:ConceptDefinition Primary Subjects of the Dataset. Usage note Primary Subjects of the Dataset defined in a controlled vocabularies. Subjects are typically narrower in meaning than dcat:theme.

Property: status

PropertystatusRequirement levelOptionalCardinality0..1URIadms:statusRangeskos:ConceptUsage note This property refers to the maturity of the Dataset. It MUST take one of the values Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]] vocabulary.

Property: spatial resolution in meters

Propertyspatial resolution in metersRequirement levelOptionalCardinality0..nURIdcat:spatialResolutionInMetersRangerdfs:Literal (typed as xsd:decimal)Usage note

If the dataset is an image or grid this should correspond to the spacing of items. For other kinds of spatial datasets, this property will usually indicate the smallest distance between items in the dataset.
The range of this property is a decimal number representing a length in meters. This is intended to provide a summary indication of the spatial resolution of the data as a single number. More complex descriptions of various aspects of spatial precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].

Property: temporal resolution

Propertytemporal resolutionRequirement levelOptionalCardinality0..nURIdcat:temporalResolutionRangerdfs:Literal (typed as xsd:duration)Usage note

If the dataset is a time-series this should correspond to the spacing of items in the series. For other kinds of dataset, this property will usually indicate the smallest time difference between items in the dataset
This is intended to provide a summary indication of the temporal resolution of the dataset as a single value. More complex descriptions of various aspects of temporal precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].

Property: category

PropertycategoryRequirement levelOptionalCardinality0..1URIdcterms:typeRangeskos:ConceptUsage note

A type of the Dataset.
A recommended controlled vocabulary data-type is foreseen.

Property: version

PropertyversionRequirement levelOptionalCardinality0..nURIdcat:versionRangerdfs:LiteralUsage note The version indicator (name or identifier) of a resource.

Property: version notes

Propertyversion notesRequirement levelOptionalCardinality0..nURIadms:versionNotesRangerdfs:LiteralUsage note

A description of the differences between this version and a previous version of the Dataset.
This property can be repeated for parallel language versions of the version notes.

Property: was generated by

Propertywas generated byRequirement levelOptionalCardinality0..nURIprov:wasGeneratedByRangeprov:ActivityUsage note An activity that generated, or provides the business context for, the creation of the dataset.

Example

Property: metadata distribution

Propertymetadata distributionRequirement levelOptionalCardinality0..nURIdcat-us:metadataDistributionRangedcat:DistributionDefinition Property referring to a metadata document distribution from which this dataset is derrived from. Usage note Distribution to "original" metadata document from which the dataset is derived from

Property: liability statement

Propertyliability statementRequirement levelOptionalCardinality0..1URIdcat-us:liabilityStatementRangedcat-us:LiabilityStatementUsage note A liability statement about the dataset

Property: purpose

PropertypurposeRequirement levelOptionalCardinality0..nURIdcat-us:purposeRangerdfs:LiteralUsage note The purpose of the dataset

Property: image

PropertyimageRequirement levelOptionalCardinality0..3URIschema:imageRangeschema:url or schema:ImageObjectDefinitionA thumbnail picture illustrating the content of the dataset.Usage note

A thumbnail picture illustrating the content of the Dataset.
For distributions that consist of visual content (photographs, videos, maps, etc.) it makes sense to add a limited number of thumbnails to the metadata.
It’s a DCAT-US Custom Class

Dataset Series

The DatasetSeries concept in the DCAT-US specification serves a dual purpose. Primarily, it represents a collection of related datasets that share common characteristics and are published as a series, facilitating the organization and discovery of datasets that evolve over time or are updated regularly. Beyond this, DatasetSeries also provides a mechanism for grouping datasets into thematic collections, regardless of whether these collections form a temporal series. This flexibility enhances the specification's utility by supporting a wider range of data publication practices, enabling users to effectively discover and understand datasets grouped by series or thematic similarity.

RDF Class:dcat:DatasetSeriesDefinition:A collection of datasets that are published separately, but share some characteristics that group them. Subclass Of:dcat:DatasetUsage note

Dataset series can be also soft-typed via property dcterms:type as in the approach used in [[?GeoDCAT-AP]]
Common scenarios for dataset series include: time series composed of periodically released subsets; map-series composed of items of the same type or theme but with differing spatial footprints.

Rationale:Incorporating dcat:DatasetSeries is essential to enable the structured grouping and presentation of related datasets, ensuring that data publishers can convey meaningful collections of data. This facilitates efficient data organization and discovery for users, aligning the DCAT profile with international standards for dataset series representation.PropertyURIRangeReqLevelCardtitledcterms:title``rdfs:LiteralM1..ndescriptiondcterms:description``rdfs:LiteralM1..ncontact pointdcat:contactPoint``vcard:KindR0..nfirstdcat:first``dcat:DatasetR0..1geographic bounding box dcat-us:geographicBoundingBoxdcat-us:GeographicBoundingBoxR0..nspatial/geographic coveragedcterms:spatial``dcterms:LocationR0..nlastdcat:last``dcat:DatasetR0..1update/modification datedcterms:modified``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1publisherdcterms:publisherfoaf:AgentR0..1series memberdcat:seriesMember``dcat:DatasetR0..1temporal coveragedcterms:temporal``dcterms:PeriodOfTimeR0..nfrequencydcterms:accrualPeriodicity``dcterms:FrequencyO0..1release datedcterms:issued``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..1

Mandatory Properties

Property: title

PropertyTitleRequirement levelMandatoryCardinality1..nURIdcterms:titleRangerdfs:LiteralUsage note

This property contains a name given to the Dataset Series.
This property can be repeated for parallel language versions of the name (see Multilingualism).

Property: description

PropertydescriptionRequirement levelMandatoryCardinality1..nURIdcterms:descriptionRangerdfs:LiteralUsage note

This property contains a free-text account of the Dataset Series.
This property can be repeated for parallel language versions of the description (see Multilingualism). It is recommended to provide an indication about the dimensions the Dataset Series evolves.

Recommended Properties

Property: contact point

Propertycontact pointRequirement levelRecommendedCardinality0..nURIdcat:contactPointRangevcard:KindUsage note Contact information that can be used for sending comments about the Dataset Series.

Property: first

PropertyfirstRequirement levelRecommendedCardinality0..1URIdcat:firstRangedcat:DatasetUsage note The first resource in an ordered collection or series of resources, to which the current resource belongs.

Property: geographic bounding box

Propertygeographic bounding boxRequirement levelRecommendedCardinality0..nURIdcat-us:geographicBoundingBoxRangedcat-us:GeographicBoundingBoxDefinition A geographic bounding box in WGS 84 coordinate systems (Lat/Long) that describes the spatial extent of the dataset series. Usage A dataset series can have multiple geographic bounding boxes (example continental US and Alaska). The goal of having geographic bounding box is to provide a common coordinate reference system to describe the spatial extent of the dataset series.

Property: spatial/geographic coverage

Propertyspatial/geographic coverage Requirement levelRecommendedCardinality0..nURIdcterms:spatialRangedcterms:LocationUsage note

This property refers to a geographic region that is covered by the Dataset Series.
When spatial coverage is a dimension in the dataset series then the spatial coverage of each dataset in the collection should be part of the spatial coverage. In that case, an open ended value is recommended, e.g. EU or a broad bounding box covering the expected values.
CV to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[GEONAMES]]

Property: last

PropertylastRequirement levelRecommendedCardinality0..1URIdcat:lastRangedcat:DatasetUsage note The last resource in an ordered collection or series of resources

Property: update/modification date

Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modifiedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note

This property contains the most recent date on which the Dataset Series was changed or modified.
No value may indicate that the Dataset Series has never changed after its initial publication, or that the date of the last modification is not known, or that the Dataset Series is continuously updated
This is not equal to the most recent modified dataset in the collection of the dataset series.

Property: publisher

PropertypublisherRequirement levelRecommendedCardinality0..1URIdcterms:publisherRangefoaf:AgentUsage note

This property refers to an entity (organization) responsible for ensuring the coherency of the Dataset Series.
The publisher of the dataset series may not be the publisher of all datasets. E.g. a digital archive could take over the publishing of older datasets in the series.

Property: series member

Propertyseries memberRequirement levelRecommendedCardinality0..nURIdcat:seriesMemberRangedcat:DatasetUsage note A member of the Dataset Series.

Property: temporal coverage

Propertytemporal coverageRequirement levelRecommendedCardinality0..nURIdcterms:temporalRangedcterms:PeriodOfTimeUsage note

A temporal period that the Dataset Series covers.
When temporal coverage is a dimension in the dataset series then the temporal coverage of each dataset in the collection should be part of the temporal coverage. In that case, an open ended value is recommended, e.g. after 2012.
The temporal coverage of a dataset series may be encoded as an instance of dcterms:PeriodOfTime, or may be indicated using an IRI reference (link) to a resource describing a time period

Optional Properties

Property: frequency

PropertyfrequencyRequirement levelOptionalCardinality0..1URIdcterms:accrualPeriodicityRangedcterms:FrequencyUsage note

This property refers to the frequency at which the Dataset Series is updated.
The frequency of a dataset series is not equal to the frequency of the dataset in the collection.
CV to be used: [[CLD-FREQ]].

Property: release date

Propertyrelease dateRequirement levelOptionalCardinality0..1URIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note

This property contains the date of formal issuance (e.g.,publication) of the Dataset Series.
The moment when the dataset series was established as a managed resource. This is not equal to the release date of the oldest dataset in the collection of the dataset series.

Example

In this example, ex:populationCensus represents a series of datasets related to the US Population Census Data, which is issued every 10 years (decennial). Individual datasets for specific years (e.g., ex:populationCensus-1950) are also defined, each pointing to the next dataset in the series using dcat:next.

Distribution

In the context of the DCAT-US profile, a metadata entry of this class serves to characterize a distribution of data, which constitutes a specific representation of a Dataset. Datasets within this profile may offer multiple serializations, each potentially differing in various aspects, including natural language, media type or format, schematic organization, temporal and spatial resolution, level of detail, or profiles that specify any combination of these attributes.

A distribution may encompass the entirety of the Dataset's data or only a subset thereof. For example, it could encompass all data related to the population in the United States or focus exclusively on a specific year, such as 2020. Alternatively, it might provide the data in an alternate format, such as a graphical representation covering the years 2010 through 2020.

Within the DCAT-US profile, various relationships between Datasets and their distributions are represented. The most straightforward relationship involves aggregating different physical representations of data, referred to as "Distributions," into a single Dataset. An example of such a Dataset is a time series, where each distribution corresponds to one year of data, and the Dataset spans multiple years.

In the DCAT vocabulary, dcat:Distribution is employed to characterize the diverse representations and formats in which a dataset is disseminated, facilitating the description of different versions or media types of the same data, and often includes properties like dcat:downloadURL for direct download links. On the other hand, dcat:DataService serves the purpose of detailing data access services, such as APIs and endpoints, enabling programmatic or interactive data retrieval, with key properties like dcat:endpointURL specifying service endpoints and dcat:serviceType indicating the type of service, thus distinguishing between the description of data formats and the specification of data access services within the DCAT framework.

RDF Class:dcat:DistributionDefinition:A specific representation of a dataset. A dataset might be available in multiple serializations that may differ in various ways, including natural language, media-type or format, schematic organization, temporal and spatial resolution, level of detail or profiles (which might specify any or all of the above).Subclass Of:dcat:ResourceUsage note

This represents a general availability of a dataset. It implies no information about the actual access method of the data, i.e., whether by direct download, API, or through a Web page. The use of dcat:downloadURL property indicates directly downloadable distributions.

Rationale:The update to DCAT 3 dcat:Distribution is of paramount significance as it greatly enhances data accessibility. It introduces a more comprehensive and structured approach to describing data distributions, ensuring that data consumers can easily understand and access the data in the format that best suits their needs, ultimately fostering greater data utilization and dissemination.PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1licensedcterms:license``dcterms:LicenseDocumentM1..1Alignedaccess URLdcat:accessURL``rdfs:ResourceR0..1No Changeformatdcterms:formatdcterms:MediaTypeR0..1Fixedrightsdcterms:rights``dcterms:RightsStatementR0..nAlignedaccess Restriction dcat-us:accessRestrictiondcat-us:AccessRestrictionR0..nNew usage restriction dcat-us:useRestrictiondcat-us:UseRestrictionR0..nNew cui Restriction dcat-us:cuiRestrictiondcat-us:CuiRestrictionR0..1New data dictionary dcat-us:describedBydcat:DistributionR0..1Fixedtitledcterms:title``rdfs:LiteralR0..nMultilingual supportupdate/modification datedcterms:modified``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1Alignedrepresentation techniqueadms:representationTechnique``skos:ConceptO0..1New status adms:status``skos:ConceptO0..1Alignedcharacter encodingcnt:characterEncoding``rdfs:LiteralO0..nNew compression format dcat:compressFormatdcterms:MediaTypeO0..1Alignedspatial resolution in meters dcat:spatialResolutionInMetersxsd:decimalO0..1Alignedquality measurement dqv:hasQualityMeasurementdqv:QualityMeasurementO0..nAlignedaccess rightsdcterms:accessRights``dcterms:RightsStatementO0..1Alignedaccess servicedcat:accessService``dcat:DataServiceO0..nAlignedbyte sizedcat:byteSize``xsd:nonNegativeIntegerO0..1Alignedchecksumspdx:checksum``spdx:ChecksumO0..1Aligneddocumentation foaf:pagefoaf:DocumentO0..nNew download URL dcat:downloadURL``rdfs:ResourceO0..1No Changeidentifierdcterms:identifier``rdfs:LiteralO0..1Alignedimageschema:image``schema:url or schema:ImageObjectO0..3Newdcterms:language``dcterms:LinguisticSystemO0..nAlignedconforms todcterms:conformsTo``dcterms:StandardO0..nNo Changemedia typedcat:mediaType``dcterms:MediaTypeO0..1Fixedpackaging formatdcat:packageFormatdcterms:MediaTypeO0..1Alignedrelease datedcterms:issued``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1Alignedtemporal resolutiondcat:temporalResolution``xsd:durationR0..1Aligned

Recommended Properties

Property: access restriction

Propertyaccess restrictionRequirement levelRecommendedCardinality0..nURIdcat-us:accessRestrictionRangedcat-us:AccessRestrictionUsage note

This property refers to a statement that specifies access restriction associated with the Distribution.

Property: access URL

Propertyaccess URLRequirement levelRecommendedCardinality0..1URIdcat:accessURLRangerdfs:ResourceUsage note

This should be the URL for an indirect means of accessing the data, such as API documentation, a 'wizard' or other graphical interface which is used to generate a download, feed, or a request form for the data. When the access is restricted but the dataset is available online indirectly, this field should be the URL that provides indirect access. This should not be a direct download URL. It is usually assumed that accessURL is an HTML webpage. This property contains a URL that gives access to a Distribution of the Dataset (typically from a service). The resource at the access URL may contain information about how to get the Dataset.
If the distribution(s) are accessible only through a landing page (i.e., direct download URLs are not known), then the landing page URL associated with the dcat:Dataset SHOULD be duplicated as access URL on a distribution
dcat:accessURL should match the property dcat:endpointURL of the dcat:DataService associated with the distribution.

Property: description

PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:descriptionRangerdfs:LiteralUsage note

This property contains a free-text account of the Distribution.
The description MAY be provided if the distribution contains only part of the data offered by the Dataset.
This property can be repeated for parallel language versions of the description (see Multilingualism).

Property: data dictionary

Propertydata dictionaryRequirement levelRecommendedCardinality0..1URIdcat-us:describedByRangedcat:DistributionUsage note

This is used to specify a data dictionary or schema that defines fields (variables, dimensions, measures, attributes) in the distribution.

Property: format

PropertyformatRequirement levelRecommendedCardinality0..1URIdcterms:formatRangedcterms:MediaTypeUsage note

This property refers to the file format of the Distribution.
CV to be used: [[?DATA-GOV-FT]]
If a format is not available, use media type ([[IANA-MEDIA-TYPES]]) if applicable.

Property: license

PropertylicenseRequirement levelRecommendedCardinality0..1URIdcterms:licenseRangedcterms:LicenseDocumentUsage note

This property refers to the license under which the Distribution is made available.
CV to used: [[?DATA-GOV-LICENSE]]

Property: rights

Property: use restriction

Propertyuse restrictionRequirement levelRecommendedCardinality0..nURIdcat-us:useRestrictionRangedcat-us:UseRestrictionUsage note This property refers to a statement that specifies use restriction associated with the Distribution.

Property: cui restriction

Propertycui restrictionRequirement levelRecommendedCardinality0..nURIdcat-us:cuiRestrictionRangedcat-us:CuiRestrictionUsage note This property refers to a statement that specifies cui restriction associated with the Distribution.

Property: title

PropertytitleRequirement levelRecommendedCardinality0..nURIdcterms:titleRangerdfs:LiteralUsage note

This property contains a name given to the Distribution. This property can be repeated for parallel language versions of the description (see Multilingualism).
The title MUST be given if the distribution contains only part of the data offered by the Dataset
The title can be given in several languages. In multilingual data portals, the title in the language selected by a user will usually be shown as title for the distribution.

Property: update/modification date

Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modifiedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note This property contains the most recent date on which the Distribution was changed or modified.

Optional Properties

Property: representation technique

Propertyrepresentation techniqueRequirement levelOptionalCardinality0..1URIadms:representationTechniqueRangeskos:ConceptDefinitionMore information about the format in which a Distribution is released. This is different from the file format as, for example, a XML file (file format) could contain an XML schema (representation technique). Usage noteadms:representationTechnique in DCAT-US metadata plays a crucial role in detailing the specific schema, standard, or method used to structure data within a dataset, like specifying RFC 4180 for CSV, GeoJSON for JSON, OWL for RDF, or XML Schema for XML. This contrasts with dcterms:format, which broadly identifies the file format (e.g., CSV, JSON, RDF, XML), providing a general idea of the data's structure and syntax. Meanwhile, dcterms:mediaType complements these by defining the MIME type, such as 'text/csv' or 'application/json', crucial for software processing and data transmission. The detailed insight provided by adms:representationTechnique is indispensable for users needing comprehensive knowledge about the dataset's internal organization and interpretation, which goes beyond the basic format or MIME type indicated by dcterms:format and dcterms:mediaType.

Property: status

PropertystatusRequirement levelOptionalCardinality0..1URIadms:statusRangeskos:ConceptUsage note This property refers to the maturity of the Distribution. It MUST take one of the values Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]] vocabulary.

Property: character encoding

Propertycharacter encodingRequirement levelOptionalCardinality0..nURIcnt:characterEncodingRangerdfs:LiteralUsage note This property SHOULD be used to specify the character encoding of the Distribution, by using as value the character set names in the IANA register [[IANA-CHARSETS]].

Property: compression format

Propertycompression formatRequirement levelOptionalCardinality0..1URIdcat:compressFormatRangedcterms:MediaTypeUsage note This property refers to the format of the file in which the data is contained in a compressed form, e.g., to reduce the size of the downloadable file. It SHOULD be expressed using a media type as defined in the official register of media types managed by IANA [[IANA-MEDIA-TYPES]].

Property: spatial resolution in meters

Propertyspatial resolution in metersRequirement levelOptionalCardinality0..nURIdcat:spatialResolutionInMetersRangexsd:decimalUsage note

This property refers to the minimum spatial separation resolvable in a Distribution, measured in meters.

Property: quality measurement

Propertyquality measurementRequirement levelOptionalCardinality0..nURIdqv:hasQualityMeasurementRangedqv:QualityMeasurementDefinitionRefers to the performed quality measurements on a distribution.It represents the evaluation of a given distribution against a specific quality metricUsage noteUse for quality measurements other than dcat:spatialResolutionInMeters or dcat:temporalResolution. Examples of quality measurements includes completeness, accuracy, accuracy, timeliness, granularity.

Property: access rights

Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRightsRangedcterms:RightsStatementUsage note

This property MAY include information regarding access or restrictions based on privacy, security, or other policies.

Property: access service

Propertyaccess serviceRequirement levelOptionalCardinality0..nURIdcat:accessServiceRangedcat:DataServiceUsage note This property refers to a data service that gives access to the distribution of the Dataset

Property: byte size

Propertybyte sizeRequirement levelOptionalCardinality0..1URIdcat:byteSizeRangexsd:nonNegativeIntegerDefinitionThe size of a distribution in bytes.Usage note The size in bytes can be approximated (as a non-negative integer) when the precise size is not known.

Property: checksum

PropertychecksumRequirement levelOptionalCardinality0..1URIspdx:checksumRangespdx:ChecksumUsage note

This property provides a mechanism that can be used to verify that the contents of a distribution have not changed.
The checksum is related to the downloadURL.
Property added in [[VOCAB-DCAT-3]]: spdx:checksum

Property: coverage

PropertycoverageRequirement levelOptionalCardinality0..nURIdcterms:coverageRangedcterms:LocationPeriodOrJurisdictionUsage note

If a dataset contains distributions that differ regarding their content beyond just differences in format or resolution this property can be used to specify temporal or spatial coverage of the data that the distribution contains.

Property: documentation

PropertyDocumentationRequirement levelOptionalCardinality0..nURIfoaf:pageRangefoaf:DocumentUsage note This property refers to a page or document about this Distribution.

Property: download URL

Propertydownload URLRequirement levelOptionalCardinality0..1URIdcat:downloadURLRangerdfs:ResourceUsage note This must be the direct download URL. Other means of accessing the dataset should be expressed using accessURL. This should always be accompanied by mediaType.

Property: identifier

PropertyidentifierRequirement levelOptionalCardinality0..1URIdcterms:identifierRangerdfs:LiteralUsage note An identifier for the distribution, that identifies it as a resource mainly for the organization publishing the data.

Property: image

PropertyimageRequirement levelOptionalCardinality0..3URIschema:imageRangeschema:ImageObjectUsage note

This property is for associating thumbnail images that visually represent the Distribution's content, especially beneficial for visual content like photographs, videos, maps, etc. Thumbnails should effectively illustrate or summarize the content, enhancing metadata richness and utility. While typically only URLs pointing directly to downloadable images are allowed, for more detailed representation, additional fields from schema:ImageObject, such as schema:caption, can be utilized to provide further context or descriptions. This approach ensures the "image" property not only aids in content identification but also enriches the user's understanding and interaction with the metadata.

Property: language

PropertyLanguageRequirement levelOptionalCardinality0..nURIdcterms:languageRangerdfs:LiteralDefinitionA language of the resource. This refers to the natural language used for textual metadata (i.e., titles, descriptions, etc.) of textual values of a dataset distribution Usage note

Resources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.

Usage Note For datasets available in separate languages, create a dcat:Distribution instance for each language version. Assign a unique dcterms:language value to each distribution to specify its language. Distributions with multiple languages should list several dcterms:language values.

Property: conforms to

Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:conformsToRangedcterms:Standard (A basis for comparison; a reference point against which other things can be evaluated.) DefinitionAn established standard to which the distribution conforms.Usage note This is used to identify a standardized specification the distribution conforms to. It's recommended that this be a URI that serves as a unique identifier for the standard. The URI may or may not also be a URL that provides documentation of the specification. This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation of a dataset conforms to. This is (generally) a complementary concern to the media-type or format.

Property: media type

Propertymedia typeRequirement levelOptionalCardinality0..1URIdcat:mediaTypeRangedcterms:MediaTypeDefinition This property refers to the media type of the Distribution as defined in the official register of media types managed by IANA [[IANA-MEDIA-TYPES]]. Usage note

The mediaType property specifies the media type (MIME type) of the distribution. It should be used when the distribution's format corresponds to a standard media type registered with the IANA Media Types [[IANA-MEDIA-TYPES]]. This property provides a precise technical descriptor of the data format (e.g., application/json, text/csv).

Usage note This property refers to the media type of the Distribution as defined in the official register of media types managed by IANA. [[IANA-MEDIA-TYPES]]. Usage noteThe encoding in JSON-LD allows to use mime type without the full URL (e.g. text/csv). The JSON-LD context processor will expand automatically to the full uri in RDF using the base uri https://www.iana.org/assignments/media-types/. This preserves backward compatibility with DCAT-US 1.1

Property: packaging format

Propertypackaging formatRequirement levelOptionalCardinality0..1URIdcat:packageFormatRangedcterms:MediaTypeUsage note

This property refers to the format of the file in which one or more data files are grouped together, e.g. to enable a set of related files to be downloaded together.
It SHOULD be expressed using a media type as defined in the official register of media types managed by IANA.

Property: release date

Propertyrelease dateRequirement levelOptionalCardinality0..1URIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) Usage note

This property contains the date of formal issuance (e.g., publication) of the Distribution.
Date of formal issuance (publication) of the distribution
UsageThe first time issuance of the distribution.

Property: temporal resolution

Propertytemporal resolutionRequirement levelOptionalCardinality0..1URIdcat:temporalResolutionRangexsd:durationUsage note This property refers to the minimum time period resolvable in the Dataset distribution.

Example

Document

RDF Class:foaf:DocumentObligationOptionalDefinition:A publication - as a scientific paper, a techni cal report, a book, book chapter, but also a blog post. Usage noteDepending on whether a catalog supports or not publications as first-class citizens, a publication can be fully described, or simply denoted by its URI.Rationale:The introduction of foaf:Document significantly improves the representation of documents within the DCAT-US profile. It ensures that metadata about documents, such as title, format, language, and access options, are clearly defined and standardized. This alignment with global data standards fosters interoperability and eases document integration into various data ecosystems, benefiting both publishers and consumers.Reference

§ Class: foaf:Document [FOAF]

Properties Summary

PropertyURIRangeReqLevelCardtitle dcterms:titlerdfs:LiteralM1..nindividual author dcterms:creatorfoaf:PersonR0..ncorporate author dcterms:creatororg:OrganizationR0..nauthor(s) as literal dc:creatorrdfs:LiteralR0..npublisher organization dcterms:publisherorg:OrganizationR0..1publisher(s) as literal dc:publisherrdfs:LiteralR0..nidentifier dcterms:identifierrdfs:LiteralR0..1publication date dcterms:issuedrdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1bibliographic citation dcterms:bibliographicCitationrdfs:LiteralR0..1document type dcterms:typeskos:ConceptR0..1abstract dcterms:abstractrdfs:LiteralO0..ndescription dcterms:descriptionrdfs:LiteralO0..nconforms to dcterms:conformsTodcterms:StandardO0..nmedia type dcterms:mediaTypedcterms:MediaTypeO0..n

Mandatory Properties

Property: title

PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:titleRangerdfs:LiteralUsage note

Recommended Properties

Property: bibliographic citation

Propertybibliographic citationRequirement levelRecommendedCardinality0..1URIdcterms:bibliographicCitationRangerdfs:LiteralUsage note

Property: description

PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:descriptionRangerdfs:LiteralDefinitionDescription of the document.

Property: author(s) as literal

Propertyauthor(s) as literalRequirement levelRecommendedCardinality0..nURIdc:creatorRangerdfs:LiteralUsage note Use fields to represent creators as literal strings. This is useful when the creator is a structured object

Property: identifier

PropertyidentifierRequirement levelRecommendedCardinality0..1URIdcterms:identifierRangerdfs:LiteralUsage note

Property: publication date

Propertypublication dateRequirement levelRecommendedCardinality0..nURIdcterms:issuedRangerdfs:LiteralUsage note

Property: publisher organization

Propertypublisher organizationRequirement levelRecommendedCardinality0..nURIdcterms:publisherRangeorg:OrganizationUsage note

Property: publisher(s) as literal

Propertypublisher(s) as literalRequirement levelRecommendedCardinality0..nURIdc:publisherRangerdfs:LiteralUsage note Use this property to represent publisher as literal string and not structured data.

Property: document type

Propertydocument typeRequirement levelRecommendedCardinality0..1URIdcterms:typeRangeskos:ConceptUsage note

Optional Properties

Property: abstract

PropertyabstractRequirement levelOptionalCardinality0..nURIdcterms:abstractRangerdfs:LiteralUsage note

Property: individual author

Propertyindividual authorRequirement levelOptionalCardinality0..nURIdcterms:creatorRangefoaf:PersonUsage note

Property: corporate author

Propertycorporate authorRequirement levelOptionalCardinality0..nURIdcterms:creatorRangeorg:OrganizationUsage note

Property: conforms to

Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:identifierRangedcterms:StandardUsage note An implementing rule or other specification.

Property: media type

Propertymedia typeRequirement levelOptionalCardinality0..nURIdcterms:mediaTypeRangedcterms:MediaTypeUsage note An implementing rule or other specification.

Example

Geographic Bounding Box

GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.

RDF Class:dcat-us:GeographicBoundingBoxDefinition:GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.Usage note Strongly recommended for geospatial data Rationale There is no consensus and common vocabulary to describe spatial bounding box in the community. GML Envelope was proposed but it is too cumbersome to process. We introduce four separates fields for each bound (west, east, north and south) that removes any ambiguity and make it easy to index and query

Properties Summary

PropertyURIRangeReqLevelCardwest bounding longitude dcat-us:westBoundingLongitudexsd:decimalM1east bounding longitude dcat-us:eastBoundingLongitudexsd:decimalM1south bouding latitude dcat-us:southBoundingLatitudexsd:decimalM1north bounding latitude dcat-us:northBoundingLatitudexsd:decimalM1

Mandatory Properties

Property: west bounding longitude

Propertywest bounding longitudeRequirement levelMandatoryCardinality1URIdcat-us:westBoundingLongitudeRangexsd:decimalDefinitionWest bound longitude in decimal degrees

Property: east bounding longitude

Propertyeast bounding longitudeRequirement levelMandatoryCardinality1URIdcat-us:eastBoundingLongitudeRangexsd:decimalDefinitionEast bound longitude in decimal degrees

Property: south bounding latitude

Propertysouth bounding latitudeRequirement levelMandatoryCardinality1URIdcat-us:southBoundingLatitudeRangexsd:decimalDefinitionSouth bound latitude in decimal degrees

Property: north bounding latitude

Propertynorth bounding latitudeRequirement levelMandatoryCardinality1URIdcat-us:southBoundingLatitudeRangexsd:decimalDefinitionNorth bound latitude in decimal degrees

Example

Identifier

RDF Class:adms:IdentifierObligationOptionalDefinition:This is based on the UN/CEFACT Identifier class.Usage noteAn identifier in a particular context, consisting of the

content string that is the identifier;
an optional identifier for the identifier scheme;
an optional identifier for the version of the identifier scheme;
an optional identifier for the agency that manages the identifier scheme.

Reference

§ Term name: Identifier [ADMS]

Rationale Incorporating adms:Identifier in the DCAT-US profile fosters a culture of data governance and trust by transparently documenting the authority behind each identifier. This enhances data reliability and credibility, boosting confidence for DCAT-US users. Additionally, it enables versatile data access using multiple identifiers, enhancing overall data accessibility and usability for diverse stakeholders.

Properties Summary

PropertyURIRangeReqLevelCardnotationskos:notation``xsd:stringR0..1creatordcterms:creator``dcterms:AgentO0..1schema agencyadms:schemaAgency``rdfs:LiteralO0..1version dcat:versionrdfs:LiteralO0..1issueddcterms:issued``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) O0..1

Recommended Properties

Property: notation

PropertynotationRequirement levelRecommendedCardinality0..1URIskos:notationRangexsd:string

Optional Properties

Property: creator

PropertycreatorRequirement levelOptionalCardinality0..1URIdcterms:creatorRangedcterms:Agent

Property: schema agency

Propertyschema agencyRequirement levelOptionalCardinality0..1URIadms:schemaAgencyRangerdfs:Literal

Property: version

PropertyversionRequirement levelOptionalCardinality0..1URIdcat:versionRangerdfs:Literal

Property: issued

PropertyissuedRequirement levelOptionalCardinality0..1URIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth)

Example

Liability Statement

RDF Class:dcat-us:LiabilityStatementDefinition:A formal declaration accompanying a dataset intended to limit the legal exposure of the data provider by disclaiming warranties or guarantees. Usage note

This statement often includes information of the following aspects:
- Limitation of Responsibility: Clarifying that the publisher or provider is not responsible for any errors in the data, and any consequences resulting from its use.
- No Guarantee of Validity: Indicating that there is no guarantee of the accuracy, reliability, or completeness of the data provided.
- Absence of Endorsement: Stating that inclusion of the data in the catalog does not imply endorsement by the publisher or provider.
- Use at Own Risk: Advising users that they use the data at their own risk and are responsible for ensuring its appropriateness for their intended purposes.
The statement may be provided as a literal text or as a URL pointing to a detailed liability statement.
Utilizing the LiabilityStatement helps in setting clear expectations for consumers of the dataset and limits potential legal exposures of the data provider.

RationaleIntroducing dcat-us:LiabilityStatement in DCAT-US clarifies data provider responsibilities and limitations, reducing legal risks by defining acceptable uses and disclaiming warranties. This ensures transparency and legal compliance in data sharing within the United States.

Properties Summary

PropertyURIRangeReqLevelCardliability statement textrdfs:label``rdfs:LiteralO0..n

Optional Properties

Property: liability statement text

Propertyliability statement textRequirement levelOptionalCardinality0..nURIrdfs:labelRangerdfs:LiteralDefinitionFull text of the liability statement.Usage noteProperty rdfs:label MAY only be used to specify the text of liability statement information. This property can be repeated for parallel language versions of the description

Example

LicenseDocument

RDF Class:dcterms:LicenseDocumentObligationOptionalDefinition:A legal document giving official permission to do something with a resource.Usage note License document SHOULD be specified only with URIs from an endorsed Data.gov registry. Property spdx:licenseText MAY only be used to specify license information in legacy metadata records, not compliant with standard license from an endorsed Data.Gov registry. Rationale:The introduction of dcterms:LicenseDocument in the DCAT profile enables the customization of license text. This flexibility empowers data publishers to tailor license terms to specific dataset requirements, facilitating clear communication of licensing conditions and promoting responsible data sharing and usage while adhering to established international standards.Reference

§ Term name: LicenseDocument [DCTERMS]

Properties Summary

PropertyURIRangeReqLevelCardlicense text spdx:licenseTextrdfs:LiteralO0..n

Optional Properties

Property: license text

Propertylicense textRequirement levelOptionalCardinality0..nURIspdx:licenseTextRangerdfs:LiteralDefinitionFull text of the license.Usage noteProperty spdx:licenseText MAY only be used to specify license information in legacy metadata records, not compliant with1 standard license from an endorsed registry. This property can be repeated for parallel language versions of the description

Example

Location

A spatial region or named place.

RDF Class:dcterms:LocationDefinition:A spatial region or named place. It can be represented using a controlled vocabulary or with geographic coordinates. Usage note

For an extensive geometry (i.e., a set of coordinates denoting the vertices of the relevant geographic area), the property locn:geometry [[LOCN]] SHOULD be used.
For a geographic bounding box delimiting a spatial area the property dcat:bbox SHOULD be used.
For the geographic center of a spatial area, or another characteristic point, the property dcat:centroid SHOULD be used.

Rationale:The introduction of dcterms:Location in DCAT-US 3.0 is driven by the need to restore compatibility with the DCAT standard. DCAT-US 1.1 had deviated from the standard by using strings for location in dcterms:spatial property, which was incompatible. This addition aligns DCAT-US with recognized geospatial standards (e.g., Geosparql, WKT, GeoJSON, W3C Location) for representing geometries, addresses, and location names, ensuring data compatibility, discoverability, and integration while adhering to international data management practices.

Properties Summary

PropertyURIRangeReqLevelCardbounding boxdcat:bbox``rdfs:Literal typed as gsp:wktLiteral (preferred) or gsp:gmlLiteral or gsp:geoJSONLiteralR0..1centroiddcat:centroid``rdfs:Literal typed as gsp:wktLiteral or gsp:gmlLiteral.O0..1geographic identifierdcterms:identifier``rdfs:LiteralO0..ngeometrylocn:geometrylocn:Geometry typed as gsp:wktLiteral (preferred) or gsp:gmlLiteral or gsp:geoJSONLiteralO0..1gazetteerskos:inScheme``skos:ConceptSchemeO0..1geographic nameskos:prefLabel``rdfs:LiteralR0..nalternate geographic nameskos:altLabel``rdfs:LiteralO0..n

Recommended Properties

Property: bounding box

Propertybounding boxRequirement levelRecommendedCardinality0..1URIdcat:bboxRangerdfs:Literal typed as geosparql:wktLiteral or geosparql:gmlLiteralDefinitionThe geographic center (centroid) of a spatial thingUsage note

The range of this property (rdfs:Literal) is intentionally generic, with the purpose of allowing different geometry literal encodings. E.g., the geometry could be encoded as a WKT literal (geosparql:wktLiteral)
Please note that the order of usage is as follows: use the most specific geospatial relationship by preference. E.g. if the spatial description is a bbox, use dcat:bbox, otherwise use locn:geometry
The WKT encoding supports geospatial positions expressed in coordinate reference systems other than WGS84.

Property: geographic name

RDF Propertyskos:prefLabelRequirement levelRecommendedCardinality0..nRangerdfs:LiteralDefinitionPreferred toponym for the locationUsage note This property contains a preferred label of the Location. This property can be repeated for parallel language versions of the label.

Optional Properties

Property: geographic name

RDF Propertyskos:altLabelRequirement levelOptionalCardinality0..nRangerdfs:LiteralDefinitionAlternate toponyms for the locationUsage note This property contains a alternate labels of the Location. This property can be repeated for parallel language versions of the label.

Property: centroid

PropertycentroidRequirement levelOptionalCardinality0..1URIdcat:centroidRangerdfs:Literal typed as geosparql:wktLiteral or geosparql:gmlLiteralUsage note

The range of this property (rdfs:Literal) is intentionally generic, with the purpose of allowing different geometry literal encodings. E.g., the geometry could be encoded as a WKT literal (geosparql:wktLiteral)
Please note that the order of usage is as follows: use the most specific geospatial relationship by preference. E.g. if the spatial description is a bbox, use dcat:bbox, otherwise use locn:geometry
The WKT encoding supports geospatial positions expressed in coordinate reference systems other than WGS84.

Property: geographic identifier

Propertygeographic identifierRequirement levelOptionalCardinality0..nURIdcterms:identifierRangerdfs:LiteralUsage note This property contains the geographic identifier for the Location, e.g., the URI or other unique identifier in the context of the relevant gazetteer.

Property: geometry

PropertygeometryRequirement levelOptionalCardinality0..1URIlocn:geometryRangelocn:GeometryDefinition: Associates a spatial thing [[?SDW-BP]] with a corresponding geometry. Usage note The range of this property ( locn:Geometry) allows for any type of geometry specification. E.g., the geometry could be encoded by a literal, as WKT ( geosparql:wktLiteral [[GeoSPARQL]]), or represented by a class, as geosparql:Geometry (or any of its subclasses) [[GeoSPARQL]].

Property: gazetteer

RDF Propertyskos:inSchemeRequirement levelOptionalCardinality0..1Rangeskos:ConceptSchemeUsage note: This property MAY be used to specify the gazetteer to which the Location belongs.

Example

MediaType

RDF Class:dcterms:MediaTypeObligationOptionalDefinition:A media type, e.g. the format of a computer file.Usage noteData publishers should consider using well-established IANA [[IANA-MEDIA-TYPES]] URLs for media types whenever possible to enhance compatibility and interoperability. However, the ability to create custom media types using labels provides flexibility for unique data requirements. When creating custom media types, it's advisable to provide clear and concise definitions to ensure transparency and understanding for data consumers. Striking a balance between standardized and custom media types optimizes data sharing within the DCAT-US framework. Rationale:Incorporating dcterms:MediaType in DCAT-US combines the use of established IANA [[IANA-MEDIA-TYPES]] URLs for standardized media types with the flexibility to create custom types using labels. This dual approach ensures compatibility with recognized media types while allowing adaptability to specific needs, promoting both data interoperability and flexibility in data sharing and dissemination. Reference

§ Term name: MediaType [DCTERMS]

Properties Summary

PropertyURIRangeReqLevelCardlabelrdfs:label``xsd:stringR0..1

Recommended Properties

Property: label

RDF Propertyrdfs:labelRequirement levelRecommendedCardinality0..1Rangexsd:stringUsage noteThis property contains the denomination of the Media Type.

Example

Metric

RDF Class:dqv:MetricObligationOptionalDefinition: Represents a standard to measure a quality dimension. An observation (instance of dqv:QualityMeasurement) assigns a value in a given unit to a Metric.Usage noteThe concept of a metric is used to define and measure specific aspects or dimensions of data quality within a given context, providing a standardized and quantifiable way to assess the quality of data. It allows for the comparison and evaluation of data quality across different resources and enables the development of consistent quality assessment frameworks and methodologies.Rationale:Introducing dqv:Metric in the DCAT-US profile enhances dataset quality assessment and management by aligning with international data quality standards. It allows data publishers to systematically define and communicate dataset quality characteristics, promoting transparency and informed data utilization, fostering trust, and supporting responsible data sharing within the DCAT-US ecosystem. Reference

[§ 4.1
Class: Metric](https://www.w3.org/TR/vocab-dqv/#dqv:Metric) [VOCAB-DQV]

Properties Summary

PropertyURIRangeReqLevelCardin dimension dqv:inDimensiondqv:DimensionM1expected DataType dqv:expectedDataTypexsd:anySimpleTypeM1definitionskos:definition``rdfs:LiteralR0..n

Mandatory Properties

Property: in dimension

Propertyin dimensionRequirement levelMandatoryCardinality1URIdqv:inDimensionRangedqv:DimensionDefinitionRepresents the dimensions a quality metric, certificate and annotation allow a measurement of.

Property: expected datatype

Propertyexpected datatypeRequirement levelMandatoryCardinality1URIdqv:expectedDataTypeRangexsd:anySimpleTypeDefinitionRepresents the expected data type for the metric's observed value (e.g., xsd:boolean, xsd:double etc...)

Recommended Properties

Property: definition

PropertydefinitionRequirement levelRecommendedCardinality0..nURIskos:definitionRangerdfs:LiteralDefinitiondefinition of the metric

Example

Organization

RDF Class:org:OrganizationDefinition:Represents a collection of people organized together into a community or other social, commercial or political structure. The group has some common purpose or reason for existence which goes beyond the set of people belonging to it and can act as an Agent. Organizations are often decomposable into hierarchical structures.Subclass Of:foaf:AgentUsage note When utilizing the org:Organization class in DCAT-US 3.0, data publishers are encouraged to provide the preferred label (skos:prefLabel) for the organization, along with any relevant alternative labels (skos:altLabel) and abbreviations skos:notation. This usage is consistent with the W3C Organization Recommendation standard [[VOCAB-ORG]].This practice ensures comprehensive and flexible organization identification, improving data discoverability and search accuracy. Data publishers should strive to maintain consistency in naming conventions while considering variations and common aliases used to refer to organizations. By providing a well-rounded representation of organizations, DCAT-US 3.0 enhances data usability and transparency, facilitating efficient data search and retrieval. Rationale:Improving the org:Organization class in DCAT-US 3.0 by supporting prefLabel, alternative labels, and abbreviations is essential to enhance organization representation. This enhancement accommodates variations in organization naming, promotes data interoperability, and improves discoverability within datasets. By incorporating these features, DCAT-US 3.0 aligns with best practices in data representation, enhances data search and transparency, and optimizes the overall usability of data resources.

Properties Summary

PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1namefoaf:name``xsd:stringM1..1No Changepreferred labelskos:prefLabel``xsd:stringO0..1Alignedalternative labelskos:altLabel``xsd:stringO0..nAlignednotationskos:notation``xsd:stringO0..nAlignedsubOrganizationOf org:subOrganizationOforg:OrganizationO0..1No Change

Mandatory Properties

Property: name

PropertynameRequirement levelMandatoryCardinality1URIfoaf:nameRangexsd:stringDefinitionThe name of the Organization

Optional Properties

Property: preferred label

Propertypreferred labelRequirement levelOptionalCardinality0..1URIskos:prefLabelRangexsd:stringDefinitionThe legal name or preferred name of the Organization

Property: alternate label

Propertyalternate labelRequirement levelOptionalCardinality0..nURIskos:altLabelRangexsd:stringDefinitionalternative names (trading names, colloquial names) for an organization

Property: notation

PropertynotationRequirement levelOptionalCardinality0..nURIskos:notationRangexsd:stringDefinitionabbreviations or codes from code lists for an organization (e.g. DOI, DOD)

Property: suborganization of

Propertysub organization ofRequirement levelOptionalCardinality0..nURIorg:subOrganizationOfRangeorg:OrganizationDefinitionRepresents hierarchical containment of Organizations or OrganizationalUnits; indicates an Organization which contains this Organization.

Example

Period of Time

PeriodOfTime represents a period of time with a start date and an end.

RDF Class:dcterms:PeriodOfTimeDefinition:PeriodOfTime represents a period of time with a start date and an end.Usage note The start and end of the interval SHOULD be given by using properties dcat:startDate, and dcat:endDate, respectively. The interval can also be open - i.e., it can have just a start or just an end. Rationale:The introduction of dcterms:PeriodOfTime in DCAT-US 3.0 is pivotal for harmonizing with international standards and rectifying the inconsistency with DCAT 1. In DCAT-US 1.1, [[ISO8601-1]] was used for interval representation in dcterms:temporal, diverging from DCAT 1's requirement of dcterms:PeriodOfTime. This alignment with DCAT 3 standards in DCAT-US 3.0 not only resolves discrepancies but also streamlines data processing, simplifying parsing and indexing of time intervals. By adopting dcterms:PeriodOfTime, DCAT-US 3.0 promotes ease of implementation, ensuring uniformity, flexibility, accuracy, and enhanced interoperability in handling time-related data, ultimately benefiting data usability and exchange. PropertyURIRangeReqLevelCardstart datedcat:startDate``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1end datedcat:endDate``rdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) R0..1

Recommended Properties

Property: start date

Propertystart dateRequirement levelRecommendedCardinality0..1URIdcat:startDateRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThe start date of the period of time

Property: end date

Propertyend dateRequirement levelRecommendedCardinality0..1URIdcat:endDateRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThe end date of the period of time

Example

Person

RDF Class:foaf:PersonDefinition:This class represents an individual human being or a person. It can be used to provide information about individuals, such as their name, email address, homepage URL, and other personal details. Subclass Of:foaf:AgentUsage noteRationale:The rationale for enhancing the foaf:Person class in DCAT-US 3.0 is to provide a more comprehensive and standardized representation of individuals within datasets. In earlier versions, like DCAT 1.1, only a single "name" property was available for describing persons, limiting the richness of personal data representation. By introducing properties like "firstName," "givenName," and "affiliation," DCAT-US 3.0 aligns with best practices in data representation, allowing data publishers to provide more detailed information about individuals and their affiliations with organizations. This enhancement enhances data usability and transparency. PropertyURIRangeReqLevelCardname foaf_namexsd:stringM1..1given name foaf:givenNamexsd:stringO0..1first name foaf:firstnamexsd:stringR0..1member oforg:memberOforg:OrganizationO0..n

Mandatory Properties

Property: name

PropertynameRequirement levelMandatoryCardinality1URIfoaf:nameRangexsd:stringDefinitionThe full name of the Person

Recommended Properties

Property: given name

Propertygiven nameRequirement levelRecommendedCardinality0..1URIfoaf:givenNameRangexsd:stringDefinitionThe given name of the Person

Property: first name

Propertyfirst nameRequirement levelRecommendedCardinality0..1URIfoaf:firstnameRangexsd:stringDefinitionThe first name of the Person

Optional Properties

Property: affiliation

PropertynameRequirement levelOptionalCardinality0..nURIorg:memberOfRangeorg:OrganizationDefinitionThis property MAY be used to specify the affiliation of the Person to an organization.

Example

Provenance Statement

RDF Class:dcterms:ProvenanceStatementObligationOptionalDefinition: Any changes in ownership and custody of a resource since its creation that are significant for its authenticity, integrity, and interpretation.Usage noteThe dcterms:ProvenanceStatement in DCAT-US 3.0 offers flexibility in how it can be referenced. It can either be referred to by a URL or included in-line by using a label. This versatility allows data publishers to choose the most suitable method for providing information about significant changes in ownership and custody, enhancing the accessibility and usability of provenance details within datasets. Rationale:Introducing dcterms:ProvenanceStatement in DCAT-US 3.0 enhances dataset transparency and trustworthiness. It allows data publishers to provide structured information about significant changes in ownership and custody, aligning with international data quality and provenance standards. This flexibility ensures greater confidence in dataset authenticity and interpretation, promoting responsible data usage within DCAT-US. Reference

§ Term name: ProvenanceStatement [DCTERMS]

Properties Summary

PropertyURIRangeReqLevelCardprovenance statement textrdfs:label``xsd:stringR0..n

Recommended Properties

Property: provenance statement text

Propertyprovenance statement textRequirement levelRecommendedCardinality0..nURIrdfs:labelRangerdfs:LiteralDefinitionThis property contains the text of the Provenance Statement. This property can be repeated for parallel language versions of the name

Role

RDF Class:dcat:RoleObligationOptionalDefinition: A role is the function of a resource or agent with respect to another resource, in the context of resource attribution or resource relationships. Usage noteUsed in a qualified-attribution to specify the role of an Agent with respect to an Entity. It is recommended that the values be managed as a controlled vocabulary of agent roles, such as [[?ISO-19115-1]] CI_RoleCode. Rationale:Integrating dcat:Role within dcat:Relationship in DCAT-US enriches data networks by providing clear, navigable, and semantically transparent relationships among datasets, thereby enhancing data discoverability, usability, and integration across various applications and use cases by precisely depicting complex data dependencies and hierarchies.

Properties Summary

Mandatory Properties

Property: preferred label

Propertypreferred labelRequirement levelMandatoryCardinality0..nURIskos:prefLabelRangerdfs:LiteralDefinitionPreferred label for the controlled vocabulary term (one per language).

Property: concept scheme

Propertyin schemeRequirement levelMandatoryCardinality1URIskos:inSchemeRangeskos:ConceptSchemeDefinitionConcept scheme defining the role

Recommended Properties

Property: definition

PropertydefinitionRequirement levelRecommendedCardinality0..nURIskos:definitionRangerdfs:LiteralDefinitiondefinition of the controlled vocabulary

Optional Properties

Property: alternate label

Propertyalternate labelRequirement levelOptionalCardinality0..nURIskos:altLabelRangerdfs:LiteralDefinitionalternative labels for a role

Property: notation

PropertynotationRequirement levelOptionalCardinality0..nURIskos:notationRangexsd:stringDefinitionabbreviations or codes for the role.

Quality Measurement

RDF Class:dqv:QualityMeasurementObligationOptionalDefinition: Represents the evaluation of a given dataset (or dataset distribution) against a specific quality metric. Usage noteRepresents the evaluation of a given resource (as a Data Service, Dataset, or Distribution) against a specific quality metric, such as spatial resolution in scale, angle or metric.Rationale:The inclusion of dqv:QualityMeasurement in DCAT-US assists end-users in better evaluating the fitness of use of resources. This optional class enhances data quality assessment, aligns with international standards (DQV), and enables more precise evaluation against specific quality metrics, ultimately improving data usability and adherence to recognized quality assessment practices. Reference

§ 4.1 Class: Quality Measurement [VOCAB-DQV]

Properties Summary

PropertyURIRangeReqLevelCardis measurement of dqv:isMeasurementOfdqv:MetricM1value dqv:valuerdfs:LiteralM1unit of measure sdmx-attribute:unitMeasurerdfs:ResourceO0..1

Mandatory Properties

Property: is measurement of

Propertyis measurement ofRequirement levelMandatoryCardinality1URIdqv:isMeasurementOfRangedqv:MetricDefinitionIndicates the metric being observed.

Property: value

PropertyvalueRequirement levelMandatoryCardinality1URIdqv:valueRangerdfs:LiteralDefinitionRefers to values computed by metric.

Optional Properties

Property: unit of measure

Propertyunit of measureRequirement levelOptionalCardinality0..1URIsdmx-attribute:unitMeasureRangerdfs:ResourceDefinitionUnit of measure associated with the value

Example

Relationship

RDF Class:dcat:RelationshipDefinition: An association class for attaching additional information to a relationship between DCAT Resources Usage note Use to characterize a relationship between datasets, and potentially other resources, where the nature of the relationship is known but is not adequately characterized by the standard [[?DCTERMS]] properties ( dcterms:hasPart, dcterms:isPartOf, dcterms:conformsTo, dcterms:isFormatOf, dcterms:hasFormat, dcterms:isVersionOf, dcterms:hasVersion, dcterms:replaces, dcterms:isReplacedBy, dcterms:references, dcterms:isReferencedBy, dcterms:requires, dcterms:isRequiredBy) or [[PROV-O]] properties (prov:wasDerivedFrom, prov:wasInfluencedBy, prov:wasQuotedFrom, prov:wasRevisionOf, prov:hadPrimarySource, prov:alternateOf, prov:specializationOf) Rationale:The introduction of dcat:Relationship in DCAT-US serves to enhance the representation and description of relationships between datasets and other resources. This class allows for the attachment of additional information to relationships that are not adequately characterized by standard properties, promoting a more comprehensive understanding of dataset connections. By accommodating nuanced relationship types beyond existing standards like [[DCTERMS]] and [[PROV-O]] properties, DCAT-US ensures greater flexibility and precision in documenting dataset relationships, facilitating more informed data discovery and utilization.

Properties Summary

PropertyURIRangeReqLevelCardrelation dcterms:relationdcat:ResourceM1roledcat:hadRole``dcat:RoleM1

Mandatory Properties

Property: relation

PropertyrelationRequirement levelMandatoryCardinality1URIdcterms:relationRange``DefinitionThe resource related to the source resource.

Property: role

PropertyroleRequirement levelMandatoryCardinality1URIdcat:hadRoleRangedcat:RoleDefinitionThe function of an entity or agent with respect to another entity or resource.

Example

RightsStatement

RDF Class:dcterms:RightsStatementObligationOptionalDefinition:A statement about the intellectual property rights (IPR) held in or over a resource, a legal document giving official permission to do something with a resource, or a statement about access rights.Usage noteInformation about rights SHOULD be provided on the level of Distribution. Information about rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts. Rationale:The introduction of dcterms:RightsStatement in DCAT-US is vital for standardizing the conveyance of intellectual property rights (IPR) and access permissions. This optional class accommodates URL references and custom rights statements via attribution text, promoting transparency and compliance. By encouraging consistent rights information at the Distribution and optional Dataset levels, DCAT-US enhances data sharing while reducing legal conflict risks. Reference

§ Term name: RightsStatement [DCTERMS]

Properties Summary

PropertyURIRangeReqLevelCardlabelrdfs:label``rdfs:LiteralR0..nattribution text odrs:attributionTextrdfs:LiteralR0..n

Recommended Properties

Property: label

PropertylabelRequirement levelRecommendedCardinality0..nURIrdfs:labelRangerdfs:LiteralDefinitionThis property contains the text of the Rights Statement. This property can be repeated for parallel language versions of the name - see § 4.3. Multilingualism section.

Property: attribution text

Propertyattribution textRequirement levelRecommendedCardinality0..nURIodrs:attributionTextRangerdfs:LiteralDefinition

Example

Standard

RDF Class:dcterms:StandardObligationOptionalDefinition: A standard or other specification to which a Dataset or Distribution conforms.Usage noteA standard or other specification to which a Catalog, Catalog Record, Data Service, Dataset, or Distribution conformsRationale:The inclusion of dcterms:Standard in DCAT-US accommodates standard references through URLs or custom, detailed descriptions when specific standards are not available, promoting flexibility and completeness in resource metadata.Reference

§ Term name: Standard [DCTERMS]

Properties Summary

PropertyURIRangeReqLevelCarddescription dcterms:descriptionrdfs:LiteralR0..nidentifier dcterms:identifierxsd:stringR0..nissued dcterms:issuedxsd:dateR0..1title dcterms:titlerdfs:LiteralR0..ntype dcterms:typeskos:ConceptR0..nversion dcat:versionxsd:stringR0..1in schemeskos:inScheme``skos:ConceptSchemeO0..1creation date dcterms:createdxsd:dateO0..1update/modification date dcterms:modifiedxsd:dateO0..1

Recommended Properties

Property: title

PropertytitleRequirement levelRecommendedCardinality0..nURIdcterms:titleRangexsd:stringDefinitionThis property contains a name given to the Standard. This property can be repeated for parallel language versions of the name.

Property: description

PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:descriptionRangerdfs:LiteralDefinitionThis property contains a free-text account of the Standard. This property can be repeated for parallel language versions of the description - see Multilinguism

Property: identifier

PropertyidentifierRequirement levelRecommendedCardinality0..nURIdcterms:identifierRangexsd:stringDefinitionThis property contains the main identifier for the Standard, e.g. the URI or other unique identifier in the context of the Catalog, or of a reference register

Property: issued

PropertyissuedRequirement levelRecommendedCardinality0..1URIdcterms:issuedRangerdfs:Literal (typed as xsd:date, xsd:dateTime, xsd:gYear or xsd:gYearMonth) DefinitionThis property contains the date of formal issuance (e.g., publication) of the Standard.

Property: type

PropertytypeRequirement levelRecommendedCardinality0..1URIdcterms:typeRangeskos:ConceptDefinitionThis property refers to the type of the Standard. A controlled vocabulary for the values has not been established.

Property: version

PropertyversionRequirement levelOptionalCardinality0..1URIdcat:versionRangexsd:stringDefinitionThis property contains a version number or other version designation of the Standard.

Property: in scheme

Propertyin schemeRequirement levelRecommendedCardinality0..1URIskos:inSchemeRangeskos:ConceptSchemeDefinitionThis property MAY be used to specify the reference register to which the Standard belongs.

Optional Properties

Property: creation date

Propertycreation dateRequirement levelOptionalCardinality0..1URIdcterms:createdRangexsd:dateDefinitionThis property contains the date on which the Standard has been first created.

Property: update/modification date

Propertyupdate/modification dateRequirement levelOptionalCardinality0..1URIdcterms:modifiedRangexsd:dateDefinitionThis property contains the most recent date on which the Standard was changed or modified.

Examples

UseRestriction

RDF Class:dcat-us:UseRestrictionDefinition:A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on access, distribution, reproduction, modification, or sharing, and they are often put in place to protect privacy, intellectual property rights, security, or compliance with legal or ethical standards.Usage note When utilizing the dcat-us:UseRestriction class, data publishers are encouraged to provide comprehensive and precise details regarding the specific use restrictions applied to a resource. This may include information on access limitations, distribution rules, reproduction guidelines, modification constraints, and any other pertinent restrictions. Adherence to NARA guidelines and standards should be a priority when defining use restrictions, ensuring that data resources align with archival and preservation practices. By offering clear and concise use restriction information, data consumers can make informed decisions about the utilization of these resources while complying with NARA's requirements. Rationale:The introduction of dcat-us:UseRestriction in DCAT-US 3.0, aligned with NARA (National Archives and Records Administration) guidelines, enhances compliance and interoperability with NARA-specific use restriction standards. This enables organizations to accurately convey NARA-specific restrictions on data resources, ensuring adherence to archival and data preservation requirements, and promoting consistent data management practices within the DCAT-US framework.

Properties Summary

Mandatory Properties

Property: restriction status

Propertyrestriction statusRequirement levelMandatoryCardinality1URIdcat-us:restrictionStatusRangeskos:ConceptDefinitionIndication of whether or not there are use restrictions on the archival materials

Recommended Properties

Property: specific restriction

Propertyspecific restrictionRequirement levelRecommendedCardinality0..1URIdcat-us:specificRestrictionRangeskos:ConceptDefinitionThe identification of the type of use restrictions, based on copyright, donor, or statutory provisions, on the archival materials.

Optional Properties

Property: restriction note

Propertyrestriction noteRequirement levelOptionalCardinality0..1URIdcat-us:restrictionNoteRangerdfs:LiteralDefinitionSignificant information pertaining to the use or reproduction of the data.

Example

Usage Guidelines

Dereferenceable identifiers

The FAIR principles, under the Findability and the Accessibility chapters respectively, state that:

F1. (Meta)data are assigned a globally unique and persistent identifier
A1. (Meta)data are retrievable by their identifier using a standardized communications protocol

In the expansive realm of digital data and ontology, the ability to unambiguously identify and access resources is foundational. this section delves deep into the principles and practices that underpin this crucial aspect of digital data management. Guided by the FAIR principles, this section unravels the nuances of generating resolvable URLs, the importance of URI resolution, the roles of various identifier resolution services, and the distinctions between alternate identifier properties. Through a comprehensive exploration, this section offers insights into ensuring data is not only uniquely identifiable but also consistently accessible in an ever-evolving digital landscape.

Generating Resolvable URLs

In the context of FAIR data, resources on the web must have unique, persistent, and resolvable identifiers. In order to achieve the capability of persistence, it is necessary for the resource identifiers to comply to the RFC 3986 IETF standard for URIs (and IRIs, which are URI extended to cope with unicode). This means that it must comprise the following components:

scheme: http or https
an authority: www.example.com
optionally a path: /dataset-name/
a local identifier (such as database accession number, such as P12133 from uniprot) or a globally unique identifier (such as a UUID or hash code).

Identifier Resolution

URI resolution is a fundamental process that involves directing requests to the appropriate identified entity. The standard approach typically entails resolving an HTTP GET request through content negotiation, enabling the selection of different representations of the desired resource.

A PURL, or persistent URL, serves as a permanent address for accessing web resources. To grasp the concept of PURLs, it's essential to first understand the concept of URL indirection (also known as URL redirect or URL forwarding). This practice involves providing a stable and fixed web address/URL that is configured to point to different content, which might undergo periodic modifications.

When a user accesses a PURL, they are automatically redirected to the current location of the resource. This means that when an author decides to relocate a page, they can easily update the PURL to direct it to the new location.

The practice of indirection proves beneficial as it ensures a consistent URL address for resources that are prone to change, such as due to version updates or ownership changes.

A concrete example of this practice can be observed in the utilization of purl.org URLs for identifying OBO Foundry resources. For instance, the URL http://purl.obolibrary.org/obo/stato.owl redirects to the latest release of the file, which can be found at https://raw.githubusercontent.com/ISA-tools/stato/dev/releases/latest_release/stato.owl.

PURLs sharing a common prefix are organized into domains, each managed by a single maintainer. The maintainer has the authority to add new PURLs to the domain and make modifications to existing PURLs within that domain.

According to FAIR Principle A1, it is essential for (meta)data to be retrievable using its identifier. When the identifier itself is not a resolvable URL, Identifier Resolution Services are required. These services possess the capability to map an Internationalized Resource Identifier (IRI) to a specific location where the corresponding data can be accessed.

Identifier Resolution services

In the digital realm, ensuring consistent and persistent access to resources is paramount. Identifier Resolution Services play a crucial role in achieving this by providing unique and persistent identifiers for various digital objects and entities. This section delves into several prominent services, detailing their functions and significance in the broader digital ecosystem. Please note that this is not an exhaustive list but rather a selection of popular examples intended to illustrate the diversity and importance of such services

purl.orgThe PURL system is a service of the Internet Archive, which provides an interface to administer domain. For more information about the service, visit https://archive.org/services/purl/help w3ids

W3IDs.org provides persistent identifiers for Linked Data resources. These identifiers can be used in DCAT to uniquely identify datasets and data services. This can help to improve the discoverability and interoperability of datasets and data services. W3IDs.org is an important part of the Linked Data ecosystem and plays a key role in making data more discoverable and interoperable.

Send a request to add a redirect to the public-perma-id@w3.org mailing list. Make sure to include the URL that you want on w3id.org, the URL that you want to redirect to, and the HTTP code that you want to use when redirecting. An administrator will then create the redirect for you.

doi.orgDOI.org is a digital identifier system that assigns unique and persistent identifiers to digital objects. These identifiers can be used to cite, share, and track digital objects across different platforms and systems. DOI.org identifiers can be used in DCAT to uniquely identify datasets and data services. This can help to improve the discoverability and interoperability of datasets and data services.orcid.orgORCID (Open Researcher and Contributor ID) is a global, non-profit organization that provides a unique and persistent identifier for researchers. ORCID IDs are used to link researchers to their professional activities, such as publications, grants, and affiliations. This helps to ensure that researchers are properly credited for their work and that their work is more easily discoverable. ORCID is a valuable tool for researchers, and it is becoming increasingly important as the research landscape becomes more complex. arxiv.orgArXiv identifiers are globally unique identifiers (GUIDs) assigned to scholarly articles submitted to the arXiv preprint server. These identifiers can be used in DCAT (Data Catalog Vocabulary) to uniquely identify authors and their publications. This can help to improve the discoverability and interoperability of research data.Identifiers.orgThe Identifiers.org Resolution Service provides consistent access to life science data using Compact Identifiers. Compact Identifiers consist of an assigned unique prefix and a local provider designated accession number (prefix:accession). The resolving location of Compact Identifiers is determined using information that is stored in the Identifiers.org Registry.

Alternate identifiers

In the realm of data cataloging, identifiers play a pivotal role in ensuring the uniqueness, traceability, and interoperability of resources. Different namespaces and vocabularies offer distinct properties to denote identifiers. Here, we discuss three such properties: dcterms:identifier, adms:identifier, and skos:notation, shedding light on their distinct usages and nuances.

dcterms:identifier

Originating from the Dublin Core Metadata Terms (DCTERMS), dcterms:identifier is a broad and general property used to denote a unique reference for a resource. It does not impose any constraint on the format or nature of the identifier. In essence, it's a flexible property that can be employed across various domains and for diverse types of resources, be they digital documents, physical artifacts, or abstract concepts.

adms:identifier

The Asset Description Metadata Schema ([[VOCAB-ADMS]]) introduces adms:identifier. Unlike the more generic dcterms:identifier, this property is more structured. It's designed to link a resource to its identifier, which is itself described using further properties. This allows for a richer description of the identifier, such as specifying its type (e.g., ISBN, DOI), its status, and its version. It's particularly useful in contexts where there's a need to provide additional metadata about the identifier itself, beyond just its value.

skos:notation

skos:notation is a property from the Simple Knowledge Organization System ([[SKOS-REFERENCE]]) vocabulary. It's used to provide a symbolic string notation for a concept. While it can function similarly to an identifier, its primary intention is to give a machine-readable, often standardized, symbolic name to a concept, especially when such a notation exists in a legacy or external system. For example, in a controlled vocabulary, each concept might have a notation that denotes its code in a classification scheme.

Multilingualism

From a technical perspective multilingualism SHOULD be handled as follows:

Multilingual literals: Properties of Range rdfs:Literal can be provided in multiple languages by adding so called language encoded strings: these add the language as an [[ISO 639-1]] two letter code after the string in the way that is shown in the example below:
Content negotiation: Properties of Range rdfs:Resource SHOULD be URIs. It is important to use URIs that are language independent. Then the data publisher in the process of dispatching these URIs can use content negotiation.

The table lists multilingual properties of DCAT-US and the translation strategies that apply to them:

PropertyRDF propertyRangeMultilingual SupportCatalog titledcterms:title``rdfs:LiteralLanguage encoded stringCatalog descriptiondcterms:description``rdfs:LiteralLanguage encoded stringDataset titledcterms:title``rdfs:LiteralLanguage encoded stringDataset descriptiondcterms:description``rdfs:LiteralLanguage encoded stringDataset keyworddcat:keyword``rdfs:LiteralLanguage encoded stringCatalog homepage foaf:homepagefoaf:DocumentContent negotiationDataset landing Pagedcat:landingPagefoaf:DocumentContent negotiationCatalog publisherdcterms:publisherfoaf:AgentContent negotiation for the URI and language encoded string for the nameDataset publisherdcterms:publisherfoaf:AgentContent negotiation for the URI and language encoded string for the name

Stakeholders

In the realm of data cataloging and management, understanding the entities involved in the creation, curation, and maintenance of datasets is paramount. This section delves into the intricate details of these entities, categorizing them into distinct classes such as "Agent," "Person," and "Organization." Each class provides a structured framework to represent various stakeholders, from individuals to software agents and organizations, ensuring that data provenance is transparent and traceable. As we navigate through this section, we'll gain insights into the properties, roles, and significance of these agent representations within the DCAT-US 3.0 context, highlighting their pivotal role in enhancing data discoverability, interoperability, and usability.

Embracing globally unique, resolvable URLs and Persistent Identifiers (PIDs) stands paramount in fortifying the integrity and usability of data ecosystems, especially in identifying diverse agents. This practice not only ensures a crystal-clear, unambiguous identification, thereby averting potential duplications and inconsistencies from multiple URIs but also significantly enhances data discoverability and accessibility. By employing a singular, steadfast identifier per agent, data practitioners safeguard against data misinterpretation and ensure a coherent, traceable data lineage, bolstering data provenance and trust across various platforms and datasets. Furthermore, adherence to standardized identification practices, utilizing reference registries like ORCID or Research Organization Registry (ROR) , not only aligns with global data management standards but also propels collaborative research and data sharing, ensuring a streamlined, reliable, and impactful data management and collaboration across diverse research and data utilization environments. More details about identifiers are provided in Deferenceable identifiers section.

Agent

The foaf:Agent class in the Friend of a Friend [[FOAF]] ontology serves a dual-purpose role, particularly in the context of data cataloging and management.

Firstly, it acts as an abstract class for both org:Organization and foaf:Person, providing a generalized representation that encompasses various entities involved in dataset production and management. This abstraction facilitates the encapsulation of common properties and behaviors, enabling a unified approach to handling different entity types in data documentation and interoperability.

Secondly, foaf:Agent is utilized as a class to represent autonomous software agents, which are self-operating software entities capable of performing tasks and making decisions without direct human intervention.

This dual functionality of foaf:Agent not only streamlines the representation of human and non-human actors in data management processes but also provides a flexible and semantically rich framework to describe and interlink various entities within the DCAT-US schema, thereby enhancing data discoverability and usability.

Person

A person agent represents an individual involved in producing or managing datasets. It provides information about the person and their associated contact details.

In the context of a DCAT-US 3.0, the foaf:Person class plays a crucial role. It is used to represent individuals who are associated with or responsible for the datasets or resources described within the DCAT profile.

Let's break down the specific properties associated with foaf:Person and their significance within this context:

foaf:name: This property represents the full name of a person. It can be used to specify the full name of individuals associated with datasets. For example, if a person's full name is "John Smith," you can use this property to provide their complete name. This property is the only property mandatory for describing a person and is typically used for display.
foaf:firstName: This optional property represents the first name of a person. It is used to provide the first name of individuals associated with or responsible for resources in the DCAT-US profile. It can be used to provide structured information about an individual's first name.
foaf:givenName: This optional property represents the given name of a person. It can be used to provide structured information about an individual's name.
org:memberOf While not a FOAF property,this property is typically used to indicate the organization or group to which a person is affiliated to. In the context of a DCAT-US profile, it can be used to specify the organization or entity with which an individual is affiliated in relation to the described resources. For instance, if a person is a member of an organization of Department of Interior, you can use this property to link them to that organization identified by http://www.doi.gov.

Organization

The Organization agent plays a pivotal role in representing an organization or institution that is instrumental in the production or management of a resource. It encapsulates information about the organization accountable for the resource, along with its pertinent contact details, thereby acting proficiently as an Agent. Furthermore, it can be hierarchically decomposed into sub-organizations, offering a structured view of the organizational layers.

When employing org:Organization within DCAT-US 3.0, adherence to the following guidelines is imperative:

Use Recognized URL Identifiers: It is strongly recommended to utilize well-known URL identifiers for organizations that are centrally managed by a government registry. This practice ensures the unambiguous identification of organizations and fosters consistency and reliability in organizational referencing across cataloged resources.
Ensure Consistency: Employ foaf:name to furnish a consistent and recognizable name for the organization, thereby maintaining a uniform identity across various platforms.
Enhance Discoverability: Leverage skos:prefLabel to designate the preferred label, ensuring that the organization is effortlessly discoverable and identifiable across diverse search scenarios.
Accommodate Variations: Utilize skos:altLabel to incorporate alternative names, acronyms, or aliases, thereby enhancing searchability and augmenting user-friendliness by accommodating various naming conventions.
Provide Abbreviations: Employ skos:notation to document any abbreviations or short forms that are commonly associated with the organization, facilitating users in recognizing and associating the organization with its widely-used abbreviations.
Represent Hierarchy: Optionally, utilize org:subOrganizationOf to depict hierarchical relationships, providing a structured and layered view of the organization and its sub-entities.

Contact Point

The Contact Point serves as a crucial element in data cataloging, providing a reference for users to seek additional information, clarifications, or support regarding a resource published in a catalog. In the DCAT-US profile, contact point information is encoded using the widely used [[VCARD-RDF]] vocabulary, ensuring standardized representation and interoperability of contact details across various platforms and applications.

A contact point may refer to an individual, a team, or an organization responsible for the resource (dcat:Dataset, dcat:DataService, dcat:DatasetSeries, dcat:Catalog) and is typically characterized by properties such as name, email, and telephone number. The inclusion of address details, role or title, and associated organizational details further enriches the contact information, providing users with multiple avenues to facilitate communication.

It is imperative to ensure that the contact point information is accurate, up-to-date, and reliable to foster trust and facilitate efficient communication between data providers and consumers. The following sub-sections provide detailed guidance on encoding contact point information, defining associated address details, and linking the contact point to the resources in the DCAT-US profile.

Encoding Contact Information

The contact information is encoded using the vcard:Kind class. If the contact information is reused in many resources, it is recommended to identify it with URI to avoid duplicate entries. The vcard:fn (formatted name) and vcard:email (email address) properties are mandatory to ensure basic contactability. Additional properties like vcard:tel (telephone number) and vcard:title (role or title) can be utilized to provide comprehensive details about the contact point. If the contact is a person, the property vcard:givenName and vcard:familyName can be used.

              :vcard123 a vcard:Kind ;
                  vcard:fn "John Doe" ;
                  vcard:email <mailto:john.doe@example.com> ;
                  vcard:tel <tel:+123456789> ;
                  vcard:family-name "Doe" ;
                  vcard:given-name "John" ;
                  vcard:title "Data Manager" ;
                  vcard:hasAddress :address456 ;
              .

Defining Address Details

Address details, when applicable, are encoded using the vcard:Address class and linked to the contact point using the vcard:hasAddress property. The address does not have to a URI, if it not reused anywhere else in the catalog. The address class can include properties like vcard:street-address, vcard:locality, vcard:locality, vcard:region, vcard:postal-code and vcard:country-name to provide detailed location information about the contact point.

The following example illustrates how to define and encode address details, ensuring clarity and usability for data consumers.

            :address456 a vcard:Address ;
                vcard:street-address "123 Main Street" ;
                vcard:locality "Anytown" ;
                vcard:region "CA" ;
                vcard:postal-code "12345" ;
                vcard:country-name "USA" ;
            .

Linking Contact Point to Resource

The contact point is associated with the dataset using the dcat:contactPoint property. This linkage ensures that users can easily identify and communicate with the responsible entity for additional information, support, or inquiries regarding the dataset.

The following example illustrates how to link the defined contact point to the dataset, ensuring clarity and facilitating user navigation and communication.

            @prefix dcat: <http://www.w3.org/ns/dcat#> .

            :MyDataset a dcat:Dataset ;
                dcat:title "My Example Dataset" ;
                dcat:description "This dataset includes example data for demonstration purposes." ;
                dcat:contactPoint :vcard123 ;
            .

Resource Attributions

Attribution in data catalogs pertains to the systematic association of a resource (such as a dataset or service) with a responsible entity, termed an "agent". Agents, which can be individuals, organizations, or services, may contribute to, create, publish, or interact significantly with the data. The roles of agents, such as contributor, creator, publisher, funder, distributor, custodian, or editor, are crucial in understanding the lineage and responsibility of data management.

Attributions hold paramount importance in data catalog searches for several pivotal reasons:

Provenance and Trustworthiness: Understanding the entities (agents) that have created or interacted with the data can significantly inform assessments of its quality and trustworthiness. Data originating from or managed by reputable and trusted organizations or individuals may be deemed more reliable and credible.
Credit and Accountability: Proper attributions ensure that all contributing individuals or organizations are aptly acknowledged for their work or data. This practice not only adheres to ethical guidelines and potentially legal requirements but also fosters a culture of recognition and accountability in data management and sharing.
Search and Discovery: Attributions serve as a valuable criterion in data search and discovery processes. Users may seek datasets created, managed, or contributed to by specific researchers, organizations, or other agents, thereby making attributions a vital component in filtering and locating data resources.
Collaboration and Networking: Identifying and acknowledging the agents associated with datasets can pave the way for new collaborative opportunities. It enables users and researchers to identify and connect with individuals or organizations possessing relevant expertise or shared research interests.
Issue Resolution: When users encounter issues or have queries about a dataset, attributions provide a clear pathway to seek clarifications, report issues, or obtain additional information. This ensures that data reliability and integrity are maintained through active resolution of issues and continuous improvement.

Standard Attributions and Roles

Employing standard properties such as dcterms:creator, dcterms:contributor, dcterms:rightsHolder, and dcterms:publisher, along with the generic prov:wasAttributedTo from [[!PROV-O]], facilitates the basic associations of responsible agents with a cataloged resource, ensuring clarity and standardization in data attribution.

Extended Attributions and Diverse Roles

While there are numerous roles of significance in relation to cataloged resources, such as funder, distributor, custodian, and editor, some of these roles are enumerated in the CI_RoleCode values from [[?ISO-19115-1]], in the [[?DataCite]] metadata schema, and included within the MARC relators.

Utilizing a generalized method for assigning an agent to a resource with a specified role is facilitated by prov:qualifiedAttribution from [[PROV-O]]. This method is particularly useful when the nature of the relationship is known but does not correspond with one of the standard attribution property roles.

The range of prov:qualifiedAttribution is prov:Attribution. The relevant Agent is specified via property prov:agent, whereas the role is specified with property dcat:hadRole, which takes as value a skos:Concept describing that role, as those included in the relevant code list operated by a US Government-controlled Registry.

The prov:qualifiedAttribution property is utilized to provide more detailed and structured information about the attribution of a resource, allowing for the specification of additional attributes, such as the role or position of the attributed entity, the date of attribution, or other relevant details.

provides an illustration of the usage of attribution properties:

Resource Classification

Controlled vocabularies, including taxonomies and thesauri, dramatically enhance data searchability. Utilizing these vocabularies allows datasets to be systematically classified, tagged, and described with standardized terms, aiding users in retrieving relevant datasets, even when using varied terms or synonyms.

Employing controlled vocabularies enables semantic search, which comprehends the context and relationships behind search terms. This approach enhances search results, for example, linking "automobiles" with related terms like "cars" or "vehicles".

This enriched search experience is crucial for navigating vast, diverse datasets, ensuring comprehensive and relevant results, and bridging the gap between user intent and dataset content.

The DCAT-US profile utilizes properties from the DCAT 3 framework for resource classification, providing flexibility in the choice of controlled vocabularies to meet the specific needs of various communities or agencies.

dcterms:type: This property specifies the category or genre ofgc the content in a resource. It's applicable to dcat:Dataset, dcat:DataService, and dcat:DatasetSeries. For dcat:DataService, types might include "Web Map Service" (WMS) for services providing geographical data in a map format, "Web Feature Service" (WFS) for services allowing users to access geospatial features, or "RESTful API" for services using REST API protocols. For datasets, types can be "Geospatial Dataset", "Image", "Statistical Dataset", or "Map". The Dublin Core Type Vocabulary is a popular choice for providing standardized descriptors.
dcat:keyword: This property allows for the tagging of datasets with relevant keywords, facilitating easier discovery and categorization. It is suitable for use with dcat:Dataset, dcat:DataService, dcat:DatasetSeries, and dcat:Catalog. Employing keywords from established vocabularies such as AGROVOC (for agricultural terms), Global Change Master Directory (GCMD) [[?GCMD]] (for Earth science), or NAICS (for industry classifications) ensures consistency and enhances the discoverability of datasets within the US context.
dcat:theme: This property provides thematic categorization for resources, specifically for dcat:Dataset and dcat:DatasetSeries. Utilizing a unified thematic taxonomy, such as the Data Theme Taxonomy from Data.gov or the FGDC (Federal Geographic Data Committee) Controlled Vocabularies like the ISO 19115 Topic CodeList, ensures a cohesive approach to categorizing datasets. This thematic classification aids users in navigating and identifying datasets relevant to particular subjects or sectors.
dcterms:subject: Aimed at providing detailed insight into the primary subject matter of a dataset, this property is crucial for dcat:Dataset and dcat:DatasetSeries. Adoption of controlled vocabularies like Global Change Master Directory (GCMD) [[?GCMD]] for Earth science topics, FAO Agrovoc for agricultural subjects, ITIS for taxonomic information, NAICS for industry classifications, or LCSH (Library of Congress Subject Headings) enhances the clarity and searchability of datasets, particularly in the context of US Government data. These vocabularies enable precise and comprehensive subject classification, facilitating more effective data discovery and use.

Spatial Metadata

Spatial metadata play a vital role in the context of geospatial data within the US Government by providing essential information about data quality, facilitating data discovery and interoperability, and ensuring responsible data governance. They describe the characteristics, source, and limitations of geospatial datasets, enabling informed decision-making based on data credibility. Spatial metadata support efficient data discovery, retrieval, and sharing, reducing duplication and promoting collaboration. They also promote interoperability by adhering to standardized metadata schemas and facilitate compliance with legal and regulatory requirements, ensuring accountable data stewardship. Spatial metadata are essential for maximizing the value and effective utilization of geospatial data within the US Government.

The Data Catalog Vocabulary (DCAT) specification provides a standardized way to represent metadata about datasets and services, including information about their spatial properties. In the context of DCAT-US, which is a profile tailored specifically for the United States, several spatial properties are relevant for describing resources. This wiki page aims to provide an overview of these spatial properties and their usage within the DCAT-US framework.

Geographic Bounding Box

A bounding box represents the minimum and maximum coordinates that enclose a specific geographic area. In DCAT-US, the dcat-us:geographicBoundingBox property and the class dcat-us:GeographicBoundingBox are introduced and utilized to define the spatial extent of a resource. This class consists of four numerical properties: the west ( dcat-us:westBoundingLongitude) and east longitude ( dcat-us:eastBoundingLongitude), followed by the north ( dcat-us:northBoundingLatitude) and south latitude ( dcat-us:southBoundingLatitude), which are based on the WGS84 coordinate system.

By specifying a bounding box, datasets can be associated with a particular geographic region. If the west bound longitude is greater than the east bound longitude, then the box spans the anti-meridian

Antimeridian crossing Geographic Bounding Box crossing antimeridian

Defining a common reference system is of utmost importance when searching for geospatial data. Geospatial datasets are typically represented using different coordinate systems, projections, and datums, which can lead to challenges in interoperability and data integration. A common reference system ensures that data from diverse sources can be accurately aligned and combined, enabling effective analysis, visualization, and decision-making.

The introduction of the dcat-us:geographicBoundingBox property in DCAT-US profile addresses this challenge by providing a standardized way to express the spatial extent of a resource. Unlike using a Polygon, which requires explicit geometric coordinates, the dcat-us:geographicBoundingBox offers a simpler and more interoperable approach. Here are a few reasons why the dcat-us:geographicBoundingBox is advantageous:

Consistent Spatial Representation: Geospatial datasets can be represented in various coordinate systems, projections, and datums. Without a common reference system, it becomes difficult to align and compare datasets accurately. By establishing a common reference system, data publishers and consumers can ensure consistent spatial representation, enabling seamless integration and analysis of geospatial data from different sources.
Interoperability and Integration: The use of a common reference system enhances interoperability among geospatial datasets and systems. It enables data from diverse sources to be combined and used together seamlessly, facilitating cross-domain analysis and decision-making. With a common reference system, data publishers can provide metadata that adheres to a standard, making it easier for data consumers to understand and utilize the data.
Simplified Search and Discovery: The dcat-us:geographicBoundingBox property simplifies the search and discovery process for geospatial data. Instead of relying on complex geometric representations like polygons, users can specify a bounding box by defining the minimum and maximum values of latitude and longitude. Filtering geospatial data using a bounding box involves numeric comparisons, where the latitude and longitude values of data points are compared to the minimum and maximum values of the bounding box. This approach efficiently eliminates data points outside the specified spatial extent by performing simple numeric operations. It leverages the inherent numerical properties of latitude and longitude values, making it computationally efficient and compatible with spatial indexing and query optimization techniques. By using numeric comparison, geospatial data can be filtered and retrieved faster, optimizing the search process in various geospatial applications. This makes it easier for users to define their area of interest and retrieve relevant datasets that intersect with that spatial extent.
Query Efficiency and Performance: The use of dcat-us:GeographicBoundingBox enables efficient spatial querying of datasets. Data consumers can quickly filter and retrieve resources based on their spatial extent, reducing the need to process unnecessary data. This improves search performance and query efficiency, particularly when dealing with large-scale geospatial data collections.
Compatibility with Existing Tools and Standards: The adoption of the dcat-us:geographicBoundingBox property aligns wiofficetropolitan statistical areas, employing multiple bounding boxes for each area helps retrieve data specific to each metropolitan region, ensuring more accurate and focused results. Furthermore, non-contiguous states like Alaska and Hawaii require separate bounding boxes to accurately capture their unique spatial coverage. The inclusion of multiple bounding boxes in geospatial searches improves the accuracy and relevance of the retrieved datasets, facilitating more effective decision-making and analysis in various applications and domains.

Spatial Coverage

In DCAT 3, the use of the dcterms:spatial property is intended to provide information about the spatial coverage or location of a resource. This property allows for the description of the spatial aspect of a dataset, dataset distribution, or data service in a standardized manner.

The dcterms:spatial property can be used to represent spatial coverage using various spatial reference systems, such as coordinates, polygons, or place names. This flexibility allows data publishers to express the spatial extent of their resources in a way that is most appropriate for the given context.

For example, the dcterms:spatial property can be used to indicate the geographic bounding box that represents the extent of a dataset. This can be expressed using minimum and maximum latitude and longitude values, providing a rectangular approximation of the resource’s coverage area. Alternatively, a more precise polygon can be used to describe complex or irregularly shaped spatial extents.

By including the dcterms:spatial property in DCAT 3, datasets can provide explicit information about their spatial coverage. This enables data consumers and applications to understand the geographic scope of a resource and determine its relevance for their specific use cases. It supports efficient searching, discovery, and integration of geospatial datasets across different platforms and systems.

Furthermore, the use of standardized properties like dcterms:spatial enhances interoperability and data exchange among different data catalogs and applications. By conforming to the DCAT 3 specification, data publishers ensure that spatial information is consistently represented and interpreted, facilitating seamless data integration and interoperability within the geospatial community.

Spatial Resolution

Spatial resolution is a characteristic of geospatial datasets that describes the level of detail or granularity in the spatial representation. In DCAT 3, the dcat:spatialResolutionInMeters property is used to specify the spatial resolution of a resource, measured in meters. This property helps users understand the level of detail provided by the dataset and assess its suitability for their specific needs. Applications benefit from this property in various ways. For instance, in remote sensing and satellite imagery, users can determine if the dataset captures the required level of detail for their analysis. In cartography and mapping, spatial resolution influences the clarity and accuracy of displayed features. Environmental modeling relies on appropriate resolution for accurate simulations, and emergency management requires datasets that support informed decision-making. The dcat:spatialResolutionInMeters property supports data integration, ensuring compatibility between datasets with different resolutions. Overall, this property enhances the usability and effectiveness of geospatial datasets across diverse domains.

Handling Map Projections and Coordinate Systems

Geographic datasets in DCAT are commonly referenced using latitude and longitude coordinates based on the WGS84 datum. This is facilitated by the recommended use of the dcat-us:geographicBoundingBox property and the corresponding class dcat-us:GeographicBoundingBox to establish a uniform reference system for searching and indexing. However, the diverse nature of geographic data often necessitates the use of various map projections and coordinate systems.

The dcterms:conformsTo property in DCAT is integral in specifying the Coordinate Reference System (CRS) utilized by a dataset or a distribution. Accurately defining the CRS is essential for understanding the spatial context, enabling precise geographic analysis and ensuring data interoperability.

Additionally, the dcterms:type property is employed alongside dcterms:conformsTo to delineate the type of reference system, be it spatial or temporal. For spatial datasets, dcterms:type typically points to a spatial reference system, as defined by URIs like http://resources.data.gov/categories/SpatialReferenceSystem.

Utilizing URIs to reference EPSG standards ensures a clear and unambiguous specification of the CRS. For example, the URI http://www.opengis.net/def/crs/EPSG/0/4269 explicitly denotes adherence to the NAD 83 CRS. Standardized references like these enhance data consistency and facilitate interoperability across various platforms and applications.

The reference system identifier SHOULD be preferably represented with an HTTP URI. In particular, spatial reference systems should be specified by using the corresponding URIs from the “EPSG coordinate reference systems” register operated by the Open Geospatial Consortium [[?OGC-EPSG]]. This registry is crucial for the precise identification of CRSs, thereby ensuring that spatial data referenced in DCAT are compatible and functional across a multitude of geospatial applications.

Example: Specifying a CRS using an EPSG code for a geographic dataset

Clearly defining the CRS is paramount for the effective use and integration of DCAT datasets, facilitating their application in a broad spectrum of spatial data uses.

Temporal Metadata

Temporal metadata is crucial for understanding and utilizing datasets effectively. This section is divided into three main categories to cover key aspects: Lifecycle Temporal Properties, Temporal Coverage, and Temporal Resolution. Additionally, we provide insights into handling these temporal aspects in JSON-LD format. Accurate temporal metadata ensures datasets are relevant and reliable, especially for time-sensitive analyses.

The use of multiple formats for temporal metadata, such as xsd:date, xsd:dateTime, xsd:gYear, and xsd:gYearMonth, is essential. These formats provide the necessary precision, flexibility, contextual appropriateness, interoperability, and cater to diverse user needs, accommodating different datasets' requirements for detail.

Lifecycle Temporal Properties

Lifecycle temporal properties document the timeline of the dataset's creation, updates, and publication. These properties are crucial for understanding the dataset's history, and currentness.

Release Time ( dcterms:issued): Indicates the date when the dataset was first made available. Formats include:
- xsd:date(e.g., 2023-11-30)
- xsd:dateTime (e.g., 2023-11-30T15:00:00)
- xsd:gYear (e.g., 2023)
- xsd:gYearMonth (e.g., 2023-11)
Revision/Update Time (dcterms:modified): Shows when the dataset was last updated, using the same formagits as the release time.
Update Schedule (dcterms:accrualPeriodicity): Describes the frequency of dataset updates. Terms are taken from the [Dublin
Core
Collection Description Frequency Vocabulary](http://www.dublincore.org/specifications/dublin-core/collection-description/frequency/). Multiple formats allow for precise scheduling, whether regular or irregular. Frequency Coding Guide section provides a guide to coding various standard frequencies as per ISO 19115, ISO-8601, and the Dublin Core standards.
Record Creation Time (dcterms:created): Specifies the date when the catalog record itself was created, separate from the dataset it catalogs. This property uses the xsd:dateTime format.

Temporal Coverage of the Dataset

Temporal coverage refers to the time period the data within a dataset covers or relates to, as opposed to lifecycle properties like creation or update dates. This concept is central in data management for understanding the relevance and applicability of the dataset's content.

In DCAT, temporal coverage is defined using the property dcterms:temporal associated with the dcterms:PeriodOfTime class. This class allows for a clear specification of the coverage period through defined start and end dates. For detailed representation, formats such as xsd:date, xsd:dateTime, xsd:gYear, or xsd:gYearMonth can be used. For instance, a dataset on a year-long project might use "2023" (xsd:gYear), whereas a dataset with specific event dates might use "2023-03-15T13:00:00" (xsd:dateTime).

Marking these timeframes is typically done using dcat:startDate and dcat:endDate, offering flexibility for either fixed or open-ended periods. For example, a dataset about historical weather patterns might span from "1950-01-01" to "2000-12-31".

Adopting dcterms:PeriodOfTime in DCAT-US 3.0 aligns it with international DCAT 3 standards, improving data interoperability and ensuring consistent handling of time-related data. This alignment rectifies previous inconsistencies and enhances the usability and exchange of data.

Temporal Resolution in Datasets and Distributions

The property dcat:temporalResolution in a dcat:Dataset, or dcat:Distribution, refers to the smallest time interval that can be discerned in the data. This property is essential for understanding the granularity and frequency of data recording within the dataset or its specific distributions.

Temporal resolution is particularly relevant in datasets where time plays a crucial role, such as time-series data. It indicates the level of detail at which changes or updates in the data are recorded and presented. For instance, a dataset with daily weather observations might have a temporal resolution of one day, represented as "P1D" in XML Schema duration format.

In the context of dcat:Dataset, specifying dcat:temporalResolution helps users understand the overall temporal granularity of the dataset. Conversely, when applied to dcat:Distribution, it provides resolution details specific to each distribution format, acknowledging that different formats might be updated at different frequencies.

This distinction is important for datasets available in multiple formats or distributions, as each might have different temporal characteristics. For example, a high-resolution version of a dataset updated every minute would be suitable for detailed, time-sensitive analyses, while a lower-resolution version updated annually might be better suited for long-term trend analyses.

Examples of encoding durations in XML Schema duration format:

Daily Resolution: A dataset with daily updates would use "P1D", indicating an update frequency of every day.
Hourly Resolution: For hourly data updates, such as in a traffic flow dataset, the encoding would be "PT1H", representing an hourly update frequency.

The dcat:temporalResolution can be specified using various time units such as seconds, minutes, hours, days, or years, depending on the nature of the dataset or distribution. This specification aids in aligning the dataset or distribution with user expectations and analytical requirements.

Frequency Coding Guide

The following table provides a guide to coding various standard frequencies as per ISO 19115, [[ISO8601-1]], and the Dublin Core standards.

ISO 19115 - MD_MaintenanceFrequencyCodeISO-8601Dublin Core Collection Description Frequency Vocabulary [[CLD-FREQ]]continualR/PT1ScontinuousdailyR/P1DdailyweeklyR/P1WweeklyfortnightlyR/P2W or R/P0.5WbiweeklymonthlyR/P1MmonthlyquarterlyR/P3MquarterlybiannuallyR/P6MsemiannualannuallyR/P1YannualasNeeded--Irregular-irregularnotPlanned--unknown---R/P3Ytriennial-R/P2Ybiennial-R/P4MthreeTimesAYear-R/P2M or R/P0.5Mbimonthly-R/P0.5Msemimonthly-R/P0.33MthreeTimesAMonth-R/P1Wsemiweekly-R/P3.5DthreeTimesAWeek

Handling Temporal Formats in JSON-LD for DCAT Datasets

When dealing with a dcat:Dataset in JSON-LD, different temporal formats can be effectively represented using the JSON-LD @type attribute. This ensures that each temporal aspect of the dataset is accurately interpreted.

Example using xsd:date for a dataset's last update time (dcterms:modified):

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Annual Financial Report",
        "modified": {
          "@value": "2023-03-31",
          "@type": "xsd:date"
        }
      }

Example using xsd:dateTime for a dataset's precise creation time:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Real-Time Traffic Data",
        "created": {
          "@value": "2023-03-31T15:00:00",
          "@type": "xsd:dateTime"
        }
      }

Example using xsd:gYear for a dataset's publication year (dcterms:issued):

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Decadal 2020 Census Data",
        "issued": {
          "@value": "2020",
          "@type": "xsd:gYear"
        }
      }

Example using xsd:gYearMonth for representing the temporal coverage of a dataset:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Quarterly Weather Observations",
        "temporal": {
          "@type": "dcterms:PeriodOfTime",
          "startDate": {
            "@value": "2023-01",
            "@type": "xsd:gYearMonth"
          },
          "endDate": {
            "@value": "2023-03",
            "@type": "xsd:gYearMonth"
          }
        }
        }

These examples illustrate the flexible use of @type to accurately represent various temporal aspects within a dcat:Dataset. By specifying the datatype, datasets can convey precise temporal information, enhancing data usability and interpretation.

Additionally, specifying the dcat:temporalResolution in JSON-LD is straightforward. Since the @type for dcat:temporalResolution is predefined in the JSON-LD context, it's not necessary to explicitly declare it.

Here's an example of how dcat:temporalResolution might be used in a JSON-LD representation of a dataset with daily updates:

          {
          "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
          "@type": "dcat:Dataset",
          "title": "Daily Temperature Observations",
          "temporalResolution": "P1D"
        }

In this example, the dataset is defined with a temporal resolution of one day, indicated by "P1D". This notation follows the XML Schema duration format and is understood in the JSON-LD context without requiring an additional @type declaration for the resolution.

The dcterms:accrualPeriodicity property in JSON-LD specifies the frequency at which a dataset is updated or new data is added. This property is vital for users to understand how often the dataset's information is refreshed.

Example of a dataset updated daily:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Daily Air Quality Index",
        "accrualPeriodicity": "daily"
      }

Example of a dataset updated monthly:

          {
        "@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
        "@type": "dcat:Dataset",
        "title": "Monthly Employment Statistics",
        "accrualPeriodicity": "monthly"
      }

These examples illustrate the use of dcterms:accrualPeriodicity in JSON-LD to clearly represent the update frequency of datasets. By specifying this property, users can easily determine the refresh rate of the dataset's data, which is crucial for its application and relevance.

Legal Metadata

The DCAT-US specification emphasizes clear and accurate legal information provision for datasets, data series, distributions, and data services. Aligning with the FAIR data principles, this approach is essential for informed data use and ensuring compliance with relevant laws and policies. Contemporary laws increasingly mandate clear legal guidelines, highlighting the need for explicit reuse standard disclosures. The DCAT-US strategy aims to articulate legal information at the most specific sharing level to prevent conflicts and discrepancies, ensuring clarity and consistency across different data organization levels. Engaging legal experts is crucial in developing precise and understandable guidelines for each data publisher.

Legal information within DCAT-US is categorized into distinct types, each addressing different legal aspects for digital resources. This categorization helps in tailoring the guidelines to the specific nature of each resource, including those governed by the National Archives and Records Administration (NARA).

License Declarations: Provide clear guidelines on resource usage, redistribution, or modification.
Access Rights: Detail who can access the resource and under what conditions.
Usage Rights: Concern the rights associated with resource use, often detailing scope and limitations.
Liability Statements: Outline responsibilities and liabilities associated with resource usage.
Copyright Information: Clarify the copyright status of the resource.
Controlled Unclassified Information (CUI): Indicate special handling or dissemination controls for CUI, especially relevant to NARA.
Privacy and Confidentiality: Address personal data, privacy, and confidentiality issues.
Intellectual Property Rights (IPR): Deal with intellectual property aspects like patents and trademarks.

The DCAT-US profile addresses unique requirements for digital resources managed by the National Archives and Records Administration (NARA). It employs a structured approach to define access, use and CUI restrictions using well-defined code lists, enhancing the precision of legal information management, which is critical for datasets involving NARA-governed materials. These materials, governed by NARA's guidelines, involve a range of legal and ethical considerations, including content sensitivity, legal compliance, donor stipulations, and preservation needs.

Adherence to these NARA-specific guidelines is vital for the responsible management of datasets with historical, legal, or cultural significance, ensuring they align with legal frameworks. Below are the key NARA-specific restrictions:

Access Restrictions: Detailing who can access NARA-governed digital assets and under what conditions, in accordance with the federal records legal framework. This includes guidelines on public access and accessing restricted records.
Use Restrictions: Outlining the rights and limitations associated with the use of NARA-governed assets, focusing on specific legal constraints.
Controlled Unclassified Information (CUI) Restrictions: Specifying the handling and dissemination controls for NARA assets that involve CUI, as mandated by law.

Access Rights

The dcterms:accessRights property in the DCAT-US specification plays a key role in defining the accessibility of a dataset, distribution or data service. This property provides information about the access restrictions placed on a digital resources, including any limitations or permissions required for accessing the data.

Accurately defining access rights is crucial for data providers to ensure that users are aware of any conditions or limitations on accessing datasets. This transparency aids users in understanding and complying with access policies, facilitating responsible and legal use of data.

The dcterms:accessRights property is used to describe the general conditions under which a dataset is accessible. This may include information about:

Public Access: Indicating whether a dataset is publicly accessible or restricted to certain groups.
Restrictions: Detailing any specific conditions or limitations on who can access the dataset.
Requirements: Outlining any prerequisites for accessing the dataset, such as membership, subscriptions, or compliance with certain terms.

The access rights is defined as an instance of dcterms:RightsStatement. This class accommodates URL references and custom rights statements as literal via odrs:attributionText.

Additionally, access rights applies both at the dataset and distribution levels. While a dataset access rights defines the overall terms for the entire dataset, each distribution can have its own specific access rights, potentially different from the dataset's access rights. This flexibility allows for tailored access rights for different forms or versions of the dataset, enhancing the dataset's usability and compliance with varied user needs.

This property does not typically include the access URL or mechanism, which is covered by the dcat:accessURL property. Instead, dcterms:accessRights focuses on the legal or policy constraints that govern the availability of the dataset.

License Document

The dcat-us:LicenseDocument is an RDF Class in the DCAT-US specification designed to specify the licensing terms under which a dataset is made available. This class plays a vital role in clearly communicating the legal permissions and restrictions associated with the use of a dataset. Its focus is broader, covering various aspects of dataset usage and redistribution in a digital environment, typically for datasets published by or for government agencies.

The License Document typically includes information such as:

Terms of Use: Detailed conditions under which the dataset can be used, modified, shared, or distributed.
Restrictions: Any limitations or prohibitions regarding the use of the dataset.
Rights Granted: Specific rights that are granted to users of the dataset, such as the right to use, reproduce, or distribute.
Attribution Requirements: Requirements for acknowledging the source of the dataset in derivative works or publications.

In dataset metadata, the dcat-us:LicenseDocument can be referenced using the property dcterms:license. This ensures that users are aware of and can easily access the licensing terms associated with a dataset.

Implementing dcat-us:LicenseDocument is crucial for data providers to ensure legal clarity and compliance, and it assists users in understanding their rights and responsibilities when using the dataset.

Property spdx:licenseText MAY only be used to specify license information in legacy metadata records, not compliant with1 standard license from an endorsed registry. This property can be repeated for parallel language versions of the description

Additionally, licensing applies both at the dataset and distribution levels. While a dataset license defines the overall terms for the entire dataset, each distribution can have its own specific license, potentially different from the dataset's license. This flexibility allows for tailored legal terms for different forms or versions of the dataset, enhancing the dataset's usability and compliance with varied user needs.

It is important for data providers to ensure consistency between the dataset license and distribution licenses to avoid confusion. Clear and accessible licensing information for both datasets and their distributions is essential for users to understand their legal rights and obligations.

Concerning license vocabularies, implementers are encouraged to consult legal experts to develop precise and understandable guidelines.

Liability Statement

The introduction of dcat-us:LiabilityStatement in the DCAT-US 3.0 profile is a formal declaration by data providers aimed at limiting legal exposure related to the dataset. It disclaims warranties and guarantees, setting clear user expectations for dataset usage, thereby enhancing legal compliance and transparency.

Key aspects typically covered in the Liability Statement include:

Limitation of Responsibility: Disclaiming provider responsibility for data errors and consequential impacts.
No Guarantee of Validity: No assurance of data accuracy, reliability, or completeness.
Absence of Endorsement: Clarifying that catalog inclusion does not imply provider endorsement.
Use at Own Risk: Users are responsible for assessing data suitability for their purposes.

The Liability Statement can be conveyed as literal text or a URL in the dataset metadata, offering flexibility in communication based on the dataset's legal context.

Concerning liability statement, implementers are encouraged to consult legal experts to develop precise and understandable guidelines.

Example

Copyrights

Copyrights are legal rights granted to the creators of original works, including literary, dramatic, musical, artistic, and certain other intellectual creations. These rights provide creators with exclusive control over the use and distribution of their works for a specific period.

Key aspects of copyright law include:

Exclusive Rights: Copyright holders have exclusive rights to reproduce, distribute, perform, display, and create derivative works from their creations.
Duration: The duration of copyright protection varies by jurisdiction.
Public Domain: After copyright protection expires, works enter the public domain and can be used freely by anyone without permission or payment.
Fair Use: Certain uses of copyrighted material may be considered fair use, such as for criticism, comment, news reporting, teaching, scholarship, and research, without the need for permission from or payment to the copyright holder.

The copyrights is defined as an instance of dcterms:RightsStatement. This class accommodates URL references and custom copyrights statements as literal via odrs:attributionText.

Additionally, copyrights applies both at the dataset and distribution levels. While a dataset copyrights defines the overall terms for the entire dataset, each distribution can have its own specific copyrights, potentially different from the dataset's copyrights. This flexibility allows for tailored copyrights for different forms or versions of the dataset, enhancing the dataset's usability and compliance with varied user needs.

NARA Access Restrictions

The National Archives and Records Administration (NARA) enforces Access Restrictions to ethically and legally manage a wide array of historical records. These restrictions protect sensitive information, comply with legal standards, honor donor agreements, and preserve physical and digital integrity.

Access Restrictions are tailored to each archival collection, balancing historical significance with legal, ethical, and preservation considerations. This ensures appropriate access while respecting archival practices.

The dcat-us:AccessRestriction class in DCAT-US application profile represents NARA's access restrictions, providing a framework to define and communicate limitations on record accessibility. This class supports NARA in managing sensitive content, maintaining confidentiality, and upholding data integrity, aiding stakeholders in understanding accessibility constraints. The following properties are used to describe use restrictions:

Restriction Status: This mandatory property is linked to the NARA [Access
Restriction
Status
Authority
List](https://www.archives.gov/research/catalog/lcdrg/authority_lists/accesslist.html), the dcat-us:restrictionStatus property indicates whether or not there are access restrictions on the archival materials.
Specific Restriction: This recommended dcat-us:specificRestriction property refers to the NARA [Specific
Access
Restriction Authority List](https://www.archives.gov/research/catalog/lcdrg/authority_lists/specificaccesslist.html) for detailed information on restrictions. It defines specific access restrictions to the archival materials, based on national security considerations, donor restrictions, court orders, and other statutory or regulatory provisions.
Restriction Note: This optional dcat-us:restrictionNote property can include additional contextor annotations about restrictions. Its use depends on the dcat-us:restrictionStatus property, and certain terms from the authority lists may require its application. It clarifies complex access restrictions, explains multiple levels of security classifications, identifies restricting statutes, or explains access restrictions not included in the [Specific
Access
Restriction Authority List](https://www.archives.gov/research/catalog/lcdrg/elements/specificaccess.html) or [Security
Classification Authority List](https://www.archives.gov/research/catalog/lcdrg/authority_lists/securitylist.html).

These properties and their corresponding code lists are essential for encoding and interpreting restriction data accurately within the NARA guidelines and DCAT-US framework. The SKOS vocabulary for these lists is available in the [DCAT-US
repository](https://github.com/DOI-DO/dcat-us/blob/main/vocabularies/nara-restrictions.ttl), with JSON-LD URIs abbreviated for space. The following is the code list for dcat-us:restrictionStatus property:

URI (abbr.)LabelDescriptionno-limitationsNo limitationsAnybody can directly and anonymously access the data, without being required to register or authenticate.registration-requiredRegistration requiredAnybody can access the data, but they have to register first.authorisation-requiredAuthorisation requiredOnly some users can access the resource.unknownUnknownAccess restrictions are unknown.

NARA Use Restrictions

The dcat-us:UseRestriction class in DCAT-US 3.0 is used by the National Archives and Records Administration (NARA) to define use restrictions on resources. This includes limitations for legal and ethical compliance, like protecting privacy and intellectual property rights. The following properties are used to describe use restrictions:

Restriction Status: This mandatory property is linked to the NARA [Use Restriction Status
Authority
List](https://www.archives.gov/research/catalog/lcdrg/elements/use.html), the dcat-us:restrictionStatus property specifies the type of use restrictions.
Specific Restriction: This recommended dcat-us:specificRestriction property refers to the NARA [Specific Restriction
Authority List](https://www.archives.gov/research/catalog/lcdrg/elements/specificuse.html) for detailed information on restrictions.
Restriction Note: This optional dcat-us:restrictionNote property can include additional context or annotations about restrictions. Its use depends on the 'restriction status' property, and certain terms from the authority lists may require its application.

Controlled Unclassified Information (CUI) Restrictions

Controlled Unclassified Information (CUI) in NARA archives needs protection due to its sensitive nature. This safeguarding is essential for national security, privacy, intellectual property, and confidentiality of sensitive data. You can learn more about CUI on the NARA website.

The dcat-us:CuiRestriction class in DCAT-US 3.0 is tailored for managing CUI, requiring specific safeguarding in line with legal and policy requirements. It aligns with NARA guidelines to ensure compliance, enhance data security, and improve data interoperability, transparency, and efficient resource management in government data systems.

The dcat-us:CuiRestriction class is instrumental in managing Controlled Unclassified Information (CUI) within the DCAT-US framework. It mandates certain properties for effective representation and management of CUI data:

CUI Banner Marking: The dcat-us:cuiBannerMarking property is essential for marking CUI.
CUI Designation Indicator: Detailed in the dcat-us:designationIndicator, this free-text property follows guidelines from the NARA Marking Guidebook and DODI 5200.48, typically including "Controlled by:" and contact information.
Required Indicator Per Authority: The optional dcat-us:requiredIndicatorPerAuthority property allows for additional context and indicators as required by the authority.

These properties collectively ensure that the CUI status is accurately represented, complying with relevant standards and providing necessary flexibility for specific cases.

Provenance Metadata

Provenance and data lineage are crucial aspects of data management and transparency, ensuring that data consumers understand the origins, transformations, and utility of the data. In DCAT-US, leveraging [[DCTERMS]] (Dublin Core Terms) and [[PROV-O]] (W3C PROV Ontology) properties can effectively represent these aspects. This section outlines best practices for utilizing these properties to detail the provenance and data lineage within DCAT datasets.

Basic Provenance Metadata

Within DCAT-US, the Dublin Core Terms [[DCTERMS]] vocabulary offers properties that allow data publishers to articulate basic provenance information effectively. Particularly, dcterms:source and dcterms:provenance are pivotal in this context.

dcterms:source

dcterms:source is used into the following context:

Property source metadata (dcterms:source), optional, non-repeatable property for Catalog Record, that refers to the original metadata that was used in creating metadata for the Dataset.
Property source (dcterms:source), optional, repeatable property for Dataset, that refers to a related Dataset from which the described Dataset is derived.

The dcterms:source property is utilized to denote the original source from which the current dataset is derived. It can be a URI directly pointing to the original dataset or, in the absence of a URI, a descriptive reference that sufficiently identifies the original source. It's imperative to ensure that the source referenced is the most immediate or direct source from which the data was derived and to utilize persistent URIs when available to ensure stable and long-term linkage to the source.

dcterms:provenance

On the other hand, dcterms:provenance provides a mechanism to describe the history or lineage of the dataset. This property allows publishers to detail the dataset's historical context and sequence of events or processes that have influenced its formation or transformation. The provenance statement should be concise yet comprehensive, providing a clear and adequate understanding of the dataset's history and lineage. Employing standardized nomenclature and terminologies ensures clarity and consistency across provenance statements.

In the context of data cataloging and transparency, embedding provenance information is vital to elucidate the origin and historical context of a dataset. The dcterms:ProvenanceStatement from the Dublin Core Terms (DCTerms) vocabulary provides a structured way to incorporate this information within the DCAT-US framework.

The dcterms:ProvenanceStatement is designed to convey a human-readable explanation or record of the history or lineage of a dataset. It can be utilized to describe the dataset's origins, transformations, ownership, and any other changes it might have undergone, thereby providing a clear and comprehensive historical record.

Property provenance (dcterms:provenance), optional, repeatable property for Dataset, that contains a statement about the lineage of a Dataset.

This property can be expressed in two primary ways within DCAT-US:

By URI: A Uniform Resource Identifier (URI) can be used to refer to a dcterms:ProvenanceStatement that is hosted externally. This method is beneficial when the provenance information is extensive or when it is standardized and used across multiple datasets.
Using Free Text with rdfs:label: Alternatively, a dcterms:ProvenanceStatement can be expressed as free text using the rdfs:label property. This approach is suitable for providing concise, readable provenance information directly within the dataset's metadata.

Detailed Data Lineage

Data lineage, which traces the discrete steps involving data as it moves through the various stages of a workflow, is crucial for understanding data's origins and transformations. The W3C PROV Ontology [[PROV-O]] provides a rich set of properties to describe detailed data lineage in a standardized manner, ensuring interoperability and clarity in data documentation.

Key PROV-O properties include:

prov:wasDerivedFrom: Establishes a derivation relationship between two entities.
prov:wasGeneratedBy: Links an entity to the activity that generated it.
prov:wasInfluencedBy: Expresses a generic influence of one entity over an activity.
prov:wasAttributedTo: Signifies the relationship between a dataset and an agent responsible for its creation.

The prov:Activity class in the PROV-O ontology plays a pivotal role in representing processes or actions taken upon or with entities, thereby providing a structured framework to document the transformations, analyses, or other actions that data undergoes. An instance of prov:Activity is utilized to describe a particular occurrence of an action or process, which can involve the consumption, production, or transformation of entities. By associating activities with entities through properties such as prov:wasGeneratedBy, a detailed account of the data's journey, from its origin through various transformations to its current state, can be articulated. This not only enhances the transparency of the data but also provides a robust mechanism to trace back through the steps involved in data creation and processing, thereby contributing to verifiable and trustworthy data lineage. Furthermore, prov:Activity can be associated with prov:Agent through properties like prov:wasAssociatedWith, offering insights into the roles of different agents (e.g., organizations, people, or software) in data processing activities, thereby enriching the data provenance and lineage documentation.

By adhering to these practices and effectively utilizing [[PROV-O]] properties, data publishers can enhance transparency and facilitate informed data usage among consumers by providing a clear view of data sourcing, processing, and transformation.

Distribution Metadata

In the realm of data sharing and management, dcat:Distribution plays a pivotal role as the tangible representation of datasets. A Distribution within the DCAT framework is more than just a link to a dataset; it is the embodiment of the dataset in a practical, accessible format, adhering to the W3C standards. It is the dataset manifested in a specific format, ranging from CSV files to complex databases, inherently tied to its parent dataset. This relationship underscores the fact that a Distribution does not exist in isolation but as a practical form of the dataset, prepared and published by data providers for end users. The core attributes of a Distribution focus on its file-centric properties like download URLs, media types, file formats, byte sizes, character encodings, and checksums, emphasizing its primary function: efficient and reliable data delivery.

Guidelines for Creating DCAT Distributions

The following guidelines are designed to help determine the most effective way to structure DCAT distributions, whether as a single file, a multi-file package, or multiple distributions. The choice depends on the dataset's characteristics, user needs, and the data's intended use. Consider these guidelines to ensure your distributions are user-friendly, accessible, and align with best practices in data management.

Single-File Distribution: Ideal for datasets that are cohesive and standalone, typically encapsulated in a single format like CSV or XML. This approach is beneficial for smaller or comprehensive datasets, simplifying access and use. The key is to choose a file format that effectively represents all necessary data.
Multi-File Packaged Distribution: Essential for complex datasets, such as ArcGIS shapefiles, which require multiple interdependent files. Packaging related files together is useful for large or component-rich datasets. It's crucial to include all essential components and ensure the package facilitates easy download and usage.
Multiple Distributions in a Dataset: Suitable for datasets that can be logically segmented or offered in different formats. This method allows targeted access to specific data parts and enables selective updating. Clear documentation of each distribution is important for user navigation.

When selecting a distribution format, it is important to consider factors such as the interdependence of files, the ease of user accessibility, the size and downloadability of the data, the frequency of updates, and the diversity of formats required. A thoughtful approach to these criteria will help in creating a distribution strategy that is both practical for data providers and beneficial for end-users, enhancing the overall effectiveness of data sharing and utilization.

File-centric Properties

This section focuses on the properties central to the file-centric aspects of dcat:Distribution. These properties are crucial for ensuring datasets are accessible and usable in their practical forms, addressing the aspects of data encoding, structure, packaging, presentation, media type, and language.

dcat:downloadURL: This property is preferred for direct links to downloadable resources. It is the most straightforward way to provide access to a distribution, allowing users to directly download the dataset in its entirety without any intermediate steps or interactions.
dcat:accessURL: This property should be used for the URL of a service or location that provides access to the distribution, typically through a web form, query, or API call. It is ideal for scenarios where the distribution is accessed via an interactive mechanism rather than direct download. For example, when accessing datasets that require specific queries or are provided through a web service.
dcat:mediaType: This property specifies the Internet Media Type (also known as MIME type) of the distribution, which are standardized identifiers for labeling the format of documents, files, or data transmitted via the Internet. It is particularly useful in scenarios where the distribution format aligns with media types registered by the Internet Assigned Numbers Authority (IANA) [[IANA-MEDIA-TYPES]], ensuring standardization and facilitating automated processing.
dcterms:format: This property is applicable in scenarios not covered by dcat:mediaType, particularly when aligning with file formats recognized by central authorities. The role of dcterms:format is to offer a detailed description of the distribution's file format or physical medium. For instance, in the geospatial domain, this could include formats like “Shapefile” or “GeoJSON”. These descriptions are crucial for providing human-readable information about the distribution's format, enhancing user understanding and aiding in the effective presentation within data catalogs.
dcterms:conformsTo: This property indicates the standards or specifications to which the distribution conforms. Allowing for multiple standards acknowledges that datasets may adhere to more than one set of specifications, either due to the nature of the data or to meet various user needs and compliance requirements. For instance, a dataset might conform to both an industry-specific standard and a general data format standard. Documenting each applicable standard enhances the dataset's interoperability and usability, making it clear to users what to expect in terms of data structure and quality.
dcat:compressFormat: This property to be used when the files in the distribution are compressed, e.g., in a ZIP file. The format SHOULD be expressed using a media type as defined by IANA [[IANA-MEDIA-TYPES]] if available.
dcat:packageFormat: This property should be employed when the files within a distribution are packaged together, such as in formats like TAR, ZIP, Frictionless Data Package, or Bagit files. The format SHOULD be expressed using an appropriate media type as defined by IANA [[IANA-MEDIA-TYPES]] of available to ensure standardization and broader recognition of the format.
dcat:byteSize: Indicates the size of the distribution, important for understanding download requirements and storage planning. The size SHOULD be given as an integer.
spdx:checksum: This optional property is used to provide a spdx:Checksum instance for ensuring data integrity during transfer. It serves as a mechanism to verify that the contents of a file or package have not been altered. The checksum should be specified using the spdx:checksumValue property. To indicate the algorithm used for generating the checksum, use the property spdx:algorithm with URIs defined in the SPDX specification, such as spdx:checksumAlgorithm_sha1, spdx:checksumAlgorithm_sha256, or spdx:checksumAlgorithm_sha512, depending on the algorithm employed.
adms:representationTechnique: This property can be used to specify the technique or method by which the data is represented in the distribution. This is different from the file format as, for example, a ZIP file (file format) could contain an XML schema (representation technique). It can help users understand the underlying structure or visualization method of the dataset. For example, for spatial datasets, this property SHOULD be used to express the spatial representation type (grid, vector, tin), by using the URIs from a code list managed in a registry.
cnt:characterEncoding: This property SHOULD be used to specify the character encoding of the Distribution, by using as value the character set names in the the IANA Character Set names register [[IANA-CHARSETS]]. Character encoding in [[?ISO-19115-1]] metadata is specified with a code list that can be mapped to the corresponding codes in [[IANA-CHARSETS]], as shown in the following table (entries with 1-to-many mappings are in italic).

ISO 19115 - MD_CharacterSetCodeDescriptionIANAucs216-bit fixed size Universal Character Set, based on ISO/IEC 10646ISO-10646-UCS-2ucs432-bit fixed size Universal Character Set, based on ISO/IEC 10646ISO-10646-UCS-4utf77-bit variable size UCS Transfer Format, based on ISO/IEC 10646UTF-7utf88-bit variable size UCS Transfer Format, based on ISO/IEC 10646UTF-8utf1616-bit variable size UCS Transfer Format, based on ISO/IEC 10646UTF-168859part1ISO/IEC 8859-1, Information technology

8-bit single byte coded graphic character sets - Part 1 : Latin alphabet No.1ISO-8859-18859part2ISO/IEC 8859-2, Information technology

8-bit single byte coded graphic character sets - Part 2 : Latin alphabet No.2ISO-8859-28859part3ISO/IEC 8859-3, Information technology

8-bit single byte coded graphic character sets - Part 3 : Latin alphabet No.3ISO-8859-38859part4ISO/IEC 8859-4, Information technology

8-bit single byte coded graphic character sets - Part 4 : Latin alphabet No.4ISO-8859-48859part5ISO/IEC 8859-5, Information technology

8-bit single byte coded graphic character sets - Part 5 : Latin/Cyrillic alphabetISO-8859-58859part6ISO/IEC 8859-6, Information technology

8-bit single byte coded graphic character sets - Part 6 : Latin/Arabic alphabetISO-8859-68859part7ISO/IEC 8859-7, Information technology

8-bit single byte coded graphic character sets - Part 7 : Latin/Greek alphabetISO-8859-78859part8ISO/IEC 8859-8, Information technology

8-bit single byte coded graphic character sets - Part 8 : Latin/Hebrew alphabetISO-8859-88859part9ISO/IEC 8859-9, Information technology

8-bit single byte coded graphic character sets - Part 9 : Latin alphabet No.5ISO-8859-98859part10ISO/IEC 8859-10, Information technology

8-bit single byte coded graphic character sets - Part 10 : Latin alphabet No.6ISO-8859-108859part11ISO/IEC 8859-11, Information technology

8-bit single byte coded graphic character sets - Part 11 : Latin/Thai alphabetISO-8859-118859part13ISO/IEC 8859-13, Information technology

8-bit single byte coded graphic character sets - Part 13 : Latin alphabet No.7ISO-8859-138859part14ISO/IEC 8859-14, Information technology

8-bit single byte coded graphic character sets - Part 14 : Latin alphabet No.8 (Celtic)ISO-8859-148859part15ISO/IEC 8859-15, Information technology

8-bit single byte coded graphic character sets - Part 15 : Latin alphabet No.9ISO-8859-158859part16ISO/IEC 8859-16, Information technology

8-bit single byte coded graphic character sets - Part 16 : Latin alphabet No.10ISO-8859-16_jis__japanese code set used for electronic transmission__JIS_Encoding_shiftJISjapanese code set used on MS-DOS machinesShift_JISeucJPjapanese code set used on UNIX based machinesEUC-JPusAsciiUnited States ASCII code set (ISO 646 US)US-ASCII_ebcdic__IBM mainframe code set__IBM037_eucKRKorean code setEUC-KRbig5traditional Chinese code set used in Taiwan, Hong Kong of China and other areasBig5GB2312simplified Chinese code setGB2312

Effective utilization of these properties enhances data discoverability, interoperability, and the overall user experience in accessing and working with datasets.

Data Quality

The quality of a dataset plays a pivotal role in shaping trust, reusability, and the overall performance of applications that rely on it. As a result, it is imperative to integrate data quality information seamlessly into both the data publishing and consumption processes. This inclusion allows for a thorough evaluation of a dataset's quality, thereby determining its suitability for a particular application.

Thorough documentation of data quality significantly streamlines the dataset selection process, enhancing the likelihood of reuse. Regardless of domain-specific nuances, documenting data quality and explicitly stating known quality issues in metadata are fundamental practices. Typically, assessing quality involves multiple dimensions, each encapsulating characteristics of importance to both data publishers and consumers.

The Data Quality Vocabulary (DQV) defines machine-readable concepts such as measurements and criteria to assess quality across various dimensions [[VOCAB-DQV]]. Tailored heuristics designed for specific assessment scenarios rely on quality indicators, which encompass data content, metadata, and human ratings. These indicators offer valuable insights into the dataset's suitability for its intended purpose.

In the context of integrating data quality information into DCAT resources (Dataset, Distribution, Data Service, Dataset Series), the Data Quality Vocabulary [[VOCAB-DQV]] provides a structured and standardized way to represent and assess quality information for fitness of use. The key components of DQV relevant to this discussion are dqv:QualityMeasurement, dqv:Metric, dqv:Dimension, and the property hasQualityMeasurement. Here's how each of these elements is used:

dqv:QualityMeasurement: This class represents a specific measurement or assessment of quality. It's a quantifiable value that indicates how well a dataset performs against a particular quality metric. A dqv:QualityMeasurement instance is associated with a specific dataset and linked to the metric it measures.
dqv:Metric: The dqv:Metric class represents the standard or criterion used to assess a particular aspect of quality. Metrics are the yardsticks against which quality is evaluated. Each metric is typically associated with a quality dimension. For example, a metric could measure the accuracy of data, its timeliness, or its completeness.
dqv:Dimension: This dqv:Dimension class represents the various dimensions or categories of data quality, such as accuracy, timeliness, or completeness. Quality dimensions help categorize different aspects of data quality, providing a framework for comprehensive assessment.
Property hasQualityMeasurement: The hasQualityMeasurement property is used to link a resource to a dqv:QualityMeasurement. It indicates that the dataset has been evaluated in terms of quality and specifies the measurement. This linkage is crucial for conveying the results of quality assessments to data consumers, enabling them to understand the quality aspects that have been measured and the outcomes of those measurements.

Using these DQV elements, data publishers can document the quality of their datasets in a structured and meaningful way. This documentation includes specific measurements of quality, the criteria used for these assessments, and the quality dimensions they relate to. The use of DQV thus enhances transparency and helps data consumers make informed decisions about the suitability of a dataset for their specific needs.

The use of shareable controlled vocabularies for dqv:Metric and dqv:Dimension is highly encouraged within communities. These standardized vocabularies facilitate consistent and precise communication of data quality aspects across different datasets and applications. By adopting such vocabularies, communities can ensure that their data quality metrics and dimensions are universally understood, enhancing interoperability and the effective use of data across diverse systems and contexts.

Versioning

Versioning is a concept used to describe the relationship between an original resource and its variations, updates, or translations. In this section, we explore how versions resulting from updates or modifications throughout a resource's lifecycle is used in DCAT-US 3.0 profile.

DCAT-US 3.0 relies on established vocabularies, including the versioning section of the PAV ontology and terms from [[?PAV]], [[DCTERMS]], [[OWL2-OVERVIEW]], and [[VOCAB-ADMS]].

It's essential to recognize that versioning is applicable to all primary DCAT resources, such as Catalogs, Catalog Records, Datasets, and Distributions. This versioning capability extends across these resource types.

The versioning methodology detailed in DCAT-US 3.0 is designed to enhance and work alongside existing versioning practices specific to certain resource types (for instance, versioning properties for ontologies are detailed in [OWL2-OVERVIEW]) and customary in various domains and communities. Refer to section 11.4 for an analysis of how DCAT's versioning approach aligns with other vocabularies.

Handling Dataset Changes

Web-based datasets are inherently dynamic, with some undergoing scheduled updates and others evolving due to advancements in data collection techniques. To address these varying changes, the creation of new dataset versions is often necessary. The decision to classify changes as a new dataset or a new version of an existing dataset, however, is not universally agreed upon. The following examples illustrate typical scenarios where a new version is generally warranted:

Scenario 1: Adding a new bus stop requires its inclusion in the dataset.
Scenario 2: Eliminating an existing bus stop necessitates its removal from the dataset.
Scenario 3: Correcting a mistake related to a bus stop currently in the dataset.

It's important to note that datasets representing time or spatial series (like annual regional data or weekly weather forecasts) are usually considered separate datasets, each capturing unique observations.

While Scenarios 1 and 2 might lead to significant version updates, Scenario 3 typically results in a minor update. The key is not the scale of the change, but the clarity in marking these changes through version numbering. Keeping a detailed version history is crucial for the integrity of the dataset, especially considering its potential ongoing use by various stakeholders. Publishers are advised to inform users proactively about new versions, particularly for datasets undergoing real-time updates, where automated timestamps can aid in version identification. Ultimately, maintaining a systematic and transparent versioning approach, including the use of semantic versioning, is vital for enabling users to navigate and utilize these evolving datasets effectively.

Version Information

The DCAT-US profile recognizes the importance of associating versioned resources with further details. These details can include aspects like the differences from the original resource (referred to as the version "delta"), the version's name or identifier, and its release date.

To accommodate these details, the DCAT US 3.0 profile employs several specific properties:

dcat:version (parallel to pav:version [[?PAV]]) - This property is used for denoting the version name or identifier.
dcterms:issued [[DCTERMS]] - Indicates the release date of a particular version.
adms:versionNotes [[VOCAB-ADMS]] - Provides a textual summary of the changes in the version, highlighting any issues of backward compatibility with the previous version of the resource.

Dataset Versions

The versioning of datasets is an essential aspect of data management, facilitating the tracking of changes and updates over time. In DCAT-US 3.0, dataset versioning is primarily managed through the use of properties that identify and describe different versions of a dataset including:

dcterms:hasVersion - Links to a more recent version of the dataset.
dcterms:isVersionOf - Indicates the dataset is a version of another dataset.
dcterms:version - Provides the version number or identifier of the dataset.
dcterms:versionNotes - Describes changes between this version and the previous version of the dataset.

These properties ensure users can easily track dataset evolutions, access different versions, and understand the changes made across versions. Implementing these versioning properties in the DCAT-US profile enhances data discoverability and usability, aligning with best practices in data management.

Version Chains and Hierarchies

DCAT-US 3.0 profile facilitates the management of version histories and hierarchies through specific properties. These properties help in establishing and navigating the relationships between different versions of a dataset.

The key properties for defining version chains and hierarchies include:

dcat:previousVersion - This property creates a backward navigable chain from a given version to the first one, allowing for the tracking of a dataset's version history.
dcat:hasVersion - Utilized for outlining a version hierarchy by linking an abstract resource to its different versions.
dcat:hasCurrentVersion - a subproperty of dcat:hasVersion) - This property is used to connect an abstract resource to the snapshot representing the current version of its content.

Additionally, the dcat:isVersionOf property (inverse of dcat:hasVersion) can be used to provide a backward link from a version to its abstract resource. The utilization of these properties depends on the specific requirements of the use case.

It's important to note that the essential properties for specifying a version chain and hierarchy are dcat:previousVersion and dcat:hasVersion. The choice to use additional properties is determined by the needs of the relevant use case.

For further guidance on specifying a resource's status refer to Resource life-cycle section.

The following example, adapted from § 8.6 Data Versioning of [[?DWBP]] demonstrates how to specify a version chain and hierarchy for a bus stops dataset using the properties described in this section.

Versions Replaced by Other Ones

In DCAT-US 3.0 profile, a significant type of relationship is the one where a given version replaces or supersedes another. To represent this, DCAT adopts the relevant properties from [[DCTERMS]]:

dcterms:replaces - This property is used when a version supersedes another one.
dcterms:isReplacedBy - Its inverse, this property provides a back link to the newer version that replaces the current one.

It's important to note that these properties do not necessarily indicate a version chain. That is, a version does not automatically replace its immediate predecessor.

To illustrate how these roperties can be applied in DCAT-US 3.0, the following example reuses the description of the MyCity bus stop dataset in to show how replaced versions can be specified in DCAT.

Resource Life-Cycle

The life-cycle of a resource, while distinct from versioning, is often closely related to it. The evolution of a resource through its life-cycle stages—conception, creation, publication—may lead to new versions, though not invariably (e.g., resources passing through an approval workflow without revisions). Conversely, creating a new version does not always signify a life-cycle status change, such as in cases of minor updates or resources still under development.

The life-cycle status of a resource holds significant value, informing data consumers about its developmental stage, deprecation, or withdrawal, and indicating whether a new version is available. For data providers, marking a resource with its life-cycle status is crucial for managing data workflows, such as ensuring a resource is stable and appropriately flagged before publication.

Resource life-cycle management varies depending on community practices, data management policies, and workflows. This variation extends to different resource types (e.g., datasets vs. catalog records), which may follow distinct life-cycle statuses.

DCAT utilizes the adms:status property [[VOCAB-ADMS]] to specify life-cycle statuses, supplemented by relevant [[DCTERMS]] time-related properties (e.g., dcterms:created, dcterms:dateSubmitted). However, DCAT-US profile does not mandate specific life-cycle statuses, instead deferring to standards and practices suitable for each application scenario and communities of practice.

Dataset Series

A Dataset Series is a collection of related datasets that share common characteristics, making them part of a cohesive group. This section provides guidance on the effective use of Dataset Series within data catalogs, emphasizing the benefits and considerations for publishers and users alike.

A Dataset Series is a way for publishers to convey that a dataset is evolving across specific dimensions and is available as a set of related datasets. However, choosing to group datasets this way depends on the use case. Since it demands extra metadata management from the publisher, it's optional. For instance, a dataset updated frequently via an API may not require individual records for each yearly snapshot unless the publisher wishes to share each snapshot's lifecycle.

Why Use Dataset Series?

Implementing Dataset Series offers several advantages:

Organizational Clarity: Helps categorize and group datasets, making it easier for users to find and navigate related sets of data.
Efficient Data Management: Streamlines the management of multiple datasets, providing a structured approach for updates and maintenance.
User Experience: Enhances data discoverability and understanding, as users can perceive the broader context of individual datasets within a collective series.

Guidelines for Implementing Dataset Series

When using Dataset Series, consider the following best practices:

Initiate a Dataset Series exclusively for managing multiple, interconnected datasets, ensuring each dataset is significant independently and contributes to the series' overall narrative.
Maintain up-to-date metadata for the Dataset Series, reflecting any addition or removal of datasets. Consider discontinuing the series if it no longer contains any datasets, particularly when persistent identifiers are employed.
Refrain from categorizing a single, frequently updated dataset as a Dataset Series, and avoid associating distributions directly with a series. Distributions pertain to individual datasets within the series.
Ensure a coherent and strong thematic or contextual connection among the members of a Dataset Series, defined by shared attributes such as topic, time frame, or publisher, among others.
Uphold high-quality metadata standards for both individual datasets and the Dataset Series, with specific series guidelines superseding general practices where necessary.

Expressing Relationships and Connections

Articulating the interconnections between datasets in a series is crucial for user understanding and data management:

Employ consistent metadata descriptors to clarify the relationships and commonalities within the series.
Utilize versioning for datasets that evolve or expand over time, helping users track changes and understand the dataset's history.
Highlight the distinct features of each dataset, ensuring its standalone value is clear, while also emphasizing its role in the broader series.
For more complex relationships, especially in automated or tightly interconnected collections, leverage specific DCAT properties (e.g., next, prev, inSeries, last) to express the nuanced connections. Refer to the DCAT versioning guidelines for detailed practices.

Impact on Metadata

Being part of a Dataset Series may necessitate specific metadata considerations:

Adjust metadata to emphasize the unique aspects of each dataset within the series, such as different time periods, geographical areas, or methodologies.
Ensure that metadata reflects the cohesive nature of the series, helping users understand the context and relationship between individual datasets.

How to specify dataset series

DCAT-US profile makes dataset series first class citizens of data catalogs by using the [[VOCAB-DCAT-3]] new class dcat:DatasetSeries, defined as a subclass of dcat:Dataset. The datasets are linked to the dataset series by using the property dcat:inSeries.

Note that a dataset series can also be hierarchical, and a dataset series can be a member of another dataset series.

Dataset series may evolve over time, by acquiring new datasets. E.g., a dataset series about yearly budget data will acquire a new child dataset every year. In such cases, it might be important to link the yearly releases with relationships specifying the first, previous, next, and latest ones. In such a scenario, DCAT makes use of properties dcat:first, dcat:prev, and dcat:last, respectively.

Controlled Vocabularies

Importance of Controlled Vocabularies

Controlled vocabularies are predetermined sets of terms that have been carefully curated to ensure consistency, accuracy, and standardized representation of concepts within a specific domain. In the context of DCAT-US, controlled vocabularies are used to define and constrain the values of specific metadata elements. These vocabularies enable the creation of a common language for describing datasets, facilitating data integration and harmonization across different repositories.

The use of controlled vocabularies in DCAT-US offers several key benefits:

Consistency: By providing a predefined list of terms, controlled vocabularies ensure consistent representation and labeling of metadata elements. This consistency promotes data interoperability and simplifies data integration efforts, as different datasets can be mapped to a shared set of controlled terms.
Enhanced search and discovery: Controlled vocabularies enable more effective search and discovery of datasets. By aligning metadata elements with standardized terms, users can easily navigate and explore datasets based on their specific domain knowledge. Furthermore, controlled vocabularies facilitate the development of advanced search capabilities, such as faceted search, which allows users to refine search results based on predefined categories or facets.
Data harmonization: In a diverse data landscape where multiple agencies and organizations produce and manage datasets, controlled vocabularies help in harmonizing the data representation. By agreeing on a set of controlled terms, data publishers can ensure that similar concepts are represented consistently across different datasets. This harmonization promotes data integration and interoperability, enabling meaningful analysis and comparison of data from various sources.

Requirements for controlled vocabularies

The following is a list of requirements that were identified for the controlled vocabularies to be recommended in this Application Profile.

Controlled vocabularies SHOULD:

Be published under an open license.
Be operated and/or maintained by an agency of the US Government, by a recognised standards organization or another trusted organization.
Be properly documented.
Have labels in english, and optionally in Spanish
Contain a relatively small number of terms (e.g. 10-25) that are general enough to enable a wide range of resources to be classified.
Have terms that are identified by URIs with each URI resolving to documentation about the term.
Have associated persistence and versioning policies.

These criteria do not intend to define a set of requirements for controlled vocabularies in general; they are only intended to be used for the selection of the controlled vocabularies that are proposed for this Application Profile.

Controlled vocabularies to be used

In the table below, a number of properties are listed with controlled vocabularies that MUST be used for the listed properties. The declaration of the following controlled vocabularies as mandatory ensures a minimum level of interoperability.

Compared with [[?DCAT-AP-20200608]], DCAT-US makes use of additional controlled vocabularies mandated by [[?DATA-GOV-REG]], and operated by the Data.gov Registry - with the only exceptions of the coordinate reference systems register maintained by OGC [[?OGC-EPSG]].

For two of these controlled vocabularies, namely the NGDA spatial data themes [[?NGDA-THEMES]] and the ISO topic categories [[?ISO-19115-1]], the DCAT-US Working Group has defined a set of harmonised mappings to the Data.gov Vocabularies Data Themes [[?DATA-GOV-THEME]] (TBD), in order to facilitate the identification of the relevant theme in [[?DATA-GOV-THEME]] for geospatial/statistical metadata.

Other controlled vocabularies

In addition to the proposed common vocabularies in , which are mandatory to ensure minimal interoperability, implementers are encouraged to publish and to use further region or domain-specific vocabularies that are available online. While those may not be recognised by general implementations of the Application Profile, they may serve to increase interoperability across applications in the same region or domain. Examples are the full set of concepts in Global Change Master Directory (GCMD) [[?GCMD]],and numerous other schemes.

For geospatial metadata, the working group has identified the following additional vocabularies:

Geographic identifiers:
- For marine regions:
  - Marine Regions http://www.marineregions.org/
  - SeaVoX salt and fresh water body gazetteer - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
- General:
  - DBpedia for Geographic Placenames - http://dbpedia.org/about
  - National gazetteer vocabularies where feasible
  - SeaVoX salt and fresh water body gazetteer for ‘marine geonames’ - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
Keywords (with controlled vocabularies):

JSON-LD context file

One common technical question is the format in which the data is being exchanged. For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged format SHOULD be unambiguously be transformable into RDF. For the format JSON, a popular format to exchange data between systems, DCAT-US profile provides a [JSON-LD context
file](https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld). JSON-LD is a W3C Recommendation [[[json-ld11]]] that provided a standard approach to interpret JSON structures as RDF. The provided JSON-LD context file can be used by implementers to base their data exchange upon, and so create a DCAT-US conformant data exchange. This JSON-LD context is not normative, i.e. other JSON-LD contexts are allowed to create a a conformant DCAT-US data exchange. The JSON-LD context file downloadable here.

JSON Schemas

For JSON, which is a widely adopted format for data exchange between systems, the DCAT-US profile offers an informative JSON Schema. This schema aids in understanding the structure expected for DCAT-US compliant data exchanges in JSON format.

JSON Schema offers a compact way to describe and validate the structure and content of JSON data, ensuring specific formatting and value constraints. However, it's more limited than JSON-LD context and RDF serialization due to its focus on structure over meaning.

JSON Schema's focus on structural validation forms a contrast with JSON-LD and RDF's capabilities. JSON-LD and RDF go beyond just validation, allowing the creation of a graph of interconnected entities that can be easily integrated and reused across various contexts. This interconnectedness is fundamental to the concept of the semantic web, where data is not only readable but also comprehensible to machines.

Specifically, JSON-LD facilitates the representation of data as a graph, making it suitable for more complex, interlinked data representations, which is a cornerstone of linked data systems. This graph-based approach stands in contrast to the tree-like structures that JSON Schema is confined to, limiting its utility in scenarios requiring extensive data interconnectivity and reusability.

Implementers can use the provided JSON Schema for their data exchanges, aligning with DCAT-US standards. However, it's non-normative, meaning alternatives creating compliant exchanges are also valid. Download the current JSON Schema here.

SHACL Validation

In order to verify whether a catalog adheres to the stipulated constraints in this Application Profile, the constraints are articulated utilizing SHACL [[?SHACL]]. All constraints in this specification that were amenable to SHACL expression translation have been incorporated. Consequently, this set of SHACL expressions can be employed to construct a validation check for data exchange between two systems, a common scenario being one catalog being harvested into another.

For example, it may be recognized that the data being exchanged doesn't include the organizations' details since they are uniquely identified by a deferenceable URI. In this scenario, enforcing rules about the mandatory presence of a name for each organization may not be pertinent. Rigorously applying the DCAT-US SHACL expressions would trigger errors, even though the data is accessible via an alternative route. In this context, it's acceptable to omit this check during the validation phase.

This example underscores that to achieve an optimal user experience during a validation process, it's crucial to consider the actual data transferred between systems and apply only the constraints relevant to the data exchange. To facilitate this, the SHACL expressions are organized into separate files, aligning with common validation configurations.

The SHACL application profile for DCAT-US can be found here

Namespaces

Namespaces and prefixes used in normative parts of this recommendation are shown in the following table:

PrefixNamespace IRISourceadms``http://www.w3.org/ns/adms#[[VOCAB-ADMS]]cnt``http://www.w3.org/2011/content#[[Content-in-RDF10]]dcat``https://www.w3.org/TR/vocab-dcat-3/[[VOCAB-DCAT]]dcat-us``http://resources.data.gov/ontology/dcat-us#[[DCAT-US]]dct``http://purl.org/dc/terms/[[DCTERMS]]dqv``https://www.w3.org/TR/vocab-dqv/[[VOCAB-DQV]]foaf``http://xmlns.com/foaf/0.1/[[FOAF]]gsp``http://www.opengis.net/ont/geosparql#[[GeoSPARQL]]locn``http://www.w3.org/ns/locn#[[LOCN]]org``http://www.w3c.org/ns/org#[[VOCAB-ORG]]prov``http://www.w3.org/ns/prov#[[PROV]]rdf``http://www.w3.org/1999/02/22-rdf-syntax-ns#[[RDF-SYNTAX-GRAMMAR]]rdfs``http://www.w3.org/2000/01/rdf-schema#[[RDF-SCHEMA]]schema``http://schema.org/[[schema-org]]sdmx-attribute``http://purl.org/linked-data/sdmx/2009/attribute#[[?SDMX-ATTRIBUTE]]skos``http://www.w3.org/2004/02/skos/core#[[SKOS-REFERENCE]]spdx``http://spdx.org/rdf/terms#[[SPDX]]vcard``http://www.w3.org/2006/vcard/ns#[[VCARD-RDF]]xsd``http://www.w3.org/2001/XMLSchema#[[XMLSCHEMA11-2]]