DCAT-US - Version 3.0
Data Catalog Application Profile for the United States of America Candidate Recommendation Snapshot
The DCAT-US 3.0 Profile (DCAT-US 3.0) is an updated specification designed to facilitate data cataloging, discovery, and interoperability among US government agencies. Leveraging the strong foundation laid by the Project Open Data (POD) 1.1 standard (also known as [[DCAT-US-1.1]]), this profile seamlessly aligns with the emerging Data Catalog Vocabulary (DCAT) - Version 3 (DCAT 3) [[VOCAB-DCAT-3]] recommendations approved by the World Wide Web Consortium (W3C), all while upholding the essential FAIR principles. Moreover, it emphasizes maintaining compatibility with the existing POD 1.1 standard, ensuring a fluid transition. The result ensures data's Findability, Accessibility, Interoperability, and Reusability (FAIR).
The predominant significance of the DCAT-US 3.0 lies in its role as a bridge between the well-established DCAT-US 1.1 and the forward-looking DCAT 3, uniting them under a single, standardized approach for articulating and exchanging datasets. By harmonizing the most significant attributes of both standards, this profile also addresses the distinctive metadata prerequisites inherent to the US context. It goes above and beyond by encompassing specialized properties to address geospatial and statistical datasets, effectively harnessing established vocabularies to elevate the process of data sharing and subsequent reuse.
Distinguished by its usage of the Shapes Constraint Language (SHACL) [[?SHACL]] for structural and semantic validation, the DCAT-US 3.0 introduces a highly refined, interoperable, and future-proof framework for describing and validating dataset metadata. In essence, it is not just a specification but an advanced stride towards achieving a data-centric landscape where precise metadata description empowers the efficient flow of information while laying the groundwork for sustained innovation.
Background
The FAIRness Project is introducing a draft update to the Data Catalog (DCAT) standard for the United States. This update, “DCAT-US 3.0 Schema,” builds upon the requirements we received from agencies as well as data creators, providers, and users, Data Inventory statutory requirements, and the lessons learned over ten years of successful implementation of the Project Open Data Metadata Standard (DCAT-US v1.1) used by Data.gov.
We need your help to review and comment on this draft so that it meets agencies' data inventory needs and those of cross-government programs like Data.gov, GeoPlatform, and the Standard Application Process Portal.
Once approved and implemented, the update will improve the FAIRness, or Findability, Accessibility, Interoperability, and Reusability of all types of federal data. DCAT-US 3.0 will provide a *single* metadata standard able to support most requirements for documentation of business, technical, statistical, and geospatial data consistently.
The DCAT-US 3.0 Schema introduces the following key enhancements:
- DCAT-US 3.0 represents a "profile" of the W3C DCAT standard, specifically aligned with [[VOCAB-DCAT-3]], rather than a new standard. It aims to tailor the DCAT specification to meet specific use cases and requirements within the United States.
- It ensures backward compatibility with DCAT v1.1 metadata, facilitating seamless integration. While DCAT-US 3.0 introduces additional metadata elements to address emerging needs, it preserves the integrity of existing elements (with the exception of addressing errors introduced in DCAT-US 1.1), negating the necessity for metadata translation.
- The schema extends support for enriched and updated controlled vocabularies, enhancing the precision in naming conventions for federal agencies, file formats, and units of measurement, thereby promoting uniformity across datasets.
- DCAT-US 3.0 addresses and resolves the constraints encountered with DCAT v1.1 in documenting geospatial data. This advancement obviates the requirement for a distinct federal standard dedicated to geospatial datasets, streamlining documentation processes.
- Aligning with practices akin to the European Data Catalogue Application Profile (DCAT-AP), DCAT-US 3.0 has garnered vendor support, with ongoing efforts to broaden this support base, indicating a commitment to interoperability and standard adoption.
Please review the documentation below and provide feedback to help make this standard as useful as possible to you and the broader federal data user community.
Please follow the instructions found here to submit your comments and issues with the current draft schema specification.
Overview
The DCAT-US 3.0 Profile is a comprehensive update to the Project Open Data (POD) 1.1 standard, designed to meet the evolving needs of data exchange and interoperability among US government agencies. This profile builds on the foundation laid by POD 1.1 and is aligned with the latest DCAT 3 standard from the World Wide Web Consortium (W3C). In addition, the profile aims to embody the FAIR principles, ensuring that data is Findable, Accessible, Interoperable, and Reusable. This introduction will provide an overview of the purpose of this profile, highlight the gaps between POD 1.1 and DCAT 3, and elaborate on the differences and enhancements offered by the DCAT-US 3.0 profile.
Purpose and Evolution
The purpose of the DCAT-US 3.0 Profile is to improve data discoverability, accessibility, and interoperability among US government agencies. By adhering to the FAIR principles, the profile promotes more effective data sharing and reuse. The FAIR principles emphasize that data should be:
- Findable: Easy to discover by humans and machines.
- Accessible: Easily retrieved, with well-defined access protocols.
- Interoperable: Ready for use with other data sources and applications.
- Reusable: Clearly documented and licensed, facilitating reuse by others.
The DCAT-US 3.0 Profile bridges the gap between the POD 1.1 and DCAT 3 standards by incorporating the best features of both while also addressing specific metadata requirements unique to the US context. It offers a standardized approach for describing and exchanging datasets, thereby enabling more efficient data sharing and reuse.
Data Structure
The Application Profile specified in this document is based on the specification of the Data Catalog Vocabulary Version 3 (DCAT 3) [[VOCAB-DCAT-3]] developed under the responsibility of the W3C Dataset Exchange Working Group (DXWG). DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. Additional classes and properties from other well-known vocabularies are re-used where necessary.
The DCAT vocabulary consists of classes and properties.
Classes are things on the internet: Not all of them have URIs, but it is recommended to provide a URI for them. They are complex things like a person, an organization, a dataset, a website or a downloadable data file.
Classes have properties: The properties are the attributes describing these things. Some properties occur in more than one class, a title for example is a common attribute. Other properties are very specialized such as a file format that only makes sense for a data file.
Properties can be simple or complex: Some properties are classes. For example, an organization can have a website. Or a dataset can have a data publisher. In general, a class can be recognized by its spelling: A property name starts with a lowercase letter such as dcat:dataset , while a class starts with a capital letter such as dcat:Dataset.
Classes and properties are used to deliver the metadata in a structured way.
Application Areas
The DCAT Application Profile for data portals in the United States (DCAT-US) is an Application Profile of the DCAT vocabulary.
The “Data Catalog Vocabulary” (DCAT) is a semantic language for describing data by means of an RDF vocabulary. It allows for a decentralized and interoperable approach for publishing and finding data by use of a common language for describing data.
DCAT is a generic language that can be applied in various contexts. An Application Profile applies the DCAT vocabulary to a specific domain, context, or application, with the goal of facilitating data discovery, access, and sharing. The DCAT-US application profile addresses specific requirements of the U.S. Government (laws, regulations, and policies) and other needs of U.S. data producers and consumers.
The DCAT-US Application Profile provides the guidance needed for data publishers to specify their data catalogs and for data portal managers to process data catalogs. DCAT-US specifies the schema for metadata -- data describing data that can be validated for correctness and completeness. Metadata is by definition secondary information about the data: when and by whom were they published, which usage conditions apply, how often are they updated, whom to contact about them, and where and how can they be accessed.
Gaps with DCAT-US 1.1
The DCAT-US 1.1 standard, while effective for its time, had some limitations that the DCAT 3 standard has addressed. The key differences between the two standards include:
Increased expressiveness: DCAT 3 offers a richer set of classes and properties, enabling more detailed descriptions of datasets and their relationships.
Improved support for geospatial data: DCAT 3 provides better support for describing geospatial datasets, including coordinate reference systems and geometry.
Enhanced handling of statistical data: DCAT 3 and RDF Data Cube (QB) specification introduce new vocabulary terms to describe statistical datasets and their dimensions more effectively.
Refined vocabularies: DCAT 3 benefits from updated reference controlled vocabularies to ensure better interoperability between agencies and systems.
DCAT-US Features
The DCAT-US 3.0 Profile not only incorporates the enhancements provided by DCAT 3 but also maintains the US-specific metadata requirements defined in POD 1.1. This profile offers a harmonized approach to data cataloging that accounts for the unique needs of US agencies.
One of the key features of this profile is its use of reference controlled vocabularies. These vocabularies enable better interoperability between US agencies by providing a common language for describing datasets. The profile also introduces new properties to handle geospatial data and statistical datasets, leveraging established vocabularies in these domains.
The Data Catalog Vocabulary (DCAT-US) specification introduces several key features designed to enhance the accessibility, interoperability, and effectiveness of data cataloging practices. Below, we outline the compelling advantages of adopting DCAT-US over traditional document-centric metadata standards, such as ISO 19115, highlighting its superiority in meeting the needs of modern data ecosystems.
- Enhanced Interoperability and Integration: DCAT-US is engineered for maximum interoperability with web technologies and data catalogs, facilitating the sharing and discovery of datasets across diverse platforms. This level of integration surpasses the capabilities of document-centric standard such as ISO 19115, enabling broader visibility and usability of data assets.
- Flexibility and Extensibility: Offering a flexible and extensible framework, DCAT-US adapts to the changing requirements of data publishers and consumers. Its ability to incorporate new metadata types ensures that the specification remains relevant in the face of technological advancements, a critical advantage over the more static metadata standards.
- Web-Friendly and User-Centric: With its modern web-oriented design, DCAT-US enhances data accessibility and usability through support of Linked Data formats such as Turtle and JSON-LD, making datasets more discoverable and consumable for a wide range of users. This approach marks a significant improvement over the document-centric standards, prioritizing ease of use and efficiency.
- Alignment with Global Standards: By aligning with the W3C's Data Catalog Vocabulary (DCAT), DCAT-US adheres to globally recognized standards, facilitating international data sharing and collaboration. This global perspective is essential for transcending the limitations of the more narrowly focused syntactic and structure schema-based standard.
- Cost-Effective and Efficient Data Management: The adoption of DCAT-US promotes cost-effective and efficient data management practices. Its emphasis on standardization and interoperability reduces the complexities and costs associated with data integration, leveraging web infrastructure for data publication and consumption.
In conclusion, DCAT-US represents a forward-looking solution that significantly advances beyond traditional rigid document-centric metadata standard silos. Its design and features cater to the demands of contemporary data management and publishing, ensuring that data assets are more visible, accessible, and valuable to users across the data ecosystem.
Profile Encoding
The encoding of the DCAT-US profile involves the technical aspects of how data is represented and exchanged, addressing questions about data format and interoperability. While the DCAT-US 3.0 conformance does not strictly mandate the use of RDF serialization for data exchange, it emphasizes the importance of ensuring that the exchanged format can be unambiguously transformed into RDF. This flexibility allows for interoperability while accommodating various data exchange requirements.
One prevalent format for data exchange between systems is JSON (JavaScript Object Notation), which is widely used due to its simplicity and human-readable nature. To facilitate data exchange in JSON while adhering to the DCAT-US profile, a dedicated mechanism is provided: the JSON-LD context file. JSON-LD 1.1 (JSON for Linked Data) is a W3C Recommendation [[?JSON-LD]] that establishes a standardized approach for interpreting JSON structures as RDF, enhancing the potential for semantic integration and interoperability.
The DCAT-US profile offers a [[?JSON-LD]] context file that implementers can utilize as a foundation for their data exchange processes. By incorporating this JSON-LD context file, implementers can ensure that their data adheres to the DCAT-US standards while being exchanged in a JSON format. This allows for a coherent and consistent representation of the data that aligns with the RDF model, promoting interoperability among different systems and tools.
It's important to note that the provided JSON-LD context file is not normative, indicating that other JSON-LD contexts can also be used to establish a conformant DCAT-US data exchange. This flexibility caters to various implementation scenarios and data requirements, while still adhering to the overarching principles of the DCAT-US profile. Overall, the encoding of the DCAT-US profile acknowledges the significance of data format and interchange methods, leveraging JSON-LD and related mechanisms to facilitate seamless and interoperable data exchange within the context of the DCAT-US specification.
Profile Validation
While the JSON Schema approach used in POD 1.1 was effective in certain scenarios, it has limitations when compared to using SHACL for defining data models and constraints:
Expressiveness: JSON Schema primarily targets validation of JSON data structures, which can be less expressive than RDF data models used with SHACL. RDF enables a more flexible and semantically rich representation of data, while SHACL is designed to provide constraints and validation for RDF data.
Linked Data Compatibility: JSON Schema is not specifically designed for Linked Data or semantic web applications. SHACL, on the other hand, is tailored for RDF and Linked Data, making it a more natural fit for data models that need to interoperate with other semantic web resources and standards.
Inferencing Support: SHACL can be used in conjunction with RDF reasoners to validate inferred triples, offering advanced inferencing capabilities beyond the scope of JSON Schema. This feature enables more powerful and intelligent data validation.
Extensibility: SHACL is designed to be extensible, allowing users to define custom constraint components, which can be reused across multiple shapes and datasets. In contrast, JSON Schema has a fixed set of validation keywords, and extending it may require the introduction of non-standard or custom keywords, potentially affecting interoperability.
Considering these limitations, the DCAT-US 3.0 Profile has chosen SHACL as the foundation for its data modeling and validation, ensuring a more expressive, interoperable, and future-proof framework for defining dataset metadata.
The DCAT-US 3.0 Profile is defined using the Shapes Constraint Language (SHACL), which offers several advantages over previous approaches:
Expressive and flexible: SHACL allows for the precise definition of constraints on RDF data, making it easier to validate and verify datasets.
Scalable and efficient: SHACL is designed for efficient validation, making it suitable for use with large datasets and complex data models.
Widely adopted: As a W3C recommendation, SHACL enjoys broad support among data management tools and libraries, facilitating interoperability and reuse.
By using [[?SHACL]], the DCAT-US 3.0 Profile ensures a robust and extensible foundation for future updates, as well as compatibility with a wide range of data processing tools and applications.
Document Status
Candidate Recommendation Snapshot
Data Provider requirements
In order to conform to this Application Profile, an application that provides metadata MUST:
Provide a description of the Catalog, including at least the mandatory properties specified for the class Catalog.
Provide information for the mandatory properties for [Catalog
Records](#properties-for-catalog-record), if descriptions of Catalogue Records are provided - please note that the provision of descriptions of Catalogue Records is optional.Provide descriptions of Datasets in the Catalogue, including at least the mandatory properties for the class Dataset.
Provide descriptions of Distributions, if any, of Datasets in the Catalogue, including at least the mandatory properties for the class Distribution.
Provide descriptions of Data Services, if any, of Datasets in the Catalogue, including at least the mandatory properties for the class Data Service.
Provide descriptions of all organisations involved in the descriptions of Catalogue and Datasets, including at least the mandatory properties for the class Agent.
Apply the publication requirements for the controlled vocabularies as mentioned in section [[[#controlled-vocabularies]]].
For the properties listed in the table in section [[[#controlled-vocabularies]]], the associated controlled vocabularies MUST be used. Additional controlled vocabularies MAY be used. In addition to the mandatory properties, any of the recommended and optional properties defined in each class description MAY be provided.
Receiver requirements
In order to conform to this Application Profile, an application that receives metadata MUST be able to:
It is able to process RDF catalogs that conform to DCAT-US .
Process information for all classes and properties specified in the class descriptions.
Process information for all controlled vocabularies specified in section [[[#controlled-vocabularies]]].
Properties not mentioned in this specification MAY be used if they are included in either DCAT 3 and their usage conforms to DCAT 3 if they are included in DCAT-US or to DCAT if they are only included in DCAT.
"Processing" means that receivers must accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).
DCAT-US Classes
This section displays the classes for the DCAT-US 3 profile. We distinguish core classes, which represent the primary business entities that the application profile is concerned with, from supporting classes, which are used to provide additional context, metadata, or structure to the core classes.
This following table provides a summary of critical changes and updates in the DCAT-US 3.0 Application Profile, offering valuable insights into the evolution of class definitions within this data cataloging standard. Each change type is carefully documented, from the introduction of new classes specifically designed for DCAT-US 3.0 to updates and adaptations from the broader DCAT specifications, such as DCAT 1, DCAT 2, and DCAT 3. Understanding these changes is essential for data practitioners, as it enables them to grasp the evolving landscape of data cataloging and its alignment with various DCAT versions, ultimately facilitating more effective data management and interoperability.
Change TypeDescriptionNew!New DCAT-US 3.0 specific class that is not referred in DCAT specificationsAlignedClass introduced in DCAT specifications that does not exist in DCAT-US 1.1
Core Classes
The DCAT US Application Profile (“DCAT-US ”) are structured around the following main classes:
Class nameUsage note for the Application ProfileURI and ReferenceChanges from DCAT-US 1.1CatalogA catalog or repository that hosts the Datasets or Data Services being described.dcat:Catalog
AlignedCatalog RecordA record in a catalog, describing the registration of a single dcat:Resource``dcat:CatalogRecord
AlignedDatasetA conceptual entity that represents the information published.dcat:Dataset
AlignedDistributionA physical embodiment of the Dataset in a particular format.dcat:Distribution
AlignedData ServiceA collection of operations that provides access to one or more datasets or data processing functions.
dcat:DataService
AlignedDataset SeriesA collection of datasets that are published separately, but share some characteristics that group
them.
dcat:DatasetSeries
Aligned
UML Model for Core Classes of DCAT-US 3.0 (click to open)
Supporting Classes
Class nameUsage note for the Application ProfileURI and ReferenceChanges from DCAT-US 1.1
The "AccessRestriction" class used by NARA represents limitations placed on accessing specific records or information within their archives, ensuring controlled and responsible access based on legal, ethical, or security considerations.
New!
An activity carried out by an Agent over an entity, according to a plan, and generating another entity.
Aligned
A postal address for a Location.
New!
A postal address for Contact Point.
AlignedAgent
An entity (e.g., an individual or an organization) that is associated with Catalogs, Catalog Records, Data Services, or Datasets. If the Agent is an organization, the use of the Organization Ontology [[VOCAB-ORG]] is recommended.
Aligned
A responsibility of an Agent for a resource.
AlignedConcept
A subject of a Catalog, Dataset, or Data Service.
skos:Concept
AlignedConcept scheme
A concept collection (e.g. controlled vocabulary) in which the Concept is defined.
AlignedChecksum
A value that allows the contents of a file to be authenticated. This class allows the results of a variety of checksum and cryptographic message digest algorithms to be represented.
AlignedContact
A description following the [[VCARD-RDF]]
specification, e.g. to provide telephone number and e-mail address for a contact point using
vcard:Kind
.
Aligned
Controlled Unclassified Information (CUI) is information that requires safeguarding or dissemination controls pursuant to and consistent with applicable law, regulations, and government-wide policies but is not classified.
New
A textual resource intended for human consumption that contains information, e.g., a Web page about a Dataset, a publication, a chapter book, a technical report, but also a blog post.
Aligned
GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.
New
An identifier in a particular context, consisting of the string that is the identifier; an optional identifier for the identifier scheme; an optional identifier for the version of the identifier scheme; an optional identifier for the agency that manages the identifier scheme
New
A formal declaration accompanying a dataset which outlines the responsibilities and limitations of the data provider in terms of the accuracy, completeness, and potential use of the data. It often serves to limit the legal exposure of the data provider by defining the scope of allowed uses and disclaiming warranties or guarantees.
New
A legal document giving official permission to do something with a resource.
AlignedLocation
A spatial region or named place. It can be represented using a controlled vocabulary or with geographic coordinates.
Aligned
A media type, e.g. the format of a computer file.
Aligned
Represents a standard to measure a quality dimension. An observation (instance of
dqv:QualityMeasurement
) assigns a value in a given unit to a Metric.
In DCAT-US, this class is used to define individuals corresponding to the different types of spatial resolution.
AlignedOrganization
Represents a collection of people organized together into a community or other social, commercial or political structure. The group has some common purpose or reason for existence which goes beyond the set of people belonging to it and can act as an Agent. Organizations are often decomposable into hierarchical structures.
org:Organization
AlignedPeriod of time
An interval of time that is named or defined by its start and end dates.
AlignedPerson
This class represents an individual human being or a person. It can be used to provide information about individuals, such as their name, email address, homepage URL, and other personal details.
foaf:Person
Aligned Provenance Statement
A statement of any changes in ownership and custody of a resource since its creation that are significant for its authenticity, integrity, and interpretation
New!
Represents the evaluation of a given resource (as a Data Service, Dataset, or Distribution) against a specific quality metric.
AlignedRelationship
An association class for attaching additional information to a relationship between DCAT Resources
AlignedRights statement
A statement about the intellectual property rights (IPR) held in or over a resource, a legal document giving official permission to do something with a resource, or a statement about access rights.
AlignedRole
A role is the function of a resource or agent with respect to another resource, in the context of resource attribution or resource relationships.
Note it is a subclass of skos:Concept
.
AlignedStandard
A standard or other specification to which a Catalog, Catalog Record, Data Service, Dataset, or Distribution conforms.
Aligned
A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on access, distribution, reproduction, modification, or sharing, and they are often put in place to protect privacy, intellectual property rights, security, or compliance with legal or ethical standards.
New!
UML Model for Supporting Classes of DCAT-US 3.0 (click to open)
Properties per Class
Overview
Requirement levels
DCAT-US defines four requirement levels for data receivers and senders:
Mandatory property: a receiver MUST be able to process the information for that property; a sender MUST provide the information for that property.
Recommended property: a receiver MUST be able to process the information for that property; a sender SHOULD provide the information for that property if it is available.
Optional property: a receiver MUST be able to process the information for that property; a sender MAY provide the information for that property but is not obliged to do so.
Deprecated property: a receiver SHOULD be able to process information about instances of that property; a sender SHOULD NOT provide the information about instances of that property.
The meaning of the terms MUST, MUST NOT, SHOULD and MAY in this section and in the following sections are as defined in RFC 2119.
In the given context, the term "processing" means that receivers MUST accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).
Notations
Property: denotes the Property that the class or property is given in DCAT-US .
URI: denotes the property URI.
Range: specifies the range of values that is expected for the property.
ReqLevel (“Requirement level”): denotes whether the class / property is mandatory, recommended or optional.
Card (“Cardinality”): specifies the minimum number of values that MUST be provided for that property and the maximum number of values that MAY be provided.
Usage note: specifies custom usage instructions and provides background information.
CV (“Controlled Vocabulary”): defines which controlled vocabulary SHOULD be used.
Property Evolution in DCAT-US 3.0.
The following table provides an overview of the various types of changes and updates within the DCAT-US specifications, shedding light on the evolution and adaptation of the data catalog standard. Each change type is categorized, and its significance is explained, ranging from the introduction of new properties to updates that align with the latest DCAT specifications. Understanding these changes is essential for data practitioners and stakeholders seeking to keep pace with the evolving landscape of data cataloging and data sharing standards.
Change TypeDescriptionNew!New DCAT-US 3.0 specific property that is not referred in DCAT specificationsAlignedProperty aligned with latest DCAT-3 specification that does not exist in DCAT-US 1.1FixedFixed property that is inconsistent with DCAT specificationNo ChangeNo change from DCAT-US 1.1 profileMultilingual SupportExtension of DCAT-US property to support multilingual values
AccessRestriction
The class "AccessRestriction" used by the National Archives and Records Administration (NARA) refers to a categorization or specification that denotes limitations or conditions imposed on the accessibility of certain records, documents, or information within their archival holdings. Access restrictions are employed to regulate and control access to sensitive or confidential content based on legal, ethical, security, or other relevant considerations. These restrictions may pertain to who can access the information, the purposes for which it can be accessed, and the conditions under which it can be utilized. The "AccessRestriction" class provides a structured framework for classifying and managing these access limitations within NARA's archival context, contributing to the proper governance and responsible dissemination of historical records and data.
RDF Class:dcat-us:AccessRestriction
Definition:The "AccessRestriction" class used by NARA represents limitations placed on accessing specific records
or
information within their archives, ensuring controlled and responsible access based on legal, ethical, or
security considerations.Usage note
The "AccessRestriction" class serves as a valuable tool within NARA's archival framework, enabling the organization to effectively manage and communicate access limitations for specific records or information. By employing this class, NARA can categorize and enforce controlled access to sensitive content, safeguarding confidentiality, adhering to legal requirements, and preserving the integrity of historical data. Researchers, archivists, and authorized users can rely on "AccessRestriction" to navigate and understand the accessibility parameters associated with archived materials, facilitating responsible information dissemination and usage.
Rationale The "AccessRestriction" class in the DCAT-US application profile is essential for categorizing and managing access restrictions according to NARA standards, ensuring responsible access to sensitive historical records. It enhances transparency, aiding researchers and authorized users in understanding and navigating access parameters for archived materials.
Properties Summary
PropertyURIRangeReqLevelCardrestriction statusdcat-us:restrictionStatus
skos:Concept
M1..1specific restrictiondcat-us:specificRestriction
skos:Concept
R0..1restriction notedcat-us:restrictionNote
rdfs:Literal
O0..1
Mandatory Properties
Property: restriction status
Propertyrestriction statusRequirement levelMandatoryCardinality1URIdcat-us:restrictionStatus
Rangeskos:Concept
DefinitionThe indication of whether or not there are access restrictions on the data.
Recommended Properties
Property: specific restriction
Propertyspecific restrictionRequirement levelRecommendedCardinality0..1URIdcat-us:specificRestriction
Rangeskos:Concept
DefinitionThe authority of the restriction
Optional Properties
Property: restriction note
Propertyrestriction noteRequirement levelOptionalCardinality0..1URIdcat-us:restrictionNote
Rangerdfs:Literal
DefinitionA note related to the access restriction
Example
Activity
RDF Class:prov:Activity
Definition:An activity is something that occurs over a period of time and acts upon or with entities; it may
include
consuming, processing, transforming, modifying, relocating, using, or generating entities.Usage note
The activity associated with generation of a dataset will typically be an initiative, project, mission,
survey, on-going activity ("business as usual"). mission or survey etc. Multiple prov:wasGeneratedBy
properties can be
used
to indicate the dataset production context at various levels of granularity.
Details about how to describe the activity that generated a dataset are out of scope for this applicition
profile. prov:Activity
provides for minimum basic properties
for labeling and classification of activities.
Rationale:
Integrating prov:Activity
into the DCAT-US schema offers
a
streamlined, generic class to
represent a myriad of operations, such as initiatives, projects, and ongoing activities, without the
complexity of managing numerous specialized classes. This inclusion not only simplifies the
representation
of varied activities under a unified semantic framework but also enhances data provenance tracking and
interoperability across diverse systems and domains. Consequently, it provides a flexible, future-proof
approach to accommodate evolving types of activities without necessitating continual schema
modifications.
Properties Summary
PropertyURIRangeReqLevelCardlabelrdfs:label``xsd:string
M1..ncategorydcterms:type
skos:Concept
O0..1
Mandatory Properties
Property: label
PropertylabelRequirement levelMandatoryCardinality1..nURIrdfs:label
Rangexsd:string
Usage noteThis property is used to give a human-readable label for the activity.
Optional Properties
Property: category
PropertycategoryRequirement levelOptionalCardinality0..nURIdcterms:type
Rangeskos:Concept
Usage note
Example
Address (Contact Point)
RDF Class:vcard:Address
ObligationOptionalDefinition: Specify the components of the delivery address for a contact point
Usage noteThis class is used only to associate an address with a contact point. When incorporating [[VCARD-RDF]]
vcard:Address
within
DCAT-US, ensure
to utilize its properties, such as vcard:street-address
,
vcard:locality
, and vcard:country-name
, to provide comprehensive and accurate
address details for entities like organizations or publishers. Always adhere to consistent formatting
across
the catalog, be mindful of privacy considerations, especially for individual addresses, and validate the
data regularly to maintain its accuracy and reliability.
Rationale:Integrating [[VCARD-RDF]]'s contact point address into DCAT-US ensures a standardized, interoperable
format for
presenting address data
Properties Summary
PropertyURIRangeReqLevelCardadministrative areavcard:region``rdfs:Literal
R0..1cityvcard:locality``rdfs:Literal
R0..1country namevcard:country-name``rdfs:Literal
R0..1postal codevcard:postal-code``rdfs:Literal
R0..1street addressvcard:street-address``rdfs:Literal
R0..1
Recommended Properties
Property: administrative area
Propertyadministrative areaRequirement levelRecommendedCardinality0..1URIvcard:region
Rangerdfs:Literal
Usage noteThe administrative area of an Address of the Kind. Depending on the country, this corresponds to a
province, a county, a
region, or a state.
Property: city
PropertycityRequirement levelRecommendedCardinality0..1URIvcard:locality
Rangerdfs:Literal
Usage noteThe city of an Address of the Kind.
Property: country
PropertycountryRequirement levelRecommendedCardinality0..1URIvcard:country-name
Rangerdfs:Literal
Usage noteThe country of an Address of the Kind.
Property: postal code
Propertypostal codeRequirement levelRecommendedCardinality0..1URIvcard:postal-code
Rangerdfs:Literal
Usage noteThe postal code of an Address of the Kind.
Property: street address
Propertystreet addressRequirement levelRecommendedCardinality0..1URIvcard:street-address
Rangerdfs:Literal
Usage noteThe street name and civic number of an Address of the Kind.
Example
Address (Location)
RDF Class:locn:Address
Definition:The address of a location.Usage noteThis class is used to define a location defined by an address. It should be used only with the property
dcterms:spatial
, not the contact
point address property.
Rationale:Incorporating locn:Address
from the W3C Location ontology
[[LOCN]] into DCAT-US provides a standardized,
structured, and extensible format to represent physical addresses, facilitating consistent, interoperable,
and precise sharing of location information across various datasets and digital platforms.
Properties Summary
PropertyURIRangeReqLevelCardadministrative arealocn:adminUnitL2
rdfs:Literal
R0..1citylocn:postName
rdfs:Literal
R0..1countrylocn:adminUnitL1
rdfs:Literal
R0..1postal codelocn:postCode
rdfs:Literal
R0..1street addresslocn:thoroughfare
rdfs:Literal
R0..1
Recommended Properties
Property: administrative area
Propertyadministrative areaRequirement levelRecommendedCardinality0..1URIlocn:adminUnitL2
Rangerdfs:Literal
Usage noteThe administrative area of an Address of the Agent. Depending on the country, this corresponds to
a
province, a county,
a region, or a state.
Property: city
PropertycityRequirement levelRecommendedCardinality0..1URIlocn:postName
Rangerdfs:Literal
Usage noteThe city of an Address of the Agent.
Property: country
PropertycountryRequirement levelRecommendedCardinality0..1URIlocn:adminUnitL1
Rangerdfs:Literal
Usage noteThe country of an Address of the Agent.
Property: postal code
Propertypostal codeRequirement levelRecommendedCardinality0..1URIlocn:postCode
Rangerdfs:Literal
Usage noteThe postal code of an Address of the Agent.
Property: street address
Propertystreet addressRequirement levelRecommendedCardinality0..1URIlocn:thoroughfare
Rangerdfs:Literal
Usage noteThe street name and civic number of an Address of the Agent.
Agent
RDF Class:foaf:Agent
Definition:An entity that acts on something (eg. person, group, software or physical artifact).Usage note
Use this class when refering to a software agent that is associated with Catalogs, Catalog Records, Data Services, or Datasets.
If the Agent is an organization, the use of the
org:Organization
is recommended.If the Agent is a person, the use of
foaf:Person
is recommended
Rationale:The addition of the foaf:Agent
class in
DCAT-US 3.0 serves a dual purpose. Firstly, it allows for the representation of software agents, aligning
with modern data automation needs. Secondly, it acts as an abstract class for both foaf:Person
and org:Organization
, promoting consistency
and
interoperability while simplifying resource descriptions within the dataset catalog.
Properties Summary
PropertyURIRangeReqLevelCardnamefoaf:name
xsd:string
M1..1typedcterms:type
skos:Concept
R0..1
Mandatory Properties
Property: name
PropertynameRequirement levelMandatoryCardinality1URIfoaf:name
Rangexsd:string
DefinitionThe name of the software agent
Recommended Properties
Property: type
PropertytypeRequirement levelRecommendedCardinality0..1URIdcterms:type
Rangeskos:Concept
DefinitionThis property refers to a type of the Agent that makes the Catalog, Catalog Record, Data Service,
or Dataset available
Example
Attribution
RDF Class:prov:Attribution
Definition:A responsibility of an Agent for a resource.Usage note
Used to link to an Agent where the nature of the relationship is known but does not match one of the
standard [[DCTERMS]] properties ( dcterms:creator
,
dcterms:contributor
,
dcterms:rightsHolder
,
and dcterms:publisher
). Use dcat:hadRole
on the prov:Attribution
to capture the responsibility
of
the Agent with respect to the Resource.
Rationale
The inclusion of prov:Attribution
in the DCAT
profile
enables clear data source attribution, promoting
responsible data sharing and proper citation practices. It aligns the profile with data provenance best
practices for accurate attribution in data sharing.
Properties Summary
PropertyURIRangeReqLevelCardagentprov:agent
foaf:Agent
M1roledcat:hadRole``dcat:Role
M1
Mandatory Properties
Property: agent
PropertyagentRequirement levelMandatoryCardinality1URIprov:agent
Rangefoaf:Agent
DefinitionThe prov:agent property references an Agent that plays a role in the resource
Property: role
PropertyroleRequirement levelMandatoryCardinality1URIdcat:hadRole
Rangedcat:Role
DefinitionThe function of an entity or agent with respect to another entity or resource.
Example
Catalog
A Catalog or repository that hosts the Datasets or Data Services being described.
DCAT-US allows Catalogs of only Datasets, but also Catalogs of only Data Services
RDF Class:dcat:Catalog
Definition:A curated collection of metadata about resources (e.g., datasets and data services in the context of a
data catalog)Sub-class of:dcat:Dataset
Usage note
- A Web-based data catalog is typically represented as a single instance of this class.
- Populate metadata within the dcat:Catalog to facilitate resource discovery, including title, description, classifiers and other relevant information.
- Specify the resources hosted within the catalog by linking them as
dcat:dataset
ordcat:service
.
Rationale
The update of the dcat:Catalog
class aligns
with the generalization of catalog scope in DCAT-US 3.0, accommodating catalogs of datasets, data
services,
or a mixture of both. It reflects the evolving landscape of data publication and discovery, allowing data
publishers to describe and share their resources effectively. Additionally, by making Catalog a subclass
of
Dataset, DCAT-US promotes consistency in metadata representation and enables catalogs to be composed of
other catalogs, promoting modularity and extensibility in the data catalog ecosystem.
PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1titledcterms:title``rdfs:Literal
M1..nMultilingual supportdescriptiondcterms:description``rdfs:Literal
M1..nMultilingual supportpublisherdcterms:publisher
foaf:Agent
M1..1Aligneddatasetdcat:dataset``dcat:Dataset
M1..nNo Changehomepagefoaf:homepage
foaf:Document
R0..1Alignedlanguagedcterms:language``dcterms:LinguisticSystem
R0..nAlignedlicensedcterms:license``dcterms:LicenseDocument
R0..1Alignedrelease datedcterms:issued``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1Alignedrightsdcterms:rights``dcterms:RightsStatement
R0..nAlignedspatial/geographic coveragedcterms:spatial``dcterms:Location
R0..nAlignedthemesdcat:themeTaxonomy``skos:ConceptScheme
R0..nAlignedupdate/modification datedcterms:modified``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1Alignedschema versiondcterms:conformsTo``dcterms:Standard
R0..1No Changecreatordcterms:creator``dcterms:Agent
O0..nAlignedaccess rightsdcterms:accessRights``dcterms:RightsStatement
O0..1Alignedcatalogdcat:catalog``dcat:Catalog
O0..nAlignedcontact pointdcat:contactPoint``vcard:Kind
O0..nAlignedkeyword/tagdcat:keyword``rdfs:Literal
O0..nAlignedhas partdcterms:hasPart``dcat:Catalog
O0..nAlignedcatalog recorddcat:record``dcat:CatalogRecord
O0..nAlignedservicedcat:service``dcat:DataService
O0..nAlignedtheme/categorydcat:theme``skos:Concept
O0..nAlignedidentifierdcterms:identifier``rdfs:Literal
O0..nAlignedrights holderdcterms:rightsHolder
org:Organization
O0..1Newdcterms:subject``skos:Concept
O0..nNewdcterms:temporal``dcterms:PeriodOfTime
O0..nAlignedqualified attributionprov:qualifiedAttribution``prov:Attribution
O0..nAlignedcategorydcterms:type
skos:Concept
O0..1New!
Mandatory Properties
Property: title
PropertyTitleRequirement levelMandatoryCardinality1..nURIdcterms:title
Rangerdfs:Literal
Usage note
- The title of the catalog in the indicated language
- This property can be repeated for parallel language versions of the description (see )
Property: description
PropertydescriptionRequirement levelMandatoryCardinality1..nURIdcterms:description
Rangerdfs:Literal
DefinitionFree-text description of the
catalog (in the language indicated in the attribute).
Usage note
This property contains a free-text account of the data Catalog (in the language indicated in the attribute).
This property can be repeated for parallel language versions of the description (see ).
Property: publisher
PropertypublisherRequirement levelMandatoryCardinality1..1URIdcterms:publisher
Rangefoaf:Agent
DefinitionEntity responsible for making the catalog available.Usage note
- This property refers to an entity (organization) responsible for making the Catalog available.
Property: dataset
PropertydatasetRequirement levelMandatoryCardinality1..nURIdcat:dataset
Rangedcat:Dataset
DefinitionDataset that is part of the catalog.Usage note
- This property links the Catalog with a Dataset that is part of the Catalog.
- As empty Catalogs are usually indications of problems, this property SHOULD be combined with the property service to implement an empty Catalog check.
Recommended Properties
Property: homepage
PropertyhomepageRequirement levelRecommendedCardinality0..1URIfoaf:homepage
Rangefoaf:Document
Usage note
- This property refers to a web page that acts as the main page for the Catalog
- For instance data.gov, would be the homepage of the Data Catalog exported to [[?DATA-GOV]].
Property: language
PropertylanguageRequirement levelRecommendedCardinality0..nURIdcterms:language
Range dcterms:LinguisticSystem
DefinitionA language used in the textual metadata describing titles, descriptions, etc. of the Datasets in
the Catalogue.Usage noteResources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.Usage noteThe value(s) provided for members of a catalog (i.e., dataset or service) override the value(s)
provided for the catalog if they conflict.Usage noteThis property can be repeated if the resources of the catalog are provided in multiple languages.
Property: license
PropertylicenseRequirement levelRecommendedCardinality0..1URIdcterms:license
Rangedcterms:LicenseDocument
Usage note
This property refers to the license under which the Catalog can be used or reused.
CV to be used: [[?DATA-GOV-LICENSE]]
Property: release date
Propertyrelease dateRequirement levelRecommendedCardinality0..1URIdcterms:issued
Rangerdfs:Literal
(typed as
xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
- This property contains the date of formal issuance (e.g., for publication of the Catalog).
Property: rights
PropertyrightsRequirement levelRecommendedCardinality0..nURIdcterms:rights
Rangedcterms:RightsStatement
Usage note
- This property refers to a statement that specifies rights associated with the Catalog.
Property: spatial/geographic coverage
Propertyspatial/geographic coverageRequirement levelRecommendedCardinality0..nURIdcterms:spatial
Rangedcterms:Location
Usage note
- This property refers to a geographical area covered by the Catalog.
- Conventions to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[?GEONAMES]]
Property: themes
PropertythemesRequirement levelRecommendedCardinality0..nURIdcat:themeTaxonomy
Range
skos:ConceptScheme
Usage note
This property refers to a knowledge organization system used to classify the Catalog's Datasets.
CV to be used: http://TBD/resource/dataset/data-theme
Property: update/modification date
Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modified
Rangerdfs:Literal
(typed as
xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
- This property contains the most recent date on which the Catalog was modified.
- The value of this property indicates a change to the actual resource, not a change to the catalog record. An absent value MAY indicate that the resource has never changed after its initial publication, or that the date of last modification is not known, or that the resource is continuously updated.
Property: schema version
Propertyschema versionDefinition:An established standard to which the described catalog conforms.Requirement levelRecommendedCardinality0..1URIdcterms:conformsTo
Rangedcterms:Standard
Usage note
- This property SHOULD be used to indicate the model, schema, ontology, view or profile that the cataloged resource content conforms to.
Optional Properties
Property: creator
PropertycreatorRequirement levelOptionalCardinality0..nURIdcterms:creator
Rangedcterms:Agent
Definition:The entity responsible for producing the resource.Usage note
- Resources of type foaf:Agent are recommended as values for this property.
Property: access rights
Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRights
Rangedcterms:RightsStatement
Usage note
This property refers to information that indicates whether the Catalog is open data, has access restrictions or is not public.
CV to be used: [[?DATA-GOV-AR]].
Property: catalog
PropertycatalogRequirement levelOptionalCardinality0..nURIdcat:catalog
Rangedcat:Catalog
Usage note
- This property refers to a catalog whose contents are of interest in the context of this catalog.
Property: contact point
Propertycontact pointRequirement levelOptionalCardinality0..nURIdcat:contactPoint
Rangevcard:Kind
Usage note
- Relevant contact information for the cataloged resource. Use of vCard is recommended
Property: keyword/tag
Propertykeyword/tagRequirement levelOptionalCardinality0..nURIdcat:keyword
Rangerdfs:Literal
Usage note
- A keyword or tag describing the resource.
Property: has part
Propertyhas partRequirement levelOptionalCardinality0..nURIdcterms:hasPart
Rangedcat:Catalog
Usage note
- This property refers to a related catalog that is part of the described catalog.
Property: catalog record
Propertycatalog recordDefinition:A record describing the registration of a single resource (e.g., a dataset, a data service) that
is
part of the catalog.Requirement levelOptionalCardinality0..nURIdcat:record
Rangedcat:CatalogRecord
Property: service
PropertyserviceRequirement levelOptionalCardinality0..nURIdcat:service
Rangedcat:DataService
Usage note
- This property refers to a site or end-point (Data Service) that is listed in the Catalog.
- As empty Catalogs are usually indications of problems, this property SHOULD be combined with the property Dataset to implement an empty Catalog check.
Property: theme/category
Propertytheme/categoryRequirement levelOptionalCardinality0..nURIdcat:theme
Rangeskos:Concept
Usage note
This property refers to a category of the Catalog. A Catalog may be associated with multiple themes.
CV to be used: [[?DATA-GOV-THEME]]
Property: identifier
PropertyidentifierRequirement levelOptionalCardinality0..nURIdcterms:identifier
Rangerdfs:Literal
Usage note
- This property contains the main identifier for the Catalog, e.g. the URI or other unique identifier.
Property:rights holder
Propertyrights holderRequirement levelOptionalCardinality0..nURIdcterms:rightsHolder
Rangeorg:Organization
Usage note
- This property refers to an organization holding rights on the Catalog.
Property: subject
PropertysubjectRequirement levelOptionalCardinality0..nURIdcterms:subject
Rangeskos:Concept
Property: temporal coverage
Propertytemporal coverageRequirement levelOptionalCardinality0..nURIdcterms:temporal
Rangedcterms:PeriodOfTime
Usage note
- This property refers to a temporal period that the Catalog covers.
Property: qualified attribution
Propertyqualified attributionRequirement levelOptionalCardinality0..nURIprov:qualifiedAttribution
Rangeprov:Attribution
Usage note
- This property refers to a link to an Agent having some form of responsibility for the Catalog.
Property: category
PropertycategoryRequirement levelOptionalCardinality0..1URIdcterms:type
Rangeskos:Concept
Usage note
- The category of the Catalog
Example
Catalog Record
RDF Class:dcat:CatalogRecord
Definition:A record in a catalog, describing the registration of a single dcat:Resource
.Usage note
This class is optional and not all catalogs will use it. It exists for catalogs where a distinction
is made between metadata about a dataset or service and metadata about the entry in the
catalog
about the dataset or service. For example, the publication date property of the
dataset
reflects the date when the information was originally made available by the publishing agency, while the
publication date of the catalog record is the date when the dataset was added to the catalog. In
cases where both dates differ, or where only the latter is known, the publication date SHOULD
only
be specified for the catalog record. Notice that the W3C
PROV
Ontology [[PROV-O]] allows describing further provenance information such as the details of the process
and
the agent involved in a particular change to a dataset or its registration.
Rationale
While its use is not mandatory, the incorporation of dcat:CatalogRecord
into DCAT-US 3.0 holds
significant value. It enables catalogs to distinguish between metadata describing datasets or services and
the actual catalog entries. This differentiation proves especially advantageous for ensuring adherence to
application profiles that demand specific metadata for catalog records. Furthermore, it streamlines
resource
lifecycle management, empowering catalogs to monitor alterations and revisions to their entries,
ultimately
bolstering data governance and quality assurance protocols.
Properties Summary
PropertyURIRangeReqLevelCardapplication profiledcterms:conformsTo``dcterms:Standard
R0..1change typeadms:status``skos:Concept
R0..1descriptiondcterms:description``rdfs:Literal
O0..nlanguagedcterms:language``dcterms:LinguisticSystem
O0..nlisting datedcterms:issued``rdfs:Literal
(typed as
xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..nupdate/modification datedcterms:modified``rdfs:Literal
(typed as
xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
M1..1primary topicfoaf:primaryTopic
dcat:Resource
M1..1source metadatadcterms:source``dcat:Resource
O0..1titledcterms:title``rdfs:Literal
O0..n
Mandatory Properties
Property: update/modification date
Propertyupdate/modification dateRequirement levelMandatoryCardinality1..1URIdcterms:modified
Rangerdfs:Literal
(typed as
xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThe most recent date on which the Catalog Record's entry was changed or modified.
Property: primary topic
Propertyprimary topicRequirement levelMandatoryCardinality1..1URIfoaf:primaryTopic
Rangedcat:Resource
DefinitionA link to the Dataset, Data service or Catalog described in the Catalog Record.Usage noteA catalog record will refer to one entity in a catalog. This can be either a Dataset or a Data
Service. To ensure an
unambigous reading of the cardinality the range is set to Cataloged Resource. However it is not the
intend with this
range to require the explicit use of the class Cataloged Record. As abstract class, an subclass
should
be used.
Recommended Properties
Property: application profile
Propertyapplication profileRequirement levelRecommendedCardinality0..1URIdcterms:conformsTo
Rangedcterms:Standard
DefinitionAn Application Profile that the Catalog Record's metadata conforms to.
Property: change type
Propertychange typeRequirement levelRecommendedCardinality0..1URIadms:status
Rangeskos:Concept
DefinitionThe status of the catalog record in the context of editorial flow of the dataset and data service
descriptions.
Optional Properties
Property: description
PropertydescriptionRequirement levelOptionalCardinality0..nURIdcterms:description
Rangerdfs:Literal
DefinitionA free-text account of the Catalog Record. This property can be repeated for parallel language
versions of
the description.
Property: language
PropertylanguageRequirement levelOptionalCardinality0..nURIdcterms:language
Rangedcterms:LinguisticSystem
DefinitionA language used in the textual metadata describing titles, descriptions, etc. of the members of
the
catalog.
Usage noteResources defined by the Library of Congress [[ISO 639-1]] SHOULD be used.Usage noteThis property can be repeated if the metadata is provided in multiple languages.
Property: listing date
Propertylisting dateRequirement levelOptionalCardinality0..nURIdcterms:issued
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThe date on which the description of the Dataset was included in the Catalog.
Property: source metadata
Propertysource metadataRequirement levelOptionalCardinality0..1URIdcterms:source
Rangedcat:Resource
DefinitionThe original metadata that was used in creating metadata for the datasets, data services, or
catalogs in the Catalog Record.
Property: title
PropertytitleRequirement levelOptionalCardinality0..nURIdcterms:title
Rangerdfs:Literal
DefinitionA name given to the Catalog Record.Usage noteThis property can be repeated for parallel language versions of the name.
Example
Checksum
RDF Class:spdx:Checksum
Definition:A Checksum is a value that allows to check the integrity of the contents of a file. Even small changes
to
the
content of the file will change its checksum. This class allows the results of a variety of checksum and
cryptographic
message digest algorithms to be represented [[SPDX]].Usage note
- The Checksum includes the algorithm (
spdx:algorithm
) and value (spdx:checksumValue
) that allows the integrity of a file to be verified to ensure no errors occurred in transmission or storage.
Rationale:Introducing the spdx:Checksum
class in
DCAT-US bolsters data integrity and trust by ensuring datasets remain
unaltered during transfers. This standardized approach promotes consistency across catalogs, facilitates
error detection, and adapts to evolving cryptographic needs, enhancing the utility of automated tools.
Properties Summary
PropertyURIRangeReqLevelCardalgorithmspdx:algorithm``spdx:ChecksumAlgorithm
M1..1checksum valuespdx:checksumValue``xsd:hexBinary
M1..1
Mandatory Properties
Property: algorithm
PropertyalgorithmRequirement levelMandatoryCardinality1..1URIspdx:algorithm
Rangespdx:ChecksumAlgorithm
DefinitionThe algorithm used to produce the subject Checksum.
Property: checksum value
Propertychecksum valueRequirement levelMandatoryCardinality1..1URIspdx:checksumValue
Rangexsd:hexBinary
DefinitionA lower case hexadecimal encoded digest value produced using a specific algorithm.
Example
Concept
RDF Class:skos:Concept
Definition:A controlled vocabulary term used to classify Catalog, Dataset, or Data Service.Usage note
Following FAIR Vocabulary principles, Concept URI should be made resolvable and accessible using SKOS encoding and provided in Linked Data format (RDF/XML,TTL, JSON-LD, NTriples)
Ensure FAIR Resolvability: Make Concept URIs resolvable using FAIR principles, allowing them to be Findable, Accessible, Interoperable, and Reusable. This ensures that
skos:Concept
instances can be easily discovered, accessed, integrated with other resources, and reused across the DCAT-US ecosystem, promoting data interoperability and accessibility.To enhance data interoperability and consistency, it is advisable to reuse established controlled vocabularies such as Global Change Master Directory (GCMD) [[?GCMD]], Agrovoc, and NAICS for data description.
Rationale:
The inclusion of skos:Concept
in DCAT-US 3.0
enhances semantic search in catalogs, enabling more accurate discovery of Catalogs, Datasets, and Data
Services. It improves user experience, promotes data discoverability, and supports better resource
utilization. Additionally, it aligns with international standards like SKOS, ensuring compatibility and
adherence to recognized controlled vocabulary practices.
Properties Summary
PropertyURIRangeReqLevelCardalternate labelskos:altLabel``rdfs:Literal
O0..ndefinitionskos:definition``rdfs:Literal
R0..nin schemeskos:inScheme``skos:ConceptScheme
M1..1notationskos:notation``xsd:string
O0..npreferred labelskos:prefLabel``rdfs:Literal
M1.n
Mandatory Properties
Property: preferred label
Propertypreferred labelRequirement levelMandatoryCardinality1..nURIskos:prefLabel
Rangerdfs:Literal
DefinitionPreferred label for the controlled vocabulary term (one per language).
Property: in scheme
Propertyin schemeRequirement levelMandatoryCardinality1URIskos:inScheme
Rangeskos:ConceptScheme
DefinitionConcept scheme defining the concept.
Recommended Properties
Property: definition
PropertydefinitionRequirement levelRecommendedCardinality0..nURIskos:definition
Rangerdfs:Literal
Definitiondefinition of the controlled vocabulary term.
Optional Properties
Property: alternate label
Propertyalternate labelRequirement levelOptionalCardinality0..nURIskos:altLabel
Rangerdfs:Literal
DefinitionAlternative labels for a concept.
Property: notation
PropertynotationRequirement levelOptionalCardinality0..nURIskos:notation
Rangexsd:string
DefinitionAbbreviations or codes from code lists for an organization.
Example
Concept Scheme
RDF Class:skos:ConceptScheme
Definition:A concept collection (e.g. controlled vocabulary) in which a concept is defined.Usage note
Following FAIR Vocabulary principles, Concept Scheme URI should be made resolvable and accessible using SKOS encoding and provided in Linked Data format (RDF/XML,TTL, JSON-LD, NTriples)
To enhance data interoperability and consistency, it is advisable to reuse established controlled vocabularies such as Global Change Master Directory (GCMD) [[?GCMD]], Agrovoc, and NAICS for data description.
Rationale:The introduction of skos:ConceptScheme
in
DCAT-US 3.0 enhances data resource organization, categorization,
and accessibility. It provides a structured framework for controlled vocabularies, aligning with FAIR
Vocabulary principles for improved data interoperability and discoverability.
Properties
PropertyURIRangeReqLevelCardtitledcterms:title
rdfs:Literal
M1..ndescriptiondcterms:description
rdfs:Literal
R0..ncreation datedcterms:created
rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..1publication datedcterms:issued
rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..1update/modification datedcterms:modified
rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..1version infodcat:version``xsd:string
O0..1
Mandatory Properties
Property: title
PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:title
Rangerdfs:Literal
DefinitionThe title of the concept scheme in the indicated language.Usage noteOnly one title per language.
Recommended Properties
Property: description
PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:description
Rangerdfs:Literal
DefinitionThis property contains a description of the Concept Scheme.Usage noteMay be repeated for translations in different languages.
Optional Properties
Property: creation date
Propertycreation dateRequirement levelOptionalCardinality0..1URIdcterms:created
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThis property contains the date on which the Concept Scheme has been first created.
Property: publication date
Propertypublication dateRequirement levelOptionalCardinality0..1URIdcterms:issued
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThis property contains the date of formal issuance (e.g., publication) of the Concept Scheme.
Property: update/modification date
Propertyupdate/modification dateRequirement levelOptionalCardinality0..1URIdcterms:modified
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThis property contains the most recent date at which the Concept Scheme was changed or modified.
Property: version info
Propertyversion infoRequirement levelOptionalCardinality0..1URIdcat:version
Rangexsd:string
DefinitionThis property contains a version number or other version designation of the Concept Scheme.
Examples
Contact
RDF Class:vcard:Kind
Definition:Point of Contact informationRationale:
The introduction of vcard:Kind
in DCAT-US 3.0 is driven by
the need for standardized, reliable, and
interoperable Point of Contact information, ultimately improving the accessibility and usability of data
resources within the DCAT-US ecosystem.
Properties Summary
PropertyURIRangeReqLevelCardformatted namevcard:fn``xsd:string
M1emailvcard:hasEmail``rdfs:Resource
M1telephonevcard:tel``rdfs:Resource
O0..1organization namevcard:organization-name``xsd:string
O0..1family namevcard:family-name``xsd:string
O0..1given namevcard:given-name``xsd:string
O0..1position titlevcard:title``xsd:string
O0..1has uidvcard:hasUID``xsd:string
O0..1addressvcard:address``vcard:Address
O0..n
Mandatory Properties
Property: formatted name
Propertyformatted nameRequirement levelMandatoryCardinality1URIvcard:fn
Rangexsd:string
DefinitionThe formatted text corresponding to the name of the contact
Property: email
PropertyemailRequirement levelMandatoryCardinality1URIvcard:hasEmail
Rangerdfs:Resource
DefinitionThe email address of the contact.Usage noteUse email with function name instead of individual name (e.g. support). The email address should be
formatted as url starting with "mailto:" scheme
Optional Properties
Property: telephone
PropertytelephoneRequirement levelOptionalCardinality0..1URIvcard:tel
Rangerdfs:Resource
DefinitionThis property specifies the telephone number for telephony communication with the person or
organization.
Property: organization name
Propertyorganization nameRequirement levelOptionalCardinality0..1URIvcard:organization-name
Rangexsd:string
DefinitionThis property specifies the name of the organization to contact
Property: family name
Propertyfamily nameRequirement levelOptionalCardinality0..1URIvcard:family-name
Rangexsd:string
DefinitionThis property specifies the family name of the person to contact
Property: given name
Propertygiven nameRequirement levelOptionalCardinality0..1URIvcard:given-name
Rangexsd:string
DefinitionThis property specifies the given name of the person to contact
Property: position title
PropertytitleRequirement levelOptionalCardinality0..1URIvcard:title
Rangexsd:string
DefinitionThis property specifies the position role of the person to contact
Property: has UID
PropertyhasUIDRequirement levelOptionalCardinality0..1URIvcard:hasUID
Rangexsd:string
DefinitionThis property specifies a value that represents a globally unique identifier corresponding to the
contact (could also be used as URI component of the contact)Usage NoteThe hasUID
property is used to assign a unique identifier
to a contact associated with a dataset or catalog. This identifier, which is optional and should be
a
string, ensures that each contact can be distinctly recognized and referenced. The utility of this
property is particularly evident in scenarios where contacts need to be uniquely identified across
different datasets or catalogs, preventing any ambiguity. It can also serve as a part of a URI for a
contact, providing a consistent and resolvable identifier. Implementers are encouraged to use a
globally unique string value, such as a ORCID or a URI that is guaranteed to be unique, to
facilitate
unambiguous identification and referencing of contacts.
Property: address
PropertyaddressRequirement levelOptionalCardinality0..nURIvcard:address
Rangevcard:Address
DefinitionThis property specifies the address of the contact
Example
CUI Restriction
Controlled Unclassified Information (CUI) is information that requires safeguarding or dissemination controls pursuant to and consistent with applicable law, regulations, and government-wide policies but is not classified.
RDF Class:dcat-us:CuiRestriction
Definition:Represents Controlled Unclassified Information (CUI), which is information that requires safeguarding or
dissemination controls in accordance with applicable laws, regulations, and government-wide policies but
is
not classified as confidential.Usage note
- The CUI Restriction class is designed to capture information related to Controlled Unclassified Information (CUI) in accordance with NARA guidelines.
- Users of this class must provide the mandatory properties, i.e the CUI banner marking and designation indicator, to accurately describe the CUI status of a resource.
- The optional property, "required indicator per authority," allows for additional information or context about CUI restrictions, providing flexibility for specific use cases.
Rationale:
The introduction of the dcat-us:CuiRestriction
class in
DCAT-US 3.0 is driven by the need for compliance with National Archives and Records Administration (NARA)
guidelines regarding Controlled Unclassified Information (CUI). This addition ensures that DCAT-US aligns
with NARA's standards, promotes transparency, facilitates compliance audits, and supports efficient
resource
management. Ultimately, it enhances data interoperability and security within the government data
ecosystem.
Properties Summary
PropertyURIRangeReqLevelCardCUI banner markingdcat-us:cuiBannerMarking
xsd:string
M1..1CUI designation indicatordcat-us:designationIndicator
xsd:string
M1..1required indicator per authoritydcat-us:requiredIndicatorPerAuthority
xsd:string
O0..n
Mandatory Properties
Property: CUI designation indicator
PropertyCUI designation IndicatorRequirement levelMandatoryCardinality1URIdcat-us:designationIndicator
Rangexsd:string
DefinitionDesignation Indicator shows which agency made the document CUIUsage note
Free text per NARA Marking Guidebook and DODI 5200.48 (should have at least "Controlled by:").
It is best practice to include contact information.
Optional Properties
Property: required indicator per authority
Propertyrequired indicator per authorityRequirement levelOptionalCardinality0..nURIdcat-us:requiredIndicatorPerAuthority
Rangexsd:string
Definitionfree text (e.g., text of the category description or the distribution statement)
Example
Data Service
RDF Class:dcat:DataService
Definition:
A collection of operations that provides access to one or more
datasets or data processing functions.
Sub-class of:dcat:Resource
Sub-class of:dctype:Service
Usage note
If a
dcat:DataService
is bound to one or more specified Datasets, they are indicated by thedcat:servesDataset
property.The kind of service can be indicated using the
dcterms:type
property. Its value may be taken from a controlled vocabulary such as the Data.GOV spatial data service type code list [[?DATA-GOV-SDST]].
Rationale:
Introducing dcat:DataService
is essential as it
clarifies the representation of data services, addressing
the confusion caused by using dcat:Distribution
to describe services in DCAT 1. This addition promotes clear
communication of service-related information, improving discoverability, and facilitating seamless
integration and usage by data consumers and applications.
PropertyURIRangeReqLevelCardendpoint URLdcat:endpointURL``rdfs:Resource
M1..ncontact pointdcat:contactPoint``vcard:Kind
M1..npublisherdcterms:publisher
foaf:Agent
M1..1titledcterms:title``rdfs:Literal
M1..nendpoint descriptiondcat:endpointDescription``rdfs:Resource
R0..nlicensedcterms:license``dcterms:LicenseDocument
R0..1serves datasetdcat:servesDataset``dcat:Dataset
R0..nkeyword/tagdcat:keyword``rdfs:Literal
O0..nspatial resolution in metersdcat:spatialResolutionInMeters
rdfs:Literal
typed as xsd:decimal
O0..ntemporal resolutiondcat:temporalResolution
rdfs:Literal
typed as
xsd:duration
O0..ntheme/categorydcat:theme``skos:Concept
O0..naccess rightsdcterms:accessRights``dcterms:RightsStatement
O0..1conforms todcterms:conformsTo``dcterms:Standard
O0..ncreation datedcterms:created``rdfs:Literal
typed as xsd:date
or xsd:dateTime
O0..1creatordcterms:creator``dcterms:Agent
O0..ndescriptiondcterms:description``rdfs:Literal
OO..nidentifierdcterms:identifier``rdfs:Literal
O0..nlanguagedcterms:language``dcterms:LinguisticSystem
O0..nupdate/modification datedcterms:modified``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..1rightsdcterms:rights``dcterms:RightsStatement
O0..nrights holderdcterms:rightsHolder
org:Organization
O0..nspatial/geographic coveragedcterms:spatial``dcterms:Location
O0..nstatusadms:status``skos:Concept
O0..1termporal coveragedcterms:temporal``dcterms:PeriodOfTime
O0..ncategorydcterms:type
skos:Concept
O0..1quality measurementdqv:hasQualityMeasurement
dqv:QualityMeasurement
O0..nqualified attributionprov:qualifiedAttribution``prov:Attribution
O0..nwas used byprov:wasUsedBy``prov:Activity
O0..ngeographic bounding boxdcat-us:geographicBoundingBox
dcat-us:GeographicBoundingBox
O0..n
Mandatory Properties
Property: endpoint URL
RDF Propertydcat:endpointURL
Requirement levelMandatoryCardinality1..nURIdcat:endpointURL
Rangerdfs:Resource
Usage note
The root location or primary endpoint of the service (a Web-resolvable IRI)
Property: contact point
Propertycontact pointRequirement levelMandatoryCardinality1..nURIdcat:contactPoint
Rangevcard:Kind
DefinitionThis property contains contact information that can be used for sending comments about the Data
Service.Usage note
- This property MUST contain an email address that is continuously monitored by the data publisher.
- If there are several contributors involved in the publication of the Dataset, the property can be used multiple times.
Property: publisher
PropertypublisherRequirement levelMandatoryCardinality1..1URIdcterms:publisher
Rangefoaf:Agent
DefinitionThis property refers to an entity (organization) responsible for making the Data Service
available.
Usage note
This property refers to an entity (organization) responsible for making the Catalog available.
Property: title
PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:title
Rangerdfs:Literal
Usage note
- The title of the catalog in the indicated language
- This property can be repeated for parallel language versions of the description (see )
Recommended Properties
Property: endpoint description
Propertyendpoint descriptionRequirement levelRecommendedCardinality0..nURIdcat:endpointDescription
Definition:
A description of the services available via the end-points,
including their operations, parameters etc.
Domaindcat:DataService
Rangerdfs:Resource
Usage note
The endpoint description gives specific details of the actual endpoint instances, while
dcterms:conformsTo
is used to indicate the general standard or specification that the endpoints implement.An endpoint description may be expressed in a machine-readable form, such as an OpenAPI (Swagger) description [[?OpenAPI]], an OGC
GetCapabilities
response [[?WFS]], [[?ISO-19142]], [[?WMS]], [[?ISO-19128]], a SPARQL Service Description [[?SPARQL11-SERVICE-DESCRIPTION]], an [[?OpenSearch]] or [[?WSDL20]] document, a Hydra API description [[?HYDRA]], else in text or some other informal mode if a formal representation is not possible.
Property: license
PropertylicenseRequirement levelRecommendedCardinality0..1URIdcterms:license
Rangedcterms:LicenseDocument
DefinitionThis property refers to the license under which the Data Service is made available.Usage noteCV to used: [[?DATA-GOV-LICENSE]]
Property: serves dataset
Propertyserves datasetRequirement levelRecommendedCardinality0..nURIdcat:servesDataset
Rangedcat:Dataset
DefinitionThe Dataset that is served by this data service.Usage note
This property refers to a collection of data that this data service can distribute.
Optional Properties
Property: keyword/tag
Propertykeyword/tagRequirement levelOptionalCardinality0..nURIdcat:keyword
Rangerdfs:Literal
Definition
This property contains a keyword or tag describing the Data Service.
Property: spatial resolution in meters
Propertyspatial resolution in metersRequirement levelOptionalCardinality0..nURIdcat:spatialResolutionInMeters
Rangerdfs:Literal
typed as xsd:decimal
Definition
This property refers to the minimum spatial separation resolvable in a Data Service, measured in
meters.
Property: temporal resolution
Propertytemporal resolutionRequirement levelOptionalCardinality0..nURIdcat:temporalResolution
Rangerdfs:Literal
typed as
xsd:duration
Definition
The minimum time period resolvable by the Data Service.
Property: theme/category
Propertytheme/categoryRequirement levelOptionalCardinality0..nURIdcat:theme
Rangeskos:Concept
DefinitionThis property refers to a theme of the Data Service. A Data Service may be associated with
multiple
themes.Usage noteCV to be used: [[?DATA-GOV-THEME]]
Property: access rights
Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRights
Rangedcterms:RightsStatement
DefinitionThis property MAY include information regarding access or restrictions based on privacy, security,
or other policies.Usage noteCV must be used: [[?DATA-GOV-AR]]
Property: conforms to
Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:conformsTo
Rangedcterms:Standard
Definition
This property is used to indicate the general standard or specification that the Data Service
endpoints implement.
Property: creation date
Propertycreation dateRequirement levelOptionalCardinality0..1URIdcterms:created
Rangerdfs:Literal
typed as xsd:date
or xsd:dateTime
Definition
This property contains the date on which the Data Service has been first created.
Property: creator
PropertycreatorRequirement levelOptionalCardinality0..nURIdcterms:creator
Rangefoaf:Agent
Usage note
This property refers to the Agent primarily responsible for producing the Data Service.
Property: description
PropertydescriptionRequirement levelOptionalCardinality0..nURIdcterms:description
Rangerdfs:Literal
DefinitionThis property contains a free-text account of the Data Service.Usage noteThis property can be repeated for parallel language versions of the description (see ). On the user interface of data portals, the
content of the element whose language corresponds to the display language selected by the user is
displayed.
Property: identifier
PropertyidentifierRequirement levelOptionalCardinality0..nURIdcterms:identifier
Rangerdfs:Literal
Definition
This property contains the main identifier for the Data Service, e.g. the URI or other unique
identifier in the context of the Catalog.
Property: language
PropertylanguageRequirement levelOptionalCardinality0..nURIdcterms:language
Rangedcterms:LinguisticSystem
DefinitionThis property refers to a language supported by the Data Service. This property can be repeated if
multiple languages are supported in the Data Service.Usage noteResources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.Usage noteThis property can be repeated if the service is provided in multiple languages. (e.g. map service
rendering maps in spanish or english)
Property: update/modification date
Propertyupdate/modification dateRequirement levelOptionalCardinality0..1URIdcterms:modified
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Definition
This property contains the most recent date on which the Data Service was changed or modified.
Property: rights
PropertyrightsRequirement levelOptionalCardinality0..nURIdcterms:rights
Rangedcterms:RightsStatement
Definition
A statement that concerns all rights for the Data Service not addressed with dcterms:license
or dcterms:accessRights, such as copyright statements.
Property: rights holder
Propertyrights holderRequirement levelOptionalCardinality0..nURIdcterms:rightsHolder
Rangeorg:Organization
Definition
This property refers to an Agent (organization) holding rights on the Data Service.
Property: spatial/geographic coverage
Propertyspatial/geographic coverageRequirement levelOptionalCardinality0..nURIdcterms:spatial
Rangedcterms:Location
DefinitionThis property refers to a geographic region that is covered by the Data Service.Usage noteTO DISCUSS: Conventions to be used: The Vocabularies Name Authority Lists MUST be used for
continents, countries and
places that are in those lists; if a particular location is not in one of the mentioned Named
Authority Lists, Geonames URIs MUST be used:
[[?DATA-GOV-CONT]],
[[?DATA-GOV-COUNTRY]],
[[?DATA-GOV-PLACE]],
[[GEONAMES]]
Property: temporal coverage
Propertytemporal coverageRequirement levelOptionalCardinality0..nURIdcterms:temporal
Rangedcterms:PeriodOfTime
Definition
This property refers to a temporal period that the Data Service covers.
Property: category
PropertycategoryRequirement levelOptionalCardinality0..1URIdcterms:type
Rangeskos:Concept
DefinitionCategory of the data serviceUsage note
This property SHOULD take as value one of the URIs of a concept defined in service type taxonomy
or code list.
Property: quality measurement
Propertyquality measurementRequirement levelOptionalCardinality0..nURIdqv:hasQualityMeasurement
Rangedqv:QualityMeasurement
DefinitionRefers to the performed quality measurements.It represents the evaluation of a given dataset
against
a specific quality metricUsage noteUse for quality measurements of data services (availability,response time, reliability)
Property: qualified attribution
Propertyqualified attributionRequirement levelOptionalCardinality0..nURIprov:qualifiedAttribution
Rangeprov:Attribution
Definition
This property refers to a link to an Agent having some form of responsibility for the Data
Service.
Property: status
PropertystatusRequirement levelOptionalCardinality0..1URIadms:status
Rangeskos:Concept
Usage note
This property refers to the maturity of the Data Service. It MUST take one of the values
Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]]
vocabulary.
Property: was used by
Propertywas used byRequirement levelOptionalCardinality0..nURIprov:wasUsedBy
Rangeprov:Activity
DefinitionThis property refers to an Activity that used the Data Service.Usage noteThis property MAY be used to specify a testing Activity over a Data Service, against a given
Standard, producing as
output a conformance degree.
Property: geographic bounding box
Propertygeographic bounding boxRequirement levelOptionalCardinality0..nURIdcat-us:geographicBoundingBox
Rangedcat-us:GeographicBoundingBox
Definition
This property describes the spatial extent of domain of application of an data service and is
standardized in
WGS 84 Lat/Long coordinate system.
Example
Dataset
A Dataset is a collection of data, published or curated by a single source and related by a common idea or concept. In contrast to a Data Service a Dataset is expected to be a collection of data that is available for access or download in one or more formats, as Distributions. Distributions belonging to the same Dataset should not differ in regards to the idea of the data that they represent. They may differ in regards to the physical representation of the data such as format or resolution. Or they may split the data of the dataset into portions of comparable size such as data per time period or location.
DCAT 3 provides guidelines about the usage of Data services and Distribution in relation to Datasets [[VOCAB-DCAT-3]].:
RDF Class:dcat:Dataset
Definition:A collection of data, published or curated by a single agent, and available for access or download in
one
or more representations.Subclass Of:dcat:Resource
Usage note
- This class describes the conceptual dataset. One or more representations might be available, with differing schematic layouts and formats or serializations.
- This class describes the actual dataset as published by the dataset provider. In cases where a distinction between the actual dataset and its entry in the catalog is necessary (because metadata such as modification date might differ), the dcat:CatalogRecord class can be used for the latter.
- The notion of dataset in DCAT is broad and inclusive, with the intention of accommodating resource types arising from all communities. Data comes in many forms including numbers, text, pixels, imagery, sound and other multi-media, and potentially other types, any of which might be collected into a dataset.
Rationale:The update of dcat:Dataset
is crucial as it
aligns the DCAT profile with international standards,
offering a standardized and widely recognized way to describe datasets. This alignment enhances data
interoperability and discoverability, enabling data publishers to provide structured metadata, improving
data sharing, and facilitating seamless integration for users and applications.PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1titledcterms:title``rdfs:Literal
M1..nMultilingual supportdescriptiondcterms:description``rdfs:Literal
M1..nMultilingual supportcontact pointdcat:contactPoint``vcard:Kind
R0..nNo Changedata dictionarydcat-us:describedBy
dcat:Distribution
R0..1Fixeddataset distributiondcat:distribution``dcat:Distribution
R0..nNo Changeidentifierdcterms:identifier``rdfs:Literal
R0..nFixedspatial/geographic coveragedcterms:spatial``dcterms:Location
R0..nFixedkeyword/tagdcat:keyword``rdfs:Literal
R0..nNo Changelanding pagedcat:landingPage
foaf:Document
R0..nNo Changeupdate/modification datedcterms:modified``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1No Changepublisherdcterms:publisher
foaf:Agent
R0..1No Changegeographic bounding boxdcat-us:geographicBoundingBox
dcat-us:GeographicBoundingBox
R0..nNewdcterms:temporal``dcterms:PeriodOfTime
R0..nFixedtheme/categorydcat:theme``skos:Concept
R0..nFixedaccess rightsdcterms:accessRights``dcterms:RightsStatement
O0..1Alignedconforms todcterms:conformsTo``dcterms:Standard
O0..nNo Changecontributordcterms:contributor``dcterms:Agent
O0..nNewdcterms:creator``dcterms:Agent
O0..nAligneddocumentationfoaf:page
foaf:Document
O0..nNewdcterms:accrualPeriodicity``dcterms:Frequency
O0..1Fixedhas versiondcat:hasVersion``dcat:Dataset
O0..nAlignedimageschema:image``schema:url
or
schema:ImageObject
O0..nNewdcat:inSeries``dcat:DatasetSeries
O0..nAlignedis referenced bydcterms:isReferencedBy``rdfs:Resource
O0..nAlignedlanguagedcterms:language``dcterms:LinguisticSystem
O0..nFixedliability statementdcat-us:liabilityStatement
dcat-us:LiabilityStatement
O0..1Newdcat-us:metadataDistribution
dcat:Distribution
O0..nNewdcat:next``dcat:Dataset
O0..1Alignedother identifieradms:identifier
adms:Identifier
O0..nNewdcat-us:purpose
rdfs:Literal
O0..nNewdcat:prev``dcat:Dataset
O0..1Alignedprovenancedcterms:provenance``dcterms:ProvenanceStatement
O0..nNewprov:qualifiedAttribution``prov:Attribution
O0..nAlignedqualified relationdcat:qualifiedRelation``dcat:Relationship
O0..nAlignedrelated resourcedcterms:relation``rdfs:Resource
O0..nAlignedrelease datedcterms:issued``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..1No Changerightsdcterms:rights``dcterms:RightsStatement
O0..nFixedsampleadms:sample``dcat:Distribution
O0..nNewskos:scopeNote``rdfs:Literal
O0..nNewdcterms:source``dcat:Dataset
O0..nNewadms:status``skos:Concept
O0..1Alignedsubjectdcterms:subject``skos:Concept
O0..nNewdqv:hasQualityMeasurement
dqv:QualityMeasurement
O0..nAlignedspatial resolution in metersdcat:spatialResolutionInMeters``rdfs:Literal
(typed
as xsd:decimal
)O0..nAlignedtemporal resolutiondcat:temporalResolution``rdfs:Literal
(typed
as xsd:duration
)
O0..nAlignedcategorydcterms:type``skos:Concept
O0..1Alignedversiondcat:version``rdfs:Literal
O0..nAlignedversion notesadms:versionNotes``rdfs:Literal
O0..nNewprov:wasGeneratedBy``prov:Activity
O0..nNew!
Mandatory Properties
Property: title
PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:title
Rangerdfs:Literal
DefinitionThis property contains a name given to the Dataset.Usage noteThis property can be repeated for parallel language versions of the title
(see Multilingualism).
Property: description
PropertydescriptionRequirement levelMandatoryCardinality1..nURIdcterms:description
Rangerdfs:Literal
DefinitionThis property contains a free-text account of the Dataset.Usage noteThis property can be repeated for parallel language versions of the
description (see Multilingualism). On the user interface
of
data portals, the content of the element whose language corresponds to the
display language selected by the user is displayed.
Recommended Properties
Property: contact point
Propertycontact pointRequirement levelRecommendedCardinality0..nURIdcat:contactPoint
Rangevcard:Kind
Usage note
This property contains contact information that can be used for sending comments about the Dataset.
This property MUST contain an email address that is continuously monitored by the data publisher.
If there are several contributors involved in the publication of the Dataset, the property can be used multiple times.
Property: dataset distribution
Propertydataset distributionCardinality0..nRequirement levelRecommendedURIdcat:distribution
Rangedcat:Distribution
Usage note
- This property links the Dataset to an available Distribution.
- In exceptional cases, a Dataset for which no distribution form exists (yet) can be described in the Catalog. In this case, the element dcat:distribution may be omitted.
Property: identifier
PropertyidentifierRequirement levelMandatoryCardinality0..nURIdcterms:identifier
Rangerdfs:Literal
Usage note
- This property contains a unique identifier for the Dataset, e.g. a URI or other unique identifier in the context of the Catalog.
Property: spatial/geographic coverage
Propertyspatial/geographic coverageRequirement levelRecommendedCardinality0..nURIdcterms:spatial
Rangedcterms:Location
Usage note
This property refers to a geographic region that is covered by the Dataset.
CV to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[GEONAMES]]
Property: keyword/tag
Propertykeyword/tagRequirement levelRecommendedCardinality0..nURIdcat:keyword
Rangerdfs:Literal
Usage note
- This property contains a keyword or tag describing the Dataset.
- Good practice: mark the language of the keywords with the [[ISO 639-1]] language code such as "geodata"@en.
Property: landing page
Propertylanding pageRequirement levelRecommendedCardinality0..nURIdcat:landingPage
Rangefoaf:Document
Usage note
This property refers to a web page that provides access to the Dataset, its Distributions and/or additional information.
It is intended to point to a landing page at the original data provider, not to a page on a site of a third party, such as an aggregator.
Property: update/modification date
Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modified
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
This property contains the most recent date on which the Dataset was changed or modified.
No value may indicate that the Dataset has never changed after its initial publication, or that the date of the last modification is not known, or that the Dataset is continuously updated
This property MUST only be set if the distributions (the actual data) that the Dataset describes have been updated after it has been issued. In this case the property MUST contain the date of the last update. That way a person or institution using the data for an analysis or application will know when to update the report or application on their side.
Property: publisher
PropertypublisherRequirement levelRecommendedCardinality0..1URIdcterms:publisher
Rangefoaf:Agent
Usage note
This property refers to an entity (organization) responsible for making the Dataset available.
Property: geographic bounding box
Propertygeographic bounding boxRequirement levelRecommendedCardinality0..nURIdcat-us:geographicBoundingBox
Rangedcat-us:GeographicBoundingBox
Definition
A geographic bounding box in WGS 84 coordinate systems (Lat/Long) that describes the spatial extent of
the dataset.
Usage
A dataset can have multiple geographic bounding boxes (example continental US and Alaska).
The goal of having geographic bounding box is to provide a common coordinate reference system to
describe the spatial extent of the dataset.
Property: temporal coverage
Propertytemporal coverageRequirement levelRecommendedCardinality0..nURIdcterms:temporal
Rangedcterms:PeriodOfTime
Usage noteThe temporal coverage of a dataset may be encoded as an instance of dcterms:PeriodOfTime, or may
be
indicated using an IRI reference (link) to a resource describing a time period.
Property: theme/category
Propertytheme/categoryRequirement levelRecommendedCardinality0..nURIdcat:theme
Rangeskos:Concept
Usage note
This property refers to a category of the Dataset. A Dataset may be associated with multiple themes.
CV to be used: [[?DATA-GOV-THEME]]
Optional Properties
Property: access rights
Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRights
Rangedcterms:RightsStatement
Usage note
This property refers to information that indicates whether the Dataset is open data, has access restrictions or is not public.
CV to be used: [[?DATA-GOV-AR]].
Property: conforms to
Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:conformsTo
Rangedcterms:Standard
Usage note
- This property refers to an implementing rule or other specification.
- This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation of a Dataset conforms to. This is (generally) a complementary concern to the media-type or format.
Property: contributor
PropertycontributorRequirement levelOptionalCardinality0..nURIdcterms:contributor
Rangefoaf:Agent
Usage note
This property refers to an agent contributing to the Dataset.
Property: creator
PropertycreatorRequirement levelOptionalCardinality0..1URIdcterms:creator
Rangefoaf:Agent
Usage note
This property refers to an entity responsible for producing the dataset.
Property: data dictionary
Propertydata dictionaryRequirement levelRecommendedCardinality0..1URIdcat-us:describedBy
Rangedcat:Distribution
Usage note
This is used to specify a data dictionary or schema that defines fields (variables, dimensions, measures, attributes) in the dataset.
Property: documentation
PropertydocumentationRequirement levelOptionalCardinality0..nURIfoaf:page
Rangefoaf:Document
Usage note
This property refers to a page or document about this Dataset.
Property: frequency
PropertyfrequencyRequirement levelOptionalCardinality0..1URIdcterms:accrualPeriodicity
Rangedcterms:Frequency
Usage note
- This property refers to the frequency at which the Dataset is updated.
- CV to be used: [[CLD-FREQ]].
Property: quality measurement
Propertyquality measurementRequirement levelOptionalCardinality0..nURIdqv:hasQualityMeasurement
Rangedqv:QualityMeasurement
DefinitionRefers to the performed quality measurements.It represents the evaluation of a given dataset
against
a specific quality metricUsage noteUse for quality measurements other than spatial resolution in meters (use dcat:spatialResolutionInMeters
).
Examples of quality measurements includes completeness, accuracy, accuracy, timeliness, granularity.
Property: has version
URIdcat:hasVersion
Definition:
This resource has a more specific, versioned resource [[?PAV]].
Equivalent property:pav:hasVersion
Sub-property of:dcterms:hasVersion
Sub-property of:prov:generalizationOf
Usage note
A related Dataset that is a version, edition, or adaptation of the described Dataset.
Property: inSeries
PropertyinSeriesRequirement levelOptionalOptional0..nURIdcat:inSeries
Rangedcat:DatasetSeries
Usage noteThe datasets are linked to the dataset series by using the property dcat:inSeries
.
Note that a dataset series can also be hierarchical, and a dataset series can be a member of another
dataset series DefinitionA dataset series of which the dataset is part.
Property: is referenced by
Propertyis referenced byRequirement levelOptionalCardinality0..nURIdcterms:isReferencedBy
Rangerdfs:Resource
Usage note
This property is about a related resource, such as a publication, that references,
cites, or otherwise points to the Dataset.
Property: language
PropertylanguageRequirement levelOptionalOptional0..nURIdcterms:language
Rangedcterms:LinguisticSystem
DefinitionA language of the dataset. This refers to the natural language used for textual metadata (i.e.,
titles, descriptions, etc.) of a dataset.
Usage noteResources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.Usage noteThe value(s) provided for members of a catalog (i.e., dataset or service) override the value(s)
provided for the catalog if they conflict.Usage noteIf representations of a dataset are available for each language separately, define an instance
of dcat:Distribution for each language and describe the
specific language of each distribution
using dcterms:language
(i.e., the dataset will have multiple dcterms:language
values and
each distribution will have just one as the value of its dcterms:language
property).
In case of multilingual distributions, the distributions will have multiple dcterms:language
values.
Property: next
PropertynextRequirement levelOptionalOptional0..1URIdcat:next
Rangedcat:Dataset
DefinitionThe following resource (after the current one) in an ordered collection or series of resources.
Property: other identifier
Propertyother identifierRequirement levelOptionalOptional0..nURIadms:identifier
Rangeadms:Identifier
Usage note
A secondary identifier of the Dataset, such as MAST/ADS17, DataCite18, DOI19, EZID20 or
W3ID21.
Property: prev
PropertyprevRequirement levelOptionalOptional0..1URIdcat:prev
Rangedcat:Dataset
Usage noteUnless the dataset is the last in the chain a dataset in a collection must have a previous one.
DefinitionThe previous resource (before the current one) in an ordered collection or series of resources.
Property: provenance
PropertyprovenanceRequirement levelOptionalOptional0..nURIdcterms:provenance
Rangedcterms:ProvenanceStatement
Definition
- A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation.
Usage note This property contains a statement about the lineage of a Dataset.
Property: qualified attribution
Propertyqualified attributionRequirement levelOptionalCardinality0..nURIprov:qualifiedAttribution
Rangeprov:Attribution
Usage note
This property refers to a link to an Agent having some form of responsibility for the
resource.
Property: qualified relation
Propertyqualified relationRequirement levelOptionalCardinality0..nURIdcat:qualifiedRelation
Rangedcat:Relationship
Usage note
This property provides a link to a description of a relationship with another resource and it is especially meant for relationships between Datasets.
It replaces the property rdfs:seeAlso of DCAT-US v1.
See here for examples on how to use it:
dcat:qualifiedRelation
.
Property: release date
Propertyrelease dateRequirement levelOptionalCardinality0..1URIdcterms:issued
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
This property contains the date of formal issuance (e.g., first publication) of the Dataset.
If this date is not known, the date of the first referencing of the data collection in the Catalog can be entered.
Property: rights
PropertyrightsRequirement levelRecommendedCardinality0..nURIdcterms:rights
Rangedcterms:RightsStatement
Usage note
This property refers to a statement that specifies copyrights associated with the Dataset.
Property: sample
PropertysampleRequirement levelOptionalCardinality0..nURIadms:sample
Rangedcat:Distribution
Definition
- Links to a sample of an Dataset, which is a dcat:Distribution.
Property: usage note
Propertyusage noteRequirement levelOptionalCardinality0..nURIskos:scopeNote
Property: source
PropertysourceRequirement levelOptionalCardinality0..nURIdcterms:source
Rangedcat:Dataset
Usage note
A related Dataset from which the described Dataset is derived.
Property: subject
PropertysubjectRequirement levelOptionalCardinality0..nURIdcterms:subject
Rangeskos:Concept
Definition
Primary Subjects of the Dataset.
Usage note
Primary Subjects of the Dataset defined in a controlled vocabularies. Subjects are typically narrower
in meaning than dcat:theme
.
Property: status
PropertystatusRequirement levelOptionalCardinality0..1URIadms:status
Rangeskos:Concept
Usage note
This property refers to the maturity of the Dataset. It MUST take one of the values
Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]]
vocabulary.
Property: spatial resolution in meters
Propertyspatial resolution in metersRequirement levelOptionalCardinality0..nURIdcat:spatialResolutionInMeters
Rangerdfs:Literal
(typed
as xsd:decimal
)Usage note
- If the dataset is an image or grid this should correspond to the spacing of items. For other kinds of spatial datasets, this property will usually indicate the smallest distance between items in the dataset.
- The range of this property is a decimal number representing a length in meters. This is intended to provide a summary indication of the spatial resolution of the data as a single number. More complex descriptions of various aspects of spatial precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].
Property: temporal resolution
Propertytemporal resolutionRequirement levelOptionalCardinality0..nURIdcat:temporalResolution
Rangerdfs:Literal
(typed
as xsd:duration
)Usage note
- If the dataset is a time-series this should correspond to the spacing of items in the series. For other kinds of dataset, this property will usually indicate the smallest time difference between items in the dataset
- This is intended to provide a summary indication of the temporal resolution of the dataset as a single value. More complex descriptions of various aspects of temporal precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].
Property: category
PropertycategoryRequirement levelOptionalCardinality0..1URIdcterms:type
Rangeskos:Concept
Usage note
- A type of the Dataset.
- A recommended controlled vocabulary data-type is foreseen.
Property: version
PropertyversionRequirement levelOptionalCardinality0..nURIdcat:version
Rangerdfs:Literal
Usage note
The version indicator (name or identifier) of a resource.
Property: version notes
Propertyversion notesRequirement levelOptionalCardinality0..nURIadms:versionNotes
Rangerdfs:Literal
Usage note
A description of the differences between this version and a previous version of the Dataset.
This property can be repeated for parallel language versions of the version notes.
Property: was generated by
Propertywas generated byRequirement levelOptionalCardinality0..nURIprov:wasGeneratedBy
Rangeprov:Activity
Usage note
An activity that generated, or provides the business context for, the creation of the dataset.
Example
Property: metadata distribution
Propertymetadata distributionRequirement levelOptionalCardinality0..nURIdcat-us:metadataDistribution
Rangedcat:Distribution
Definition
Property referring to a metadata document distribution from which this dataset is derrived from.
Usage note
Distribution to "original" metadata document from which the dataset is derived from
Property: liability statement
Propertyliability statementRequirement levelOptionalCardinality0..1URIdcat-us:liabilityStatement
Rangedcat-us:LiabilityStatement
Usage note
A liability statement about the dataset
Property: purpose
PropertypurposeRequirement levelOptionalCardinality0..nURIdcat-us:purpose
Rangerdfs:Literal
Usage note
The purpose of the dataset
Property: image
PropertyimageRequirement levelOptionalCardinality0..3URIschema:image
Rangeschema:url
or
schema:ImageObject
DefinitionA thumbnail picture illustrating the content of the dataset.Usage note
A thumbnail picture illustrating the content of the Dataset.
For distributions that consist of visual content (photographs, videos, maps, etc.) it makes sense to add a limited number of thumbnails to the metadata.
It’s a DCAT-US Custom Class
Dataset Series
The DatasetSeries
concept in the DCAT-US specification serves a dual purpose. Primarily, it
represents a collection of related datasets that share common characteristics and are published as a series,
facilitating the organization and discovery of datasets that evolve over time or are updated regularly. Beyond
this, DatasetSeries
also provides a mechanism for grouping datasets into thematic collections,
regardless of whether these collections form a temporal series. This flexibility enhances the specification's
utility by supporting a wider range of data publication practices, enabling users to effectively discover and
understand datasets grouped by series or thematic similarity.
RDF Class:dcat:DatasetSeries
Definition:A collection of datasets that are published separately, but share some characteristics that group
them.
Subclass Of:dcat:Dataset
Usage note
- Dataset series can be also soft-typed via property dcterms:type as in the approach used in [[?GeoDCAT-AP]]
- Common scenarios for dataset series include: time series composed of periodically released subsets; map-series composed of items of the same type or theme but with differing spatial footprints.
Rationale:Incorporating dcat:DatasetSeries
is
essential to enable the structured grouping and presentation of
related datasets, ensuring that data publishers can convey meaningful collections of data. This
facilitates
efficient data organization and discovery for users, aligning the DCAT profile with international
standards
for dataset series representation.PropertyURIRangeReqLevelCardtitledcterms:title``rdfs:Literal
M1..ndescriptiondcterms:description``rdfs:Literal
M1..ncontact pointdcat:contactPoint``vcard:Kind
R0..nfirstdcat:first``dcat:Dataset
R0..1geographic bounding boxdcat-us:geographicBoundingBox
dcat-us:GeographicBoundingBox
R0..nspatial/geographic coveragedcterms:spatial``dcterms:Location
R0..nlastdcat:last``dcat:Dataset
R0..1update/modification datedcterms:modified``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1publisherdcterms:publisher
foaf:Agent
R0..1series memberdcat:seriesMember``dcat:Dataset
R0..1temporal coveragedcterms:temporal``dcterms:PeriodOfTime
R0..nfrequencydcterms:accrualPeriodicity``dcterms:Frequency
O0..1release datedcterms:issued``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..1
Mandatory Properties
Property: title
PropertyTitleRequirement levelMandatoryCardinality1..nURIdcterms:title
Rangerdfs:Literal
Usage note
- This property contains a name given to the Dataset Series.
- This property can be repeated for parallel language versions of the name (see Multilingualism).
Property: description
PropertydescriptionRequirement levelMandatoryCardinality1..nURIdcterms:description
Rangerdfs:Literal
Usage note
- This property contains a free-text account of the Dataset Series.
- This property can be repeated for parallel language versions of the description (see Multilingualism). It is recommended to provide an indication about the dimensions the Dataset Series evolves.
Recommended Properties
Property: contact point
Propertycontact pointRequirement levelRecommendedCardinality0..nURIdcat:contactPoint
Rangevcard:Kind
Usage note
Contact information that can be used for sending comments about the Dataset Series.
Property: first
PropertyfirstRequirement levelRecommendedCardinality0..1URIdcat:first
Rangedcat:Dataset
Usage note
The first resource in an ordered collection or series of resources, to which the current
resource belongs.
Property: geographic bounding box
Propertygeographic bounding boxRequirement levelRecommendedCardinality0..nURIdcat-us:geographicBoundingBox
Rangedcat-us:GeographicBoundingBox
Definition
A geographic bounding box in WGS 84 coordinate systems (Lat/Long) that describes the spatial extent
of the dataset series.
Usage
A dataset series can have multiple geographic bounding boxes (example continental US and Alaska).
The goal of having geographic bounding box is to provide a common coordinate reference system to
describe the spatial extent of the dataset series.
Property: spatial/geographic coverage
Propertyspatial/geographic coverage Requirement levelRecommendedCardinality0..nURIdcterms:spatial
Rangedcterms:Location
Usage note
This property refers to a geographic region that is covered by the Dataset Series.
When spatial coverage is a dimension in the dataset series then the spatial coverage of each dataset in the collection should be part of the spatial coverage. In that case, an open ended value is recommended, e.g. EU or a broad bounding box covering the expected values.
CV to be used: The Vocabularies Name Authority Lists MUST be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs MUST be used: [[?DATA-GOV-CONT]], [[?DATA-GOV-COUNTRY]], [[?DATA-GOV-PLACE]], [[GEONAMES]]
Property: last
PropertylastRequirement levelRecommendedCardinality0..1URIdcat:last
Rangedcat:Dataset
Usage note
The last resource in an ordered collection or series of resources
Property: update/modification date
Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modified
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
This property contains the most recent date on which the Dataset Series was changed or modified.
No value may indicate that the Dataset Series has never changed after its initial publication, or that the date of the last modification is not known, or that the Dataset Series is continuously updated
This is not equal to the most recent modified dataset in the collection of the dataset series.
Property: publisher
PropertypublisherRequirement levelRecommendedCardinality0..1URIdcterms:publisher
Rangefoaf:Agent
Usage note
This property refers to an entity (organization) responsible for ensuring the coherency of the Dataset Series.
The publisher of the dataset series may not be the publisher of all datasets. E.g. a digital archive could take over the publishing of older datasets in the series.
Property: series member
Propertyseries memberRequirement levelRecommendedCardinality0..nURIdcat:seriesMember
Rangedcat:Dataset
Usage note
A member of the Dataset Series.
Property: temporal coverage
Propertytemporal coverageRequirement levelRecommendedCardinality0..nURIdcterms:temporal
Rangedcterms:PeriodOfTime
Usage note
- A temporal period that the Dataset Series covers.
- When temporal coverage is a dimension in the dataset series then the temporal coverage of each dataset in the collection should be part of the temporal coverage. In that case, an open ended value is recommended, e.g. after 2012.
- The temporal coverage of a dataset series may be encoded as an instance of dcterms:PeriodOfTime, or may be indicated using an IRI reference (link) to a resource describing a time period
.
Optional Properties
Property: frequency
PropertyfrequencyRequirement levelOptionalCardinality0..1URIdcterms:accrualPeriodicity
Rangedcterms:Frequency
Usage note
This property refers to the frequency at which the Dataset Series is updated.
The frequency of a dataset series is not equal to the frequency of the dataset in the collection.
CV to be used: [[CLD-FREQ]].
Property: release date
Propertyrelease dateRequirement levelOptionalCardinality0..1URIdcterms:issued
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
This property contains the date of formal issuance (e.g.,publication) of the Dataset Series.
The moment when the dataset series was established as a managed resource. This is not equal to the release date of the oldest dataset in the collection of the dataset series.
Example
In this example, ex:populationCensus
represents a series of datasets related to the US
Population Census Data,
which is issued every 10 years (decennial). Individual datasets for specific years (e.g.,
ex:populationCensus-1950
) are also defined, each pointing to the next dataset in the series
using
dcat:next
.
Distribution
In the context of the DCAT-US profile, a metadata entry of this class serves to characterize a distribution of data, which constitutes a specific representation of a Dataset. Datasets within this profile may offer multiple serializations, each potentially differing in various aspects, including natural language, media type or format, schematic organization, temporal and spatial resolution, level of detail, or profiles that specify any combination of these attributes.
A distribution may encompass the entirety of the Dataset's data or only a subset thereof. For example, it could encompass all data related to the population in the United States or focus exclusively on a specific year, such as 2020. Alternatively, it might provide the data in an alternate format, such as a graphical representation covering the years 2010 through 2020.
Within the DCAT-US profile, various relationships between Datasets and their distributions are represented. The most straightforward relationship involves aggregating different physical representations of data, referred to as "Distributions," into a single Dataset. An example of such a Dataset is a time series, where each distribution corresponds to one year of data, and the Dataset spans multiple years.
In the DCAT vocabulary, dcat:Distribution
is employed to characterize the diverse
representations
and formats in which a dataset is disseminated, facilitating the description of different versions or media
types of the same data, and often includes properties like dcat:downloadURL
for direct download
links. On the other hand, dcat:DataService
serves the purpose of detailing data access services,
such as APIs and endpoints, enabling programmatic or interactive data retrieval, with key properties like
dcat:endpointURL
specifying service endpoints and dcat:serviceType
indicating the
type
of service, thus distinguishing between the description of data formats and the specification of data access
services within the DCAT framework.
RDF Class:dcat:Distribution
Definition:A specific representation of a dataset. A dataset might be available in multiple serializations that
may
differ in various ways, including natural language, media-type or format, schematic organization,
temporal
and spatial resolution, level of detail or profiles (which might specify any or all of the above).Subclass Of:dcat:Resource
Usage note
- This represents a general availability of a dataset. It implies no information about the actual access method of the data, i.e., whether by direct download, API, or through a Web page. The use of dcat:downloadURL property indicates directly downloadable distributions.
Rationale:The update to DCAT 3 dcat:Distribution
is
of paramount significance as it greatly enhances data accessibility. It introduces a more comprehensive
and
structured approach to describing data distributions, ensuring that data consumers can easily understand
and
access the data in the format that best suits their needs, ultimately fostering greater data utilization
and
dissemination.PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1licensedcterms:license``dcterms:LicenseDocument
M1..1Alignedaccess URLdcat:accessURL``rdfs:Resource
R0..1No Changeformatdcterms:format
dcterms:MediaType
R0..1Fixedrightsdcterms:rights``dcterms:RightsStatement
R0..nAlignedaccess Restrictiondcat-us:accessRestriction
dcat-us:AccessRestriction
R0..nNewdcat-us:useRestriction
dcat-us:UseRestriction
R0..nNewdcat-us:cuiRestriction
dcat-us:CuiRestriction
R0..1Newdcat-us:describedBy
dcat:Distribution
R0..1Fixedtitledcterms:title``rdfs:Literal
R0..nMultilingual supportupdate/modification datedcterms:modified``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1Alignedrepresentation techniqueadms:representationTechnique``skos:Concept
O0..1Newadms:status``skos:Concept
O0..1Alignedcharacter encodingcnt:characterEncoding``rdfs:Literal
O0..nNewdcat:compressFormat
dcterms:MediaType
O0..1Alignedspatial resolution in metersdcat:spatialResolutionInMeters
xsd:decimal
O0..1Alignedquality measurementdqv:hasQualityMeasurement
dqv:QualityMeasurement
O0..nAlignedaccess rightsdcterms:accessRights``dcterms:RightsStatement
O0..1Alignedaccess servicedcat:accessService``dcat:DataService
O0..nAlignedbyte sizedcat:byteSize``xsd:nonNegativeInteger
O0..1Alignedchecksumspdx:checksum``spdx:Checksum
O0..1Aligneddocumentationfoaf:page
foaf:Document
O0..nNewdcat:downloadURL``rdfs:Resource
O0..1No Changeidentifierdcterms:identifier``rdfs:Literal
O0..1Alignedimageschema:image``schema:url
or
schema:ImageObject
O0..3Newdcterms:language``dcterms:LinguisticSystem
O0..nAlignedconforms todcterms:conformsTo``dcterms:Standard
O0..nNo Changemedia typedcat:mediaType``dcterms:MediaType
O0..1Fixedpackaging formatdcat:packageFormat
dcterms:MediaType
O0..1Alignedrelease datedcterms:issued``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1Alignedtemporal resolutiondcat:temporalResolution``xsd:duration
R0..1Aligned
Recommended Properties
Property: access restriction
Propertyaccess restrictionRequirement levelRecommendedCardinality0..nURIdcat-us:accessRestriction
Rangedcat-us:AccessRestriction
Usage note
This property refers to a statement that specifies access restriction associated with the Distribution.
Property: access URL
Propertyaccess URLRequirement levelRecommendedCardinality0..1URIdcat:accessURL
Rangerdfs:Resource
Usage note
This should be the URL for an indirect means of accessing the data, such as API documentation, a 'wizard' or other graphical interface which is used to generate a download, feed, or a request form for the data. When the access is restricted but the dataset is available online indirectly, this field should be the URL that provides indirect access. This should not be a direct download URL. It is usually assumed that accessURL is an HTML webpage. This property contains a URL that gives access to a Distribution of the Dataset (typically from a service). The resource at the access URL may contain information about how to get the Dataset.
If the distribution(s) are accessible only through a landing page (i.e., direct download URLs are not known), then the landing page URL associated with the dcat:Dataset SHOULD be duplicated as access URL on a distribution
dcat:accessURL should match the property dcat:endpointURL of the dcat:DataService associated with the distribution.
Property: description
PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:description
Rangerdfs:Literal
Usage note
This property contains a free-text account of the Distribution.
The description MAY be provided if the distribution contains only part of the data offered by the Dataset.
This property can be repeated for parallel language versions of the description (see Multilingualism).
Property: data dictionary
Propertydata dictionaryRequirement levelRecommendedCardinality0..1URIdcat-us:describedBy
Rangedcat:Distribution
Usage note
This is used to specify a data dictionary or schema that defines fields (variables, dimensions, measures, attributes) in the distribution.
Property: format
PropertyformatRequirement levelRecommendedCardinality0..1URIdcterms:format
Rangedcterms:MediaType
Usage note
- This property refers to the file format of the Distribution.
- CV to be used: [[?DATA-GOV-FT]]
- If a format is not available, use media type ([[IANA-MEDIA-TYPES]]) if applicable.
Property: license
PropertylicenseRequirement levelRecommendedCardinality0..1URIdcterms:license
Rangedcterms:LicenseDocument
Usage note
- This property refers to the license under which the Distribution is made available.
- CV to used: [[?DATA-GOV-LICENSE]]
Property: rights
PropertyrightsRequirement levelRecommendedCardinality0..nURIdcterms:rights
Rangedcterms:RightsStatement
Usage note
This property refers to a statement that specifies rights associated with the Distribution.
Property: use restriction
Propertyuse restrictionRequirement levelRecommendedCardinality0..nURIdcat-us:useRestriction
Rangedcat-us:UseRestriction
Usage note
This property refers to a statement that specifies use restriction associated with the
Distribution.
Property: cui restriction
Propertycui restrictionRequirement levelRecommendedCardinality0..nURIdcat-us:cuiRestriction
Rangedcat-us:CuiRestriction
Usage note
This property refers to a statement that specifies cui restriction associated with the
Distribution.
Property: title
PropertytitleRequirement levelRecommendedCardinality0..nURIdcterms:title
Rangerdfs:Literal
Usage note
This property contains a name given to the Distribution. This property can be repeated for parallel language versions of the description (see Multilingualism).
The title MUST be given if the distribution contains only part of the data offered by the Dataset
The title can be given in several languages. In multilingual data portals, the title in the language selected by a user will usually be shown as title for the distribution.
Property: update/modification date
Propertyupdate/modification dateRequirement levelRecommendedCardinality0..1URIdcterms:modified
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
This property contains the most recent date on which the Distribution was changed or modified.
Optional Properties
Property: representation technique
Propertyrepresentation techniqueRequirement levelOptionalCardinality0..1URIadms:representationTechnique
Rangeskos:Concept
DefinitionMore information about the format in which a Distribution is released. This is different from the
file format as, for example, a XML file (file format) could contain an XML schema (representation
technique).
Usage noteadms:representationTechnique
in DCAT-US metadata plays a crucial role in detailing the specific schema, standard, or method used
to
structure data within a dataset, like specifying RFC 4180 for CSV, GeoJSON for JSON, OWL for RDF, or
XML Schema for XML. This contrasts with dcterms:format, which
broadly identifies the file format (e.g., CSV, JSON, RDF, XML), providing a general idea of the
data's
structure and syntax. Meanwhile, dcterms:mediaType
complements
these by
defining the MIME type, such as 'text/csv' or 'application/json', crucial for software processing
and
data transmission. The detailed insight provided by adms:representationTechnique
is
indispensable for users needing comprehensive knowledge about the dataset's internal organization
and
interpretation, which goes beyond the basic format or MIME type indicated by dcterms:format and
dcterms:mediaType.
Property: status
PropertystatusRequirement levelOptionalCardinality0..1URIadms:status
Rangeskos:Concept
Usage note
This property refers to the maturity of the Distribution. It MUST take one of the values
Completed, Deprecated, Under Development, Withdrawn from the ADMS status [[VOCAB-ADMS-SKOS]]
vocabulary.
Property: character encoding
Propertycharacter encodingRequirement levelOptionalCardinality0..nURIcnt:characterEncoding
Rangerdfs:Literal
Usage note
This property SHOULD be used to specify the character encoding of the Distribution, by using
as value the character set names in the IANA register [[IANA-CHARSETS]].
Property: compression format
Propertycompression formatRequirement levelOptionalCardinality0..1URIdcat:compressFormat
Rangedcterms:MediaType
Usage note
This property refers to the format of the file in which the data is contained in a
compressed form, e.g., to reduce the
size of the downloadable file. It SHOULD be expressed using a media type as defined in the
official register of media types managed by IANA [[IANA-MEDIA-TYPES]].
Property: spatial resolution in meters
Propertyspatial resolution in metersRequirement levelOptionalCardinality0..nURIdcat:spatialResolutionInMeters
Rangexsd:decimal
Usage note
- This property refers to the minimum spatial separation resolvable in a Distribution, measured in meters.
Property: quality measurement
Propertyquality measurementRequirement levelOptionalCardinality0..nURIdqv:hasQualityMeasurement
Rangedqv:QualityMeasurement
DefinitionRefers to the performed quality measurements on a distribution.It represents the evaluation of a
given distribution against a specific quality metricUsage noteUse for quality measurements other than dcat:spatialResolutionInMeters
or
dcat:temporalResolution
.
Examples of quality measurements includes completeness, accuracy, accuracy, timeliness, granularity.
Property: access rights
Propertyaccess rightsRequirement levelOptionalCardinality0..1URIdcterms:accessRights
Rangedcterms:RightsStatement
Usage note
- This property MAY include information regarding access or restrictions based on privacy, security, or other policies.
Property: access service
Propertyaccess serviceRequirement levelOptionalCardinality0..nURIdcat:accessService
Rangedcat:DataService
Usage note
This property refers to a data service that gives access to the distribution of the Dataset
Property: byte size
Propertybyte sizeRequirement levelOptionalCardinality0..1URIdcat:byteSize
Rangexsd:nonNegativeInteger
DefinitionThe size of a distribution in bytes.Usage note The size in bytes can be approximated (as a non-negative integer) when the precise size is not
known.
Property: checksum
PropertychecksumRequirement levelOptionalCardinality0..1URIspdx:checksum
Rangespdx:Checksum
Usage note
This property provides a mechanism that can be used to verify that the contents of a distribution have not changed.
The checksum is related to the downloadURL.
Property added in [[VOCAB-DCAT-3]]:
spdx:checksum
Property: coverage
PropertycoverageRequirement levelOptionalCardinality0..nURIdcterms:coverage
Rangedcterms:LocationPeriodOrJurisdiction
Usage note
- If a dataset contains distributions that differ regarding their content beyond just differences in format or resolution this property can be used to specify temporal or spatial coverage of the data that the distribution contains.
Property: documentation
PropertyDocumentationRequirement levelOptionalCardinality0..nURIfoaf:page
Rangefoaf:Document
Usage note
This property refers to a page or document about this Distribution.
Property: download URL
Propertydownload URLRequirement levelOptionalCardinality0..1URIdcat:downloadURL
Rangerdfs:Resource
Usage note
This must be the direct download URL. Other means of accessing the dataset should be
expressed using accessURL. This should always be accompanied by mediaType.
Property: identifier
PropertyidentifierRequirement levelOptionalCardinality0..1URIdcterms:identifier
Rangerdfs:Literal
Usage note
An identifier for the distribution, that identifies it as a
resource mainly for the organization publishing the data.
Property: image
PropertyimageRequirement levelOptionalCardinality0..3URIschema:image
Rangeschema:ImageObject
Usage note
This property is for associating thumbnail images that visually represent the Distribution's
content,
especially beneficial for visual content like photographs, videos, maps, etc. Thumbnails should
effectively illustrate or summarize the content, enhancing metadata richness and utility. While
typically only URLs pointing directly to downloadable images are allowed, for more detailed
representation, additional fields from schema:ImageObject
, such as
schema:caption
, can be utilized to provide further context or descriptions. This
approach
ensures the "image" property not only aids in content identification but also enriches the user's
understanding and interaction with the metadata.
Property: language
PropertyLanguageRequirement levelOptionalCardinality0..nURIdcterms:language
Rangerdfs:Literal
DefinitionA language of the resource. This refers to the natural language used for textual metadata (i.e.,
titles, descriptions, etc.) of textual values of a dataset distribution
Usage note
Resources defined by the Library of Congress ([[ISO 639-1]] SHOULD be used.
Usage Note
For datasets available in separate languages, create a dcat:Distribution instance for each language version.
Assign a unique dcterms:language
value to
each distribution to specify its language. Distributions with multiple languages should list
several dcterms:language
values.
Property: conforms to
Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:conformsTo
Rangedcterms:Standard
(A
basis
for comparison; a reference point against which other things can be evaluated.)
DefinitionAn established standard to which the distribution conforms.Usage note
This is used to identify a standardized specification the distribution conforms to. It's recommended
that this be a URI that serves as a unique identifier for the standard. The URI may or may not also
be
a URL that provides documentation of the specification. This property SHOULD be used to indicate the
model, schema, ontology, view or profile that this representation of a dataset conforms to. This is
(generally) a complementary concern to the media-type or format.
Property: media type
Propertymedia typeRequirement levelOptionalCardinality0..1URIdcat:mediaType
Rangedcterms:MediaType
Definition
This property refers to the media type of the Distribution as defined in the official
register of media types managed by IANA [[IANA-MEDIA-TYPES]].
Usage note
The mediaType
property specifies the media type (MIME type) of the distribution.
It should be used when the distribution's format corresponds to a standard media type registered
with the IANA Media Types [[IANA-MEDIA-TYPES]].
This property provides a precise technical descriptor of the data format (e.g.,
application/json
, text/csv
).
Usage note This property refers to the media type of the Distribution as defined in the official register of media types managed by IANA. [[IANA-MEDIA-TYPES]]. Usage noteThe encoding in JSON-LD allows to use mime type without the full URL (e.g. text/csv). The JSON-LD context processor will expand automatically to the full uri in RDF using the base uri https://www.iana.org/assignments/media-types/. This preserves backward compatibility with DCAT-US 1.1
Property: packaging format
Propertypackaging formatRequirement levelOptionalCardinality0..1URIdcat:packageFormat
Rangedcterms:MediaType
Usage note
This property refers to the format of the file in which one or more data files are grouped together, e.g. to enable a set of related files to be downloaded together.
It SHOULD be expressed using a media type as defined in the official register of media types managed by IANA.
Property: release date
Propertyrelease dateRequirement levelOptionalCardinality0..1URIdcterms:issued
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Usage note
This property contains the date of formal issuance (e.g., publication) of the Distribution.
Date of formal issuance (publication) of the distribution
UsageThe first time issuance of the distribution.
Property: temporal resolution
Propertytemporal resolutionRequirement levelOptionalCardinality0..1URIdcat:temporalResolution
Rangexsd:duration
Usage note
This property refers to the minimum time period resolvable in the Dataset distribution.
Example
Document
RDF Class:foaf:Document
ObligationOptionalDefinition:A publication - as a scientific paper, a techni cal report, a book, book chapter, but also a blog
post.
Usage noteDepending on whether a catalog supports or not publications as first-class citizens, a publication can
be fully described, or simply denoted by its URI.Rationale:The introduction of foaf:Document
significantly improves the representation of documents within the DCAT-US profile. It ensures that
metadata
about documents, such as title, format, language, and access options, are clearly defined and
standardized.
This
alignment with global data standards fosters interoperability and eases document integration into various
data ecosystems, benefiting both publishers and consumers.Reference
Properties Summary
PropertyURIRangeReqLevelCardtitledcterms:title
rdfs:Literal
M1..nindividual authordcterms:creator
foaf:Person
R0..ncorporate authordcterms:creator
org:Organization
R0..nauthor(s) as literaldc:creator
rdfs:Literal
R0..npublisher organizationdcterms:publisher
org:Organization
R0..1publisher(s) as literaldc:publisher
rdfs:Literal
R0..nidentifierdcterms:identifier
rdfs:Literal
R0..1publication datedcterms:issued
rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1bibliographic citationdcterms:bibliographicCitation
rdfs:Literal
R0..1document typedcterms:type
skos:Concept
R0..1abstractdcterms:abstract
rdfs:Literal
O0..ndescriptiondcterms:description
rdfs:Literal
O0..nconforms todcterms:conformsTo
dcterms:Standard
O0..nmedia typedcterms:mediaType
dcterms:MediaType
O0..n
Mandatory Properties
Property: title
PropertytitleRequirement levelMandatoryCardinality1..nURIdcterms:title
Rangerdfs:Literal
Usage note
Recommended Properties
Property: bibliographic citation
Propertybibliographic citationRequirement levelRecommendedCardinality0..1URIdcterms:bibliographicCitation
Rangerdfs:Literal
Usage note
Property: description
PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:description
Rangerdfs:Literal
DefinitionDescription of the document.
Property: author(s) as literal
Propertyauthor(s) as literalRequirement levelRecommendedCardinality0..nURIdc:creator
Rangerdfs:Literal
Usage note
Use fields to represent creators as literal strings. This is useful when the creator is a structured
object
Property: identifier
PropertyidentifierRequirement levelRecommendedCardinality0..1URIdcterms:identifier
Rangerdfs:Literal
Usage note
Property: publication date
Propertypublication dateRequirement levelRecommendedCardinality0..nURIdcterms:issued
Rangerdfs:Literal
Usage note
Property: publisher organization
Propertypublisher organizationRequirement levelRecommendedCardinality0..nURIdcterms:publisher
Rangeorg:Organization
Usage note
Property: publisher(s) as literal
Propertypublisher(s) as literalRequirement levelRecommendedCardinality0..nURIdc:publisher
Rangerdfs:Literal
Usage note
Use this property to represent publisher as literal string and not structured data.
Property: document type
Propertydocument typeRequirement levelRecommendedCardinality0..1URIdcterms:type
Rangeskos:Concept
Usage note
Optional Properties
Property: abstract
PropertyabstractRequirement levelOptionalCardinality0..nURIdcterms:abstract
Rangerdfs:Literal
Usage note
Property: individual author
Propertyindividual authorRequirement levelOptionalCardinality0..nURIdcterms:creator
Rangefoaf:Person
Usage note
Property: corporate author
Propertycorporate authorRequirement levelOptionalCardinality0..nURIdcterms:creator
Rangeorg:Organization
Usage note
Property: conforms to
Propertyconforms toRequirement levelOptionalCardinality0..nURIdcterms:identifier
Rangedcterms:Standard
Usage note
An implementing rule or other specification.
Property: media type
Propertymedia typeRequirement levelOptionalCardinality0..nURIdcterms:mediaType
Rangedcterms:MediaType
Usage note
An implementing rule or other specification.
Example
Geographic Bounding Box
GeographicBoundingBox describes the spatial extent of domain of application of an resource and is standardized in WGS 84 Lat/Long coordinate system.
RDF Class:dcat-us:GeographicBoundingBox
Definition:GeographicBoundingBox describes the spatial extent of domain of application of an resource and is
standardized
in WGS 84 Lat/Long coordinate system.Usage note
Strongly recommended for geospatial data
Rationale
There is no consensus and common vocabulary to describe spatial bounding box in the
community. GML Envelope was proposed but it is too cumbersome to process. We introduce four separates
fields
for each bound (west, east, north and south) that removes any ambiguity and make it easy to index and
query
Properties Summary
PropertyURIRangeReqLevelCardwest bounding longitudedcat-us:westBoundingLongitude
xsd:decimal
M1east bounding longitudedcat-us:eastBoundingLongitude
xsd:decimal
M1south bouding latitudedcat-us:southBoundingLatitude
xsd:decimal
M1north bounding latitudedcat-us:northBoundingLatitude
xsd:decimal
M1
Mandatory Properties
Property: west bounding longitude
Propertywest bounding longitudeRequirement levelMandatoryCardinality1URIdcat-us:westBoundingLongitude
Rangexsd:decimal
DefinitionWest bound longitude in decimal degrees
Property: east bounding longitude
Propertyeast bounding longitudeRequirement levelMandatoryCardinality1URIdcat-us:eastBoundingLongitude
Rangexsd:decimal
DefinitionEast bound longitude in decimal degrees
Property: south bounding latitude
Propertysouth bounding latitudeRequirement levelMandatoryCardinality1URIdcat-us:southBoundingLatitude
Rangexsd:decimal
DefinitionSouth bound latitude in decimal degrees
Property: north bounding latitude
Propertynorth bounding latitudeRequirement levelMandatoryCardinality1URIdcat-us:southBoundingLatitude
Rangexsd:decimal
DefinitionNorth bound latitude in decimal degrees
Example
Identifier
RDF Class:adms:Identifier
ObligationOptionalDefinition:This is based on the UN/CEFACT Identifier class.Usage noteAn identifier in a particular context, consisting of the
- content string that is the identifier;
- an optional identifier for the identifier scheme;
- an optional identifier for the version of the identifier scheme;
- an optional identifier for the agency that manages the identifier scheme.
Reference
§ Term name: Identifier [ADMS]
Rationale
Incorporating adms:Identifier
in the
DCAT-US
profile fosters a culture of data governance and trust by transparently documenting the authority behind
each identifier. This enhances data reliability and credibility, boosting confidence for DCAT-US users.
Additionally, it enables versatile data access using multiple identifiers, enhancing overall data
accessibility and usability for diverse stakeholders.
Properties Summary
PropertyURIRangeReqLevelCardnotationskos:notation``xsd:string
R0..1creatordcterms:creator``dcterms:Agent
O0..1schema agencyadms:schemaAgency``rdfs:Literal
O0..1versiondcat:version
rdfs:Literal
O0..1issueddcterms:issued``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
O0..1
Recommended Properties
Property: notation
PropertynotationRequirement levelRecommendedCardinality0..1URIskos:notation
Rangexsd:string
Optional Properties
Property: creator
PropertycreatorRequirement levelOptionalCardinality0..1URIdcterms:creator
Rangedcterms:Agent
Property: schema agency
Propertyschema agencyRequirement levelOptionalCardinality0..1URIadms:schemaAgency
Rangerdfs:Literal
Property: version
PropertyversionRequirement levelOptionalCardinality0..1URIdcat:version
Rangerdfs:Literal
Property: issued
PropertyissuedRequirement levelOptionalCardinality0..1URIdcterms:issued
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
Example
Liability Statement
RDF Class:dcat-us:LiabilityStatement
Definition:A formal declaration accompanying a dataset intended to limit the legal exposure of the data provider by
disclaiming warranties or guarantees.
Usage note
- This statement often includes information of the following aspects:
- Limitation of Responsibility: Clarifying that the publisher or provider is not responsible for any errors in the data, and any consequences resulting from its use.
- No Guarantee of Validity: Indicating that there is no guarantee of the accuracy, reliability, or completeness of the data provided.
- Absence of Endorsement: Stating that inclusion of the data in the catalog does not imply endorsement by the publisher or provider.
- Use at Own Risk: Advising users that they use the data at their own risk and are responsible for ensuring its appropriateness for their intended purposes.
- The statement may be provided as a literal text or as a URL pointing to a detailed liability statement.
- Utilizing the LiabilityStatement helps in setting clear expectations for consumers of the dataset and limits potential legal exposures of the data provider.
RationaleIntroducing dcat-us:LiabilityStatement
in DCAT-US
clarifies data provider responsibilities and limitations, reducing legal risks by defining acceptable uses
and disclaiming warranties. This ensures transparency and legal compliance in data sharing within the
United
States.
Properties Summary
PropertyURIRangeReqLevelCardliability statement textrdfs:label``rdfs:Literal
O0..n
Optional Properties
Property: liability statement text
Propertyliability statement textRequirement levelOptionalCardinality0..nURIrdfs:label
Rangerdfs:Literal
DefinitionFull text of the liability statement.Usage noteProperty rdfs:label
MAY only be used to specify the text of liability statement
information. This property can be repeated for
parallel language versions of the description
Example
LicenseDocument
RDF Class:dcterms:LicenseDocument
ObligationOptionalDefinition:A legal document giving official permission to do something with a resource.Usage note
License document SHOULD be specified only with URIs from an endorsed Data.gov registry.
Property spdx:licenseText
MAY only be
used
to
specify license information in legacy metadata records, not compliant with
standard license from an endorsed Data.Gov registry.
Rationale:The introduction of dcterms:LicenseDocument
in
the DCAT profile enables the customization of license text. This flexibility empowers data publishers to
tailor license terms to specific dataset requirements, facilitating clear communication of licensing
conditions and promoting responsible data sharing and usage while adhering to established international
standards.Reference
§ Term name: LicenseDocument [DCTERMS]
Properties Summary
PropertyURIRangeReqLevelCardlicense textspdx:licenseText
rdfs:Literal
O0..n
Optional Properties
Property: license text
Propertylicense textRequirement levelOptionalCardinality0..nURIspdx:licenseText
Rangerdfs:Literal
DefinitionFull text of the license.Usage noteProperty spdx:licenseText
MAY only be used to specify license information in legacy
metadata records, not compliant with1 standard license from an endorsed registry. This property can
be
repeated for parallel language versions of the description
Example
Location
A spatial region or named place.
RDF Class:dcterms:Location
Definition:A spatial region or named place. It can be represented using a controlled vocabulary or with
geographic
coordinates.
Usage note
For an extensive geometry (i.e., a set of coordinates denoting the vertices of the relevant geographic area), the property
locn:geometry
[[LOCN]] SHOULD be used.For a geographic bounding box delimiting a spatial area the property
dcat:bbox
SHOULD be used.For the geographic center of a spatial area, or another characteristic point, the property
dcat:centroid
SHOULD be used.
Rationale:The introduction of dcterms:Location
in
DCAT-US 3.0 is driven by the need to restore compatibility with the
DCAT standard. DCAT-US 1.1 had deviated from the standard by using strings for location in
dcterms:spatial
property, which
was
incompatible. This addition aligns DCAT-US with recognized geospatial standards (e.g., Geosparql, WKT,
GeoJSON, W3C Location) for representing geometries, addresses, and location names, ensuring data
compatibility, discoverability, and integration while adhering to international data management practices.
Properties Summary
PropertyURIRangeReqLevelCardbounding boxdcat:bbox``rdfs:Literal
typed as gsp:wktLiteral (preferred)
or
gsp:gmlLiteral or gsp:geoJSONLiteralR0..1centroiddcat:centroid``rdfs:Literal
typed as gsp:wktLiteral or
gsp:gmlLiteral.O0..1geographic identifierdcterms:identifier``rdfs:Literal
O0..ngeometrylocn:geometry
locn:Geometry
typed as gsp:wktLiteral
(preferred) or
gsp:gmlLiteral or gsp:geoJSONLiteralO0..1gazetteerskos:inScheme``skos:ConceptScheme
O0..1geographic nameskos:prefLabel``rdfs:Literal
R0..nalternate geographic nameskos:altLabel``rdfs:Literal
O0..n
Recommended Properties
Property: bounding box
Propertybounding boxRequirement levelRecommendedCardinality0..1URIdcat:bbox
Rangerdfs:Literal
typed as geosparql:wktLiteral
or
geosparql:gmlLiteralDefinitionThe geographic center (centroid) of a spatial thingUsage note
The range of this property (rdfs:Literal) is intentionally generic, with the purpose of allowing different geometry literal encodings. E.g., the geometry could be encoded as a WKT literal (geosparql:wktLiteral)
Please note that the order of usage is as follows: use the most specific geospatial relationship by preference. E.g. if the spatial description is a bbox, use dcat:bbox, otherwise use locn:geometry
The WKT encoding supports geospatial positions expressed in coordinate reference systems other than WGS84.
Property: geographic name
RDF Propertyskos:prefLabel
Requirement levelRecommendedCardinality0..nRangerdfs:Literal
DefinitionPreferred toponym for the locationUsage note
This property contains a preferred label of the Location. This property can be repeated for
parallel language versions of the label.
Optional Properties
Property: geographic name
RDF Propertyskos:altLabel
Requirement levelOptionalCardinality0..nRangerdfs:Literal
DefinitionAlternate toponyms for the locationUsage note
This property contains a alternate labels of the Location. This property can be repeated for
parallel language versions of the label.
Property: centroid
PropertycentroidRequirement levelOptionalCardinality0..1URIdcat:centroid
Rangerdfs:Literal
typed as geosparql:wktLiteral
or
geosparql:gmlLiteralUsage note
The range of this property (rdfs:Literal) is intentionally generic, with the purpose of allowing different geometry literal encodings. E.g., the geometry could be encoded as a WKT literal (geosparql:wktLiteral)
Please note that the order of usage is as follows: use the most specific geospatial relationship by preference. E.g. if the spatial description is a bbox, use dcat:bbox, otherwise use locn:geometry
The WKT encoding supports geospatial positions expressed in coordinate reference systems other than WGS84.
Property: geographic identifier
Propertygeographic identifierRequirement levelOptionalCardinality0..nURIdcterms:identifier
Rangerdfs:Literal
Usage note
This property contains the geographic identifier for the Location, e.g., the URI or other
unique identifier in the context of the relevant gazetteer.
Property: geometry
PropertygeometryRequirement levelOptionalCardinality0..1URIlocn:geometry
Rangelocn:Geometry
Definition:
Associates a spatial thing [[?SDW-BP]] with a corresponding
geometry.
Usage note
The range of this property ( locn:Geometry
)
allows
for any type of geometry specification. E.g., the
geometry could be encoded by a literal, as
WKT ( geosparql:wktLiteral
[[GeoSPARQL]]), or represented by a class, as
geosparql:Geometry
(or any of its subclasses) [[GeoSPARQL]].
Property: gazetteer
RDF Propertyskos:inScheme
Requirement levelOptionalCardinality0..1Rangeskos:ConceptScheme
Usage note:
This property MAY be used to specify the gazetteer to which the Location belongs.
Example
MediaType
RDF Class:dcterms:MediaType
ObligationOptionalDefinition:A media type, e.g. the format of a computer file.Usage noteData publishers should consider using well-established IANA [[IANA-MEDIA-TYPES]] URLs for media types
whenever possible to
enhance compatibility and interoperability. However, the ability to create custom media types using labels
provides flexibility for unique data requirements. When creating custom media types, it's advisable to
provide clear and concise definitions to ensure transparency and understanding for data consumers.
Striking
a balance between standardized and custom media types optimizes data sharing within the DCAT-US framework.
Rationale:Incorporating dcterms:MediaType
in DCAT-US
combines the use of established IANA [[IANA-MEDIA-TYPES]] URLs for standardized media types with the
flexibility to create custom
types using labels. This dual approach ensures compatibility with recognized media types while allowing
adaptability to specific needs, promoting both data interoperability and flexibility in data sharing and
dissemination.
Reference
§ Term name: MediaType [DCTERMS]
Properties Summary
PropertyURIRangeReqLevelCardlabelrdfs:label``xsd:string
R0..1
Recommended Properties
Property: label
RDF Propertyrdfs:label
Requirement levelRecommendedCardinality0..1Rangexsd:string
Usage noteThis property contains the denomination of the Media Type.
Example
Metric
RDF Class:dqv:Metric
ObligationOptionalDefinition: Represents a standard to measure a quality dimension. An observation (instance of
dqv:QualityMeasurement)
assigns a value in a given unit to a Metric.Usage noteThe concept of a metric is used to define and measure specific aspects or dimensions of data quality
within a given context, providing a standardized and quantifiable way to assess the quality of data. It
allows for the comparison and evaluation of data quality
across different resources and enables the development of consistent quality assessment frameworks and
methodologies.Rationale:Introducing dqv:Metric
in the DCAT-US profile enhances
dataset quality assessment and management by aligning with international data quality standards. It allows
data publishers to systematically define and communicate dataset quality characteristics, promoting
transparency and informed data utilization, fostering trust, and supporting responsible data sharing
within
the DCAT-US ecosystem.
Reference
[§ 4.1
Class: Metric](https://www.w3.org/TR/vocab-dqv/#dqv:Metric) [VOCAB-DQV]
Properties Summary
PropertyURIRangeReqLevelCardin dimensiondqv:inDimension
dqv:Dimension
M1expected DataTypedqv:expectedDataType
xsd:anySimpleType
M1definitionskos:definition``rdfs:Literal
R0..n
Mandatory Properties
Property: in dimension
Propertyin dimensionRequirement levelMandatoryCardinality1URIdqv:inDimension
Rangedqv:Dimension
DefinitionRepresents the dimensions a quality metric, certificate and annotation allow a measurement of.
Property: expected datatype
Propertyexpected datatypeRequirement levelMandatoryCardinality1URIdqv:expectedDataType
Rangexsd:anySimpleType
DefinitionRepresents the expected data type for the metric's observed value (e.g., xsd:boolean, xsd:double
etc...)
Recommended Properties
Property: definition
PropertydefinitionRequirement levelRecommendedCardinality0..nURIskos:definition
Rangerdfs:Literal
Definitiondefinition of the metric
Example
Organization
RDF Class:org:Organization
Definition:Represents a collection of people organized together into a community or other social, commercial or
political structure. The group has some common purpose or reason for existence which goes beyond the set
of
people belonging to it and can act as an Agent. Organizations are often decomposable into hierarchical
structures.Subclass Of:foaf:Agent
Usage note
When utilizing the org:Organization
class
in DCAT-US 3.0, data publishers are encouraged to provide the preferred label (skos:prefLabel
) for the organization, along with
any
relevant alternative labels (skos:altLabel
) and
abbreviations skos:notation
. This usage is
consistent with the W3C Organization Recommendation standard [[VOCAB-ORG]].This practice ensures
comprehensive and flexible organization identification, improving data discoverability and search
accuracy. Data publishers should strive to maintain consistency in naming conventions while considering
variations and common aliases used to refer to organizations. By providing a well-rounded representation
of organizations, DCAT-US 3.0 enhances data usability and transparency, facilitating efficient data search
and retrieval. Rationale:Improving the org:Organization
class
in
DCAT-US 3.0 by supporting prefLabel, alternative labels, and abbreviations is essential to enhance
organization representation. This enhancement accommodates variations in organization naming, promotes
data
interoperability, and improves discoverability within datasets. By incorporating these features, DCAT-US
3.0
aligns with best practices in data representation, enhances data search and transparency, and optimizes
the
overall usability of data resources.
Properties Summary
PropertyURIRangeReqLevelCardChanges from DCAT-US 1.1namefoaf:name``xsd:string
M1..1No Changepreferred labelskos:prefLabel``xsd:string
O0..1Alignedalternative labelskos:altLabel``xsd:string
O0..nAlignednotationskos:notation``xsd:string
O0..nAlignedsubOrganizationOforg:subOrganizationOf
org:Organization
O0..1No Change
Mandatory Properties
Property: name
PropertynameRequirement levelMandatoryCardinality1URIfoaf:name
Rangexsd:string
DefinitionThe name of the Organization
Optional Properties
Property: preferred label
Propertypreferred labelRequirement levelOptionalCardinality0..1URIskos:prefLabel
Rangexsd:string
DefinitionThe legal name or preferred name of the Organization
Property: alternate label
Propertyalternate labelRequirement levelOptionalCardinality0..nURIskos:altLabel
Rangexsd:string
Definitionalternative names (trading names, colloquial names) for an organization
Property: notation
PropertynotationRequirement levelOptionalCardinality0..nURIskos:notation
Rangexsd:string
Definitionabbreviations or codes from code lists for an organization (e.g. DOI, DOD)
Property: suborganization of
Propertysub organization ofRequirement levelOptionalCardinality0..nURIorg:subOrganizationOf
Rangeorg:Organization
DefinitionRepresents hierarchical containment of Organizations or OrganizationalUnits; indicates an
Organization which contains this Organization.
Example
Period of Time
PeriodOfTime represents a period of time with a start date and an end.
RDF Class:dcterms:PeriodOfTime
Definition:PeriodOfTime represents a period of time with a start date and an end.Usage note
The start and end of the interval SHOULD be given by using properties dcat:startDate
, and dcat:endDate
, respectively. The
interval
can also be open - i.e., it can have just a start or just an end.
Rationale:The introduction of dcterms:PeriodOfTime
in
DCAT-US 3.0 is pivotal for harmonizing with international
standards and rectifying the inconsistency with DCAT 1. In DCAT-US 1.1, [[ISO8601-1]] was used for
interval
representation in dcterms:temporal
, diverging from DCAT 1's requirement of
dcterms:PeriodOfTime. This
alignment with DCAT 3
standards in DCAT-US 3.0 not only resolves discrepancies but also streamlines data processing, simplifying
parsing and indexing of time intervals. By adopting dcterms:PeriodOfTime
, DCAT-US 3.0 promotes
ease
of
implementation, ensuring uniformity, flexibility, accuracy, and enhanced interoperability in handling
time-related data, ultimately benefiting data usability and exchange.
PropertyURIRangeReqLevelCardstart datedcat:startDate``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1end datedcat:endDate``rdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
R0..1
Recommended Properties
Property: start date
Propertystart dateRequirement levelRecommendedCardinality0..1URIdcat:startDate
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThe start date of the period of time
Property: end date
Propertyend dateRequirement levelRecommendedCardinality0..1URIdcat:endDate
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThe end date of the period of time
Example
Person
RDF Class:foaf:Person
Definition:This class represents an individual human being or a person. It can be used to provide information
about individuals, such as their name, email address, homepage URL, and other personal details. Subclass Of:foaf:Agent
Usage noteRationale:The rationale for enhancing the foaf:Person
class in DCAT-US 3.0 is to
provide a more comprehensive and standardized representation of individuals within datasets. In earlier
versions, like DCAT 1.1, only a single "name" property was available for describing persons, limiting the
richness of personal data representation. By introducing properties like "firstName," "givenName," and
"affiliation," DCAT-US 3.0 aligns with best practices in data representation, allowing data publishers to
provide more detailed information about individuals and their affiliations with organizations. This
enhancement enhances data usability and transparency.
PropertyURIRangeReqLevelCardnamefoaf_name
xsd:string
M1..1given namefoaf:givenName
xsd:string
O0..1first namefoaf:firstname
xsd:string
R0..1member oforg:memberOf
org:Organization
O0..n
Mandatory Properties
Property: name
PropertynameRequirement levelMandatoryCardinality1URIfoaf:name
Rangexsd:string
DefinitionThe full name of the Person
Recommended Properties
Property: given name
Propertygiven nameRequirement levelRecommendedCardinality0..1URIfoaf:givenName
Rangexsd:string
DefinitionThe given name of the Person
Property: first name
Propertyfirst nameRequirement levelRecommendedCardinality0..1URIfoaf:firstname
Rangexsd:string
DefinitionThe first name of the Person
Optional Properties
Property: affiliation
PropertynameRequirement levelOptionalCardinality0..nURIorg:memberOf
Rangeorg:Organization
DefinitionThis property MAY be used to specify the affiliation of the Person to an organization.
Example
Provenance Statement
RDF Class:dcterms:ProvenanceStatement
ObligationOptionalDefinition: Any changes in ownership and custody of a resource since its creation that are significant for its
authenticity, integrity, and interpretation.Usage noteThe dcterms:ProvenanceStatement
in
DCAT-US 3.0 offers flexibility in how it can be referenced. It can either be referred to by a URL or
included in-line by using a label. This versatility allows data publishers to choose the most suitable
method for providing information about significant changes in ownership and custody, enhancing the
accessibility and usability of provenance details within datasets. Rationale:Introducing dcterms:ProvenanceStatement
in DCAT-US 3.0 enhances dataset transparency and trustworthiness. It allows data publishers to provide
structured information about significant changes in ownership and custody, aligning with international
data
quality and provenance standards. This flexibility ensures greater confidence in dataset authenticity and
interpretation, promoting responsible data usage within DCAT-US.
Reference
§ Term name: ProvenanceStatement [DCTERMS]
Properties Summary
PropertyURIRangeReqLevelCardprovenance statement textrdfs:label``xsd:string
R0..n
Recommended Properties
Property: provenance statement text
Propertyprovenance statement textRequirement levelRecommendedCardinality0..nURIrdfs:label
Rangerdfs:Literal
DefinitionThis property contains the text of the Provenance Statement. This property can be repeated for
parallel language
versions of the name
Role
RDF Class:dcat:Role
ObligationOptionalDefinition: A role is the function of a resource or agent with respect to another resource, in the context of
resource
attribution or resource relationships.
Usage noteUsed in a qualified-attribution to specify the role of an Agent with respect to an Entity. It is
recommended that the values be managed as a controlled vocabulary of agent roles, such as [[?ISO-19115-1]]
CI_RoleCode
.
Rationale:Integrating dcat:Role
within dcat:Relationship
in DCAT-US enriches data
networks by providing clear, navigable, and semantically transparent relationships among datasets, thereby
enhancing data discoverability, usability, and integration across various applications and use cases by
precisely depicting complex data dependencies and hierarchies.
Properties Summary
PropertyURIRangeReqLevelCardalternate labelskos:altLabel``rdfs:Literal
O0..ndefinitionskos:definition``rdfs:Literal
R0..nin schemeskos:inScheme``skos:ConceptScheme
M1..1notationskos:notation``xsd:string
O0..npreferred labelskos:prefLabel``rdfs:Literal
M1.n
Mandatory Properties
Property: preferred label
Propertypreferred labelRequirement levelMandatoryCardinality0..nURIskos:prefLabel
Rangerdfs:Literal
DefinitionPreferred label for the controlled vocabulary term (one per language).
Property: concept scheme
Propertyin schemeRequirement levelMandatoryCardinality1URIskos:inScheme
Rangeskos:ConceptScheme
DefinitionConcept scheme defining the role
Recommended Properties
Property: definition
PropertydefinitionRequirement levelRecommendedCardinality0..nURIskos:definition
Rangerdfs:Literal
Definitiondefinition of the controlled vocabulary
Optional Properties
Property: alternate label
Propertyalternate labelRequirement levelOptionalCardinality0..nURIskos:altLabel
Rangerdfs:Literal
Definitionalternative labels for a role
Property: notation
PropertynotationRequirement levelOptionalCardinality0..nURIskos:notation
Rangexsd:string
Definitionabbreviations or codes for the role.
Quality Measurement
RDF Class:dqv:QualityMeasurement
ObligationOptionalDefinition: Represents the evaluation of a given dataset (or dataset distribution) against a specific quality
metric.
Usage noteRepresents the evaluation of a given resource (as a Data Service, Dataset, or Distribution) against a
specific quality metric, such as spatial resolution in scale, angle or metric.Rationale:The inclusion of dqv:QualityMeasurement
in
DCAT-US assists end-users in better evaluating the fitness of use of resources. This optional class
enhances
data quality assessment, aligns with international standards (DQV), and enables more precise evaluation
against specific quality metrics, ultimately improving data usability and adherence to recognized quality
assessment practices. Reference
§ 4.1 Class: Quality Measurement [VOCAB-DQV]
Properties Summary
PropertyURIRangeReqLevelCardis measurement ofdqv:isMeasurementOf
dqv:Metric
M1valuedqv:value
rdfs:Literal
M1unit of measuresdmx-attribute:unitMeasure
rdfs:Resource
O0..1
Mandatory Properties
Property: is measurement of
Propertyis measurement ofRequirement levelMandatoryCardinality1URIdqv:isMeasurementOf
Rangedqv:Metric
DefinitionIndicates the metric being observed.
Property: value
PropertyvalueRequirement levelMandatoryCardinality1URIdqv:value
Rangerdfs:Literal
DefinitionRefers to values computed by metric.
Optional Properties
Property: unit of measure
Propertyunit of measureRequirement levelOptionalCardinality0..1URIsdmx-attribute:unitMeasure
Rangerdfs:Resource
DefinitionUnit of measure associated with the value
Example
Relationship
RDF Class:dcat:Relationship
Definition: An association class for attaching additional information to a relationship between DCAT Resources
Usage note
Use to characterize a relationship between datasets, and potentially other resources, where the nature
of
the relationship is known but is not adequately characterized by the standard [[?DCTERMS]] properties
( dcterms:hasPart
,
dcterms:isPartOf
,
dcterms:conformsTo
,
dcterms:isFormatOf
,
dcterms:hasFormat
,
dcterms:isVersionOf
,
dcterms:hasVersion
,
dcterms:replaces
,
dcterms:isReplacedBy
,
dcterms:references
,
dcterms:isReferencedBy
,
dcterms:requires
,
dcterms:isRequiredBy
)
or [[PROV-O]] properties
(prov:wasDerivedFrom
,
prov:wasInfluencedBy
,
prov:wasQuotedFrom
,
prov:wasRevisionOf
,
prov:hadPrimarySource
,
prov:alternateOf
,
prov:specializationOf
)
Rationale:The introduction of dcat:Relationship
in
DCAT-US serves to enhance the representation and description of relationships between datasets and other
resources. This class allows for the attachment of additional information to relationships that are not
adequately characterized by standard properties, promoting a more comprehensive understanding of dataset
connections. By accommodating nuanced relationship types beyond existing standards like [[DCTERMS]] and
[[PROV-O]] properties, DCAT-US ensures greater flexibility and precision in documenting dataset
relationships, facilitating more informed data discovery and utilization.
Properties Summary
PropertyURIRangeReqLevelCardrelationdcterms:relation
dcat:Resource
M1roledcat:hadRole``dcat:Role
M1
Mandatory Properties
Property: relation
PropertyrelationRequirement levelMandatoryCardinality1URIdcterms:relation
Range``DefinitionThe resource related to the source resource.
Property: role
PropertyroleRequirement levelMandatoryCardinality1URIdcat:hadRole
Rangedcat:Role
DefinitionThe function of an entity or agent with respect to another entity or resource.
Example
RightsStatement
RDF Class:dcterms:RightsStatement
ObligationOptionalDefinition:A statement about the intellectual property rights (IPR) held in or over a resource, a legal document
giving
official permission to do something with a resource, or a statement about access rights.Usage noteInformation about rights SHOULD be provided on the level of Distribution. Information about rights MAY
be
provided for a Dataset in addition to but not instead of the information provided for the Distributions
of
that Dataset.
Providing rights information for a Dataset that is different from information provided for a
Distribution
of
that Dataset
SHOULD be avoided as this can create legal conflicts.
Rationale:The introduction of dcterms:RightsStatement
in DCAT-US is vital for standardizing the conveyance of intellectual property rights (IPR) and access
permissions. This optional class accommodates URL references and custom rights statements via attribution
text, promoting transparency and compliance. By encouraging consistent rights information at the
Distribution and optional Dataset levels, DCAT-US enhances data sharing while reducing legal conflict
risks.
Reference
§ Term name: RightsStatement [DCTERMS]
Properties Summary
PropertyURIRangeReqLevelCardlabelrdfs:label``rdfs:Literal
R0..nattribution textodrs:attributionText
rdfs:Literal
R0..n
Recommended Properties
Property: label
PropertylabelRequirement levelRecommendedCardinality0..nURIrdfs:label
Rangerdfs:Literal
DefinitionThis property contains the text of the Rights Statement. This property can be repeated for
parallel
language versions of the name - see § 4.3. Multilingualism section.
Property: attribution text
Propertyattribution textRequirement levelRecommendedCardinality0..nURIodrs:attributionText
Rangerdfs:Literal
Definition
Example
Standard
RDF Class:dcterms:Standard
ObligationOptionalDefinition: A standard or other specification to which a Dataset or Distribution conforms.Usage noteA standard or other specification to which a Catalog, Catalog Record, Data Service, Dataset, or
Distribution conformsRationale:The inclusion of dcterms:Standard
in DCAT-US
accommodates standard references through URLs or custom, detailed descriptions when specific standards are
not available, promoting flexibility and completeness in resource metadata.Reference
§ Term name: Standard [DCTERMS]
Properties Summary
PropertyURIRangeReqLevelCarddescriptiondcterms:description
rdfs:Literal
R0..nidentifierdcterms:identifier
xsd:string
R0..nissueddcterms:issued
xsd:date
R0..1titledcterms:title
rdfs:Literal
R0..ntypedcterms:type
skos:Concept
R0..nversiondcat:version
xsd:string
R0..1in schemeskos:inScheme``skos:ConceptScheme
O0..1creation datedcterms:created
xsd:date
O0..1update/modification datedcterms:modified
xsd:date
O0..1
Recommended Properties
Property: title
PropertytitleRequirement levelRecommendedCardinality0..nURIdcterms:title
Rangexsd:string
DefinitionThis property contains a name given to the Standard. This property can be repeated for parallel
language versions of the
name.
Property: description
PropertydescriptionRequirement levelRecommendedCardinality0..nURIdcterms:description
Rangerdfs:Literal
DefinitionThis property contains a free-text account of the Standard. This property can be repeated for
parallel language versions of the description - see Multilinguism
Property: identifier
PropertyidentifierRequirement levelRecommendedCardinality0..nURIdcterms:identifier
Rangexsd:string
DefinitionThis property contains the main identifier for the Standard, e.g. the URI or other unique
identifier
in the
context of
the Catalog, or of a reference register
Property: issued
PropertyissuedRequirement levelRecommendedCardinality0..1URIdcterms:issued
Rangerdfs:Literal
(typed
as xsd:date
,
xsd:dateTime
,
xsd:gYear
or xsd:gYearMonth
)
DefinitionThis property contains the date of formal issuance (e.g., publication) of the Standard.
Property: type
PropertytypeRequirement levelRecommendedCardinality0..1URIdcterms:type
Rangeskos:Concept
DefinitionThis property refers to the type of the Standard. A controlled vocabulary for the values has not
been established.
Property: version
PropertyversionRequirement levelOptionalCardinality0..1URIdcat:version
Rangexsd:string
DefinitionThis property contains a version number or other version designation of the Standard.
Property: in scheme
Propertyin schemeRequirement levelRecommendedCardinality0..1URIskos:inScheme
Rangeskos:ConceptScheme
DefinitionThis property MAY be used to specify the reference register to which the Standard belongs.
Optional Properties
Property: creation date
Propertycreation dateRequirement levelOptionalCardinality0..1URIdcterms:created
Rangexsd:date
DefinitionThis property contains the date on which the Standard has been first created.
Property: update/modification date
Propertyupdate/modification dateRequirement levelOptionalCardinality0..1URIdcterms:modified
Rangexsd:date
DefinitionThis property contains the most recent date on which the Standard was changed or modified.
Examples
UseRestriction
A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on access, distribution, reproduction, modification, or sharing, and they are often put in place to protect privacy, intellectual property rights, security, or compliance with legal or ethical standards.
RDF Class:dcat-us:UseRestriction
Definition:A UseRestriction is a set of rules, guidelines, or legal provisions that dictate how a particular
resource, asset, information, or object can be utilized. Use restrictions may encompass limitations on
access, distribution, reproduction, modification, or sharing, and they are often put in place to protect
privacy, intellectual property rights, security, or compliance with legal or ethical standards.Usage note
When utilizing the dcat-us:UseRestriction
class, data
publishers are encouraged to provide comprehensive and precise details regarding the specific use
restrictions applied to a resource. This may include information on access limitations, distribution
rules,
reproduction guidelines, modification constraints, and any other pertinent restrictions. Adherence to NARA
guidelines and standards should be a priority when defining use restrictions, ensuring that data resources
align with archival and preservation practices. By offering clear and concise use restriction information,
data consumers can make informed decisions about the utilization of these resources while complying with
NARA's requirements.
Rationale:The introduction of dcat-us:UseRestriction
in DCAT-US
3.0, aligned with NARA (National Archives and
Records Administration) guidelines, enhances compliance and interoperability with NARA-specific use
restriction standards. This enables organizations to accurately convey NARA-specific restrictions on data
resources, ensuring adherence to archival and data preservation requirements, and promoting consistent
data
management practices within the DCAT-US framework.
Properties Summary
PropertyURIRangeReqLevelCardrestriction statusdcat-us:restrictionStatus
skos:Concept
M1..1specific restrictiondcat-us:specificRestriction
skos:Concept
R0..1restriction notedcat-us:restrictionNote
rdfs:Literal
O0..1
Mandatory Properties
Property: restriction status
Propertyrestriction statusRequirement levelMandatoryCardinality1URIdcat-us:restrictionStatus
Rangeskos:Concept
DefinitionIndication of whether or not there are use restrictions on the archival materials
Recommended Properties
Property: specific restriction
Propertyspecific restrictionRequirement levelRecommendedCardinality0..1URIdcat-us:specificRestriction
Rangeskos:Concept
DefinitionThe identification of the type of use restrictions, based on copyright, donor, or statutory
provisions, on the archival materials.
Optional Properties
Property: restriction note
Propertyrestriction noteRequirement levelOptionalCardinality0..1URIdcat-us:restrictionNote
Rangerdfs:Literal
DefinitionSignificant information pertaining to the use or reproduction of the data.
Example
Usage Guidelines
Dereferenceable identifiers
The FAIR principles, under the Findability and the Accessibility chapters respectively, state that:
- F1. (Meta)data are assigned a globally unique and persistent identifier
- A1. (Meta)data are retrievable by their identifier using a standardized communications protocol
In the expansive realm of digital data and ontology, the ability to unambiguously identify and access resources is foundational. this section delves deep into the principles and practices that underpin this crucial aspect of digital data management. Guided by the FAIR principles, this section unravels the nuances of generating resolvable URLs, the importance of URI resolution, the roles of various identifier resolution services, and the distinctions between alternate identifier properties. Through a comprehensive exploration, this section offers insights into ensuring data is not only uniquely identifiable but also consistently accessible in an ever-evolving digital landscape.
Generating Resolvable URLs
In the context of FAIR data, resources on the web must have unique, persistent, and resolvable identifiers. In order to achieve the capability of persistence, it is necessary for the resource identifiers to comply to the RFC 3986 IETF standard for URIs (and IRIs, which are URI extended to cope with unicode). This means that it must comprise the following components:
- scheme: http or https
- an authority: www.example.com
- optionally a path: /dataset-name/
- a local identifier (such as database accession number, such as P12133 from uniprot) or a globally unique identifier (such as a UUID or hash code).
Identifier Resolution
URI resolution is a fundamental process that involves directing requests to the appropriate identified entity. The standard approach typically entails resolving an HTTP GET request through content negotiation, enabling the selection of different representations of the desired resource.
A PURL, or persistent URL, serves as a permanent address for accessing web resources. To grasp the concept of PURLs, it's essential to first understand the concept of URL indirection (also known as URL redirect or URL forwarding). This practice involves providing a stable and fixed web address/URL that is configured to point to different content, which might undergo periodic modifications.
When a user accesses a PURL, they are automatically redirected to the current location of the resource. This means that when an author decides to relocate a page, they can easily update the PURL to direct it to the new location.
The practice of indirection proves beneficial as it ensures a consistent URL address for resources that are prone to change, such as due to version updates or ownership changes.
A concrete example of this practice can be observed in the utilization of purl.org URLs for identifying OBO Foundry resources. For instance, the URL http://purl.obolibrary.org/obo/stato.owl redirects to the latest release of the file, which can be found at https://raw.githubusercontent.com/ISA-tools/stato/dev/releases/latest_release/stato.owl.
PURLs sharing a common prefix are organized into domains, each managed by a single maintainer. The maintainer has the authority to add new PURLs to the domain and make modifications to existing PURLs within that domain.
According to FAIR Principle A1, it is essential for (meta)data to be retrievable using its identifier. When the identifier itself is not a resolvable URL, Identifier Resolution Services are required. These services possess the capability to map an Internationalized Resource Identifier (IRI) to a specific location where the corresponding data can be accessed.
Identifier Resolution services
In the digital realm, ensuring consistent and persistent access to resources is paramount. Identifier Resolution Services play a crucial role in achieving this by providing unique and persistent identifiers for various digital objects and entities. This section delves into several prominent services, detailing their functions and significance in the broader digital ecosystem. Please note that this is not an exhaustive list but rather a selection of popular examples intended to illustrate the diversity and importance of such services
purl.orgThe PURL system is a service of the Internet Archive, which provides an interface to administer domain. For more information about the service, visit https://archive.org/services/purl/helpw3ids
W3IDs.org provides persistent identifiers for Linked Data resources. These identifiers can be used in DCAT to uniquely identify datasets and data services. This can help to improve the discoverability and interoperability of datasets and data services. W3IDs.org is an important part of the Linked Data ecosystem and plays a key role in making data more discoverable and interoperable.
Send a request to add a redirect to the public-perma-id@w3.org mailing list. Make sure to include the URL that you want on w3id.org, the URL that you want to redirect to, and the HTTP code that you want to use when redirecting. An administrator will then create the redirect for you.
doi.orgDOI.org is a digital identifier system that assigns unique and persistent identifiers to digital objects. These identifiers can be used to cite, share, and track digital objects across different platforms and systems. DOI.org identifiers can be used in DCAT to uniquely identify datasets and data services. This can help to improve the discoverability and interoperability of datasets and data services.orcid.orgORCID (Open Researcher and Contributor ID) is a global, non-profit organization that provides a unique and persistent identifier for researchers. ORCID IDs are used to link researchers to their professional activities, such as publications, grants, and affiliations. This helps to ensure that researchers are properly credited for their work and that their work is more easily discoverable. ORCID is a valuable tool for researchers, and it is becoming increasingly important as the research landscape becomes more complex. arxiv.orgArXiv identifiers are globally unique identifiers (GUIDs) assigned to scholarly articles submitted to the arXiv preprint server. These identifiers can be used in DCAT (Data Catalog Vocabulary) to uniquely identify authors and their publications. This can help to improve the discoverability and interoperability of research data.Identifiers.orgThe Identifiers.org Resolution Service provides consistent access to life science data using Compact Identifiers. Compact Identifiers consist of an assigned unique prefix and a local provider designated accession number (prefix:accession). The resolving location of Compact Identifiers is determined using information that is stored in the Identifiers.org Registry.
Alternate identifiers
In the realm of data cataloging, identifiers play a pivotal role in ensuring the uniqueness, traceability,
and interoperability of resources. Different namespaces and vocabularies offer distinct properties to denote
identifiers. Here, we discuss three such properties: dcterms:identifier
,
adms:identifier
, and skos:notation
, shedding light on their distinct usages and
nuances.
dcterms:identifier
Originating from the Dublin Core Metadata Terms (DCTERMS), dcterms:identifier
is a broad and
general property used to denote a unique reference for a resource. It does not impose any constraint on the
format or nature of the identifier. In essence, it's a flexible property that can be employed across various
domains and for diverse types of resources, be they digital documents, physical artifacts, or abstract
concepts.
adms:identifier
The Asset Description Metadata Schema ([[VOCAB-ADMS]]) introduces adms:identifier
. Unlike the
more generic dcterms:identifier
, this property is more structured. It's designed to link a
resource to its identifier, which is itself described using further properties. This allows for a richer
description of the identifier, such as specifying its type (e.g., ISBN, DOI), its status, and its version.
It's particularly useful in contexts where there's a need to provide additional metadata about the
identifier
itself, beyond just its value.
skos:notation
skos:notation
is a property from the Simple Knowledge Organization System ([[SKOS-REFERENCE]])
vocabulary. It's used to provide a symbolic string notation for a concept. While it can function similarly
to
an identifier, its primary intention is to give a machine-readable, often standardized, symbolic name to a
concept, especially when such a notation exists in a legacy or external system. For example, in a controlled
vocabulary, each concept might have a notation that denotes its code in a classification scheme.
Multilingualism
From a technical perspective multilingualism SHOULD be handled as follows:
Multilingual literals: Properties of Range rdfs:Literal can be provided in multiple languages by adding so called language encoded strings: these add the language as an [[ISO 639-1]] two letter code after the string in the way that is shown in the example below:
Content negotiation: Properties of Range rdfs:Resource SHOULD be URIs. It is important to use URIs that are language independent. Then the data publisher in the process of dispatching these URIs can use content negotiation.
The table lists multilingual properties of DCAT-US and the translation strategies that apply to them:
PropertyRDF propertyRangeMultilingual SupportCatalog titledcterms:title``rdfs:Literal
Language encoded stringCatalog descriptiondcterms:description``rdfs:Literal
Language encoded stringDataset titledcterms:title``rdfs:Literal
Language encoded stringDataset descriptiondcterms:description``rdfs:Literal
Language encoded stringDataset keyworddcat:keyword``rdfs:Literal
Language encoded stringCatalog homepagefoaf:homepage
foaf:Document
Content negotiationDataset landing Pagedcat:landingPage
foaf:Document
Content negotiationCatalog publisherdcterms:publisher
foaf:Agent
Content negotiation for the URI and language encoded string for the nameDataset publisherdcterms:publisher
foaf:Agent
Content negotiation for the URI and language encoded string for the name
Stakeholders
In the realm of data cataloging and management, understanding the entities involved in the creation, curation, and maintenance of datasets is paramount. This section delves into the intricate details of these entities, categorizing them into distinct classes such as "Agent," "Person," and "Organization." Each class provides a structured framework to represent various stakeholders, from individuals to software agents and organizations, ensuring that data provenance is transparent and traceable. As we navigate through this section, we'll gain insights into the properties, roles, and significance of these agent representations within the DCAT-US 3.0 context, highlighting their pivotal role in enhancing data discoverability, interoperability, and usability.
Embracing globally unique, resolvable URLs and Persistent Identifiers (PIDs) stands paramount in fortifying the integrity and usability of data ecosystems, especially in identifying diverse agents. This practice not only ensures a crystal-clear, unambiguous identification, thereby averting potential duplications and inconsistencies from multiple URIs but also significantly enhances data discoverability and accessibility. By employing a singular, steadfast identifier per agent, data practitioners safeguard against data misinterpretation and ensure a coherent, traceable data lineage, bolstering data provenance and trust across various platforms and datasets. Furthermore, adherence to standardized identification practices, utilizing reference registries like ORCID or Research Organization Registry (ROR) , not only aligns with global data management standards but also propels collaborative research and data sharing, ensuring a streamlined, reliable, and impactful data management and collaboration across diverse research and data utilization environments. More details about identifiers are provided in Deferenceable identifiers section.
Agent
The foaf:Agent
class in the Friend of a Friend [[FOAF]]
ontology serves a dual-purpose role,
particularly in the context of data cataloging and management.
Firstly, it acts as an abstract class for both org:Organization
and foaf:Person
, providing a generalized representation that
encompasses various entities involved in dataset production and management. This abstraction facilitates the
encapsulation of common properties and behaviors, enabling a unified approach to handling different entity
types in data documentation and interoperability.
Secondly, foaf:Agent
is utilized as a class to represent
autonomous software agents, which are self-operating software entities capable of performing tasks and
making
decisions without direct human intervention.
This dual functionality of foaf:Agent
not only streamlines
the representation of human and non-human actors in data management processes but also provides a flexible
and
semantically rich framework to describe and interlink various entities within the DCAT-US schema, thereby
enhancing data discoverability and usability.
Person
A person agent represents an individual involved in producing or managing datasets. It provides information about the person and their associated contact details.
In the context of a DCAT-US 3.0, the foaf:Person
class
plays a crucial role. It is used to represent individuals who are associated with or responsible for the
datasets or resources described within the DCAT profile.
Let's break down the specific properties associated with foaf:Person
and their significance within this context:
foaf:name:
This property represents the full name of a person. It can be used to specify the full name of individuals associated with datasets. For example, if a person's full name is "John Smith," you can use this property to provide their complete name. This property is the only property mandatory for describing a person and is typically used for display.foaf:firstName:
This optional property represents the first name of a person. It is used to provide the first name of individuals associated with or responsible for resources in the DCAT-US profile. It can be used to provide structured information about an individual's first name.foaf:givenName:
This optional property represents the given name of a person. It can be used to provide structured information about an individual's name.org:memberOf
While not a FOAF property,this property is typically used to indicate the organization or group to which a person is affiliated to. In the context of a DCAT-US profile, it can be used to specify the organization or entity with which an individual is affiliated in relation to the described resources. For instance, if a person is a member of an organization of Department of Interior, you can use this property to link them to that organization identified byhttp://www.doi.gov
.
Organization
The Organization
agent plays a pivotal role in representing an organization or institution
that
is instrumental in the production or management of a resource. It encapsulates information about the
organization accountable for the resource, along with its pertinent contact details, thereby acting
proficiently as an Agent. Furthermore, it can be hierarchically decomposed into sub-organizations, offering
a
structured view of the organizational layers.
When employing org:Organization
within DCAT-US 3.0, adherence to the following guidelines is imperative:
- Use Recognized URL Identifiers: It is strongly recommended to utilize well-known URL identifiers for organizations that are centrally managed by a government registry. This practice ensures the unambiguous identification of organizations and fosters consistency and reliability in organizational referencing across cataloged resources.
- Ensure Consistency: Employ
foaf:name
to furnish a consistent and recognizable name for the organization, thereby maintaining a uniform identity across various platforms. - Enhance Discoverability: Leverage
skos:prefLabel
to designate the preferred label, ensuring that the organization is effortlessly discoverable and identifiable across diverse search scenarios. - Accommodate Variations: Utilize
skos:altLabel
to incorporate alternative names, acronyms, or aliases, thereby enhancing searchability and augmenting user-friendliness by accommodating various naming conventions. - Provide Abbreviations: Employ
skos:notation
to document any abbreviations or short forms that are commonly associated with the organization, facilitating users in recognizing and associating the organization with its widely-used abbreviations. - Represent Hierarchy: Optionally, utilize
org:subOrganizationOf
to depict hierarchical relationships, providing a structured and layered view of the organization and its sub-entities.
Contact Point
The Contact Point serves as a crucial element in data cataloging, providing a reference for users to seek additional information, clarifications, or support regarding a resource published in a catalog. In the DCAT-US profile, contact point information is encoded using the widely used [[VCARD-RDF]] vocabulary, ensuring standardized representation and interoperability of contact details across various platforms and applications.
A contact point may refer to an individual, a team, or an organization responsible for the resource (dcat:Dataset, dcat:DataService, dcat:DatasetSeries, dcat:Catalog) and is typically characterized by properties such as name, email, and telephone number. The inclusion of address details, role or title, and associated organizational details further enriches the contact information, providing users with multiple avenues to facilitate communication.
It is imperative to ensure that the contact point information is accurate, up-to-date, and reliable to foster trust and facilitate efficient communication between data providers and consumers. The following sub-sections provide detailed guidance on encoding contact point information, defining associated address details, and linking the contact point to the resources in the DCAT-US profile.
Encoding Contact Information
The contact information is encoded using the
vcard:Kind
class. If the contact information is reused in many resources, it is recommended to identify it with URI
to
avoid duplicate entries.
The vcard:fn
(formatted name) and
vcard:email
(email address) properties are mandatory to
ensure basic contactability. Additional properties like
vcard:tel
(telephone number) and
vcard:title
(role or title) can be utilized to provide
comprehensive details about the contact point. If the contact is a person, the property
vcard:givenName
and
vcard:familyName
can be used.
:vcard123 a vcard:Kind ;
vcard:fn "John Doe" ;
vcard:email <mailto:john.doe@example.com> ;
vcard:tel <tel:+123456789> ;
vcard:family-name "Doe" ;
vcard:given-name "John" ;
vcard:title "Data Manager" ;
vcard:hasAddress :address456 ;
.
Defining Address Details
Address details, when applicable, are encoded using the
vcard:Address
class and linked to the contact point
using
the vcard:hasAddress
property. The address does not have
to
a URI, if it not reused anywhere else in the catalog. The address class can include properties like
vcard:street-address
,
vcard:locality
,
vcard:locality
,
vcard:region
,
vcard:postal-code
and
vcard:country-name
to provide detailed location
information
about the contact point.
The following example illustrates how to define and encode address details, ensuring clarity and usability for data consumers.
:address456 a vcard:Address ;
vcard:street-address "123 Main Street" ;
vcard:locality "Anytown" ;
vcard:region "CA" ;
vcard:postal-code "12345" ;
vcard:country-name "USA" ;
.
Linking Contact Point to Resource
The contact point is associated with the dataset using the
dcat:contactPoint
property. This linkage ensures
that users can easily identify and communicate with the responsible entity for additional information,
support, or inquiries regarding the dataset.
The following example illustrates how to link the defined contact point to the dataset, ensuring clarity and facilitating user navigation and communication.
@prefix dcat: <http://www.w3.org/ns/dcat#> .
:MyDataset a dcat:Dataset ;
dcat:title "My Example Dataset" ;
dcat:description "This dataset includes example data for demonstration purposes." ;
dcat:contactPoint :vcard123 ;
.
Resource Attributions
Attribution in data catalogs pertains to the systematic association of a resource (such as a dataset or service) with a responsible entity, termed an "agent". Agents, which can be individuals, organizations, or services, may contribute to, create, publish, or interact significantly with the data. The roles of agents, such as contributor, creator, publisher, funder, distributor, custodian, or editor, are crucial in understanding the lineage and responsibility of data management.
Attributions hold paramount importance in data catalog searches for several pivotal reasons:
Provenance and Trustworthiness: Understanding the entities (agents) that have created or interacted with the data can significantly inform assessments of its quality and trustworthiness. Data originating from or managed by reputable and trusted organizations or individuals may be deemed more reliable and credible.
Credit and Accountability: Proper attributions ensure that all contributing individuals or organizations are aptly acknowledged for their work or data. This practice not only adheres to ethical guidelines and potentially legal requirements but also fosters a culture of recognition and accountability in data management and sharing.
Search and Discovery: Attributions serve as a valuable criterion in data search and discovery processes. Users may seek datasets created, managed, or contributed to by specific researchers, organizations, or other agents, thereby making attributions a vital component in filtering and locating data resources.
Collaboration and Networking: Identifying and acknowledging the agents associated with datasets can pave the way for new collaborative opportunities. It enables users and researchers to identify and connect with individuals or organizations possessing relevant expertise or shared research interests.
Issue Resolution: When users encounter issues or have queries about a dataset, attributions provide a clear pathway to seek clarifications, report issues, or obtain additional information. This ensures that data reliability and integrity are maintained through active resolution of issues and continuous improvement.
Standard Attributions and Roles
Employing standard properties such as dcterms:creator
, dcterms:contributor
, dcterms:rightsHolder
, and dcterms:publisher
, along with the generic prov:wasAttributedTo
from [[!PROV-O]], facilitates the
basic associations of responsible agents with a cataloged resource, ensuring clarity and standardization in
data attribution.
Extended Attributions and Diverse Roles
While there are numerous roles of significance in relation to cataloged resources, such as funder,
distributor, custodian, and editor, some of these roles are enumerated in the CI_RoleCode
values from [[?ISO-19115-1]], in the [[?DataCite]] metadata schema, and included within the MARC relators.
Utilizing a generalized method for assigning an agent to a resource with a specified role is facilitated by
prov:qualifiedAttribution
from [[PROV-O]]. This
method is particularly useful when the nature of the relationship is known but does not correspond with one
of
the standard attribution property roles.
The range of prov:qualifiedAttribution
is prov:Attribution
. The relevant Agent is specified via
property prov:agent
, whereas the role is specified with
property
dcat:hadRole
, which takes as
value
a skos:Concept
describing that role, as those
included
in the relevant code list operated by a US Government-controlled Registry.
The prov:qualifiedAttribution
property is
utilized to provide more detailed and structured information about the attribution of a resource, allowing
for
the specification of additional attributes, such as the role or position of the attributed entity, the date
of
attribution, or other relevant details.
provides an illustration of the usage of attribution properties:
Resource Classification
Controlled vocabularies, including taxonomies and thesauri, dramatically enhance data searchability. Utilizing these vocabularies allows datasets to be systematically classified, tagged, and described with standardized terms, aiding users in retrieving relevant datasets, even when using varied terms or synonyms.
Employing controlled vocabularies enables semantic search, which comprehends the context and relationships behind search terms. This approach enhances search results, for example, linking "automobiles" with related terms like "cars" or "vehicles".
This enriched search experience is crucial for navigating vast, diverse datasets, ensuring comprehensive and relevant results, and bridging the gap between user intent and dataset content.
The DCAT-US profile utilizes properties from the DCAT 3 framework for resource classification, providing flexibility in the choice of controlled vocabularies to meet the specific needs of various communities or agencies.
dcterms:type: This property specifies the category or genre ofgc the content in a resource. It's applicable to dcat:Dataset, dcat:DataService, and dcat:DatasetSeries. For dcat:DataService, types might include "Web Map Service" (WMS) for services providing geographical data in a map format, "Web Feature Service" (WFS) for services allowing users to access geospatial features, or "RESTful API" for services using REST API protocols. For datasets, types can be "Geospatial Dataset", "Image", "Statistical Dataset", or "Map". The Dublin Core Type Vocabulary is a popular choice for providing standardized descriptors.
dcat:keyword: This property allows for the tagging of datasets with relevant keywords, facilitating easier discovery and categorization. It is suitable for use with dcat:Dataset, dcat:DataService, dcat:DatasetSeries, and dcat:Catalog. Employing keywords from established vocabularies such as AGROVOC (for agricultural terms), Global Change Master Directory (GCMD) [[?GCMD]] (for Earth science), or NAICS (for industry classifications) ensures consistency and enhances the discoverability of datasets within the US context.
dcat:theme: This property provides thematic categorization for resources, specifically for dcat:Dataset and dcat:DatasetSeries. Utilizing a unified thematic taxonomy, such as the Data Theme Taxonomy from Data.gov or the FGDC (Federal Geographic Data Committee) Controlled Vocabularies like the ISO 19115 Topic CodeList, ensures a cohesive approach to categorizing datasets. This thematic classification aids users in navigating and identifying datasets relevant to particular subjects or sectors.
dcterms:subject: Aimed at providing detailed insight into the primary subject matter of a dataset, this property is crucial for dcat:Dataset and dcat:DatasetSeries. Adoption of controlled vocabularies like Global Change Master Directory (GCMD) [[?GCMD]] for Earth science topics, FAO Agrovoc for agricultural subjects, ITIS for taxonomic information, NAICS for industry classifications, or LCSH (Library of Congress Subject Headings) enhances the clarity and searchability of datasets, particularly in the context of US Government data. These vocabularies enable precise and comprehensive subject classification, facilitating more effective data discovery and use.
Spatial Metadata
Spatial metadata play a vital role in the context of geospatial data within the US Government by providing essential information about data quality, facilitating data discovery and interoperability, and ensuring responsible data governance. They describe the characteristics, source, and limitations of geospatial datasets, enabling informed decision-making based on data credibility. Spatial metadata support efficient data discovery, retrieval, and sharing, reducing duplication and promoting collaboration. They also promote interoperability by adhering to standardized metadata schemas and facilitate compliance with legal and regulatory requirements, ensuring accountable data stewardship. Spatial metadata are essential for maximizing the value and effective utilization of geospatial data within the US Government.
The Data Catalog Vocabulary (DCAT) specification provides a standardized way to represent metadata about datasets and services, including information about their spatial properties. In the context of DCAT-US, which is a profile tailored specifically for the United States, several spatial properties are relevant for describing resources. This wiki page aims to provide an overview of these spatial properties and their usage within the DCAT-US framework.
Geographic Bounding Box
A bounding box represents the minimum and maximum coordinates that enclose a specific geographic area. In
DCAT-US, the dcat-us:geographicBoundingBox
property
and the class dcat-us:GeographicBoundingBox
are
introduced
and utilized to define the spatial extent of a resource. This class consists of four numerical properties:
the
west ( dcat-us:westBoundingLongitude
)
and east longitude ( dcat-us:eastBoundingLongitude
),
followed by the north
( dcat-us:northBoundingLatitude
)
and south latitude ( dcat-us:southBoundingLatitude
),
which are based on the WGS84
coordinate system.
By specifying a bounding box, datasets can be associated with a particular geographic region. If the west bound longitude is greater than the east bound longitude, then the box spans the anti-meridian
Geographic Bounding Box crossing antimeridian
Defining a common reference system is of utmost importance when searching for geospatial data. Geospatial datasets are typically represented using different coordinate systems, projections, and datums, which can lead to challenges in interoperability and data integration. A common reference system ensures that data from diverse sources can be accurately aligned and combined, enabling effective analysis, visualization, and decision-making.
The introduction of the dcat-us:geographicBoundingBox
property in DCAT-US
profile addresses this challenge by providing a standardized way to express the spatial extent of a resource.
Unlike using a Polygon, which requires explicit geometric coordinates, the dcat-us:geographicBoundingBox
offers a simpler and
more interoperable approach. Here are a few reasons why the dcat-us:geographicBoundingBox
is advantageous:
- Consistent Spatial Representation: Geospatial datasets can be represented in various coordinate systems, projections, and datums. Without a common reference system, it becomes difficult to align and compare datasets accurately. By establishing a common reference system, data publishers and consumers can ensure consistent spatial representation, enabling seamless integration and analysis of geospatial data from different sources.
- Interoperability and Integration: The use of a common reference system enhances interoperability among geospatial datasets and systems. It enables data from diverse sources to be combined and used together seamlessly, facilitating cross-domain analysis and decision-making. With a common reference system, data publishers can provide metadata that adheres to a standard, making it easier for data consumers to understand and utilize the data.
- Simplified Search and Discovery: The
dcat-us:geographicBoundingBox
property simplifies the search and discovery process for geospatial data. Instead of relying on complex geometric representations like polygons, users can specify a bounding box by defining the minimum and maximum values of latitude and longitude. Filtering geospatial data using a bounding box involves numeric comparisons, where the latitude and longitude values of data points are compared to the minimum and maximum values of the bounding box. This approach efficiently eliminates data points outside the specified spatial extent by performing simple numeric operations. It leverages the inherent numerical properties of latitude and longitude values, making it computationally efficient and compatible with spatial indexing and query optimization techniques. By using numeric comparison, geospatial data can be filtered and retrieved faster, optimizing the search process in various geospatial applications. This makes it easier for users to define their area of interest and retrieve relevant datasets that intersect with that spatial extent. - Query Efficiency and Performance: The use of dcat-us:GeographicBoundingBox enables efficient spatial querying of datasets. Data consumers can quickly filter and retrieve resources based on their spatial extent, reducing the need to process unnecessary data. This improves search performance and query efficiency, particularly when dealing with large-scale geospatial data collections.
- Compatibility with Existing Tools and Standards: The adoption of the dcat-us:geographicBoundingBox property aligns wiofficetropolitan statistical areas, employing multiple bounding boxes for each area helps retrieve data specific to each metropolitan region, ensuring more accurate and focused results. Furthermore, non-contiguous states like Alaska and Hawaii require separate bounding boxes to accurately capture their unique spatial coverage. The inclusion of multiple bounding boxes in geospatial searches improves the accuracy and relevance of the retrieved datasets, facilitating more effective decision-making and analysis in various applications and domains.
Spatial Coverage
In DCAT 3, the use of the dcterms:spatial
property is intended to provide information about the spatial coverage
or location of a resource. This property allows for the description of the spatial aspect of a dataset,
dataset
distribution, or data service in a standardized manner.
The dcterms:spatial
property can
be
used
to represent spatial coverage using various spatial reference systems,
such as coordinates, polygons, or place names. This flexibility allows data publishers to express the
spatial
extent of their resources in a way that is most appropriate for the given context.
For example, the dcterms:spatial
property
can be used to indicate the geographic bounding box that represents the
extent of a dataset. This can be expressed using minimum and maximum latitude and longitude values,
providing a
rectangular approximation of the resource’s coverage area. Alternatively, a more precise polygon can be used
to
describe complex or irregularly shaped spatial extents.
By including the dcterms:spatial
property
in DCAT 3, datasets can provide explicit information about their
spatial coverage. This enables data consumers and applications to understand the geographic scope of a
resource
and determine its relevance for their specific use cases. It supports efficient searching, discovery, and
integration of geospatial datasets across different platforms and systems.
Furthermore, the use of standardized properties like dcterms:spatial
enhances
interoperability
and data exchange
among different data catalogs and applications. By conforming to the DCAT 3 specification, data publishers
ensure
that spatial information is consistently represented and interpreted, facilitating seamless data integration
and
interoperability within the geospatial community.
Spatial Resolution
Spatial resolution is a characteristic of geospatial datasets that describes the level of detail or
granularity in the spatial representation. In DCAT 3, the dcat:spatialResolutionInMeters
property is used to specify the spatial resolution of a resource, measured in meters. This property helps
users understand the level of detail provided by the dataset and assess its suitability for their specific
needs. Applications benefit from this property in various ways. For instance, in remote sensing and
satellite
imagery, users can determine if the dataset captures the required level of detail for their analysis. In
cartography and mapping, spatial resolution influences the clarity and accuracy of displayed features.
Environmental modeling relies on appropriate resolution for accurate simulations, and emergency management
requires datasets that support informed decision-making. The dcat:spatialResolutionInMeters
property supports data integration, ensuring compatibility between datasets with different resolutions.
Overall, this property enhances the usability and effectiveness of geospatial datasets across diverse
domains.
Handling Map Projections and Coordinate Systems
Geographic datasets in DCAT are commonly referenced using latitude and longitude coordinates based on the
WGS84 datum. This is facilitated by the recommended use of the dcat-us:geographicBoundingBox
property and the
corresponding class dcat-us:GeographicBoundingBox
to
establish a uniform reference system for searching and indexing. However, the diverse
nature of geographic data often necessitates the use of various map projections and coordinate systems.
The dcterms:conformsTo
property in DCAT is integral in
specifying
the Coordinate Reference System (CRS) utilized by a dataset or a distribution. Accurately defining the CRS
is
essential for understanding the spatial context, enabling precise geographic analysis and ensuring data
interoperability.
Additionally, the dcterms:type
property is employed alongside
dcterms:conformsTo
to delineate the type of reference system, be
it spatial or temporal. For spatial
datasets, dcterms:type
typically points to a spatial reference system,
as defined by URIs like http://resources.data.gov/categories/SpatialReferenceSystem.
Utilizing URIs to reference EPSG standards ensures a clear and unambiguous specification of the CRS. For example, the URI http://www.opengis.net/def/crs/EPSG/0/4269 explicitly denotes adherence to the NAD 83 CRS. Standardized references like these enhance data consistency and facilitate interoperability across various platforms and applications.
The reference system identifier SHOULD be preferably represented with an HTTP URI. In particular, spatial reference systems should be specified by using the corresponding URIs from the “EPSG coordinate reference systems” register operated by the Open Geospatial Consortium [[?OGC-EPSG]]. This registry is crucial for the precise identification of CRSs, thereby ensuring that spatial data referenced in DCAT are compatible and functional across a multitude of geospatial applications.
Example: Specifying a CRS using an EPSG code for a geographic dataset
Clearly defining the CRS is paramount for the effective use and integration of DCAT datasets, facilitating their application in a broad spectrum of spatial data uses.
Temporal Metadata
Temporal metadata is crucial for understanding and utilizing datasets effectively. This section is divided into three main categories to cover key aspects: Lifecycle Temporal Properties, Temporal Coverage, and Temporal Resolution. Additionally, we provide insights into handling these temporal aspects in JSON-LD format. Accurate temporal metadata ensures datasets are relevant and reliable, especially for time-sensitive analyses.
The use of multiple formats for temporal metadata, such as xsd:date, xsd:dateTime, xsd:gYear, and xsd:gYearMonth, is essential. These formats provide the necessary precision, flexibility, contextual appropriateness, interoperability, and cater to diverse user needs, accommodating different datasets' requirements for detail.
Lifecycle Temporal Properties
Lifecycle temporal properties document the timeline of the dataset's creation, updates, and publication. These properties are crucial for understanding the dataset's history, and currentness.
Release Time (
dcterms:issued
): Indicates the date when the dataset was first made available. Formats include:xsd:date(e.g.,
2023-11-30
)xsd:dateTime (e.g.,
2023-11-30T15:00:00
)xsd:gYear (e.g.,
2023
)xsd:gYearMonth (e.g.,
2023-11
)
Revision/Update Time (
dcterms:modified
): Shows when the dataset was last updated, using the same formagits as the release time.Update Schedule (dcterms:accrualPeriodicity): Describes the frequency of dataset updates. Terms are taken from the [Dublin
Core
Collection Description Frequency Vocabulary](http://www.dublincore.org/specifications/dublin-core/collection-description/frequency/). Multiple formats allow for precise scheduling, whether regular or irregular. Frequency Coding Guide section provides a guide to coding various standard frequencies as per ISO 19115, ISO-8601, and the Dublin Core standards.Record Creation Time (
dcterms:created
): Specifies the date when the catalog record itself was created, separate from the dataset it catalogs. This property uses the xsd:dateTime format.
Temporal Coverage of the Dataset
Temporal coverage refers to the time period the data within a dataset covers or relates to, as opposed to lifecycle properties like creation or update dates. This concept is central in data management for understanding the relevance and applicability of the dataset's content.
In DCAT, temporal coverage is defined using the property dcterms:temporal
associated with
the
dcterms:PeriodOfTime
class. This class allows for
a
clear specification of the coverage period through defined start and end dates. For detailed representation,
formats such as xsd:date, xsd:dateTime, xsd:gYear, or xsd:gYearMonth can be used. For instance, a
dataset on a year-long project might use "2023" (xsd:gYear), whereas a dataset with specific event dates
might
use "2023-03-15T13:00:00" (xsd:dateTime).
Marking these timeframes is typically done using dcat:startDate
and dcat:endDate,
offering flexibility for either fixed or open-ended periods. For example, a dataset about historical weather
patterns might span from "1950-01-01"
to "2000-12-31"
.
Adopting dcterms:PeriodOfTime
in DCAT-US 3.0
aligns
it with international DCAT 3 standards, improving data interoperability and ensuring consistent handling of
time-related data. This alignment rectifies previous inconsistencies and enhances the usability and exchange
of data.
Temporal Resolution in Datasets and Distributions
The property dcat:temporalResolution in a dcat:Dataset
, or
dcat:Distribution
, refers to the smallest time
interval that can be discerned in the data. This property is
essential for understanding the granularity and frequency of data recording within the dataset or its
specific
distributions.
Temporal resolution is particularly relevant in datasets where time plays a crucial role, such as time-series data. It indicates the level of detail at which changes or updates in the data are recorded and presented. For instance, a dataset with daily weather observations might have a temporal resolution of one day, represented as "P1D" in XML Schema duration format.
In the context of dcat:Dataset
, specifying
dcat:temporalResolution helps users
understand the overall temporal granularity of the dataset. Conversely, when applied to
dcat:Distribution
, it provides resolution details
specific to each distribution format,
acknowledging that different formats might be updated at different frequencies.
This distinction is important for datasets available in multiple formats or distributions, as each might have different temporal characteristics. For example, a high-resolution version of a dataset updated every minute would be suitable for detailed, time-sensitive analyses, while a lower-resolution version updated annually might be better suited for long-term trend analyses.
Examples of encoding durations in XML Schema duration format:
Daily Resolution: A dataset with daily updates would use
"P1D"
, indicating an update frequency of every day.Hourly Resolution: For hourly data updates, such as in a traffic flow dataset, the encoding would be
"PT1H"
, representing an hourly update frequency.
The dcat:temporalResolution can be specified using various time units such as seconds, minutes, hours, days, or years, depending on the nature of the dataset or distribution. This specification aids in aligning the dataset or distribution with user expectations and analytical requirements.
Frequency Coding Guide
The following table provides a guide to coding various standard frequencies as per ISO 19115, [[ISO8601-1]], and the Dublin Core standards.
ISO 19115 - MD_MaintenanceFrequencyCodeISO-8601Dublin Core Collection Description Frequency Vocabulary [[CLD-FREQ]]continualR/PT1ScontinuousdailyR/P1DdailyweeklyR/P1WweeklyfortnightlyR/P2W or R/P0.5WbiweeklymonthlyR/P1MmonthlyquarterlyR/P3MquarterlybiannuallyR/P6MsemiannualannuallyR/P1YannualasNeeded--Irregular-irregularnotPlanned--unknown---R/P3Ytriennial-R/P2Ybiennial-R/P4MthreeTimesAYear-R/P2M or R/P0.5Mbimonthly-R/P0.5Msemimonthly-R/P0.33MthreeTimesAMonth-R/P1Wsemiweekly-R/P3.5DthreeTimesAWeek
Handling Temporal Formats in JSON-LD for DCAT Datasets
When dealing with a dcat:Dataset
in JSON-LD, different temporal formats can be effectively
represented using the JSON-LD @type
attribute. This ensures that each temporal aspect of the dataset
is accurately interpreted.
Example using xsd:date
for a dataset's last update time (dcterms:modified):
{
"@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
"@type": "dcat:Dataset",
"title": "Annual Financial Report",
"modified": {
"@value": "2023-03-31",
"@type": "xsd:date"
}
}
Example using xsd:dateTime
for a dataset's precise creation time:
{
"@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
"@type": "dcat:Dataset",
"title": "Real-Time Traffic Data",
"created": {
"@value": "2023-03-31T15:00:00",
"@type": "xsd:dateTime"
}
}
Example using xsd:gYear
for a dataset's publication year (dcterms:issued):
{
"@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
"@type": "dcat:Dataset",
"title": "Decadal 2020 Census Data",
"issued": {
"@value": "2020",
"@type": "xsd:gYear"
}
}
Example using xsd:gYearMonth
for representing the temporal coverage of a dataset:
{
"@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
"@type": "dcat:Dataset",
"title": "Quarterly Weather Observations",
"temporal": {
"@type": "dcterms:PeriodOfTime",
"startDate": {
"@value": "2023-01",
"@type": "xsd:gYearMonth"
},
"endDate": {
"@value": "2023-03",
"@type": "xsd:gYearMonth"
}
}
}
These examples illustrate the flexible use of @type
to accurately represent
various
temporal
aspects within a dcat:Dataset
. By specifying the datatype, datasets can convey precise temporal
information, enhancing data usability and interpretation.
Additionally, specifying the dcat:temporalResolution
in JSON-LD is straightforward. Since the
@type
for dcat:temporalResolution
is predefined in the JSON-LD context, it's not
necessary to explicitly declare it.
Here's an example of how dcat:temporalResolution
might be used in a JSON-LD representation of a
dataset with daily updates:
{
"@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
"@type": "dcat:Dataset",
"title": "Daily Temperature Observations",
"temporalResolution": "P1D"
}
In this example, the dataset is defined with a temporal resolution of one day, indicated by
"P1D"
. This notation follows the XML Schema duration format and is understood in the JSON-LD
context without requiring an additional @type
declaration for the resolution.
The dcterms:accrualPeriodicity
property in JSON-LD specifies the frequency at which a dataset
is
updated or new data is added. This property is vital for users to understand how often the dataset's
information is refreshed.
Example of a dataset updated daily:
{
"@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
"@type": "dcat:Dataset",
"title": "Daily Air Quality Index",
"accrualPeriodicity": "daily"
}
Example of a dataset updated monthly:
{
"@context": "https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld",
"@type": "dcat:Dataset",
"title": "Monthly Employment Statistics",
"accrualPeriodicity": "monthly"
}
These examples illustrate the use of dcterms:accrualPeriodicity
in JSON-LD to clearly represent
the update frequency of datasets. By specifying this property, users can easily determine the refresh rate
of
the dataset's data, which is crucial for its application and relevance.
Legal Metadata
The DCAT-US specification emphasizes clear and accurate legal information provision for datasets, data series, distributions, and data services. Aligning with the FAIR data principles, this approach is essential for informed data use and ensuring compliance with relevant laws and policies. Contemporary laws increasingly mandate clear legal guidelines, highlighting the need for explicit reuse standard disclosures. The DCAT-US strategy aims to articulate legal information at the most specific sharing level to prevent conflicts and discrepancies, ensuring clarity and consistency across different data organization levels. Engaging legal experts is crucial in developing precise and understandable guidelines for each data publisher.
Legal information within DCAT-US is categorized into distinct types, each addressing different legal aspects for digital resources. This categorization helps in tailoring the guidelines to the specific nature of each resource, including those governed by the National Archives and Records Administration (NARA).
License Declarations: Provide clear guidelines on resource usage, redistribution, or modification.
Access Rights: Detail who can access the resource and under what conditions.
Usage Rights: Concern the rights associated with resource use, often detailing scope and limitations.
Liability Statements: Outline responsibilities and liabilities associated with resource usage.
Copyright Information: Clarify the copyright status of the resource.
Controlled Unclassified Information (CUI): Indicate special handling or dissemination controls for CUI, especially relevant to NARA.
Privacy and Confidentiality: Address personal data, privacy, and confidentiality issues.
Intellectual Property Rights (IPR): Deal with intellectual property aspects like patents and trademarks.
The DCAT-US profile addresses unique requirements for digital resources managed by the National Archives and Records Administration (NARA). It employs a structured approach to define access, use and CUI restrictions using well-defined code lists, enhancing the precision of legal information management, which is critical for datasets involving NARA-governed materials. These materials, governed by NARA's guidelines, involve a range of legal and ethical considerations, including content sensitivity, legal compliance, donor stipulations, and preservation needs.
Adherence to these NARA-specific guidelines is vital for the responsible management of datasets with historical, legal, or cultural significance, ensuring they align with legal frameworks. Below are the key NARA-specific restrictions:
- Access Restrictions: Detailing who can access NARA-governed digital assets and under what conditions, in accordance with the federal records legal framework. This includes guidelines on public access and accessing restricted records.
- Use Restrictions: Outlining the rights and limitations associated with the use of NARA-governed assets, focusing on specific legal constraints.
- Controlled Unclassified Information (CUI) Restrictions: Specifying the handling and dissemination controls for NARA assets that involve CUI, as mandated by law.
Access Rights
The dcterms:accessRights
property in the DCAT-US
specification plays a key role in defining the accessibility of a dataset, distribution or data
service. This
property provides information about the access restrictions placed on a digital resources, including
any
limitations or permissions required for accessing the data.
Accurately defining access rights is crucial for data providers to ensure that users are aware of any conditions or limitations on accessing datasets. This transparency aids users in understanding and complying with access policies, facilitating responsible and legal use of data.
The dcterms:accessRights
property is used to
describe the
general conditions under which a dataset is
accessible. This may include information about:
Public Access: Indicating whether a dataset is publicly accessible or restricted to certain groups.
Restrictions: Detailing any specific conditions or limitations on who can access the dataset.
Requirements: Outlining any prerequisites for accessing the dataset, such as membership, subscriptions, or compliance with certain terms.
The access rights is defined as an instance of dcterms:RightsStatement
. This class accommodates
URL
references and custom rights statements as literal via odrs:attributionText
.
Additionally, access rights applies both at the dataset and distribution levels. While a dataset access rights defines the overall terms for the entire dataset, each distribution can have its own specific access rights, potentially different from the dataset's access rights. This flexibility allows for tailored access rights for different forms or versions of the dataset, enhancing the dataset's usability and compliance with varied user needs.
This property does not typically include the access URL or mechanism, which
is covered by the dcat:accessURL
property. Instead,
dcterms:accessRights
focuses on the legal or policy
constraints that govern the availability of the dataset.
License Document
The dcat-us:LicenseDocument
is an RDF Class in
the
DCAT-US
specification designed to specify the licensing terms under which a dataset
is made available. This class plays a vital role in clearly communicating
the legal permissions and restrictions associated with the use of a dataset.
Its focus is broader, covering various aspects of dataset usage and
redistribution in a digital environment, typically for datasets published by
or for government agencies.
The License Document typically includes information such as:
Terms of Use: Detailed conditions under which the dataset can be used, modified, shared, or distributed.
Restrictions: Any limitations or prohibitions regarding the use of the dataset.
Rights Granted: Specific rights that are granted to users of the dataset, such as the right to use, reproduce, or distribute.
Attribution Requirements: Requirements for acknowledging the source of the dataset in derivative works or publications.
In dataset metadata, the dcat-us:LicenseDocument
can
be referenced using the property dcterms:license
. This ensures that users are aware of and can
easily
access the licensing terms associated with a dataset.
Implementing dcat-us:LicenseDocument
is crucial
for data providers to ensure legal clarity and compliance, and it assists users in understanding their
rights
and responsibilities when using the dataset.
Property spdx:licenseText
MAY only be used to
specify license information in legacy metadata records, not compliant with1 standard license from an
endorsed
registry. This property can be repeated
for parallel language versions of the description
Additionally, licensing applies both at the dataset and distribution levels. While a dataset license defines the overall terms for the entire dataset, each distribution can have its own specific license, potentially different from the dataset's license. This flexibility allows for tailored legal terms for different forms or versions of the dataset, enhancing the dataset's usability and compliance with varied user needs.
It is important for data providers to ensure consistency between the dataset license and distribution licenses to avoid confusion. Clear and accessible licensing information for both datasets and their distributions is essential for users to understand their legal rights and obligations.
Concerning license vocabularies, implementers are encouraged to consult legal experts to develop precise and understandable guidelines.
Liability Statement
The introduction of dcat-us:LiabilityStatement
in the DCAT-US 3.0 profile is a formal declaration by data providers aimed at limiting legal exposure
related
to
the dataset. It disclaims warranties and guarantees, setting clear user expectations for dataset usage,
thereby
enhancing legal compliance and transparency.
Key aspects typically covered in the Liability Statement include:
Limitation of Responsibility: Disclaiming provider responsibility for data errors and consequential impacts.
No Guarantee of Validity: No assurance of data accuracy, reliability, or completeness.
Absence of Endorsement: Clarifying that catalog inclusion does not imply provider endorsement.
Use at Own Risk: Users are responsible for assessing data suitability for their purposes.
The Liability Statement can be conveyed as literal text or a URL in the dataset metadata, offering flexibility in communication based on the dataset's legal context.
Concerning liability statement, implementers are encouraged to consult legal experts to develop precise and understandable guidelines.
Example
Copyrights
Copyrights are legal rights granted to the creators of original works, including literary, dramatic, musical, artistic, and certain other intellectual creations. These rights provide creators with exclusive control over the use and distribution of their works for a specific period.
Key aspects of copyright law include:
Exclusive Rights: Copyright holders have exclusive rights to reproduce, distribute, perform, display, and create derivative works from their creations.
Duration: The duration of copyright protection varies by jurisdiction.
Public Domain: After copyright protection expires, works enter the public domain and can be used freely by anyone without permission or payment.
Fair Use: Certain uses of copyrighted material may be considered fair use, such as for criticism, comment, news reporting, teaching, scholarship, and research, without the need for permission from or payment to the copyright holder.
The copyrights is defined as an instance of dcterms:RightsStatement
. This class accommodates
URL
references and custom copyrights statements as literal via odrs:attributionText
.
Additionally, copyrights applies both at the dataset and distribution levels. While a dataset copyrights defines the overall terms for the entire dataset, each distribution can have its own specific copyrights, potentially different from the dataset's copyrights. This flexibility allows for tailored copyrights for different forms or versions of the dataset, enhancing the dataset's usability and compliance with varied user needs.
NARA Access Restrictions
The National Archives and Records Administration (NARA) enforces Access Restrictions to ethically and legally manage a wide array of historical records. These restrictions protect sensitive information, comply with legal standards, honor donor agreements, and preserve physical and digital integrity.
Access Restrictions are tailored to each archival collection, balancing historical significance with legal, ethical, and preservation considerations. This ensures appropriate access while respecting archival practices.
The dcat-us:AccessRestriction
class in DCAT-US
application profile represents NARA's access restrictions, providing a framework to define and communicate
limitations on record accessibility. This class supports NARA in managing sensitive content, maintaining
confidentiality, and upholding data integrity, aiding stakeholders in understanding accessibility
constraints.
The following properties are used to describe use restrictions:
Restriction Status: This mandatory property is linked to the NARA [Access
Restriction
Status
Authority
List](https://www.archives.gov/research/catalog/lcdrg/authority_lists/accesslist.html), the dcat-us:restrictionStatus property indicates whether or not there are access restrictions on the archival materials.Specific Restriction: This recommended dcat-us:specificRestriction property refers to the NARA [Specific
Access
Restriction Authority List](https://www.archives.gov/research/catalog/lcdrg/authority_lists/specificaccesslist.html) for detailed information on restrictions. It defines specific access restrictions to the archival materials, based on national security considerations, donor restrictions, court orders, and other statutory or regulatory provisions.Restriction Note: This optional dcat-us:restrictionNote property can include additional contextor annotations about restrictions. Its use depends on the dcat-us:restrictionStatus property, and certain terms from the authority lists may require its application. It clarifies complex access restrictions, explains multiple levels of security classifications, identifies restricting statutes, or explains access restrictions not included in the [Specific
Access
Restriction Authority List](https://www.archives.gov/research/catalog/lcdrg/elements/specificaccess.html) or [Security
Classification Authority List](https://www.archives.gov/research/catalog/lcdrg/authority_lists/securitylist.html).
These properties and their corresponding code lists are essential for encoding and interpreting restriction
data
accurately within the NARA guidelines and DCAT-US framework. The SKOS vocabulary for these lists is
available
in
the [DCAT-US
repository](https://github.com/DOI-DO/dcat-us/blob/main/vocabularies/nara-restrictions.ttl), with JSON-LD URIs abbreviated for space. The following is the code list for dcat-us:restrictionStatus property:
URI (abbr.)LabelDescriptionno-limitations
No limitationsAnybody can directly and anonymously access the data, without being required to register or
authenticate.registration-required
Registration requiredAnybody can access the data, but they have to register first.authorisation-required
Authorisation requiredOnly some users can access the resource.unknown
UnknownAccess restrictions are unknown.
NARA Use Restrictions
The dcat-us:UseRestriction
class in DCAT-US 3.0
is
used by the National Archives and Records Administration (NARA) to define use restrictions on resources.
This
includes limitations for legal and ethical compliance, like protecting privacy and intellectual property
rights.
The following properties are used to describe use restrictions:
Restriction Status: This mandatory property is linked to the NARA [Use Restriction Status
Authority
List](https://www.archives.gov/research/catalog/lcdrg/elements/use.html), the dcat-us:restrictionStatus property specifies the type of use restrictions.Specific Restriction: This recommended dcat-us:specificRestriction property refers to the NARA [Specific Restriction
Authority List](https://www.archives.gov/research/catalog/lcdrg/elements/specificuse.html) for detailed information on restrictions.Restriction Note: This optional dcat-us:restrictionNote property can include additional context or annotations about restrictions. Its use depends on the 'restriction status' property, and certain terms from the authority lists may require its application.
These properties and their corresponding code lists are essential for encoding and interpreting restriction
data
accurately within the NARA guidelines and DCAT-US framework. The SKOS vocabulary for these lists is
available
in
the [DCAT-US
repository](https://github.com/DOI-DO/dcat-us/blob/main/vocabularies/nara-restrictions.ttl), with JSON-LD URIs abbreviated for space.
Controlled Unclassified Information (CUI) Restrictions
Controlled Unclassified Information (CUI) in NARA archives needs protection due to its sensitive nature. This safeguarding is essential for national security, privacy, intellectual property, and confidentiality of sensitive data. You can learn more about CUI on the NARA website.
The dcat-us:CuiRestriction
class in DCAT-US 3.0
is
tailored for managing CUI, requiring specific safeguarding in line with legal and policy requirements. It
aligns
with NARA guidelines to ensure compliance, enhance data security, and improve data
interoperability, transparency, and efficient resource management in government data systems.
The dcat-us:CuiRestriction
class is instrumental
in
managing Controlled Unclassified Information (CUI) within the DCAT-US framework. It mandates certain
properties
for effective representation and management of CUI data:
CUI Banner Marking: The
dcat-us:cuiBannerMarking
property is essential for marking CUI.CUI Designation Indicator: Detailed in the
dcat-us:designationIndicator
, this free-text property follows guidelines from the NARA Marking Guidebook and DODI 5200.48, typically including "Controlled by:" and contact information.Required Indicator Per Authority: The optional
dcat-us:requiredIndicatorPerAuthority
property allows for additional context and indicators as required by the authority.
These properties collectively ensure that the CUI status is accurately represented, complying with relevant standards and providing necessary flexibility for specific cases.
Provenance Metadata
Provenance and data lineage are crucial aspects of data management and transparency, ensuring that data consumers understand the origins, transformations, and utility of the data. In DCAT-US, leveraging [[DCTERMS]] (Dublin Core Terms) and [[PROV-O]] (W3C PROV Ontology) properties can effectively represent these aspects. This section outlines best practices for utilizing these properties to detail the provenance and data lineage within DCAT datasets.
Basic Provenance Metadata
Within DCAT-US, the Dublin Core Terms [[DCTERMS]] vocabulary offers properties that allow data publishers
to
articulate basic provenance information effectively. Particularly, dcterms:source
and dcterms:provenance
are
pivotal in this context.
dcterms:source
dcterms:source
is used into the
following
context:
Property source metadata (
dcterms:source
), optional, non-repeatable property for Catalog Record, that refers to the original metadata that was used in creating metadata for the Dataset.Property source (
dcterms:source
), optional, repeatable property for Dataset, that refers to a related Dataset from which the described Dataset is derived.
The dcterms:source
property is
utilized
to
denote the original source from which the current dataset is
derived. It can be a URI directly pointing to the original dataset or, in the absence of a URI, a
descriptive
reference that sufficiently identifies the original source. It's imperative to ensure that the source
referenced
is the most immediate or direct source from which the data was derived and to utilize persistent URIs when
available to ensure stable and long-term linkage to the source.
dcterms:provenance
On the other hand, dcterms:provenance
provides a mechanism to describe the history or lineage of the dataset.
This property allows publishers to detail the dataset's historical context and sequence of events or
processes
that have influenced its formation or transformation. The provenance statement should be concise yet
comprehensive, providing a clear and adequate understanding of the dataset's history and lineage. Employing
standardized nomenclature and terminologies ensures clarity and consistency across provenance statements.
In the context of data cataloging and transparency, embedding provenance information is vital to elucidate
the
origin and historical context of a dataset. The dcterms:ProvenanceStatement
from the
Dublin
Core Terms (DCTerms)
vocabulary provides a structured way to incorporate this information within the DCAT-US framework.
The dcterms:ProvenanceStatement
is
designed
to convey a human-readable explanation or record of the history or
lineage of a dataset. It can be utilized to describe the dataset's origins, transformations, ownership, and
any
other changes it might have undergone, thereby providing a clear and comprehensive historical record.
Property provenance (dcterms:provenance), optional, repeatable property for Dataset, that contains a statement about the lineage of a Dataset.
This property can be expressed in two primary ways within DCAT-US:
By URI: A Uniform Resource Identifier (URI) can be used to refer to a
dcterms:ProvenanceStatement
that is hosted externally. This method is beneficial when the provenance information is extensive or when it is standardized and used across multiple datasets.Using Free Text with
rdfs:label
: Alternatively, adcterms:ProvenanceStatement
can be expressed as free text using therdfs:label
property. This approach is suitable for providing concise, readable provenance information directly within the dataset's metadata.
Detailed Data Lineage
Data lineage, which traces the discrete steps involving data as it moves through the various stages of a workflow, is crucial for understanding data's origins and transformations. The W3C PROV Ontology [[PROV-O]] provides a rich set of properties to describe detailed data lineage in a standardized manner, ensuring interoperability and clarity in data documentation.
Key PROV-O properties include:
prov:wasDerivedFrom
: Establishes a derivation relationship between two entities.prov:wasGeneratedBy
: Links an entity to the activity that generated it.prov:wasInfluencedBy
: Expresses a generic influence of one entity over an activity.prov:wasAttributedTo
: Signifies the relationship between a dataset and an agent responsible for its creation.
The prov:Activity
class in the PROV-O ontology
plays
a
pivotal role in representing processes or actions taken
upon or with entities, thereby providing a structured framework to document the transformations, analyses,
or
other actions that data undergoes. An instance of prov:Activity
is utilized to describe a particular
occurrence
of an action or process, which can involve the consumption, production, or transformation of entities. By
associating activities with entities through properties such as prov:wasGeneratedBy
, a detailed
account of the data's journey, from its origin through various transformations to its current state, can be
articulated. This not only enhances the transparency of the data but also provides a robust mechanism to
trace
back through the steps involved in data creation and processing, thereby contributing to verifiable and
trustworthy data lineage. Furthermore, prov:Activity
can be associated with prov:Agent through properties like
prov:wasAssociatedWith, offering insights into the roles of different agents (e.g., organizations, people,
or
software) in data processing activities, thereby enriching the data provenance and lineage documentation.
By adhering to these practices and effectively utilizing [[PROV-O]] properties, data publishers can enhance transparency and facilitate informed data usage among consumers by providing a clear view of data sourcing, processing, and transformation.
Distribution Metadata
In the realm of data sharing and management, dcat:Distribution plays a pivotal role as the tangible representation of datasets. A Distribution within the DCAT framework is more than just a link to a dataset; it is the embodiment of the dataset in a practical, accessible format, adhering to the W3C standards. It is the dataset manifested in a specific format, ranging from CSV files to complex databases, inherently tied to its parent dataset. This relationship underscores the fact that a Distribution does not exist in isolation but as a practical form of the dataset, prepared and published by data providers for end users. The core attributes of a Distribution focus on its file-centric properties like download URLs, media types, file formats, byte sizes, character encodings, and checksums, emphasizing its primary function: efficient and reliable data delivery.
Guidelines for Creating DCAT Distributions
The following guidelines are designed to help determine the most effective way to structure DCAT distributions, whether as a single file, a multi-file package, or multiple distributions. The choice depends on the dataset's characteristics, user needs, and the data's intended use. Consider these guidelines to ensure your distributions are user-friendly, accessible, and align with best practices in data management.
Single-File Distribution: Ideal for datasets that are cohesive and standalone, typically encapsulated in a single format like CSV or XML. This approach is beneficial for smaller or comprehensive datasets, simplifying access and use. The key is to choose a file format that effectively represents all necessary data.
Multi-File Packaged Distribution: Essential for complex datasets, such as ArcGIS shapefiles, which require multiple interdependent files. Packaging related files together is useful for large or component-rich datasets. It's crucial to include all essential components and ensure the package facilitates easy download and usage.
Multiple Distributions in a Dataset: Suitable for datasets that can be logically segmented or offered in different formats. This method allows targeted access to specific data parts and enables selective updating. Clear documentation of each distribution is important for user navigation.
When selecting a distribution format, it is important to consider factors such as the interdependence of files, the ease of user accessibility, the size and downloadability of the data, the frequency of updates, and the diversity of formats required. A thoughtful approach to these criteria will help in creating a distribution strategy that is both practical for data providers and beneficial for end-users, enhancing the overall effectiveness of data sharing and utilization.
File-centric Properties
This section focuses on the properties central to the file-centric aspects of dcat:Distribution. These properties are crucial for ensuring datasets are accessible and usable in their practical forms, addressing the aspects of data encoding, structure, packaging, presentation, media type, and language.
dcat:downloadURL: This property is preferred for direct links to downloadable resources. It is the most straightforward way to provide access to a distribution, allowing users to directly download the dataset in its entirety without any intermediate steps or interactions.
dcat:accessURL: This property should be used for the URL of a service or location that provides access to the distribution, typically through a web form, query, or API call. It is ideal for scenarios where the distribution is accessed via an interactive mechanism rather than direct download. For example, when accessing datasets that require specific queries or are provided through a web service.
dcat:mediaType: This property specifies the Internet Media Type (also known as MIME type) of the distribution, which are standardized identifiers for labeling the format of documents, files, or data transmitted via the Internet. It is particularly useful in scenarios where the distribution format aligns with media types registered by the Internet Assigned Numbers Authority (IANA) [[IANA-MEDIA-TYPES]], ensuring standardization and facilitating automated processing.
dcterms:format: This property is applicable in scenarios not covered by
dcat:mediaType
, particularly when aligning with file formats recognized by central authorities. The role ofdcterms:format
is to offer a detailed description of the distribution's file format or physical medium. For instance, in the geospatial domain, this could include formats like “Shapefile” or “GeoJSON”. These descriptions are crucial for providing human-readable information about the distribution's format, enhancing user understanding and aiding in the effective presentation within data catalogs.dcterms:conformsTo: This property indicates the standards or specifications to which the distribution conforms. Allowing for multiple standards acknowledges that datasets may adhere to more than one set of specifications, either due to the nature of the data or to meet various user needs and compliance requirements. For instance, a dataset might conform to both an industry-specific standard and a general data format standard. Documenting each applicable standard enhances the dataset's interoperability and usability, making it clear to users what to expect in terms of data structure and quality.
dcat:compressFormat: This property to be used when the files in the distribution are compressed, e.g., in a ZIP file. The format SHOULD be expressed using a media type as defined by IANA [[IANA-MEDIA-TYPES]] if available.
dcat:packageFormat: This property should be employed when the files within a distribution are packaged together, such as in formats like TAR, ZIP, Frictionless Data Package, or Bagit files. The format SHOULD be expressed using an appropriate media type as defined by IANA [[IANA-MEDIA-TYPES]] of available to ensure standardization and broader recognition of the format.
dcat:byteSize: Indicates the size of the distribution, important for understanding download requirements and storage planning. The size SHOULD be given as an integer.
spdx:checksum: This optional property is used to provide a spdx:Checksum instance for ensuring data integrity during transfer. It serves as a mechanism to verify that the contents of a file or package have not been altered. The checksum should be specified using the
spdx:checksumValue
property. To indicate the algorithm used for generating the checksum, use the property spdx:algorithm with URIs defined in the SPDX specification, such asspdx:checksumAlgorithm_sha1
,spdx:checksumAlgorithm_sha256
, orspdx:checksumAlgorithm_sha512
, depending on the algorithm employed.adms:representationTechnique: This property can be used to specify the technique or method by which the data is represented in the distribution. This is different from the file format as, for example, a ZIP file (file format) could contain an XML schema (representation technique). It can help users understand the underlying structure or visualization method of the dataset. For example, for spatial datasets, this property SHOULD be used to express the spatial representation type (grid, vector, tin), by using the URIs from a code list managed in a registry.
cnt:characterEncoding: This property SHOULD be used to specify the character encoding of the Distribution, by using as value the character set names in the the IANA Character Set names register [[IANA-CHARSETS]]. Character encoding in [[?ISO-19115-1]] metadata is specified with a code list that can be mapped to the corresponding codes in [[IANA-CHARSETS]], as shown in the following table (entries with 1-to-many mappings are in italic).
ISO 19115 - MD_CharacterSetCodeDescriptionIANAucs216-bit fixed size Universal Character Set, based on ISO/IEC 10646ISO-10646-UCS-2ucs432-bit fixed size Universal Character Set, based on ISO/IEC 10646ISO-10646-UCS-4utf77-bit variable size UCS Transfer Format, based on ISO/IEC 10646UTF-7utf88-bit variable size UCS Transfer Format, based on ISO/IEC 10646UTF-8utf1616-bit variable size UCS Transfer Format, based on ISO/IEC 10646UTF-168859part1ISO/IEC 8859-1, Information technology
8-bit single byte coded graphic character sets - Part 1 : Latin alphabet No.1ISO-8859-18859part2ISO/IEC 8859-2, Information technology
8-bit single byte coded graphic character sets - Part 2 : Latin alphabet No.2ISO-8859-28859part3ISO/IEC 8859-3, Information technology
8-bit single byte coded graphic character sets - Part 3 : Latin alphabet No.3ISO-8859-38859part4ISO/IEC 8859-4, Information technology
8-bit single byte coded graphic character sets - Part 4 : Latin alphabet No.4ISO-8859-48859part5ISO/IEC 8859-5, Information technology
8-bit single byte coded graphic character sets - Part 5 : Latin/Cyrillic alphabetISO-8859-58859part6ISO/IEC 8859-6, Information technology
8-bit single byte coded graphic character sets - Part 6 : Latin/Arabic alphabetISO-8859-68859part7ISO/IEC 8859-7, Information technology
8-bit single byte coded graphic character sets - Part 7 : Latin/Greek alphabetISO-8859-78859part8ISO/IEC 8859-8, Information technology
8-bit single byte coded graphic character sets - Part 8 : Latin/Hebrew alphabetISO-8859-88859part9ISO/IEC 8859-9, Information technology
8-bit single byte coded graphic character sets - Part 9 : Latin alphabet No.5ISO-8859-98859part10ISO/IEC 8859-10, Information technology
8-bit single byte coded graphic character sets - Part 10 : Latin alphabet No.6ISO-8859-108859part11ISO/IEC 8859-11, Information technology
8-bit single byte coded graphic character sets - Part 11 : Latin/Thai alphabetISO-8859-118859part13ISO/IEC 8859-13, Information technology
8-bit single byte coded graphic character sets - Part 13 : Latin alphabet No.7ISO-8859-138859part14ISO/IEC 8859-14, Information technology
8-bit single byte coded graphic character sets - Part 14 : Latin alphabet No.8 (Celtic)ISO-8859-148859part15ISO/IEC 8859-15, Information technology
8-bit single byte coded graphic character sets - Part 15 : Latin alphabet No.9ISO-8859-158859part16ISO/IEC 8859-16, Information technology
8-bit single byte coded graphic character sets - Part 16 : Latin alphabet No.10ISO-8859-16_jis__japanese code set used for electronic transmission__JIS_Encoding_shiftJISjapanese code set used on MS-DOS machinesShift_JISeucJPjapanese code set used on UNIX based machinesEUC-JPusAsciiUnited States ASCII code set (ISO 646 US)US-ASCII_ebcdic__IBM mainframe code set__IBM037_eucKRKorean code setEUC-KRbig5traditional Chinese code set used in Taiwan, Hong Kong of China and other areasBig5GB2312simplified Chinese code setGB2312
Effective utilization of these properties enhances data discoverability, interoperability, and the overall user experience in accessing and working with datasets.
Data Quality
The quality of a dataset plays a pivotal role in shaping trust, reusability, and the overall performance of applications that rely on it. As a result, it is imperative to integrate data quality information seamlessly into both the data publishing and consumption processes. This inclusion allows for a thorough evaluation of a dataset's quality, thereby determining its suitability for a particular application.
Thorough documentation of data quality significantly streamlines the dataset selection process, enhancing the likelihood of reuse. Regardless of domain-specific nuances, documenting data quality and explicitly stating known quality issues in metadata are fundamental practices. Typically, assessing quality involves multiple dimensions, each encapsulating characteristics of importance to both data publishers and consumers.
The Data Quality Vocabulary (DQV) defines machine-readable concepts such as measurements and criteria to assess quality across various dimensions [[VOCAB-DQV]]. Tailored heuristics designed for specific assessment scenarios rely on quality indicators, which encompass data content, metadata, and human ratings. These indicators offer valuable insights into the dataset's suitability for its intended purpose.
In the context of integrating data quality information into DCAT resources (Dataset, Distribution, Data
Service,
Dataset Series), the Data Quality Vocabulary [[VOCAB-DQV]]
provides a structured and standardized way to represent and assess quality information for fitness of use. The
key components of DQV
relevant to this discussion are dqv:QualityMeasurement
, dqv:Metric
,
dqv:Dimension
, and the property hasQualityMeasurement
. Here's how each of these
elements is used:
dqv:QualityMeasurement: This class represents a specific measurement or assessment of quality. It's a quantifiable value that indicates how well a dataset performs against a particular quality metric. A
dqv:QualityMeasurement
instance is associated with a specific dataset and linked to the metric it measures.dqv:Metric: The
dqv:Metric
class represents the standard or criterion used to assess a particular aspect of quality. Metrics are the yardsticks against which quality is evaluated. Each metric is typically associated with a quality dimension. For example, a metric could measure the accuracy of data, its timeliness, or its completeness.dqv:Dimension: This
dqv:Dimension
class represents the various dimensions or categories of data quality, such as accuracy, timeliness, or completeness. Quality dimensions help categorize different aspects of data quality, providing a framework for comprehensive assessment.Property hasQualityMeasurement: The
hasQualityMeasurement
property is used to link a resource to adqv:QualityMeasurement
. It indicates that the dataset has been evaluated in terms of quality and specifies the measurement. This linkage is crucial for conveying the results of quality assessments to data consumers, enabling them to understand the quality aspects that have been measured and the outcomes of those measurements.
Using these DQV elements, data publishers can document the quality of their datasets in a structured and meaningful way. This documentation includes specific measurements of quality, the criteria used for these assessments, and the quality dimensions they relate to. The use of DQV thus enhances transparency and helps data consumers make informed decisions about the suitability of a dataset for their specific needs.
The use of shareable controlled vocabularies for dqv:Metric
and dqv:Dimension
is highly
encouraged
within communities. These standardized vocabularies facilitate consistent and precise communication of data
quality aspects across different datasets and applications. By adopting such vocabularies, communities can
ensure that their data quality metrics and dimensions are universally understood, enhancing interoperability
and
the effective use of data across diverse systems and contexts.
Versioning
Versioning is a concept used to describe the relationship between an original resource and its variations, updates, or translations. In this section, we explore how versions resulting from updates or modifications throughout a resource's lifecycle is used in DCAT-US 3.0 profile.
DCAT-US 3.0 relies on established vocabularies, including the versioning section of the PAV ontology and terms from [[?PAV]], [[DCTERMS]], [[OWL2-OVERVIEW]], and [[VOCAB-ADMS]].
It's essential to recognize that versioning is applicable to all primary DCAT resources, such as Catalogs, Catalog Records, Datasets, and Distributions. This versioning capability extends across these resource types.
The versioning methodology detailed in DCAT-US 3.0 is designed to enhance and work alongside existing versioning practices specific to certain resource types (for instance, versioning properties for ontologies are detailed in [OWL2-OVERVIEW]) and customary in various domains and communities. Refer to section 11.4 for an analysis of how DCAT's versioning approach aligns with other vocabularies.
Handling Dataset Changes
Web-based datasets are inherently dynamic, with some undergoing scheduled updates and others evolving due to advancements in data collection techniques. To address these varying changes, the creation of new dataset versions is often necessary. The decision to classify changes as a new dataset or a new version of an existing dataset, however, is not universally agreed upon. The following examples illustrate typical scenarios where a new version is generally warranted:
Scenario 1: Adding a new bus stop requires its inclusion in the dataset.
Scenario 2: Eliminating an existing bus stop necessitates its removal from the dataset.
Scenario 3: Correcting a mistake related to a bus stop currently in the dataset.
It's important to note that datasets representing time or spatial series (like annual regional data or weekly weather forecasts) are usually considered separate datasets, each capturing unique observations.
While Scenarios 1 and 2 might lead to significant version updates, Scenario 3 typically results in a minor update. The key is not the scale of the change, but the clarity in marking these changes through version numbering. Keeping a detailed version history is crucial for the integrity of the dataset, especially considering its potential ongoing use by various stakeholders. Publishers are advised to inform users proactively about new versions, particularly for datasets undergoing real-time updates, where automated timestamps can aid in version identification. Ultimately, maintaining a systematic and transparent versioning approach, including the use of semantic versioning, is vital for enabling users to navigate and utilize these evolving datasets effectively.
Version Information
The DCAT-US profile recognizes the importance of associating versioned resources with further details. These details can include aspects like the differences from the original resource (referred to as the version "delta"), the version's name or identifier, and its release date.
To accommodate these details, the DCAT US 3.0 profile employs several specific properties:
dcat:version
(parallel topav:version
[[?PAV]]) - This property is used for denoting the version name or identifier.dcterms:issued
[[DCTERMS]] - Indicates the release date of a particular version.adms:versionNotes
[[VOCAB-ADMS]] - Provides a textual summary of the changes in the version, highlighting any issues of backward compatibility with the previous version of the resource.
Dataset Versions
The versioning of datasets is an essential aspect of data management, facilitating the tracking of changes and updates over time. In DCAT-US 3.0, dataset versioning is primarily managed through the use of properties that identify and describe different versions of a dataset including:
dcterms:hasVersion
- Links to a more recent version of the dataset.dcterms:isVersionOf
- Indicates the dataset is a version of another dataset.dcterms:version
- Provides the version number or identifier of the dataset.dcterms:versionNotes
- Describes changes between this version and the previous version of the dataset.
These properties ensure users can easily track dataset evolutions, access different versions, and understand the changes made across versions. Implementing these versioning properties in the DCAT-US profile enhances data discoverability and usability, aligning with best practices in data management.
Version Chains and Hierarchies
DCAT-US 3.0 profile facilitates the management of version histories and hierarchies through specific properties. These properties help in establishing and navigating the relationships between different versions of a dataset.
The key properties for defining version chains and hierarchies include:
dcat:previousVersion
- This property creates a backward navigable chain from a given version to the first one, allowing for the tracking of a dataset's version history.dcat:hasVersion
- Utilized for outlining a version hierarchy by linking an abstract resource to its different versions.dcat:hasCurrentVersion
- a subproperty ofdcat:hasVersion
) - This property is used to connect an abstract resource to the snapshot representing the current version of its content.
Additionally, the dcat:isVersionOf
property (inverse of dcat:hasVersion
) can be used
to provide a backward link from a version to its abstract resource. The utilization of these properties
depends on the specific requirements of the use case.
It's important to note that the essential properties for specifying a version chain and hierarchy are
dcat:previousVersion
and dcat:hasVersion
. The choice to use additional properties is
determined by the needs of the relevant use case.
For further guidance on specifying a resource's status refer to Resource life-cycle section.
The following example, adapted from § 8.6 Data Versioning of [[?DWBP]] demonstrates how to specify a version chain and hierarchy for a bus stops dataset using the properties described in this section.
Versions Replaced by Other Ones
In DCAT-US 3.0 profile, a significant type of relationship is the one where a given version replaces or supersedes another. To represent this, DCAT adopts the relevant properties from [[DCTERMS]]:
dcterms:replaces
- This property is used when a version supersedes another one.dcterms:isReplacedBy
- Its inverse, this property provides a back link to the newer version that replaces the current one.
It's important to note that these properties do not necessarily indicate a version chain. That is, a version does not automatically replace its immediate predecessor.
To illustrate how these roperties can be applied in DCAT-US 3.0, the following example reuses the description of the MyCity bus stop dataset in to show how replaced versions can be specified in DCAT.
Resource Life-Cycle
The life-cycle of a resource, while distinct from versioning, is often closely related to it. The evolution of a resource through its life-cycle stages—conception, creation, publication—may lead to new versions, though not invariably (e.g., resources passing through an approval workflow without revisions). Conversely, creating a new version does not always signify a life-cycle status change, such as in cases of minor updates or resources still under development.
The life-cycle status of a resource holds significant value, informing data consumers about its developmental stage, deprecation, or withdrawal, and indicating whether a new version is available. For data providers, marking a resource with its life-cycle status is crucial for managing data workflows, such as ensuring a resource is stable and appropriately flagged before publication.
Resource life-cycle management varies depending on community practices, data management policies, and workflows. This variation extends to different resource types (e.g., datasets vs. catalog records), which may follow distinct life-cycle statuses.
DCAT utilizes the adms:status
property [[VOCAB-ADMS]] to specify
life-cycle
statuses, supplemented by relevant [[DCTERMS]] time-related properties (e.g., dcterms:created
,
dcterms:dateSubmitted
). However, DCAT-US profile does not mandate specific life-cycle statuses,
instead deferring to standards and practices suitable for each application scenario and communities of
practice.
Dataset Series
A Dataset Series is a collection of related datasets that share common characteristics, making them part of a cohesive group. This section provides guidance on the effective use of Dataset Series within data catalogs, emphasizing the benefits and considerations for publishers and users alike.
A Dataset Series is a way for publishers to convey that a dataset is evolving across specific dimensions and is available as a set of related datasets. However, choosing to group datasets this way depends on the use case. Since it demands extra metadata management from the publisher, it's optional. For instance, a dataset updated frequently via an API may not require individual records for each yearly snapshot unless the publisher wishes to share each snapshot's lifecycle.
Why Use Dataset Series?
Implementing Dataset Series offers several advantages:
- Organizational Clarity: Helps categorize and group datasets, making it easier for users to find and navigate related sets of data.
- Efficient Data Management: Streamlines the management of multiple datasets, providing a structured approach for updates and maintenance.
- User Experience: Enhances data discoverability and understanding, as users can perceive the broader context of individual datasets within a collective series.
Guidelines for Implementing Dataset Series
When using Dataset Series, consider the following best practices:
- Initiate a Dataset Series exclusively for managing multiple, interconnected datasets, ensuring each dataset is significant independently and contributes to the series' overall narrative.
- Maintain up-to-date metadata for the Dataset Series, reflecting any addition or removal of datasets. Consider discontinuing the series if it no longer contains any datasets, particularly when persistent identifiers are employed.
- Refrain from categorizing a single, frequently updated dataset as a Dataset Series, and avoid associating distributions directly with a series. Distributions pertain to individual datasets within the series.
- Ensure a coherent and strong thematic or contextual connection among the members of a Dataset Series, defined by shared attributes such as topic, time frame, or publisher, among others.
- Uphold high-quality metadata standards for both individual datasets and the Dataset Series, with specific series guidelines superseding general practices where necessary.
Expressing Relationships and Connections
Articulating the interconnections between datasets in a series is crucial for user understanding and data management:
Employ consistent metadata descriptors to clarify the relationships and commonalities within the series.
Utilize versioning for datasets that evolve or expand over time, helping users track changes and understand the dataset's history.
Highlight the distinct features of each dataset, ensuring its standalone value is clear, while also emphasizing its role in the broader series.
For more complex relationships, especially in automated or tightly interconnected collections, leverage specific DCAT properties (e.g., next, prev, inSeries, last) to express the nuanced connections. Refer to the DCAT versioning guidelines for detailed practices.
Impact on Metadata
Being part of a Dataset Series may necessitate specific metadata considerations:
- Adjust metadata to emphasize the unique aspects of each dataset within the series, such as different time periods, geographical areas, or methodologies.
- Ensure that metadata reflects the cohesive nature of the series, helping users understand the context and relationship between individual datasets.
How to specify dataset series
DCAT-US profile makes dataset series first class citizens of data catalogs by using the [[VOCAB-DCAT-3]] new
class
dcat:DatasetSeries
, defined as a subclass of dcat:Dataset
.
The datasets are linked to the dataset series by using the property dcat:inSeries
.
Note that a dataset series can also be hierarchical, and a dataset series can be a member of another dataset series.
Dataset series may evolve over time, by acquiring new datasets. E.g., a dataset series about yearly budget
data will acquire a new child dataset every year. In such cases, it might be important to link the yearly
releases with relationships specifying the first, previous, next, and latest ones. In such a scenario, DCAT
makes use of properties dcat:first
, dcat:prev
, and dcat:last
, respectively.
Controlled Vocabularies
Importance of Controlled Vocabularies
Controlled vocabularies are predetermined sets of terms that have been carefully curated to ensure consistency, accuracy, and standardized representation of concepts within a specific domain. In the context of DCAT-US, controlled vocabularies are used to define and constrain the values of specific metadata elements. These vocabularies enable the creation of a common language for describing datasets, facilitating data integration and harmonization across different repositories.
The use of controlled vocabularies in DCAT-US offers several key benefits:
- Consistency: By providing a predefined list of terms, controlled vocabularies ensure consistent representation and labeling of metadata elements. This consistency promotes data interoperability and simplifies data integration efforts, as different datasets can be mapped to a shared set of controlled terms.
- Enhanced search and discovery: Controlled vocabularies enable more effective search and discovery of datasets. By aligning metadata elements with standardized terms, users can easily navigate and explore datasets based on their specific domain knowledge. Furthermore, controlled vocabularies facilitate the development of advanced search capabilities, such as faceted search, which allows users to refine search results based on predefined categories or facets.
- Data harmonization: In a diverse data landscape where multiple agencies and organizations produce and manage datasets, controlled vocabularies help in harmonizing the data representation. By agreeing on a set of controlled terms, data publishers can ensure that similar concepts are represented consistently across different datasets. This harmonization promotes data integration and interoperability, enabling meaningful analysis and comparison of data from various sources.
Requirements for controlled vocabularies
The following is a list of requirements that were identified for the controlled vocabularies to be recommended in this Application Profile.
Controlled vocabularies SHOULD:
- Be published under an open license.
- Be operated and/or maintained by an agency of the US Government, by a recognised standards organization or another trusted organization.
- Be properly documented.
- Have labels in english, and optionally in Spanish
- Contain a relatively small number of terms (e.g. 10-25) that are general enough to enable a wide range of resources to be classified.
- Have terms that are identified by URIs with each URI resolving to documentation about the term.
- Have associated persistence and versioning policies.
These criteria do not intend to define a set of requirements for controlled vocabularies in general; they are only intended to be used for the selection of the controlled vocabularies that are proposed for this Application Profile.
Controlled vocabularies to be used
In the table below, a number of properties are listed with controlled vocabularies that MUST be used for the listed properties. The declaration of the following controlled vocabularies as mandatory ensures a minimum level of interoperability.
Compared with [[?DCAT-AP-20200608]], DCAT-US makes use of additional controlled vocabularies mandated by [[?DATA-GOV-REG]], and operated by the Data.gov Registry - with the only exceptions of the coordinate reference systems register maintained by OGC [[?OGC-EPSG]].
For two of these controlled vocabularies, namely the NGDA spatial data themes [[?NGDA-THEMES]] and the ISO topic categories [[?ISO-19115-1]], the DCAT-US Working Group has defined a set of harmonised mappings to the Data.gov Vocabularies Data Themes [[?DATA-GOV-THEME]] (TBD), in order to facilitate the identification of the relevant theme in [[?DATA-GOV-THEME]] for geospatial/statistical metadata.
Other controlled vocabularies
In addition to the proposed common vocabularies in , which are mandatory to ensure minimal interoperability, implementers are encouraged to publish and to use further region or domain-specific vocabularies that are available online. While those may not be recognised by general implementations of the Application Profile, they may serve to increase interoperability across applications in the same region or domain. Examples are the full set of concepts in Global Change Master Directory (GCMD) [[?GCMD]],and numerous other schemes.
For geospatial metadata, the working group has identified the following additional vocabularies:
- Geographic identifiers:
- For marine regions:
Marine Regions http://www.marineregions.org/
SeaVoX salt and fresh water body gazetteer - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
- General:
DBpedia for Geographic Placenames - http://dbpedia.org/about
National gazetteer vocabularies where feasible
SeaVoX salt and fresh water body gazetteer for ‘marine geonames’ - https://www.bodc.ac.uk/data/codes_and_formats/seavox/
- For marine regions:
- Keywords (with controlled vocabularies):
JSON-LD context file
One common technical question is the format in which the data is being exchanged.
For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged
format SHOULD be unambiguously be transformable into RDF.
For the format JSON, a popular format to exchange data between systems, DCAT-US profile provides a [JSON-LD context
file](https://raw.githubusercontent.com/DOI-DO/dcat-us/main/context/dcat-us-3.0.jsonld).
JSON-LD is a W3C Recommendation [[[json-ld11]]] that provided a standard approach to interpret JSON structures
as RDF. The provided JSON-LD context file can be used by implementers to base their data exchange upon, and so
create
a DCAT-US conformant data exchange. This JSON-LD context is not normative, i.e. other JSON-LD contexts are
allowed
to create a a conformant
DCAT-US data exchange. The JSON-LD context file downloadable here.
JSON Schemas
One common technical question is the format in which the data is being exchanged. For DCAT-US 3.0 conformance, it is not mandatory that this happens in a RDF serialisation, but the exchanged format SHOULD be unambiguously be transformable into RDF.
For JSON, which is a widely adopted format for data exchange between systems, the DCAT-US profile offers an informative JSON Schema. This schema aids in understanding the structure expected for DCAT-US compliant data exchanges in JSON format.
JSON Schema offers a compact way to describe and validate the structure and content of JSON data, ensuring specific formatting and value constraints. However, it's more limited than JSON-LD context and RDF serialization due to its focus on structure over meaning.
JSON Schema's focus on structural validation forms a contrast with JSON-LD and RDF's capabilities. JSON-LD and RDF go beyond just validation, allowing the creation of a graph of interconnected entities that can be easily integrated and reused across various contexts. This interconnectedness is fundamental to the concept of the semantic web, where data is not only readable but also comprehensible to machines.
Specifically, JSON-LD facilitates the representation of data as a graph, making it suitable for more complex, interlinked data representations, which is a cornerstone of linked data systems. This graph-based approach stands in contrast to the tree-like structures that JSON Schema is confined to, limiting its utility in scenarios requiring extensive data interconnectivity and reusability.
Implementers can use the provided JSON Schema for their data exchanges, aligning with DCAT-US standards. However, it's non-normative, meaning alternatives creating compliant exchanges are also valid. Download the current JSON Schema here.
SHACL Validation
In order to verify whether a catalog adheres to the stipulated constraints in this Application Profile, the constraints are articulated utilizing SHACL [[?SHACL]]. All constraints in this specification that were amenable to SHACL expression translation have been incorporated. Consequently, this set of SHACL expressions can be employed to construct a validation check for data exchange between two systems, a common scenario being one catalog being harvested into another.
For example, it may be recognized that the data being exchanged doesn't include the organizations' details since they are uniquely identified by a deferenceable URI. In this scenario, enforcing rules about the mandatory presence of a name for each organization may not be pertinent. Rigorously applying the DCAT-US SHACL expressions would trigger errors, even though the data is accessible via an alternative route. In this context, it's acceptable to omit this check during the validation phase.
This example underscores that to achieve an optimal user experience during a validation process, it's crucial to consider the actual data transferred between systems and apply only the constraints relevant to the data exchange. To facilitate this, the SHACL expressions are organized into separate files, aligning with common validation configurations.
The SHACL application profile for DCAT-US can be found here
Namespaces
Namespaces and prefixes used in normative parts of this recommendation are shown in the following table:
PrefixNamespace IRISourceadms``http://www.w3.org/ns/adms#
[[VOCAB-ADMS]]cnt``http://www.w3.org/2011/content#
[[Content-in-RDF10]]dcat``https://www.w3.org/TR/vocab-dcat-3/
[[VOCAB-DCAT]]dcat-us``http://resources.data.gov/ontology/dcat-us#
[[DCAT-US]]dct``http://purl.org/dc/terms/
[[DCTERMS]]dqv``https://www.w3.org/TR/vocab-dqv/
[[VOCAB-DQV]]foaf``http://xmlns.com/foaf/0.1/
[[FOAF]]gsp``http://www.opengis.net/ont/geosparql#
[[GeoSPARQL]]locn``http://www.w3.org/ns/locn#
[[LOCN]]org``http://www.w3c.org/ns/org#
[[VOCAB-ORG]]prov``http://www.w3.org/ns/prov#
[[PROV]]rdf``http://www.w3.org/1999/02/22-rdf-syntax-ns#
[[RDF-SYNTAX-GRAMMAR]]rdfs``http://www.w3.org/2000/01/rdf-schema#
[[RDF-SCHEMA]]schema``http://schema.org/
[[schema-org]]sdmx-attribute``http://purl.org/linked-data/sdmx/2009/attribute#
[[?SDMX-ATTRIBUTE]]skos``http://www.w3.org/2004/02/skos/core#
[[SKOS-REFERENCE]]spdx``http://spdx.org/rdf/terms#
[[SPDX]]vcard``http://www.w3.org/2006/vcard/ns#
[[VCARD-RDF]]xsd``http://www.w3.org/2001/XMLSchema#
[[XMLSCHEMA11-2]]