dans-schema¶
DANS XML schemas
DESCRIPTION¶
The DANS BagIt Profile v1 requires several XML metadata files. This project includes the XML schema files to validate those XML files. These schemas will be published on a public website. Alternatively you can clone and build this project which will package site up both as a jar-file and as an RPM.
DANS Schema vs EASY Schema
DANS Schema contains the schemas for the Data Stations. It is forked from the legacy project easy-schema which will be phased out with EASY itself.
BUILDING FROM SOURCE¶
Prerequisites:
- Java 11 or higher
- Maven 3.3.3 or higher
- RPM
Steps:
git clone https://github.com/DANS-KNAW/dans-schema.git
cd dans-schema
mvn clean install
If the rpm
executable is found at /usr/local/bin/rpm
, the build profile that includes the RPM packaging will be activated. If rpm
is available, but at a
different path, then activate it by using Maven's -P
switch: mvn -Prpm install
.
Example Resources¶
The example resources directory lib/src/test/resources
has a number of XML files that can be used
to become more familiar with the XML formats used by the DANS services.
The DANS bag documentation provides best practices for namespace prefixes.
Note that actual practice also uses dct
for dcterms
.
The table maps formats to files in a bag.
XSD | file in DANS V1 bag |
---|---|
ddm |
metadata/dataset.xml |
files |
metadata/files.xml |
Examples for XSDs that are not in the table above may be standalone,
or embedded within examples for XSDs from the table.
The namespace URI of the root element in the examples should match the
targetNamespace
of the schema to validate the example.
├── bag/metadata
│ ├── agreements/agreements
│ └── files/files
├── dcx
│ ├── dcx-dai
│ └── dcx-gml
└── md/ddm
├── v1/ddm
└── v2/ddm
Validation¶
Unit tests¶
ValidateXsdFilesTest
shows that the provided XSDs are valid.ValidateXmlFilesTest
shows that the provided examples are valid.
Directories inlib/src/test/resources
should match with XSD files in thelib/src/main/resources
directory.DdmVersionChangesTest
documents the effect of the differences between V1 and V2 of the DDM schemas.
DIY¶
Various online and offline tools can validate an XML file against an XSD schema. They may silently ignore problems with loading 3rd party schemas referenced by the main schema. This can cause error messages like:
Cannot resolve the name 'xml:lang' to a(n) 'element declaration' component
Some loading problems might be caused by services refusing the default
User-Agent
header that is sent with the request to fetch the XSD.
A system property for Java applications can override the default value,
for example with a command line argument:
-Dhttp.agent=something/1.0
Below an example to write your own validator in Java. It silently ignores loading problems of 3rd party schemas.
import org.xml.sax.SAXException;
import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import java.io.FileInputStream;
import java.io.IOException;
import java.net.URL;
import static javax.xml.validation.SchemaFactory.newInstance;
public class Validate {
public static void main(String[] args) throws IOException, SAXException {
System.setProperty("http.agent", "Test");
String xsd = " https://easy.dans.knaw.nl/schemas/dcx/2020/03/dcx-dai.xsd";
String xml = "src/main/resources/examples/dcx-dai/example2.xml";
newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
.newSchema(new URL(xsd))
.newValidator()
.validate(new StreamSource(new FileInputStream(xml)));
}
}