DANS BagPack Profile v1.0.0

Introduction

Version

  • Document version: 1.0.0
  • Publication date: N/A

Status

The status of this document is DRAFT.

Scope

This document specifies what constitutes an acceptable DANS BagPack. This includes all the requirements for a bag to be successfully processed by the DANS Data Vault ingest workflow.

Overview and Conventions

Keywords

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

The key word "SHOULD" is also used to specify requirements that are impossible or impractical to check by the archival organization (i.e. DANS). The client should do its best to meet these requirements, but not rely on their being validated by the archival organization.

Subdivisions

The requirements are subdivided into the following sections:

  • RDA BagPack Related - requirements that refer back to the RDA BagPack specifications. If a bag only needs to comply with the RDA BagPack specifications, then it should be sufficient to only check this section.
  • Extra Requirements for DANS BagPack - requirements that are specific to the DANS BagPack Profile, and which are in addition to the RDA BagPack requirements.

The sections are numbered and may have numbered subsections. The requirements themselves are stated as numbered rules. Rules may have parts that are labeled with letters: (a), (b), (c), etc. To uniquely identify a specific rule, use the notation

<section-nr>[.<subsection-nr>].<rule-nr> [(<letter>)]

Example: 2.3.4 (e) means part e of the fourth rule in subsection 3 of section 2.

Requirements

The following items are required by the RDA BagPack specifications:

  1. A DANS BagPack MUST be valid according to BagIt v1.0.
  2. A DANS BagPack MUST contain a file metadata/datacite.xml (a) this file MUST be valid according to the DataCite schema version 4.0 or later, except for the requirement that there MUST be a DOI present: a DOI is not required for a DANS BagPack; (b) DataCite's recommended properties SHOULD be present.
  3. Other files besides datacite.xml MAY be present in the metadata folder.
  4. The files in the metadata folder MUST be mentioned in the tag-manifest (this is optional in BagIt, but required by RDA BagPack).
  5. BagIt-Profile-Identifier MUST be provided.

2. Extra Requirements for DANS BagPack

The following items are required by the DANS BagPack Profile, in addition to the requirements of RDA BagPack:

  1. BagIt-Profile-Identifier MUST contain https://doi.org/10.17026/e948-0r32.
  2. The bag must be valid according to the DANS BagPack BagIt Profile.
  3. There MUST be a file called metadata/pid-mapping.txt: the structure of this file MUST be rows of <identifier> <referenced object>, where <identifier> is a unique URI and <referenced object> is the path to the file relative to the root of the bag, and both are separated by one or more spaces.
  4. (a) There MUST be a file called metadata/oai-ore.jsonld; (b) this file MUST be well-formed JSON.
  5. There MUST be a one-to-one mapping between the files in the data folder and the files described in the Aggregation contained in oai-ore.jsonld file: (a) all identifiers mentioned in the oai-ore.jsonld that refer to files in the data folder MUST be present in pid-mapping.txt; (b) all file objects mentioned in the pid-mapping.txt MUST be present in the oai-ore.jsonld.