Development¶
This page contains information for developers about how to contribute to this project.
Set-up¶
This project can be used in combination with dans-dev-tools. Before you can start it as a service some dependencies must first be started:
HSQL database¶
Open a separate terminal tab:
start-hsqldb-server.sh
Dataverse instance¶
The service needs a Dataverse instance to talk to. For this you can use for example dev_archaeology
(only accessible to DANS developers):
start-preprovisioned-box.py -s
After start-up:
vagrant ssh
curl -X PUT -d s3kretKey http://localhost:8080/api/admin/settings/:BlockedApiKey
curl -X PUT -d unblock-key http://localhost:8080/api/admin/settings/:BlockedApiPolicy
This is necessary to allow calls to admin API endpoints from outside the box. This will break access to the admin API from within the box. To roll back to the original situation:
curl -X PUT -d localhost-only http://localhost:8080/api/admin/settings/:BlockedApiPolicy/?unblock-key=s3kretKey
dd-validate-dans-bag¶
dd-ingest-flow
uses dd-validate-dans-bag
to validate the bag in the deposit. The validation service must be run outside the vagrant box, because it
needs disk access to the deposit.
Open a separate terminal tab:
start-env.sh # only first time
Configure the correct API key in etc/config.yml
, also set validation.baseFolder
to the absolute path of the data
directory of dd-ingest-flow
.
Now you can start the service:
start-service.sh
dd-ingest-flow¶
Open both projects in separate terminal tabs do the following for each:
start-env.sh # only first time
Configure the correct API keys (apiKey
and unblockKey
) in etc/config.yml
.
Note the apiKey
overrides per ingest area.
Now you can start the service:
start-service.sh
Prepare and start a deposit¶
Once the dependencies and services are started you can ingest a single deposit by moving
(not copy) a deposit into data/auto-ingest/inbox
or whatever directory is configured in
dd-ingest-flow/etc/config.yml ingestFlow:autoIngest:inbox
Note that a migration bag has more data than valid for this process. The validator will inform you about what to remove and how to fix the checksums.
The dans-datastation-tools project has commands to copy/move your data into an ingest_area
(auto-ingest/import/migration) require a user group deposits
.
When running locally you don't have such a group, so you can't use these commands.
Make sure to have the following structure.
dd-ingest-flow
├── data
│ ├── auto-ingest
│ │ ├── inbox
│ │ │ └── <UUID>
│ │ │ ├── bag
│ │ │ │ └── *
│ │ │ └── deposit.properties
│ │ └── out
│ └── tmp
Alternatively you can prepare batches in one of the other ingest areas and start as follows.
Configure the ingest_flow
section and dataverse
section of .dans-datastation-tools.yml
which is a copy of src/datastation/example-dans-datastation-tools.yml
.
service_baseurl
should refer tolocalhost
- The
ingest_areas
should refer to the same folders as theingestFlow
section ofdd-ingest-flow/etc/config.yml
. Replace the default/var/opt/dans.knaw.nl/tmp
in the latter withdata
. - Set the
apiKey
- To repeat a test you'll need the
dv-dataset-destroy
script which needssafety_latch: OFF
, the default isON
.
Assuming dans-datastation-tools
and dd-ingest-flow
are in the same directory:
cd ~/git/service/data-station/dans-datastation-tools
poetry run ingest-flow start-migration -s ../dd-ingest-flow/data/migration/inbox/<SOME-DIR>/<UUID>