A tool for diagnosing and repairing OCFL-based Fedora repositories.
Java 11
Usage: fcrepo-doctor [-dhvV] [--disable-auto-versioning] [-f=<problemsFile>]
[-o=<outputDir>] [-p=<parallelism>] -r=<ocflRoot>
[--s3-access-key=<s3AccessKey>] [--s3-bucket=<s3Bucket>]
[--s3-endpoint=<s3Endpoint>] [--s3-profile=<s3Profile>]
[--s3-region=<s3Region>] [--s3-secret-key=<s3SecretKey>]
-t=<tempDir>
A tool for diagnosing and repairing OCFL-based Fedora repositories.
-d, --debug Enable stack traces
--disable-auto-versioning
Disables auto-versioning, which means that
applying fixes will NOT automatically result in
a new version of the affected resources.
-f, --fix=<problemsFile> Path to a file that contains a list of files to
fix. This file is generated by first running the
command without this option.
-h, --help Show this help message and exit.
-o, --output=<outputDir> Path to a directory to write output files into.
Default: current directory
-p, --parallelism=<parallelism>
Number of threads to use. Default: number of cores
minus one
-r, --ocfl-root=<ocflRoot> Path to Fedora's OCFL storage root. When using S3,
this is the prefix within the bucket that the
storage root is located it, and it should be an
empty string if no prefix is used.
--s3-access-key=<s3AccessKey>
S3 access key. If provided, a secret key must also
be specified
--s3-bucket=<s3Bucket> S3 bucket the OCFL repository is in
--s3-endpoint=<s3Endpoint>
URL to the S3 endpoint
--s3-profile=<s3Profile>
S3 profile to use.
--s3-region=<s3Region> S3 region
--s3-secret-key=<s3SecretKey>
S3 secret key. If provided, an access key must
also be specified
-t, --temp=<tempDir> Path to a directory on the same filesystem as the
OCFL root to use for temporary files
-v, --verbose Enable more verbose logging
-V, --version Print version information and exit.
fcrepo-doctor
is intended to be run in two phases. The first phase scans a repository and writes a
fcrepo-doctor-problems.json
file that contains all of the resources that were identified with problems.
This first phase can be run while Fedora is operating normally.
In the second phase, the fcrepo-doctor-problems.json
file is passed back to fcrepo-doctor
, and it attempts to
resolve all of the problems that identified in the first phase. For this phase, Fedore MUST be shutdown or at
least not be actively written to.
Resources that are not fixed successfully are rolled back and logged to fcrepo-doctor-failed-problems.json
.
It is safe to stop the utility while it's running. It will gracefully terminate, and write
fcrepo-doctor-incomplete-problems.json
, which contains all of the problems that have yet to be fixed. This file
can then be passed back into the utility to resume where you left off.
# Phase 1: Analyze the repository
java -jar fcrepo-doctor.jar \
--ocfl-root fcrepo/data/ocfl-root \
--temp fcrepo/data/ocfl-temp
# Phase 2: Fix any problems found
# You may want to tee these logs to a file
java -jar fcrepo-doctor.jar \
--fix fcrepo-doctor-problems.json \
--ocfl-root fcrepo/data/ocfl-root \
--temp fcrepo/data/ocfl-temp
Versions of Fedora 6 prior to 6.2.0 allowed adding triples to a binary description that used the binary description's
resource ID as the subject rather than the binary itself, and it would serialize this subject directly rather than
translating it to be the binary's ID. However, this was a bug, and it should have translated the ID. fcrepo-doctor
is able to identify any binary descriptions that contain invalid subjects, and correct them.