Originally Published: 12/23/2015
Last Published: 12/23/2015
The current BagIt specification at the time of this writing is version 0.97, expiring Dec 25, 2015. This document refers to it as the “BagIt specification”, or simply “BagIt”.
220.127.116.11 Illegal characters
18.104.22.168 Illegal file names
22.214.171.124 Maximum path length
2.2.3 Fetch support
3.2 Bag name
This is version 1.0 of the Data Conservancy BagIt profile. Conforming Bags will have the following metadata name and value in bag-info.txt:
This profile specification is incompatible with previous versions of this profile. This profile supercedes version 0.9 of the Data Conservancy BagIt profile, identified by the following metadata name and value in bag-info.txt:
Bags conforming to this profile must be valid as defined by the BagIt specification.
Therefore, a Bag conforming to Data Conservancy BagIt Profile 1.0 will be a complete and valid bag per the BagIt specification.
Section 7 of the BagIt specification discusses practical concerns of Bag interoperability between platforms. While BagIt §7 is informative, this profile applies the following constraints to address platform interoperability.
The following characters are illegal, and are not to be used in path segments or file names, informed by BagIt §7.2.2:
The maximum allowed path length within the bag is 1024 bytes, and the maximum allowed file name length is 255 bytes.
In addition to the restrictions presented in §126.96.36.199, path segments (demarcated by ‘/’) in a manifest must not be equal to any of the following byte sequences:
Bags conforming to this profile must be valid (and therefore complete) per BagIt §3. Our profile does not support fetch.txt (BagIt §2.2.3), therefore bags with a non-empty fetch file will not be conformant with this profile.
This profile adds cardinality constraints on the reserved metadata names in BagIt §2.2.2:
Source-Organization: minOccurs=0, maxOccurs=unlimited Organization-Address: minOccurs=0, maxOccurs=unlimited Contact-Name: minOccurs=0, maxOccurs=unlimited Contact-Phone: minOccurs=0, maxOccurs=unlimited Contact-Email: minOccurs=0, maxOccurs=unlimited External-Description: minOccurs=0, maxOccurs=1 Bagging-Date: minOccurs=0, maxOccurs=1 External-Identifier: minOccurs=0, maxOccurs=unlimited Bag-Size:minOccurs=0, maxOccurs=1 Payload-Oxum: minOccurs=0, maxOccurs=1 Bag-Group-Identifier: minOccurs=0, maxOccurs=1 Bag-Count: minOccurs=0, maxOccurs=1 Internal-Sender-Identifier: minOccurs=0, maxOccurs=unlimited Internal-Sender-Description: minOccurs=0, maxOccurs=1
This profile adds the following additional reserved metadata names:
BagIt-Profile-Identifier: minOccurs=1, maxOccurs=1 Resource-Manifest: minOccurs=1, maxOccurs=1
Metadata names with minOccurs=0 should be considered optional, whereas names with minOccurs=1 are required.
Any directories or filenames reserved by this profile are considered as “Other Tag Files” per BagIt §2.2.4. Tag files defined by this profile SHOULD have entries in the tag manifests as specified by BagIt §2.2.1.
This profile reserves the directory META-INF/org.dataconservancy.packaging, its subdirectories, and any contained files, where the META-INF directory is a sibling of the data directory.
The directory META-INF/org.dataconservancy.packaging/ONT is reserved specifically to distribute ontologies that are referenced by the package description or bag payload.
The directory META-INF/org.dataconservancy.packaging/PKG-INFO is reserved specifically to distribute any files (such as ORE Resource Map(s)) that describe the structure of the bag.
The nature of the files and structure under the PKG-INFO and ONT directories will be part of another specification.
Bags conforming to this profile MUST name the single-file form of the bag (i.e. tar archive or zipped form) after the base directory of the bag.
This changes the SHOULD in BagIt §4 item #2 to a MUST.