-
Notifications
You must be signed in to change notification settings - Fork 3
BDBDataFormat
cowtowncoder edited this page Feb 15, 2013
·
9 revisions
- All values are in Big-Endian format, that is, starting with the Most Significant Bytes (and bits)
- Variable-length Integers are only used for lengths, and thus only support positive integers (removing need for using Zigzag encoding). Encoding is done using sign-bit to denote the last byte; all bytes have 7 data bits.
- Hash codes are calculated using Murmur3/32 hash, with seed value of 0. Hash value of 0 must be masked as
1
, as0
is used as the marker for "not available"
Entry metadata is stored in "raw" format with simple structure.
Structure can be thought of as consisting of multiple sections
This section has fixed offsets and is unlikely to change between data format versions.
- #0-#7:
long
"lastMod"; last modified timestamp (used for secondary index) - #8-#11: Status section
- #8:
byte
"version"; data entry version, hard-coded to 0x11 for the current version (other values reserved for future compatibility needs) - #9:
byte
"status"; entry status stats:- 0x01: soft-deleted? (active vs tombstone)
- 0x02: is-replicated? (0->primary, not replicated; 1->secondary, created by replication)
- others (0x4 - 0x80) reserved for future; should be left 0
- #10:
byte
"compression"; Compression method, with allowed values of:-
0
: no compression ("identity") -
1
: LZF (https://github.com/ning/compress) -
2
: GZIP (or to be precise, "deflate")
-
- #11:
byte
"externalPathLength": 8-bit unsigned length of external storage path; 0 for inlined storage
- #12-#15:
int
"contentHash"; hash code over uncompressed content
Currently this section only contains data if entry is compressed:
- #16-#19:
int
"compressedHash"; hash code over compressed data -- only included if compression is used (i.e. compression value is NOT0
) - #20...:
vlong
"originalLength"; original (uncompressed) length of data -- only included if compression is used
This section contains metadata used by application that uses StoreMate
: it is simply stored and exposed as-is, without modifications or semantics for StoreMate itself.
- #?
vint
"metadataLength"; length in bytes of opaque metadata - #?
byte[metadataLength]
opaque metadata itself
This section contains either:
- Inlined entry data (for small entries; threshold configurable), OR
- External path (ASCII String) for larger entries
either way, it starts with:
- #?
vlong
storageLength: length of stored data, in bytes; either length of storage file, or number of inlined bytes.
and continues, depending on value of "externalPathLength":
if "externalPathLength" is 0 ('no external data'):
- #?
byte[storageLength]
"inlinedData"; actual inlined data
if "externalPathLength" longer than 0:
- N+x:
byte[externalPathLength]
"externalPath"; Relative filename (ASCII-chars only) to data file that contains payload bytes