-
Notifications
You must be signed in to change notification settings - Fork 3
BDBDataFormat
cowtowncoder edited this page Aug 29, 2012
·
9 revisions
- All values are in Big-Endian format, that is, starting with the Most Significant Bytes (and bits)
- Variable-length Integers are only used for lengths, and thus only support positive integers (removing need for using Zigzag encoding). Encoding is done using sign-bit to denote the last byte; all bytes have 7 data bits.
- Hash codes are calculated using Murmur3/32 hash, with seed value of 0. Hash value of 0 must be masked as
1
, as0
is used as the marker for "not available"
Entry metadata is stored in "raw" format with simple structure.
Structure can be thought of as consisting of multiple sections
- 0-7:
long
Last modified timestamp (used for secondary index) - 8-11: Status section
- 8: Version number: hard-coded to 0x11 for the current version, reserved for future compatibility needs
- 9: Entry status, with allowed values of:
-
0
: active entry -
1
: soft-deleted entry ("tombstone") - 10: Compression method, with allowed values of:
-
0
: no compression ("identity") -
1
: LZF (https://github.com/ning/compress) -
2
: GZIP (or to be precise, "deflate") - 11: 8-bit unsigned length of external storage path; or 0 for inlined storage
- 12-15:
int
Hash code calculated over uncompressed content
- 16:
int
Hash code over compressed data -- only included if compression is used (i.e. compression value is NOT0
)
This section contains metadata used by application that uses StoreMate
: it is simply stored and exposed as-is, without modifications or semantics for StoreMate itself.
- 16/20 (depending on preceding field(s)):
vint
Length indicator -- "metadataLength" - N:
byte[metadataLength]
Metadata
This section contains either:
- Inlined entry data (for small entries; threshold configurable), OR
- External path (ASCII String) for larger entries