-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Storage implementation consists of two parts:
- A local database, exposed as
StorableStore
(and implemented byStorableStoreImpl
), which delegates to an implementation (StoreBackend
), in which all entry metadata is stored, along with small inlined data.- Currently two backend implementations exists:
- BDB-JE (see
storemate-backend-bdb-je
sub-module) -
LevelDB/Java (see
storemate-backend-leveldb
sub-module) -- NOTE: it should be simple to plug-in JNI-accessible native storage too, but I haven't tested this set up
- BDB-JE (see
- Currently two backend implementations exists:
- File system, where larger data entries ("blobs") are stored.
- Directory structure is handled by
FileManager
; amount of code is minimal and mostly consists of arranging files in balanced directories, using timestamps for creating new directories as necessary.
- Directory structure is handled by
Data to store is divided in three parts:
- Opaque key: no semantics are implied at
StoreMate
level, sorting is based on raw byte collation: additional semantics are usually defined by higher-level systems. - Entry metadata
- Standard entry metadata contains minimal state information, such as is-soft-deleted flag; compression indicator, checksum for payload, last-modified time
- Optional custom metadata is opaque byte sequence (
byte[]
) exposed to higher-level systems: StoreMate simply stores it along with other metadata without using it for anything
- Payload: actual data to store; small payloads are inlined in the database (size threshold configurable, typically something like 2kB), larger ("blobs") are stored on disk.
Payload is automatically compressed (unless detected to be compressed, or explicitly instructed not to compress) when stored; and uncompressed unless client indicates it accepts compressed data. This is meant to allow convenient but customizable handling of compression over protocols like HTTP.
Currently two compression formats are supported: GZIP, LZF. GZIP is used for smaller entries, due to its higher CPU overhead; LZF for larger entries. LZF also supports efficient content skipping, important if Content Range (partial payload data access) is to be supported at higher level.
Checksums are automatically calculated over payload, and used to guard against data corruption both on uploads (assuming client provides checksums to compare against) and when offering data for synchronization.
All checksums are calculated using MurMur3/32
(32 bit) algorithm.
A single secondary ("last-modified") index is maintained. It was designed to allow for reliable and efficient Change List style node-to-node synchronization of content. Backends that natively support secondary indexes (BDB-JE) use it; others (LevelDB) simply use another table and handle synchronization separately; regardless, library presents unified view of atomic CRUD operations to using application.
Basic CRUD (create, read, update, delete) operations are supported; as well as iteration over Key and Last-Modified orders.
Last-modified order can be used for change list traversal; and key order iteration for entry-range queries.
Javadocs:
Format definitions:
Projects that use StoreMate
:
-
ClusterMate is a framework for building distributed systems, and it uses
StoreMate
as its per-node storage layer.