You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tensorizer v3 should extend encryption to include header information. This needs to be accounted for in the file metadata and the patterns we use to buffer and write the metadata.
Important things to consider:
Where do we put encryption parameters in the header?
My take: Indicate in the file flags if the headers are encrypted, and then place encryption info immediately before the metadata section if so. It likely shouldn't be bundled into the metadata section, so that the length of the metadata section can also be encrypted. File flags and other meta-metadata like hashes of the metadata section can be encrypted too.
How does incremental writing interact with metadata encryption?
How many nonces and MACs are used when encrypting and decrypting the metadata section?
If one is used, then the metadata section can only be updated by rewriting the entire thing, because data can't be inserted into the middle of an encrypted stream (for the non-tensor-header metadata section), only appended at the end.
This might be cheap enough that we could do it anyway, though it feels like it could be limiting in the future.
If two are used (one for metadata entries, one for tensor headers), the metadata section could actually be written as two independent streams, only rewriting their MACs on each synchronization.
If multiple are used, then we need to handle size limits for the nonce & MAC list like size limits of the rest of the metadata section. (Luckily, this fits pretty well, since the size taken up by nonces and MACs is directly proportional to the number of entries being described when using one encrypted segment per metadata entry).
Ideally, information like the choice of encryption method would be stored in a tagged and extensible format like tensor CryptInfo segments, which would allow changing this in order to make it more or less rigid in the future.
To what extent do we attempt to protect information about model "size"?
The filesize gives away most information about that regardless of what we encrypt, but should we set out to protect information about how many tensors are contained in a file?
The number of tensors could potentially be leaked by information like the length of the metadata section, or the spacing of padding between tensor data entries. The length of the metadata section can be determined by the padding at the end of the metadata section unless we scramble the padding bytes during encryption.
The text was updated successfully, but these errors were encountered:
from @Eta0 in #127 (review)
Tensorizer v3 should extend encryption to include header information. This needs to be accounted for in the file metadata and the patterns we use to buffer and write the metadata.
Important things to consider:
CryptInfo
segments, which would allow changing this in order to make it more or less rigid in the future.The text was updated successfully, but these errors were encountered: