You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The multi-attr tiledb writer currently uses a different set of domains to what is calculated by the data model. The data model provides only the 'super-domains' (the minimum set of highest-dimensionality domains that enclose the maximum number of input datasets), but the multi-attr writer makes one domain for each unique set of dimensions and writes all the datasets that match that set of dimensions.
For example, taking the following datasets:
a --> [x, y, z, t]
b --> [x, y, t]
c --> [x, y, t]
d --> [x, y, t1]
e --> [x1, y1, z, t1]
f --> [x1, y1, t1]
g --> [x, y, z, t]
This is the set of domains that would be made by the data model:
domain_0 --> a, b, c, g
domain_1 --> e, f
domain_2 --> d
And this is the set of domains that would be made by the multi-attr writer:
x,y,z,t --> a, g
x,y,t --> b, c
x,y,t1 --> d
x1,y1,z,t1 --> e
x1,y1,t1 --> f
Some time we should tidy this discrepancy. Assuming that multi-attr append goes in (see #19) then the multi-attr case should become the default, and the data model domain assignation algorithm should just be updated to match what the multi-attr writer is doing.
Here's the TODO list:
decide on a single writing strategy - potentially prefer multi-attr as it seems to be the best approach for storing multiple data vars
commonalise the domain algorithm between the data model and multi-attr writer
commonalise to a single writer
The text was updated successfully, but these errors were encountered:
The multi-attr tiledb writer currently uses a different set of domains to what is calculated by the data model. The data model provides only the 'super-domains' (the minimum set of highest-dimensionality domains that enclose the maximum number of input datasets), but the multi-attr writer makes one domain for each unique set of dimensions and writes all the datasets that match that set of dimensions.
For example, taking the following datasets:
This is the set of domains that would be made by the data model:
And this is the set of domains that would be made by the multi-attr writer:
Some time we should tidy this discrepancy. Assuming that multi-attr append goes in (see #19) then the multi-attr case should become the default, and the data model domain assignation algorithm should just be updated to match what the multi-attr writer is doing.
Here's the TODO list:
The text was updated successfully, but these errors were encountered: