-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚚 Database migrations for lamindb 1.0a1 #2323
Conversation
…ription field, re-purpose Transform.key as the central semantic identifier, and rename Transform.name to Transform.description
The conversation you were looking for could not be found. |
I don't really understand the point, as we have |
This is something for the frontend; I shared the Claude discussion with you. After the whole exploration around the block model this will likely make more sense in |
I still don't see it. |
Ok, that is clearer to me now, but i am still not sure. If we have a path in |
|
Bumps
__version__
from0.78a1
to1.0a1
.Internal instances migrated:
laminlabs/lamindata
laminlabs/lamin-site-assets
laminlabs/lamin-dev
laminlabs/arrayloader-benchmarks
laminlabs/bionty-assets
Other changes
>=3.10
and Django to>=5
because we're usingdb_default
-- discussion here on GitHubSchema changes
_source_code_artifact
from Transform, it's been deprecated since 0.75 222931a_source_code_artifact
populated, populatesource_code
Transform.name
toTransform.description
because it behaves likeArtifact.description
and it's confusing because people expect versioning to happen based onname
5d68c5e Claude chatTransform
constructor usename
to populatekey
in all cases in which onlyname
is passedkey
in casesource_code is None
via._name_field = "key"
description
field that was never exposed on the constructor; to be safe, we concatenated potential data in it on the new description field Claude chatkey=None
andname!=None
, usename
to pre-populatekey
Collection.name
toCollection.key
for consistency with Artifact & Transform and the high likelihood of users wanting to organize them hierarchically 2bef93a_branch_code
integer on every record to model pull requests 483fb37visibility
within that codevisibility=0
as_branch_code=0
as "archive".aux
json field onRecord
: Claude threadrun._status_code
that allows to writefinished_at
in clean up operations so that there is a run time also for aborted runsRun.is_consecutive
toRun._is_consecutive
_template_id
FK to store the information of the generating template (whether a record is a template is coded via _branch_code)_accessor
tootype
to publicly declare the data format assuffix, accessor
Artifact.type
toArtifact.kind
(internal Slack thread) Claude chatrun._logfile
which holds logshash
field onParamValue
andFeatureValue
to enforce uniqueness without running the danger of failure for large dictionaries 9e4b503 Claude chatArtifact._curator: dict
field 068bcf7._expect_many
toFeature
/Param
that defaults toTrue
/False
and indicates whether values for this feature/param are expected to occur a single or multiple times for every single artifact/run 1cc1ca8True
(default), the values come from an observation-level aggregation and a dtype ofdatetime
on the observation-level meanset[datetime]
on the artifact-levelFalse
it's an artifact-level value anddatetime
meansdatetime
; this is an edge case because an arbitrary artifact would always be a set of arbitrary measurements that would need to be aggregated ("one just happens to measure a single cell line in that artifact")False
(default), the values mean artifact/run-level values anddatetime
meansdatetime
True
, the values would be from an aggregation, this seems like an edge case but say when characterizing a model ensemble trained with different parameters it could be relevant.transform
foreign key from artifact and collection for consistency with all other records; introduce a property and a simple filter statement instead (context: https://github.com/laminlabs/laminhub/issues/1498) a4e0fb2TransformULabel
,RunParamValue
,ArtifactParamValue
576d858Run.parent
toRun.initiated_by_run
9fbdd6f_overwrite_versions
, which indicates whether versions are overwritten or stored separately; it defaults toFalse
for file-like artifacts and toTrue
for folder-like artifacts - see Slack threadn_objects
ton_files
for more clarity laminlabs/lamindb-setup@eba36ba Claude chatSpace
registry to lamindb with an FK on everyBasicRecord
internal Notion page a25df16.type
CharField toFeature
andParam
so that features & params can be hierarchically grouped_previous_runs
field on everything except Artifact & Collection; we can model this better with events going forwardULabel
calledis_concept
to distinguish ontological concepts from the labels themselves; for instance, you would never want to label an artifact with a ulabelProject
, you'll only want to label with actual project valuesProject 1
,Project 2
, etc. 6bd7f62TODO in another PR:
Metric
andMetricValue
registry that parallelsParam
andParamValue
up to an FK torun
Feature
toDimension
andFeatureSet
toSchema
and let people group metrics through a feature type metricMaterials
Needs:
Affects:
Triggered these issues:
Did not move forward with these changes
Record._page_md
textfield so that every entity can hold markdown contentEvent
model with fieldsRecord._public
boolean that indicates whether a record is public Claude chatfeature_sets
is a public attribute and given that some people might want to query forfeature_values
orparam_values
outside the.features
and.params
managers; we should consider making_feature_values
and_param_values
public. This will likely also help people better understand the underlying mechanics if they're ever interested. Decided against because then there would be two ways to query by a feature or param. Everything is already supported through the lamindb-native API and by that considerably simpler than doing this with Django, in particular since ✨ Enable to easily join features onto artifacts viaArtifact.df()
#2238. Also see the run docs where nested queries are exemplified:Person
otherwise