Skip to content

Releases: eyadgaran/SimpleML

0.14.0

14 Jul 06:43
de5a3df
Compare
Choose a tag to compare

0.14.0 (2022-07-13)

  • Standarized formatting with Black
  • Split up ORM into a standalone swappable backend
  • Persistables maintain weakrefs for lineage
  • Persistables are normal python objects now
  • Hashing flag to reject non-serializable objects

What's Changed

New Contributors

Full Changelog: 0.13.0...0.14.0

0.13.0

29 Mar 06:01
94f466e
Compare
Choose a tag to compare
  • Path existence check for pandas serialization

What's Changed

Full Changelog: 0.12.0...0.13.0

0.12.0

03 Mar 08:33
6122a52
Compare
Choose a tag to compare
  • Changed internal dataset structure from mixins to direct inheritance
  • Condensed all pandas dataset types into a single base class
  • Adds support for dask datasets
  • Placeholders for additional dataset libraries
  • Adds hashing support for dask dataframes
  • Refactored persistence ("save_patterns") package into standalone extensible framework
  • Adds context manager support to registries for temporary overwrite
  • Refactor pipelines into library based subclasses

BREAKING CHANGES

  • Pandas dataset will default param squeeze_return to False (classes expecting to return a series will need to be updated)
  • Numpy dataset is considered unstable and will be redesigned in a future release
  • Onedrive, Hickle, and database save patterns are removed (functionality is still available but a composed pattern is not predefined. these can be trivially added in user code if needed)
  • Changed pandas hash output to int from numpy.int64 (due to breaking change in NumpyHasher)
  • Changed primitive deterministic hash from pickle to md5
  • Extracted data iterators into utility wrappers. Pipelines no longer have flags to return iterators
  • Random split defaults are computed at runtime instead of precalculated (affects hash)

What's Changed

New Contributors

Full Changelog: 0.11.0...0.12.0

0.11.0

10 Oct 21:34
e85f064
Compare
Choose a tag to compare
  • Added support to hasher for initialized objects
  • Adds support for arbitrary dataset splits and sections
  • Dataset hooks to validate dataframe setting
  • Pipelines no longer cache dataset splits and proxy directly to dataset on every call
  • Introduces pipeline splits as reproducible projections over dataset splits
  • Database utility to recalculate hashes for existing persistables

BREAKING CHANGES

  • Hash for an uninitialized class changed from repr(cls) to "cls._module.cls._name"
  • Database migrations no longer recalculate hashes. That has to be done manually via a utility

0.10.0

10 Jul 01:37
d1ad214
Compare
Choose a tag to compare
  • Dataset external file setter with validation hooks
  • Pandas changes to always return dataframe copies (does not extend to underlying python objects! eg lists, objects, etc)
  • Pandas Dataset Subclasses for Single and Multi label datasets
  • PersistableLoader methods do not require name as a parameter

BREAKING CHANGES

  • PandasDataset is deprecated and will be dropped in a future release. Use SingleLabelPandasDataset or MultiLabelPandasDataset instead
  • Pandas Dataset Classes require dataframe objects of type pd.DataFrame and will validate input (containers of pd.DataFrames are no longer supported)

0.9.3

04 Apr 22:20
93485d4
Compare
Choose a tag to compare
  • Patch release to support breaking changes in sqlalchemy 1.4

0.9.2

27 Jan 07:31
6cbbb9a
Compare
Choose a tag to compare
  • minor patches

0.9.1

28 Dec 04:50
4810e45
Compare
Choose a tag to compare
  • Added cli with alembic support

0.9.0

30 Nov 02:41
2ca057e
Compare
Choose a tag to compare
  • Refactored save patterns. Supports multiple concurrent save locations and arbitrary artifact declaration
  • Registry centric model for easier extension and third party contrib
  • Support for in-memory sqlite db
  • Changed database JSON mapping class and dependency to support mutability tracking
  • New import wrapper class to manage optional dependencies
  • Added dataset_id as a Metric reference. Breaking workflow change! Will raise an error if a dataset is not added and the metric depends on it
  • Dropped default Train pipeline split. Will return an empty split for split pipelines and a singleton full dataset split for NoSplitPipelines
  • Explicitly migrated to tensorflow 2 and tf.keras

0.8.1

12 May 04:27
f5889b4
Compare
Choose a tag to compare
  • Minor patches for classification metrics