Skip to content

Releases: modin-project/modin

Modin 0.11.3

09 Nov 23:42
0.11.3
587861c
Compare
Choose a tag to compare
This release contains some bugfixes.

Key Features and Updates
------------------------
* Stability and bugfixes
  * DataFrame window functions: fix internal and external indices mismatch error (11d06b3)
  * read_csv and read_fwf: fix support for certain unicode encodings (7b11994)
* Dependencies
  * Pin pyparsing <=2.4.7 to fix dfsql (6755ffc)

Contributors this release
-------------------------

The following users contributed code to Modin since the last release:
@amyskov
@YarShev
@anmyachev
@vnlitvinov

Modin 0.11.2

21 Oct 19:58
0.11.2
18ffe07
Compare
Choose a tag to compare
This release contains pandas update and a few bugfixes.

Key Features and Updates
------------------------
* Update to pandas 1.3.4 (de8a782)
* Stability and bugfixes
  * read_feather: fix relative path processing (2c106f3)
  * read_csv: fix support for squeeze=True (648a5da)
  * Series.apply: fix support for some kinds of functions (66729ec)
* Dependencies
  * Pin OmniSci to <=5.7.1 in testing (77d3588)

Contributors this release
-------------------------

The following users contributed code to Modin since the last release:
@YarShev
@dchigarev
@devin-petersohn
@naren-ponder
@prutskov
@vnlitvinov

Modin 0.11.1

06 Oct 20:16
0.11.1
9c386b0
Compare
Choose a tag to compare
This release contains significant amounts of improvements to the
maintainability of the code.

Key Features and Updates
------------------------
* Stability and bugfixes
  * Read_feather: cast columns from Index to list (bb618df)
  * Refactor read_csv skiprows parameter processing (6a47229)
  * Fix usage of modin.HDFStore in modin.read_hdf func (15d3fba)
  * Always keep 'by' data in groupby.__getitem__ (c2b399a)
* Pandas API implementations and improvements
  * pandas.read_gbq: remove deprecated parameters (13e0af0)
* Expansion in testing
* Dependencies
  * Unpin boto3 in setup.py (aee31ba)
* Omnisci backend enhancements
  * Fix arrow execution for empty frame. (b9a22cc)

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@prutskov
@gshimansky
@ienkovich
@anmyachev
@devin-petersohn

Modin 0.11.0

28 Sep 19:58
0.11.0
c3b8d7e
Compare
Choose a tag to compare
This release contains significant amounts of improvements to the
maintainability of the code and bugfixes. Multiple new additions
were made to the pandas API coverage.

Key Features and Updates
------------------------
* Stability and bugfixes
  * Fix __setitem__ when key is unhashable list (57bcfc1)
  * Fix slice_shift when index has duplicates (cf47333)
* Pandas API implementations and improvements
  * experimental ray implementation of read_pickle, to_pickle (cdf47ac)
  * Add support for Series.str.__getitem__ (7631def)
  * skiprows support added for read_csv (b8098bd)
  * support local variables in query and eval (0a64275)
  * Add storage_options param for read_parquet (da2ad79)
  * Fixed 'value_counts' implementation (ebd07dd)
  * Warn user about heterogeneous data presence during read_csv (15f168c)
  * Add attribute api in modin.pandas (9865616)
* XGBoost enhancements
  * Add async execution support for Modin xgb.predict (ab58612)
  * Fix processing of evals parameter in Modin xgb (9bef861)
* Developer API enhancements
  * Add additional parameters for from_partitions (ca43e8d)
  * add a way to get custom shapes (984e68f)
  * use MODIN_MEMORY to specify memory for dask engine (8a3b105)
* Expansion in testing
* Documentation improvements
* pandas 1.3.3 support (f91ee0a)
* Omnisci backend enhancements
  * add value_counts benchmark for OmniSci backend (600fa26)
  * Get rid of Ray when using OmniSci engine (ac8e2d1)
  * Support columns renaming in arrow execution. (3079783)
  * support logical 'and' and 'or' in filters. (ecaab1b)
  * fix dtypes for OmniSci dataframes. (02a34eb)
  * update Arrow to 3.0 for OmniSci backend (411b322)

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@YarShev
@anmyachev
@dchigarev
@vnlitvinov
@gshimansky
@prutskov
@amyskov
@krfricke
@fexolm
@devin-petersohn

Modin 0.10.2

21 Aug 01:25
0.10.2
0bc409d
Compare
Choose a tag to compare
This release contains minor bugfixes since 0.10.1. The supported pandas version was
upgraded to the latest pandas release (1.3.2). For a detailed breakdown of the bugs
fixed, please look at the changelog.

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@YarShev
@anmyachev
@dchigarev
@vnlitvinov
@ienkovich
@prutskov
@amyskov
@Lozovskii-Aleksandr
@Garra1980
@Rubtsowa
@alexlenail
@devin-petersohn

Modin 0.10.1

13 Jul 01:38
0.10.1
50457cc
Compare
Choose a tag to compare
This release contains minor bugfixes since 0.10.0.

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@YarShev
@ckw017
@anmyachev
@dchigarev
@vnlitvinov
@gshimansky
@prutskov
@amyskov
@Rubtsowa
@devin-petersohn

Modin 0.10.0

10 Jun 01:15
0.10.0
cf259f1
Compare
Choose a tag to compare
This release contains significant amounts of improvements to the
maintainability of the code. Documentation was added for the low
level code at many levels.

Another key addition this release is the two major interface
additions: Spreadsheet and SQL.

Key Features and Updates
------------------------
* Many Documentation updates
* Stability and bugfixes
* Spreadsheet Interface
* SQL Interface
* Ray 1.4 support
* pandas 1.2.4 support
* Performance improvements
  * Improvements for XGBoost
  * Some groupby calls
  * map operations
* Metadata management improvements
* Improvements to Testing and CI
* pandas API ehancements
  * fillna
* Omnisci backend enhancements

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@YarShev
@anmyachev
@dchigarev
@krfricke
@tkeech1
@vnlitvinov
@btseytlin
@gshimansky
@prutskov
@todd-yu
@kvu35
@amyskov
@richardlin047
@igalink
@devin-petersohn

Modin 0.9.1

16 Mar 02:04
0.9.1
21707d2
Compare
Choose a tag to compare
Modin 0.9.1 Release Notes

This release contained a number of bugfixes. All users are recommended
to update to the latest version.

Key Bugfixes
------------
* FIX-#2798: Fix number of partitions for dataframe on a cluster (#2828)
* FIX-#2859: Fix metadata calculation on reduce operations (#2860)
* FIX-#2857: Correctly handle identical index binary operations (#2862)
* FIX-#2869: Fix setting NPartitions via put (#2870)
* Fix Pickle support for DataFrame and Series (#2835)

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@mGalarnyk
@YarShev
@anmyachev
@gshimansky
@prutskov
@dchigarev
@RehanSD
@devin-petersohn

Thank you!

Modin 0.9.0

04 Mar 23:06
0.9.0
be33e95
Compare
Choose a tag to compare
Modin 0.9.0 Release Notes

This release contained >80 commits from 18 contributors. With this
release
we have initial support for a spreadsheet interface and many
improvements
to performance and stability.

New Functionality
-----------------
* Spreadsheet Interface
* XGBoost Support Improvement
* Read multiple CSV files at once with `read_csv_glob`

Key Bugfixes
------------
* Parquet Metadata issue fixed: #1476

Documentation
-------------
* Documentation enhancements and improvements. More to come soon!

Dependencies
------------
* Support for Pandas 1.2+

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@gshimansky
@mzjp2
@kvu35
@tirkarthi
@abykovsk
@noah-kuo
@amyskov
@RehanSD
@williamma12
@alphavector
@richardlin047
@todd-yu
@anmyachev
@dchigarev
@vnlitvinov
@YarShev
@prutskov
@devin-petersohn

Modin 0.8.3

12 Jan 14:19
0.8.3
bcab1cc
Compare
Choose a tag to compare
Modin 0.8.3 release notes

This release contains a number of bugfixes and testing/code quality improvements. See details below for the updates since last release.

Bugfixes + Pandas Concordance (🐛 + 🐼)
----------------------------------------
* FIX-#2386: add new location for import ray functions (#2387)
* FIX-#2380: don't ignore lengths parameter for dask engine (#2381)
* FIX-#2390: Fix inserting Series into DataFrame (#2391)
* FIX-2200: Enable Calcite by default in OmniSci backend (#2385)
* FEAT-#2363: fix index name setter in OmniSci backend (#2379)
* FIX-#2406: filter dictionary aggregation keys to limit them to keys only present in current partition (#2407)
* FIX-#2473: Some configuration values should not be transformed (#2476)
* FIX-#2402: Fix read_excel when files come from older windows (#2403)
* Ensure excel reader closes file if it is passed as path (#2514)
* FIX-#2442: fixed Series assignment with different indices (#2443)
* Fix indices when reading Excel files in parallel (#2526)
* FIX-#2527: Use random name for hdf file test, clean file after testing (#2528)
* FIX-#2408: Fix read_csv and read_table args when used inside a decora… (#2486)
* Fix .loc[] assignment for Modin Series (#2555)
* FIX-#2482: improved handling non-str 'by' (#2548)
* Fix loc/iloc assignments when columns are selected (#2536)
* FIX-#2559: Ignore files from /proc/ when detecting file leaks (#2560)
* FIX-#2566: Ensure `Series.unique` does not return a scalar when there is only one unique value (#2567)
* FIX-#2543: fixed handling 'as_index' at groupby dictionary renaming aggregation (#2592)

New Functionality ✨
--------------------
* FEAT-#2375: implementation of multi-column groupby aggregation (#2461)
* FEAT-#2013: merge_asof that is a little more efficient (#2510)
* FIX-#2540: add __iter__ implementation (#2541)

Code Quality + Testing 💯
-------------------------
* TEST-#2289: Columns, Index Locations and Names parameters of read_csv (#2319)
* REFACTOR-#2397: remove redundant assigment (#2398)
* FIX-#2450: fix CI recipe (#2449)
* FEAT-#2444: add docker file for nyc on omnisci (#2445)
* FIX-#2456: update taxi queries with .copy usage (#2457)
* FEAT-#2447: add docker file for census on omnisci (#2448)
* REFACTOR-#2467: Convert internal base dataframe objects to ABC (#2468)
* FIX-#2459: Updated TeamCity tests image to use Ray as base image (#2460)
* TEST-#2488: Increase commitlint message length limit to 88 characters from 70 (#2489)
* TEST-#2290: Cover by tests General Parsing Configuration parameters of read_csv (#2331)
* TEST-#2291: Cover by tests NA and Missing Data Handling parameters of read_csv (#2337)
* TEST-#2294: add iteration parameters for read_csv tests (#2477)
* FIX-#2463: Added test with callable functions as aggregate argument (#2503)
* TEST-#2296: Error Handling parameters of read_csv (#2501)
* TEST-#2295: Cover by tests Quoting, Compression, and File Format parameters of read_csv (#2495)
* FIX-#2374: remove extra code; add pandas way to handle duplicate values in reindex func for binary operations (#2378)
* TEST-#2297: Cover by tests Internal parameters of read_csv (#2502)
* TEST-#2509: Io tests refactoring (#2523)
* FIX-#2550: remove decorators usage for asv tested functions (#2551)

Backend enhancements + Performance 🚀
-------------------------------------
* FIX-#2453: Remove sorting indices for equal values in `Series.value_counts` (#2454)
* FIX-#2169: avoid unnecessary index access in groupby (#2469)
* FIX-#2313: improved handling non-numeric types at 'mean' when 'axis=1' (#2535)
* FEAT-#2520: add most important operations for asv benchmarks (#2539)
* FEAT-#2491: optimized groupby dictionary aggregation (#2534)
* FEAT-#2553: add ability to run microbenchmarks for old Modin version (#2554)

Documentation 📃
----------------
* DOCS-#2413: Add examples page to documentation (#2414)
* DOCS-#2415: Add comparisons section to documentation with stubs (#2416)
* DOCS-#2417: add sklearn example (#2425)
* DOCS-#2421: Fixes bad link on contributing from architecture.rst (#2427)
* DOCS-#2419: Updated CONTRIBUTING.rst (#2423)
* DOCS-#2426,DOCS-#2424: Fixed two issues (#2431)
* DOCS-#2420: Changed documentation to numpydoc style (#2429)
* DOCS-#2433: Updated README.md with modin_vs_dask.md doc (#2435)
* DOCS-#2437: Add documentation contrasting Modin and Dask (#2441)
* DOCS-#2439: Add Documentation for Modin vs. pandas (#2487)
* DOCS-#2436: Explicit local / single node backend (#2483)
* DOCS-#2518: add asv usage topic (#2549)
* Fix taxi-runner.py cluster example (#2557)
* DOCS-#2578: fix simple typo, parition -> partition (#2573)

Dependencies
------------
* FIX-#2388: Fixed requirements for omnisci binaries (#2389)
* FIX-#2458: fix 'psutil' install (#2452)
* FEAT-#2479: integrate asv (#2484)
* FIX-#2524: Update pandas version to 1.1.5 (#2525)
* FIX-#2498: Fix possible number of partitions for Dask engine (#2532)
* FEAT-#2236: Handling of space limited Ray Plasma directories (#2547)
* Switch to Ray from conda-forge (#2562)
* FIX-#2572: fixed arrow version in OmniSci dependencies (#2571)
* FIX-#0000: pin xlrd<=1.2.0 (#2594)

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@reshamas
@vfdev-5
@mohdkashif93
@abdulelahsm
@ashahba
@raphaelauv
@richardlin047
@timgates42
@ienkovich
@itamarst
@amyskov
@vnlitvinov
@dchigarev
@YarShev
@anmyachev
@gshimansky
@devin-petersohn