Skip to content

Modin 0.8.3

Compare
Choose a tag to compare
@devin-petersohn devin-petersohn released this 12 Jan 14:19
· 1661 commits to master since this release
0.8.3
bcab1cc
Modin 0.8.3 release notes

This release contains a number of bugfixes and testing/code quality improvements. See details below for the updates since last release.

Bugfixes + Pandas Concordance (🐛 + 🐼)
----------------------------------------
* FIX-#2386: add new location for import ray functions (#2387)
* FIX-#2380: don't ignore lengths parameter for dask engine (#2381)
* FIX-#2390: Fix inserting Series into DataFrame (#2391)
* FIX-2200: Enable Calcite by default in OmniSci backend (#2385)
* FEAT-#2363: fix index name setter in OmniSci backend (#2379)
* FIX-#2406: filter dictionary aggregation keys to limit them to keys only present in current partition (#2407)
* FIX-#2473: Some configuration values should not be transformed (#2476)
* FIX-#2402: Fix read_excel when files come from older windows (#2403)
* Ensure excel reader closes file if it is passed as path (#2514)
* FIX-#2442: fixed Series assignment with different indices (#2443)
* Fix indices when reading Excel files in parallel (#2526)
* FIX-#2527: Use random name for hdf file test, clean file after testing (#2528)
* FIX-#2408: Fix read_csv and read_table args when used inside a decora… (#2486)
* Fix .loc[] assignment for Modin Series (#2555)
* FIX-#2482: improved handling non-str 'by' (#2548)
* Fix loc/iloc assignments when columns are selected (#2536)
* FIX-#2559: Ignore files from /proc/ when detecting file leaks (#2560)
* FIX-#2566: Ensure `Series.unique` does not return a scalar when there is only one unique value (#2567)
* FIX-#2543: fixed handling 'as_index' at groupby dictionary renaming aggregation (#2592)

New Functionality ✨
--------------------
* FEAT-#2375: implementation of multi-column groupby aggregation (#2461)
* FEAT-#2013: merge_asof that is a little more efficient (#2510)
* FIX-#2540: add __iter__ implementation (#2541)

Code Quality + Testing 💯
-------------------------
* TEST-#2289: Columns, Index Locations and Names parameters of read_csv (#2319)
* REFACTOR-#2397: remove redundant assigment (#2398)
* FIX-#2450: fix CI recipe (#2449)
* FEAT-#2444: add docker file for nyc on omnisci (#2445)
* FIX-#2456: update taxi queries with .copy usage (#2457)
* FEAT-#2447: add docker file for census on omnisci (#2448)
* REFACTOR-#2467: Convert internal base dataframe objects to ABC (#2468)
* FIX-#2459: Updated TeamCity tests image to use Ray as base image (#2460)
* TEST-#2488: Increase commitlint message length limit to 88 characters from 70 (#2489)
* TEST-#2290: Cover by tests General Parsing Configuration parameters of read_csv (#2331)
* TEST-#2291: Cover by tests NA and Missing Data Handling parameters of read_csv (#2337)
* TEST-#2294: add iteration parameters for read_csv tests (#2477)
* FIX-#2463: Added test with callable functions as aggregate argument (#2503)
* TEST-#2296: Error Handling parameters of read_csv (#2501)
* TEST-#2295: Cover by tests Quoting, Compression, and File Format parameters of read_csv (#2495)
* FIX-#2374: remove extra code; add pandas way to handle duplicate values in reindex func for binary operations (#2378)
* TEST-#2297: Cover by tests Internal parameters of read_csv (#2502)
* TEST-#2509: Io tests refactoring (#2523)
* FIX-#2550: remove decorators usage for asv tested functions (#2551)

Backend enhancements + Performance 🚀
-------------------------------------
* FIX-#2453: Remove sorting indices for equal values in `Series.value_counts` (#2454)
* FIX-#2169: avoid unnecessary index access in groupby (#2469)
* FIX-#2313: improved handling non-numeric types at 'mean' when 'axis=1' (#2535)
* FEAT-#2520: add most important operations for asv benchmarks (#2539)
* FEAT-#2491: optimized groupby dictionary aggregation (#2534)
* FEAT-#2553: add ability to run microbenchmarks for old Modin version (#2554)

Documentation 📃
----------------
* DOCS-#2413: Add examples page to documentation (#2414)
* DOCS-#2415: Add comparisons section to documentation with stubs (#2416)
* DOCS-#2417: add sklearn example (#2425)
* DOCS-#2421: Fixes bad link on contributing from architecture.rst (#2427)
* DOCS-#2419: Updated CONTRIBUTING.rst (#2423)
* DOCS-#2426,DOCS-#2424: Fixed two issues (#2431)
* DOCS-#2420: Changed documentation to numpydoc style (#2429)
* DOCS-#2433: Updated README.md with modin_vs_dask.md doc (#2435)
* DOCS-#2437: Add documentation contrasting Modin and Dask (#2441)
* DOCS-#2439: Add Documentation for Modin vs. pandas (#2487)
* DOCS-#2436: Explicit local / single node backend (#2483)
* DOCS-#2518: add asv usage topic (#2549)
* Fix taxi-runner.py cluster example (#2557)
* DOCS-#2578: fix simple typo, parition -> partition (#2573)

Dependencies
------------
* FIX-#2388: Fixed requirements for omnisci binaries (#2389)
* FIX-#2458: fix 'psutil' install (#2452)
* FEAT-#2479: integrate asv (#2484)
* FIX-#2524: Update pandas version to 1.1.5 (#2525)
* FIX-#2498: Fix possible number of partitions for Dask engine (#2532)
* FEAT-#2236: Handling of space limited Ray Plasma directories (#2547)
* Switch to Ray from conda-forge (#2562)
* FIX-#2572: fixed arrow version in OmniSci dependencies (#2571)
* FIX-#0000: pin xlrd<=1.2.0 (#2594)

Contributors this release
-------------------------

The following users contributed code to Modin since the last release.

@reshamas
@vfdev-5
@mohdkashif93
@abdulelahsm
@ashahba
@raphaelauv
@richardlin047
@timgates42
@ienkovich
@itamarst
@amyskov
@vnlitvinov
@dchigarev
@YarShev
@anmyachev
@gshimansky
@devin-petersohn