Modin 0.8.3
devin-petersohn
released this
12 Jan 14:19
·
1661 commits
to master
since this release
Modin 0.8.3 release notes This release contains a number of bugfixes and testing/code quality improvements. See details below for the updates since last release. Bugfixes + Pandas Concordance (🐛 + 🐼) ---------------------------------------- * FIX-#2386: add new location for import ray functions (#2387) * FIX-#2380: don't ignore lengths parameter for dask engine (#2381) * FIX-#2390: Fix inserting Series into DataFrame (#2391) * FIX-2200: Enable Calcite by default in OmniSci backend (#2385) * FEAT-#2363: fix index name setter in OmniSci backend (#2379) * FIX-#2406: filter dictionary aggregation keys to limit them to keys only present in current partition (#2407) * FIX-#2473: Some configuration values should not be transformed (#2476) * FIX-#2402: Fix read_excel when files come from older windows (#2403) * Ensure excel reader closes file if it is passed as path (#2514) * FIX-#2442: fixed Series assignment with different indices (#2443) * Fix indices when reading Excel files in parallel (#2526) * FIX-#2527: Use random name for hdf file test, clean file after testing (#2528) * FIX-#2408: Fix read_csv and read_table args when used inside a decora… (#2486) * Fix .loc[] assignment for Modin Series (#2555) * FIX-#2482: improved handling non-str 'by' (#2548) * Fix loc/iloc assignments when columns are selected (#2536) * FIX-#2559: Ignore files from /proc/ when detecting file leaks (#2560) * FIX-#2566: Ensure `Series.unique` does not return a scalar when there is only one unique value (#2567) * FIX-#2543: fixed handling 'as_index' at groupby dictionary renaming aggregation (#2592) New Functionality ✨ -------------------- * FEAT-#2375: implementation of multi-column groupby aggregation (#2461) * FEAT-#2013: merge_asof that is a little more efficient (#2510) * FIX-#2540: add __iter__ implementation (#2541) Code Quality + Testing 💯 ------------------------- * TEST-#2289: Columns, Index Locations and Names parameters of read_csv (#2319) * REFACTOR-#2397: remove redundant assigment (#2398) * FIX-#2450: fix CI recipe (#2449) * FEAT-#2444: add docker file for nyc on omnisci (#2445) * FIX-#2456: update taxi queries with .copy usage (#2457) * FEAT-#2447: add docker file for census on omnisci (#2448) * REFACTOR-#2467: Convert internal base dataframe objects to ABC (#2468) * FIX-#2459: Updated TeamCity tests image to use Ray as base image (#2460) * TEST-#2488: Increase commitlint message length limit to 88 characters from 70 (#2489) * TEST-#2290: Cover by tests General Parsing Configuration parameters of read_csv (#2331) * TEST-#2291: Cover by tests NA and Missing Data Handling parameters of read_csv (#2337) * TEST-#2294: add iteration parameters for read_csv tests (#2477) * FIX-#2463: Added test with callable functions as aggregate argument (#2503) * TEST-#2296: Error Handling parameters of read_csv (#2501) * TEST-#2295: Cover by tests Quoting, Compression, and File Format parameters of read_csv (#2495) * FIX-#2374: remove extra code; add pandas way to handle duplicate values in reindex func for binary operations (#2378) * TEST-#2297: Cover by tests Internal parameters of read_csv (#2502) * TEST-#2509: Io tests refactoring (#2523) * FIX-#2550: remove decorators usage for asv tested functions (#2551) Backend enhancements + Performance 🚀 ------------------------------------- * FIX-#2453: Remove sorting indices for equal values in `Series.value_counts` (#2454) * FIX-#2169: avoid unnecessary index access in groupby (#2469) * FIX-#2313: improved handling non-numeric types at 'mean' when 'axis=1' (#2535) * FEAT-#2520: add most important operations for asv benchmarks (#2539) * FEAT-#2491: optimized groupby dictionary aggregation (#2534) * FEAT-#2553: add ability to run microbenchmarks for old Modin version (#2554) Documentation 📃 ---------------- * DOCS-#2413: Add examples page to documentation (#2414) * DOCS-#2415: Add comparisons section to documentation with stubs (#2416) * DOCS-#2417: add sklearn example (#2425) * DOCS-#2421: Fixes bad link on contributing from architecture.rst (#2427) * DOCS-#2419: Updated CONTRIBUTING.rst (#2423) * DOCS-#2426,DOCS-#2424: Fixed two issues (#2431) * DOCS-#2420: Changed documentation to numpydoc style (#2429) * DOCS-#2433: Updated README.md with modin_vs_dask.md doc (#2435) * DOCS-#2437: Add documentation contrasting Modin and Dask (#2441) * DOCS-#2439: Add Documentation for Modin vs. pandas (#2487) * DOCS-#2436: Explicit local / single node backend (#2483) * DOCS-#2518: add asv usage topic (#2549) * Fix taxi-runner.py cluster example (#2557) * DOCS-#2578: fix simple typo, parition -> partition (#2573) Dependencies ------------ * FIX-#2388: Fixed requirements for omnisci binaries (#2389) * FIX-#2458: fix 'psutil' install (#2452) * FEAT-#2479: integrate asv (#2484) * FIX-#2524: Update pandas version to 1.1.5 (#2525) * FIX-#2498: Fix possible number of partitions for Dask engine (#2532) * FEAT-#2236: Handling of space limited Ray Plasma directories (#2547) * Switch to Ray from conda-forge (#2562) * FIX-#2572: fixed arrow version in OmniSci dependencies (#2571) * FIX-#0000: pin xlrd<=1.2.0 (#2594) Contributors this release ------------------------- The following users contributed code to Modin since the last release. @reshamas @vfdev-5 @mohdkashif93 @abdulelahsm @ashahba @raphaelauv @richardlin047 @timgates42 @ienkovich @itamarst @amyskov @vnlitvinov @dchigarev @YarShev @anmyachev @gshimansky @devin-petersohn