Modin 0.3.0
devin-petersohn
released this
24 Jan 19:27
·
2635 commits
to master
since this release
Modin 0.3.0 release notes
This release came with a lot of bugfixes, new features, and performance enhancements. Notably, you can now use Modin out of core to use larger-than-memory DataFrames with a pandas API! We have also set up an easier way to report bugs, simply email [email protected] or [email protected] to report bugs or request features. Modin also now supports Dask Delayed as a backend!
Bugfixes + Pandas Concordance (🐛 + 🐼)
- Various bugfixes for the python backend (#260, #287, #310)
- Fixes indexing issues with
loc
,iloc
, and__getitem__
(#295, #294) __repr__
bugfix (#299)- Fixed block lengths error (#304)
- Fix reset_index with a MultiIndex resulted in a single column (#305)
- Fix index of the result of get_dtype_counts (#308)
- Fixing bug in read_csv erroneously filtered out provided keyword arguments (#314)
- Fixes mode to work with the new backend (#313)
- Fix bug in concat where keys were improperly assigned (#317)
- Allow Modin to use the larger than target item for setitem (#319)
- Adding dtypes to the
exclude
parameter in the remote partition (#321) - Adding additional check to inplace operations (#323)
- Adding modin.pandas.melt for pandas concordance (#325)
- Fixing issue where non-numeric partitions threw errors on numeric ops (#328)
- Adding SeriesView object that allows inplace operations (#326, #331, #354, #357, #373, #375, #390)
- Fix dropna when axis is a string (#329)
- Ensuring ordered retrieval of columns/rows when operating on a subset (#334)
- Adding check for lists of columns with a column not in the index (#336)
- Fixes rmod (#344)
- Fix mode when axis=1 and remove a reindex (#333)
- Fix rfloordiv error (#342)
- Removed pow dtype checking (#346)
- Fix a bug in
get_indices
(#348) - Fixed sort_index (#160)
- Setting min_count default to 0 to match pandas (#352)
- Filtering default exclude values in describe based on the include passed in (#363)
- Fixing partitioning issue when doing a reindex/concat (#361)
- Adding try-except block to internal partition computation for describe (#365)
- Adding crosstab that defaults to pandas to modin.pandas (#367)
- Correctly handling empty DataFrames for selective operations (#369)
- Adding case for operations like align so we can properly convert (#371)
- Adding groupby columns and index name when necessary (#380)
- Checking for type before we check the length to avoid spurious errors. (#383)
- Removing data from DataFrame.hist parameter requirements (#387)
- Adding plotting module. (#389)
- Correcting dtypes after iloc issue (#391)
- Adding isnull to modin.pandas (#395)
- Insert column to a DataFrame with no index (#398)
- support for DataFrame().loc when location (row/col) does not exist (#401)
- Adding support for inserting DataFrames with 1 column (#403)
- removing axis from replace signature (#412)
- align read_excel() signature with pandas 0.23.4 (#415)
- Add a fix to Groupby for aggregations by a column from the DataFrame (#413)
- Adding a way to set existing columns using setattr (#423)
- Fixes out of order columns for loc based indexing (#424)
User experience
New functionality
- Dask Backend 🎉 (#271, #281, #297)
- Out of Core 🎉 (#277)
- Use ray to parallelize read_feather with pyarrow feather API (#292)
Backend enhancements + Performance
- Using numpy arrays instead of python lists for metadata (#300)
- Improve the way that Modin filters out empty partitions (#301)
- Making Modin more efficient with smaller DataFrames (#307)
- Making drop faster for drop operations (#379)
- Converting PandasQueryCompiler.getitem_array to accept numeric indices (#386)
- Removing is_view from PandasQueryCompilerView and codepath requiring it (#393)
- Making Modin more efficient at small DataFrames (#407)
Code Refactor
- Refactor
io
module (#283) - Major refactor of
data_management
(#290) - Removing dead code and removing explicit warnings (#302)
- Refactor testing suite (#93)
Ray updated
Documentation
- Adding badge to README for link to discourse (#332)
- Add logo to repo (#337, #338, #339, #340)
- Update Readme
- Link to Pandas Documentation (#397)
- Adding out of core information to the documentation (#406)
Contributors this release
The following users contributed code to Modin since the last release.
@coobas (New contributor) 🔰
@pcahyna (New contributor) 🔰
@Kopurlso (Returning contributor) 🌟
@eavidan (Committer)
@osalpekar (Committer)
@williamma12 (Committer)
@devin-petersohn (Admin)
🎉🎉 Thank you! 🎉🎉