Skip to content

Modin 0.3.0

Compare
Choose a tag to compare
@devin-petersohn devin-petersohn released this 24 Jan 19:27
· 2635 commits to master since this release
79c1113

Modin 0.3.0 release notes

This release came with a lot of bugfixes, new features, and performance enhancements. Notably, you can now use Modin out of core to use larger-than-memory DataFrames with a pandas API! We have also set up an easier way to report bugs, simply email [email protected] or [email protected] to report bugs or request features. Modin also now supports Dask Delayed as a backend!

Bugfixes + Pandas Concordance (🐛 + 🐼)

  • Various bugfixes for the python backend (#260, #287, #310)
  • Fixes indexing issues with loc, iloc, and __getitem__ (#295, #294)
  • __repr__ bugfix (#299)
  • Fixed block lengths error (#304)
  • Fix reset_index with a MultiIndex resulted in a single column (#305)
  • Fix index of the result of get_dtype_counts (#308)
  • Fixing bug in read_csv erroneously filtered out provided keyword arguments (#314)
  • Fixes mode to work with the new backend (#313)
  • Fix bug in concat where keys were improperly assigned (#317)
  • Allow Modin to use the larger than target item for setitem (#319)
  • Adding dtypes to the exclude parameter in the remote partition (#321)
  • Adding additional check to inplace operations (#323)
  • Adding modin.pandas.melt for pandas concordance (#325)
  • Fixing issue where non-numeric partitions threw errors on numeric ops (#328)
  • Adding SeriesView object that allows inplace operations (#326, #331, #354, #357, #373, #375, #390)
  • Fix dropna when axis is a string (#329)
  • Ensuring ordered retrieval of columns/rows when operating on a subset (#334)
  • Adding check for lists of columns with a column not in the index (#336)
  • Fixes rmod (#344)
  • Fix mode when axis=1 and remove a reindex (#333)
  • Fix rfloordiv error (#342)
  • Removed pow dtype checking (#346)
  • Fix a bug in get_indices (#348)
  • Fixed sort_index (#160)
  • Setting min_count default to 0 to match pandas (#352)
  • Filtering default exclude values in describe based on the include passed in (#363)
  • Fixing partitioning issue when doing a reindex/concat (#361)
  • Adding try-except block to internal partition computation for describe (#365)
  • Adding crosstab that defaults to pandas to modin.pandas (#367)
  • Correctly handling empty DataFrames for selective operations (#369)
  • Adding case for operations like align so we can properly convert (#371)
  • Adding groupby columns and index name when necessary (#380)
  • Checking for type before we check the length to avoid spurious errors. (#383)
  • Removing data from DataFrame.hist parameter requirements (#387)
  • Adding plotting module. (#389)
  • Correcting dtypes after iloc issue (#391)
  • Adding isnull to modin.pandas (#395)
  • Insert column to a DataFrame with no index (#398)
  • support for DataFrame().loc when location (row/col) does not exist (#401)
  • Adding support for inserting DataFrames with 1 column (#403)
  • removing axis from replace signature (#412)
  • align read_excel() signature with pandas 0.23.4 (#415)
  • Add a fix to Groupby for aggregations by a column from the DataFrame (#413)
  • Adding a way to set existing columns using setattr (#423)
  • Fixes out of order columns for loc based indexing (#424)

User experience

  • Add email addresses for reporting bugs/requesting implementation (#296, #426)

New functionality

  • Dask Backend 🎉 (#271, #281, #297)
  • Out of Core 🎉 (#277)
  • Use ray to parallelize read_feather with pyarrow feather API (#292)

Backend enhancements + Performance

  • Using numpy arrays instead of python lists for metadata (#300)
  • Improve the way that Modin filters out empty partitions (#301)
  • Making Modin more efficient with smaller DataFrames (#307)
  • Making drop faster for drop operations (#379)
  • Converting PandasQueryCompiler.getitem_array to accept numeric indices (#386)
  • Removing is_view from PandasQueryCompilerView and codepath requiring it (#393)
  • Making Modin more efficient at small DataFrames (#407)

Code Refactor

  • Refactor io module (#283)
  • Major refactor of data_management (#290)
  • Removing dead code and removing explicit warnings (#302)
  • Refactor testing suite (#93)

Ray updated

  • Updated Ray dependency to 0.6.2 (#417)

Documentation

Contributors this release

The following users contributed code to Modin since the last release.

@coobas (New contributor) 🔰
@pcahyna (New contributor) 🔰
@Kopurlso (Returning contributor) 🌟
@eavidan (Committer)
@osalpekar (Committer)
@williamma12 (Committer)
@devin-petersohn (Admin)

🎉🎉 Thank you! 🎉🎉