-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add new index/range based selector cs.by_index
, allow multiple indices for nth
#16217
feat: Add new index/range based selector cs.by_index
, allow multiple indices for nth
#16217
Conversation
…ctor `cs.by_index`
CodSpeed Performance ReportMerging #16217 will not alter performanceComparing Summary
|
95238fe
to
9bcc84a
Compare
col
, and a new index/range based selector cs.by_index
col
, and a new index/range based selector cs.by_index
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16217 +/- ##
==========================================
- Coverage 80.99% 80.83% -0.16%
==========================================
Files 1393 1394 +1
Lines 179445 180095 +650
Branches 2907 2913 +6
==========================================
+ Hits 145335 145574 +239
- Misses 33604 34018 +414
+ Partials 506 503 -3 ☔ View full report in Codecov by Sentry. |
We have the |
It's not the same - df.select(pl.col(*range(10))) df.select(
pl.nth(0),
pl.nth(1),
pl.nth(2),
pl.nth(3),
pl.nth(4),
pl.nth(5),
pl.nth(6),
pl.nth(7),
pl.nth(8),
pl.nth(9),
) Could probably move the multi-index support into Still, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool one! A few comments.
e8e51a6
to
5b869e5
Compare
I'm already not super happy with We can add a See Also to |
Ok, will switch it over to (Update: @stinodego - done). |
0dabbb7
to
9c635ab
Compare
col
, and a new index/range based selector cs.by_index
cs.by_index
, and allows multiple indices for nth
cs.by_index
, and allows multiple indices for nth
cs.by_index
, allows multiple indices for nth
cs.by_index
, allows multiple indices for nth
cs.by_index
, allows multiple indices for nth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice functionality to have! I left a few comments for the Python side.
@@ -646,9 +646,9 @@ def last(*columns: str) -> Expr: | |||
return F.col(*columns).last() | |||
|
|||
|
|||
def nth(n: int, *columns: str) -> Expr: | |||
def nth(n: int | Sequence[int], *columns: str) -> Expr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pity we have this *columns
API, because passing nth(1,2)
would feel really good. I think we should deprecate this API (in another PR) and make this a keyword argument, so we can pass nth(1,2, from=['x', 'y', 'z'])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same thing 👌 Can follow-up with a separate improvement / deprecation; I think there are a few other functions with a similar set of params, so we should check and do all at once, if so.
9c635ab
to
8fb7aa4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, good to go!
cs.by_index
, allows multiple indices for nth
cs.by_index
, allow multiple indices for nth
Enables some interesting new column selection capabilities using indices (including negative indices) and
range
objects, as well as their union, intersection, etc (via selector combinatorics).by_index
selector that works with indices and ranges.nth
to take multiple indices.Examples
Select "key" col and the three first/last numeric columns:
(can freely mix/match indexes and range objects)
Select every 10th column:
Select every 25th numeric column in reverse order: