Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ feat: Feature extraction with an identifier #109

Merged
Merged
Show file tree
Hide file tree
Changes from 68 commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
675f359
🚧 feat: create first rough draft
NielsPraet Aug 1, 2023
aa509f2
✨ feat: rename index axis on group_by calculation
NielsPraet Aug 1, 2023
b1c6ccf
🐛 fix: solve rename issue
NielsPraet Aug 1, 2023
6cd48e9
🐛 fix: solve rename issue... again
NielsPraet Aug 1, 2023
154d94d
🐛 fix: solve df form for group_by
NielsPraet Aug 1, 2023
65c75bd
♻️ refactor: clean up group_by calculate code
NielsPraet Aug 1, 2023
ee7b32a
🎨 chore: format code
NielsPraet Aug 1, 2023
bfc59be
🚸 ux: filter out of bounds warning
NielsPraet Aug 1, 2023
f86bdb8
🔥 chore: remove useless loc
NielsPraet Aug 1, 2023
bb099e1
✅ tests: add tests for new group_by functionality
NielsPraet Aug 2, 2023
dd19d23
🎨 chore: reformat code
NielsPraet Aug 2, 2023
c9b5274
🍱 chore: add dummy test data
NielsPraet Aug 2, 2023
b77fdd4
🎨 tests: add basic group_by benchmark
NielsPraet Aug 2, 2023
5858797
📝 docs: update tsflex calculate docs
NielsPraet Aug 2, 2023
762d346
🚸 ux: warn users when parameters are not being used in group_by case
NielsPraet Aug 2, 2023
ed29c4d
🎨 chore: format code
NielsPraet Aug 2, 2023
f868c93
:dash: Update tsflex/features/feature_collection.py
jonasvdd Aug 3, 2023
fdb265c
:dash: tsflex/features/feature_collection.py
jonasvdd Aug 3, 2023
dc9d770
✅ tests: make nan test more robust
NielsPraet Aug 4, 2023
b53f003
🧪 tests: add failing test for group_by with multiple feature descriptors
NielsPraet Aug 4, 2023
ee1beae
🐛 fix: make sure group_by works properly when multiple feature decsri…
NielsPraet Aug 4, 2023
c881e25
Merge branch 'feat/identifier-feature-extraction' of https://github.c…
NielsPraet Aug 4, 2023
3434e59
🧪 tests: add failing test for feature collection with nan values
NielsPraet Aug 4, 2023
739ecd3
🐛 fix: make sure nan values appear as separate row
NielsPraet Aug 4, 2023
0abd761
📝 docs: update code documentation for _calculate_group_by
NielsPraet Aug 7, 2023
55257f9
✨ feat: add group_by_consecutive function
NielsPraet Aug 7, 2023
ece7f3e
♻️ refactor: rewrite group_by_calculate with new group_by_consecutive…
NielsPraet Aug 7, 2023
f410eb0
✅ tests: update groupby tests
NielsPraet Aug 8, 2023
b1bfe95
🐛 fix: resolve nan bug
NielsPraet Aug 8, 2023
02166ce
✅ tests: fix benchmarks
NielsPraet Aug 8, 2023
ab3fef7
🎨 chore: format code
NielsPraet Aug 8, 2023
b05f126
🚚 chore: replace csv file with parquet file
NielsPraet Aug 8, 2023
116877b
✅ tests: write some extra tests for group_by
NielsPraet Aug 8, 2023
4e71fcd
🎨 chore: format code
NielsPraet Aug 8, 2023
4a55088
✅ tests: update failing test
NielsPraet Aug 8, 2023
9925e17
📝 docs: update groupby documentation
NielsPraet Aug 8, 2023
579b8c6
:crayon: fix code rendering in docs
jonasvdd Aug 9, 2023
3e24204
Merge branch 'main' of https://github.com/predict-idlab/tsflex into f…
NielsPraet Aug 9, 2023
87b6f84
🧪 tests: add test to support numeric indices
NielsPraet Aug 9, 2023
c46ad15
✨ feat: add support for numeric indices
NielsPraet Aug 9, 2023
a012515
📝 docs: fix markdown table rendering
NielsPraet Aug 9, 2023
61f7fbe
🚸 ux: suppress useless warnings
NielsPraet Aug 9, 2023
86c9f50
📝 docs: add groupby example
NielsPraet Aug 9, 2023
7f24cdc
:mag: adding test opts to pyproject
jonasvdd Aug 10, 2023
8324d71
📝 docs: update code documentation
NielsPraet Aug 11, 2023
2c2ecc5
Merge branch 'feat/identifier-feature-extraction' of https://github.c…
NielsPraet Aug 11, 2023
01472e0
🧪 tests: add test for group_by_consecutive with series
NielsPraet Aug 11, 2023
e48981c
✅ tests: add test for function warnings
NielsPraet Aug 11, 2023
299f242
✅ tests: add tests for failing groupby execution
NielsPraet Aug 14, 2023
fcdbadf
♻️ refactor: change error throwing
NielsPraet Aug 14, 2023
7b95663
🎨 chore: format code
NielsPraet Aug 14, 2023
231b795
📝 docs: update group by documentation
NielsPraet Aug 14, 2023
00c79f0
✅ tests: update failing tests
NielsPraet Aug 14, 2023
521232c
💚 ci: enable benchmarks
NielsPraet Aug 14, 2023
e5e9e0c
✅ tests: fix test for older python versions
NielsPraet Aug 14, 2023
a46de29
:pen: review code
jvdd Oct 6, 2023
ba6ffaa
:see_no_evil: fix test
jvdd Oct 6, 2023
d837827
:construction: add groupby_all support
jvdd Oct 12, 2023
8d0d156
:broom: check logging file handlers in advance
jvdd Oct 12, 2023
80b28e6
:recycle: add group_by_all benchmark
jvdd Oct 12, 2023
b35c77b
:broom:
jvdd Oct 12, 2023
913ed6e
:tada: extend tests
jvdd Oct 13, 2023
6c61ec9
:broom: cleanup groupby nan behavior
jvdd Oct 14, 2023
1bb69cd
:heavy_check_mark: test n_jobs for group_by behavior
jvdd Oct 14, 2023
e49b71a
:pen: extend docs
jvdd Oct 14, 2023
380d180
:bug: remove file handler after feature calculation
jvdd Oct 16, 2023
511da52
:see_no_evil: manual instead of groupby logged window name
jvdd Oct 16, 2023
3fd7d10
:pen: review code
jvdd Oct 18, 2023
f3a9496
:detective: test error when multiple windows in case of custom segments
jvdd Oct 22, 2023
8dee606
:kangaroo: test groupby logging
jvdd Oct 23, 2023
2a19f41
:see_no_evil: fix tests
jvdd Oct 23, 2023
6059926
:arrow_up: upgrade pytest-codspeed and update other deps
jvdd Oct 24, 2023
d219100
Merge branch 'main' into feat/identifier-feature-extraction
jvdd Oct 25, 2023
50b91b7
:pray: temporarily disable benchmark of groupby
jvdd Oct 26, 2023
32e8089
:see_no_evil: lock pycatch dev depedency to avoid windows error
jvdd Oct 26, 2023
c148359
:pray:
jvdd Dec 24, 2023
9526a32
:mag: review README
jonasvdd Jan 3, 2024
7e2d5c6
:mag: review
jonasvdd Jan 5, 2024
d532bff
:mag; review
jonasvdd Jan 5, 2024
a538af4
:fire: use tuple
jvdd Jan 19, 2024
cc8eb8d
:broom
jvdd Jan 19, 2024
45aa8bd
:detective: code review with @jonasvdd
jvdd Jan 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added examples/data/grouped_data.parquet
Binary file not shown.
1,225 changes: 605 additions & 620 deletions examples/verbose_example.ipynb

Large diffs are not rendered by default.

7 changes: 6 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -79,9 +79,14 @@ ignore = ["E501"] # Never enforce `E501` (line length violations).
"tests/test_stroll_factory.py" = ["F401", "F811"]
"tests/test_utils.py" = ["F401", "F811"]

# Testing
[tool.pytest.ini_options]
addopts = "--cov=tsflex --cov-report=term-missing"
testpaths = "tests/"

# Formatting
[tool.black]
color = true
color = false
line-length = 88

[build-system]
Expand Down
16 changes: 15 additions & 1 deletion tests/benchmarks/test_featurecollection.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from tsflex.features.feature import FeatureDescriptor, MultipleFeatureDescriptors
from tsflex.features.feature_collection import FeatureCollection

from ..utils import dummy_data # noqa: F401
from ..utils import dummy_data, dummy_group_data # noqa: F401

FUNCS = [np.sum, np.min, np.max, np.mean, np.median, np.std, np.var]
MAX_CPUS = os.cpu_count() or 2
Expand Down Expand Up @@ -50,3 +50,17 @@ def test_single_series_feature_collection_multiple_descriptors(
fc = FeatureCollection(mfd)

benchmark(fc.calculate, dummy_data, n_jobs=n_cores)


@pytest.mark.benchmark(group="group_by collection")
@pytest.mark.parametrize("n_cores", NB_CORES)
@pytest.mark.parametrize("func", FUNCS)
@pytest.mark.parametrize("group_by", ["group_by_all", "group_by_consecutive"])
def test_single_series_feature_collection_group_by_consecutive(
benchmark, n_cores, func, group_by, dummy_group_data # noqa: F811
):
fd = FeatureDescriptor(function=func, series_name="number_sold")

fc = FeatureCollection(feature_descriptors=fd)

benchmark(fc.calculate, dummy_group_data, n_jobs=n_cores, **{group_by: "store"})
Loading
Loading