This repository has been archived by the owner on Nov 16, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 62
Set of new timeseries transforms #475
Open
ganik
wants to merge
44
commits into
master
Choose a base branch
from
tsaml
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
284fcd7
Use latest ML.Net dev packages from MachineLearning feed.
ad00b70
Re-enable the default nuget.org feed. It does not appear to cause
258a799
Add whitespace change to restart CI build. Linux timed out.
c542c1d
Fix build issue when using pip version >= 20.0.0
4c5bac1
Merge branch 'master' into nuget_update
5423d6a
Merge branch 'master' into nightly
actions-user 324d379
Merge branch 'master' into nightly
actions-user b3ed66b
Merge branch 'master' into nightly
actions-user 5924fdc
Merge branch 'master' into nightly
ganik 5feb56d
preview3
ganik fed9aa2
fix signing
ganik 039356a
run ep only if VerifyManifest
ganik cbe0e75
draft of timeseries transforms
ganik 6a2a913
Updated with latest changes
ganik 7ddbba5
Merge branch 'master' into tsaml
ganik 7ae2fa9
add unit tests
ganik e3196c7
Add timeseries transforms to onnx suite test.
ganik d6ae18f
Add automl ONNX tests
ganik 7244760
0.4.0 version for Featurizers
ganik f333452
Featurizer Onnx Export tests (#484)
angryjinyan ac5ce11
Add tests for DateTimeSplitter with country (#486)
angryjinyan 0d5e594
install ort-featurizers
ganik fa35f9f
Merge branch 'tsaml' of https://github.com/microsoft/NimbusML into tsaml
ganik 634597d
fix feed
ganik 7819fdb
Merge branch 'master' into tsaml
ganik f223671
update version for ort-featurizers
ganik 1e601e7
Merge branch 'tsaml' of https://github.com/microsoft/NimbusML into tsaml
ganik ca0eaa8
fix tests
ganik f234b7c
skip ts checks
ganik 9a0bec9
fix tests
ganik 82f831b
fix test
ganik 2b7accb
MLFeatur vcersion
ganik 7a7bef7
exclude test for Mac
ganik 1315d4d
do mv to save space
ganik 99b0e71
Make more space for build
ganik 7878982
more space
ganik b3797e2
more space
ganik f2e81d3
more space
ganik 9252d1e
more space
ganik b088277
fix build
ganik 2fc6a1f
more space
ganik bea3ec3
check in (#487)
angryjinyan ba376b9
Fix shape (#488)
angryjinyan e9b6b87
Merge branch 'master' into tsaml
ganik File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
############################################################################### | ||
# DateTimeSplitter | ||
import pandas as pd | ||
import numpy as np | ||
from nimbusml import FileDataStream, Pipeline | ||
from nimbusml.datasets import get_dataset | ||
from nimbusml.timeseries import ForecastingPivot, RollingWindow | ||
|
||
# data input (as a FileDataStream) | ||
path = get_dataset('infert').as_filepath() | ||
data = FileDataStream.read_csv(path, sep=',', numeric_dtype=np.double) | ||
|
||
# transform usage | ||
xf = RollingWindow(columns={'age_1': 'age'}, | ||
grain_columns=['education'], | ||
window_calculation='Mean', | ||
max_window_size=1, | ||
horizon=1) | ||
|
||
xf1 = ForecastingPivot(columns_to_pivot=['age_1']) | ||
|
||
pipe = Pipeline([xf, xf1]) | ||
|
||
# fit and transform | ||
features = pipe.fit_transform(data) | ||
|
||
features = features.drop(['row_num', 'education', 'parity', 'induced', | ||
'case', 'spontaneous', 'stratum', 'pooled.stratum'], axis=1) | ||
|
||
# print features | ||
print(features.head(100)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
############################################################################### | ||
# DateTimeSplitter | ||
import pandas as pd | ||
import numpy as np | ||
from nimbusml import FileDataStream | ||
from nimbusml.datasets import get_dataset | ||
from nimbusml.timeseries import LagLeadOperator | ||
|
||
# data input (as a FileDataStream) | ||
path = get_dataset('infert').as_filepath() | ||
data = FileDataStream.read_csv(path, sep=',', numeric_dtype=np.double) | ||
|
||
# transform usage | ||
xf = LagLeadOperator(columns={'age_1': 'age'}, | ||
grain_columns=['education'], | ||
offsets=[-3, 1], | ||
horizon=1) | ||
|
||
# fit and transform | ||
features = xf.fit_transform(data) | ||
|
||
features = features.drop(['row_num', 'education', 'parity', 'induced', | ||
'case', 'spontaneous', 'stratum', 'pooled.stratum'], axis=1) | ||
|
||
# print features | ||
print(features.head(100)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
############################################################################### | ||
# DateTimeSplitter | ||
import pandas as pd | ||
import numpy as np | ||
from nimbusml import FileDataStream | ||
from nimbusml.datasets import get_dataset | ||
from nimbusml.timeseries import RollingWindow | ||
|
||
# data input (as a FileDataStream) | ||
path = get_dataset('infert').as_filepath() | ||
data = FileDataStream.read_csv(path, sep=',', numeric_dtype=np.double) | ||
|
||
# transform usage | ||
xf = RollingWindow(columns={'age_1': 'age'}, | ||
grain_columns=['education'], | ||
window_calculation='Mean', | ||
max_window_size=2, | ||
horizon=2) | ||
|
||
# fit and transform | ||
features = xf.fit_transform(data) | ||
|
||
features = features.drop(['row_num', 'education', 'parity', 'induced', | ||
'case', 'spontaneous', 'stratum', 'pooled.stratum'], axis=1) | ||
|
||
# print features | ||
print(features.head(100)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
############################################################################### | ||
# DateTimeSplitter | ||
import pandas as pd | ||
import numpy as np | ||
from nimbusml import FileDataStream | ||
from nimbusml.datasets import get_dataset | ||
from nimbusml.timeseries import ShortDrop | ||
|
||
# data input (as a FileDataStream) | ||
path = get_dataset('infert').as_filepath() | ||
data = FileDataStream.read_csv(path, sep=',', numeric_dtype=np.double) | ||
|
||
# transform usage | ||
xf = ShortDrop(grain_columns=['education'], min_rows=4294967294) << 'age' | ||
|
||
# fit and transform | ||
features = xf.fit_transform(data) | ||
|
||
features = features.drop(['row_num', 'education', 'parity', 'induced', | ||
'case', 'spontaneous', 'stratum', 'pooled.stratum'], axis=1) | ||
|
||
# print features | ||
print(features.head(100)) |
55 changes: 55 additions & 0 deletions
55
src/python/nimbusml/internal/core/timeseries/forecastingpivot.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# -------------------------------------------------------------------------------------------- | ||
# Copyright (c) Microsoft Corporation. All rights reserved. | ||
# Licensed under the MIT License. | ||
# -------------------------------------------------------------------------------------------- | ||
# - Generated by tools/entrypoint_compiler.py: do not edit by hand | ||
""" | ||
ForecastingPivot | ||
""" | ||
|
||
__all__ = ["ForecastingPivot"] | ||
|
||
|
||
from ...entrypoints.transforms_forecastingpivot import \ | ||
transforms_forecastingpivot | ||
from ...utils.utils import trace | ||
from ..base_pipeline_item import BasePipelineItem, DefaultSignature | ||
|
||
|
||
class ForecastingPivot(BasePipelineItem, DefaultSignature): | ||
""" | ||
**Description** | ||
Pivots the input colums and drops any rows with N/A | ||
|
||
:param columns_to_pivot: List of columns to pivot. | ||
|
||
:param horizon_column_name: Name of the horizon column generated. | ||
|
||
:param params: Additional arguments sent to compute engine. | ||
|
||
""" | ||
|
||
@trace | ||
def __init__( | ||
self, | ||
columns_to_pivot, | ||
horizon_column_name='Horizon', | ||
**params): | ||
BasePipelineItem.__init__( | ||
self, type='transform', **params) | ||
|
||
self.columns_to_pivot = columns_to_pivot | ||
self.horizon_column_name = horizon_column_name | ||
|
||
@property | ||
def _entrypoint(self): | ||
return transforms_forecastingpivot | ||
|
||
@trace | ||
def _get_node(self, **all_args): | ||
algo_args = dict( | ||
columns_to_pivot=self.columns_to_pivot, | ||
horizon_column_name=self.horizon_column_name) | ||
|
||
all_args.update(algo_args) | ||
return self._entrypoint(**all_args) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better name for ShortDrop