Releases: tensorflow/tfx
Releases · tensorflow/tfx
TFX 1.0.0-rc0
Major Features and Improvements
- Added tfx.v1 Public APIs
Breaking Changes
For Pipeline Authors
- N/A
For Component Authors
- Apache Beam support is migrated from TFX Base Components and Executors to
dedicated Beam Components and Executors.BaseExecutor
will no longer embed
beam_pipeline_args
. Custom executors for Beam powered components should
now extend BaseBeamExecutor instead of BaseExecutor.
Deprecations
- N/A
Bug Fixes and Other Changes
- Removed
six
dependency. - Depends on
apache-beam[gcp]>=2.29,<3
. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.5.*,<3
. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.5.*,<3
.
Documentation Updates
- N/A
TFX 0.30.0
Major Features and Improvements
- Upgraded TFX to KFP compiler to use KFP IR schema version 2.0.0.
- InfraValidator can now produce a SavedModel with warmup requests. This feature is
enabled by settingRequestSpec.make_warmup = True
. The SavedModel will be
stored in the InfraBlessing artifact (blessing
output of InfraValidator). - Pusher's
model
input is now optional, andinfra_blessing
can be used
instead to push the SavedModel with warmup requests, produced by an
InfraValidator. Note that InfraValidator does not always create a SavedModel,
and the producer InfraValidator must be configured with
RequestSpec.make_warmup = True
in order to be pushed by a Pusher. - Support is added for the JSON_VALUE artifact property type, allowing storage
of JSON-compatible objects as artifact metadata. - Support is added for the KFP v2 artifact metadata field when executing using
the KFP v2 container entrypoint. - InfraValidator for Kubernetes now can override Pod manifest to customize
annotations and environment variables. - Allow Beam pipeline args to be extended by specifying
beam_pipeline_args
per component. - Support string RuntimeParameters on Airflow.
- User code specified through the
module_file
argument for the Evaluator,
Transform, Trainer and Tuner components is now packaged as a pip wheel for
execution. For Evaluator and Transform, these wheel packages are now
installed on remote Apache Beam workers.
Breaking Changes
For Pipeline Authors
- CLI usage with kubeflow changed significantly. You MUST use the new:
--build-image
to build a container image when
updating a pipeline with kubeflow engine.--build-target-image
flag in CLI is changed to--build-image
without
any container image argument. TFX will auto detect the image specified in
the KubeflowDagRunnerConfig class instance. For example,tfx pipeline create --pipeline-path=runner.py --endpoint=xxx --build-image tfx pipeline update --pipeline-path=runner.py --endpoint=xxx --build-image
--package-path
and--skaffold_cmd
flags were deleted. The compiled path
can be specified when creating a KubeflowDagRunner class instance. TFX CLI
doesn't depend on skaffold any more and use Docker SDK directly.- Default orchestration engine of CLI was changed to
local
orchestrator from
beam
orchestrator. You can still usebeam
orchestrator with
--engine=beam
flag. - Trainer now uses GenericExecutor as default. To use the previous Estimator
based Trainer, please set custom_executor_spec to trainer.executor.Executor. - Changed the pattern spec supported for QueryBasedDriver:
- @span_begin_timestamp: Start of span interval, Timestamp in seconds.
- @span_end_timestamp: End of span interval, Timestamp in seconds.
- @span_yyyymmdd_utc: STRING with format, e.g., '20180114', corresponding
to the span interval begin in UTC.
- Removed the already deprecated compile() method on Kubeflow V2 Dag Runner.
- Removed project_id argument from KubeflowV2DagRunnerConfig which is not used
and meaningless if not used with GCP. - Removed config from LocalDagRunner's constructor, and dropped pipeline proto
support from LocalDagRunner's run function. - Removed input parameter in ExampleGen constructor and external_input in
dsl_utils, which were called as deprecated in TFX 0.23. - Changed the storage type of
span
andversion
custom property in Examples
artifact from string to int. ResolverStrategy.resolve_artifacts()
method signature has changed to take
ml_metadata.MetadataStore
object as the first argument.- Artifacts param is deprecated/ignored in Channel constructor.
- Removed matching_channel_name from Channel's constructor.
- Deleted all usages of instance_name, which was deprecated in version 0.25.0.
Please use .with_id() method of components. - Removed output channel overwrite functionality from all official components.
- Transform will use the native TF2 implementation of tf.transform unless TF2
behaviors are explicitly disabled. The previous behaviour can still be
obtained by settingforce_tf_compat_v1=True
.
For Component Authors
- N/A
Deprecations
- RuntimeParameter usage for
module_file
and user-defined function paths is
marked experimental. LatestArtifactsResolver
,LatestBlessedModelResolver
,SpansResolver
are renamed toLatestArtifactStrategy
,LatestBlessedModelStrategy
,
SpanRangeStrategy
respectively.
Bug Fixes and Other Changes
- GCP compute project in BigQuery Pusher executor can be specified.
- New extra dependencies for convenience.
- tfx[airflow] installs all Apache Airflow orchestrator dependencies.
- tfx[kfp] installs all Kubeflow Pipelines orchestrator dependencies.
- tfx[tf-ranking] installs packages for TensorFlow Ranking.
NOTE: TensorFlow Ranking only compatible with TF >= 2.0.
- Depends on 'google-cloud-bigquery>=1.28.0,<3'. (This was already installed as
a transitive dependency from the first release of TFX.) - Depends on
google-cloud-aiplatform>=0.5.0,<0.8
. - Depends on
ml-metadata>=0.30.0,<0.31.0
. - Depends on
portpicker>=1.3.1,<2
. - Depends on
struct2tensor>=0.30.0,<0.31.0
. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.5.*,<3
. - Depends on
tensorflow-data-validation>=0.30.0,<0.31.0
. - Depends on
tensorflow-model-analysis>=0.30.0,<0.31.0
. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.5.*,<3
. - Depends on
tensorflow-transform>=0.30.0,<0.31.0
. - Depends on
tfx-bsl>=0.30.0,<0.31.0
.
Documentation Updates
- N/A
TFX 0.26.4
Major Features and Improvements
- N/A
Breaking changes
- N/A
For pipeline authors
- N/A
For component authors
- N/A
Deprecations
- N/A
Bug fixes and other changes
- Depends on
apache-beam[gcp]>=2.25,!=2.26.*,<2.29
. - Depends on
tensorflow-data-validation>=0.26.1,<0.27
.
Documentation updates
- N/A
TFX 0.30.0-rc0
Major Features and Improvements
- Upgraded TFX to KFP compiler to use KFP IR schema version 2.0.0.
- InfraValidator can now produce a SavedModel with warmup requests. This feature is
enabled by settingRequestSpec.make_warmup = True
. The SavedModel will be
stored in the InfraBlessing artifact (blessing
output of InfraValidator). - Pusher's
model
input is now optional, andinfra_blessing
can be used
instead to push the SavedModel with warmup requests, produced by an
InfraValidator. Note that InfraValidator does not always create a SavedModel,
and the producer InfraValidator must be configured with
RequestSpec.make_warmup = True
in order to be pushed by a Pusher. - Support is added for the JSON_VALUE artifact property type, allowing storage
of JSON-compatible objects as artifact metadata. - Support is added for the KFP v2 artifact metadata field when executing using
the KFP v2 container entrypoint. - InfraValidator for Kubernetes now can override Pod manifest to customize
annotations and environment variables. - Allow Beam pipeline args to be extended by specifying
beam_pipeline_args
per component. - Support string RuntimeParameters on Airflow.
Breaking Changes
For Pipeline Authors
- CLI usage with kubeflow changed significantly. You MUST use the new:
--build-image
to build a container image when
updating a pipeline with kubeflow engine.--build-target-image
flag in CLI is changed to--build-image
without
any container image argument. TFX will auto detect the image specified in
the KubeflowDagRunnerConfig class instance. For example,tfx pipeline create --pipeline-path=runner.py --endpoint=xxx --build-image tfx pipeline update --pipeline-path=runner.py --endpoint=xxx --build-image
--package-path
and--skaffold_cmd
flags were deleted. The compiled path
can be specified when creating a KubeflowDagRunner class instance. TFX CLI
doesn't depend on skaffold any more and use Docker SDK directly.- Default orchestration engine of CLI was changed to
local
orchestrator from
beam
orchestrator. You can still usebeam
orchestrator with
--engine=beam
flag. - Trainer now uses GenericExecutor as default. To use the previous Estimator
based Trainer, please set custom_executor_spec to trainer.executor.Executor. - Changed the pattern spec supported for QueryBasedDriver:
- @span_begin_timestamp: Start of span interval, Timestamp in seconds.
- @span_end_timestamp: End of span interval, Timestamp in seconds.
- @span_yyyymmdd_utc: STRING with format, e.g., '20180114', corresponding
to the span interval begin in UTC.
- Removed the already deprecated compile() method on Kubeflow V2 Dag Runner.
- Removed project_id argument from KubeflowV2DagRunnerConfig which is not used
and meaningless if not used with GCP. - Removed config from LocalDagRunner's constructor, and dropped pipeline proto
support from LocalDagRunner's run function. - Removed input parameter in ExampleGen constructor and external_input in
dsl_utils, which were called as deprecated in TFX 0.23. - Changed the storage type of
span
andversion
custom property in Examples
artifact from string to int. ResolverStrategy.resolve_artifacts()
method signature has changed to take
ml_metadata.MetadataStore
object as the first argument.- Artifacts param is deprecated/ignored in Channel constructor.
- Removed matching_channel_name from Channel's constructor.
- Deleted all usages of instance_name, which was deprecated in version 0.25.0.
Please use .with_id() method of components. - Removed output channel overwrite functionality from all official components.
For Component Authors
- N/A
Deprecations
- N/A
Bug Fixes and Other Changes
- New extra dependencies for convenience.
- tfx[airflow] installs all Apache Airflow orchestrator dependencies.
- tfx[kfp] installs all Kubeflow Pipelines orchestrator dependencies.
- tfx[tf-ranking] installs packages for TensorFlow Ranking.
NOTE: TensorFlow Ranking only compatible with TF >= 2.0.
- Depends on 'google-cloud-bigquery>=1.6.0,<3'. (This was already installed as
a transitive dependency from the first release of TFX.) - Depends on
google-cloud-aiplatform>=0.5.0,<0.8
. - Depends on
ml-metadata>=0.30.0,<0.31.0
. - Depends on
portpicker>=1.3.1,<2
. - Depends on
struct2tensor>=0.30.0,<0.31.0
. - Depends on
tensorflow-data-validation>=0.30.0,<0.31.0
. - Depends on
tensorflow-model-analysis>=0.30.0,<0.31.0
. - Depends on
tensorflow-transform>=0.30.0,<0.31.0
. - Depends on
tfx-bsl>=0.30.0,<0.31.0
.
Documentation Updates
- N/A
TFX 0.29.0
Major Features and Improvements
- Added a simple query based driver that supports Span spec and static_range.
- Added e2e rolling window example/test for Span Resolver.
- Performance improvement in Transform by avoiding excess encodings and
decodings when it materializes transformed examples or generates statistics
(both enabled by default). - Added an accessor (
.data_view_decode_fn
) to the decoder function wrapped in
the DataView in TrainerFnArgs.data_accessor
.
Breaking Changes
- Starting in this version, following artifacts will be stored in new format,
but artifacts produced by older versions can be read in a backwards
compatible way:- Change split sub-folder format to 'Split-<split_name>', this applies to
all artifacts that contain splits. Old format '<split_name>' can still
be loaded by TFX. - Change Model artifact's sub-folder name to 'Format-TFMA' for eval model
and 'Format-Serving' for serving model. Old Model artifact format
('eval_model_dir'/'serving_model_dir') can still be loaded by TFX. - Change ExampleStatistics artifact payload to binary proto
FeatureStats.pb file. Old payload format (tfrecord stats_tfrecord file)
can still be loaded by TFX. - Change ExampleAnomalies artifact payload to binary proto SchemaDiff.pb
file. Old payload format (text proto anomalies.pbtxt file) is deprecated
as TFX doesn't have downstream components that take ExampleAnomalies
artifact.
- Change split sub-folder format to 'Split-<split_name>', this applies to
For Pipeline Authors
- CLI requires Apache Airflow 1.10.14 or later. If you are using an older
version of airflow, you can still copy runner definition to the DAG
directory manually and run using airflow UIs.
For Component Authors
- N/A
Deprecations
- Deprecated input/output compatibility aliases for Transform and
StatisticsGen.
Bug Fixes and Other Changes
- The
tfx_version
custom property of output artifacts is now set by the
default publisher to the TFX SDK version. - Depends on
absl-py>=0.9,<0.13
. - Depends on
kfp-pipeline-spec>=0.1.7,<0.2
. - Depends on
ml-metadata>=0.29.0,<0.30.0
. - Depends on
packaging>=20,<21
. - Depends on
struct2tensor>=0.29.0,<0.30.0
. - Depends on
tensorflow-data-validation>=0.29.0,<0.30.0
. - Depends on
tensorflow-model-analysis>=0.29.0,<0.30.0
. - Depends on
tensorflow-transform>=0.29.0,<0.30.0
. - Depends on
tfx-bsl>=0.29.0,<0.30.0
.
Documentation Updates
- Simplified Apache Spark and Flink example deployment scripts by using Beam's
SparkRunner and FlinkRunner classes. - Upgraded example Apache Flink deployment to Flink 1.12.1.
- Upgraded example Apache Spark deployment to Spark 2.4.7.
- Added the "TFX Python function component" notebook tutorial.
TFX 0.29.0-rc0
Major Features and Improvements
- Added a simple query based driver that supports Span spec and static_range.
- Added e2e rolling window example/test for Span Resolver.
- Performance improvement in Transform by avoiding excess encodings and
decodings when it materializes transformed examples or generates statistics
(both enabled by default). - Added an accessor (
.data_view_decode_fn
) to the decoder function wrapped in
the DataView in TrainerFnArgs.data_accessor
.
Breaking Changes
- Starting in this version, following artifacts will be stored in new format,
but artifacts produced by older versions can be read in a backwards
compatible way:- Change split sub-folder format to 'Split-<split_name>', this applies to
all artifacts that contain splits. Old format '<split_name>' can still
be loaded by TFX. - Change Model artifact's sub-folder name to 'Format-TFMA' for eval model
and 'Format-Serving' for serving model. Old Model artifact format
('eval_model_dir'/'serving_model_dir') can still be loaded by TFX. - Change ExampleStatistics artifact payload to binary proto
FeatureStats.pb file. Old payload format (tfrecord stats_tfrecord file)
can still be loaded by TFX. - Change ExampleAnomalies artifact payload to binary proto SchemaDiff.pb
file. Old payload format (text proto anomalies.pbtxt file) is deprecated
as TFX doesn't have downstream components that take ExampleAnomalies
artifact.
- Change split sub-folder format to 'Split-<split_name>', this applies to
For Pipeline Authors
- CLI requires Apache Airflow 1.10.14 or later. If you are using an older
version of airflow, you can still copy runner definition to the DAG
directory manually and run using airflow UIs.
For Component Authors
- N/A
Deprecations
- Deprecated input/output compatibility aliases for Transform and
StatisticsGen.
Bug Fixes and Other Changes
- The
tfx_version
custom property of output artifacts is now set by the
default publisher to the TFX SDK version. - Depends on
absl-py>=0.9,<0.13
. - Depends on
kfp-pipeline-spec>=0.1.7,<0.2
. - Depends on
ml-metadata>=0.29.0,<0.30.0
. - Depends on
packaging>=20,<21
. - Depends on
struct2tensor>=0.29.0,<0.30.0
. - Depends on
tensorflow-data-validation>=0.29.0,<0.30.0
. - Depends on
tensorflow-model-analysis>=0.29.0,<0.30.0
. - Depends on
tensorflow-transform>=0.29.0,<0.30.0
. - Depends on
tfx-bsl>=0.29.0,<0.30.0
.
Documentation Updates
- Simplified Apache Spark and Flink example deployment scripts by using Beam's
SparkRunner and FlinkRunner classes. - Upgraded example Apache Flink deployment to Flink 1.12.1.
- Upgraded example Apache Spark deployment to Spark 2.4.7.
- Added the "TFX Python function component" notebook tutorial.
TFX 0.26.3
Major Features and Improvements
- N/A
Breaking changes
- N/A
For pipeline authors
- N/A
For component authors
- N/A
Deprecations
- N/A
Bug fixes and other changes
- Automatic autoreload of underlying modules a single
_ModuleFinder
registered per module.
Documentation updates
- N/A
TFX 0.28.0
Major Features and Improvements
- Publically released TFX docker image in tensorflow/tfx will use GPU
compatible based TensorFlow images from Deep Learning Containers. This allow
these images to be used with GPU out of box. - Added an example pipeline for a ranking model (using
tensorflow_ranking)
attfx/examples/ranking
. More documentation will be available in future
releases. - Added a spans_resolver
that can resolve spans based on range_config.
Breaking Changes
For Pipeline Authors
- Custom arg key in
google_cloud_ai_platform.tuner.executor
is renamed to
ai_platform_tuning_args
fromai_platform_training_args
, to better
distinguish usage with Trainer.
For component authors
- N/A
Deprecations
- Deprecated input/output compatibility aliases for Transform and SchemaGen.
Bug Fixes and Other Changes
- Change Bigquery ML Pusher to publish the model to the user specified project
instead of the default project from run time context. - Depends on
apache-beam[gcp]>=2.28,<3
. - Depends on
ml-metadata>=0.28.0,<0.29.0
. - Depends on
kfp-pipeline-spec>=0.1.6,<0.2
. - Depends on
struct2tensor>=0.28.0,<0.29.0
. - Depends on
tensorflow-data-validation>=0.28.0,<0.29.0
. - Depends on
tensorflow-model-analysis>=0.28.0,<0.29.0
. - Depends on
tensorflow-transform>=0.28.0,<0.29.0
. - Depends on
tfx-bsl>=0.28.1,<0.29.0
.
Documentation Updates
- Published a migration instruction
for legacy custom launcher developers.
TFX 0.28.0
Major Features and Improvements
- Publically released TFX docker image in tensorflow/tfx will use GPU
compatible based TensorFlow images from Deep Learning Containers. This allow
these images to be used with GPU out of box. - Added an example pipeline for a ranking model (using
tensorflow_ranking)
attfx/examples/ranking
. More documentation will be available in future
releases. - Added a spans_resolver
that can resolve spans based on range_config.
Breaking Changes
For Pipeline Authors
- Custom arg key in
google_cloud_ai_platform.tuner.executor
is renamed to
ai_platform_tuning_args
fromai_platform_training_args
, to better
distinguish usage with Trainer.
For component authors
- N/A
Deprecations
- Deprecated input/output compatibility aliases for Transform and SchemaGen.
Bug Fixes and Other Changes
- Change Bigquery ML Pusher to publish the model to the user specified project
instead of the default project from run time context. - Depends on
apache-beam[gcp]>=2.28,<3
. - Depends on
ml-metadata>=0.28.0,<0.29.0
. - Depends on
kfp-pipeline-spec>=0.1.6,<0.2
. - Depends on
struct2tensor>=0.28.0,<0.29.0
. - Depends on
tensorflow-data-validation>=0.28.0,<0.29.0
. - Depends on
tensorflow-model-analysis>=0.28.0,<0.29.0
. - Depends on
tensorflow-transform>=0.28.0,<0.29.0
. - Depends on
tfx-bsl>=0.28.1,<0.29.0
.
Documentation Updates
- Published a migration instruction
for legacy custom launcher developers.
TFX 0.27.0
Major Features and Improvements
- Supports different types of quantizations on TFLite conversion using
TFLITE_REWRITER by settingquantization_optimizations
,
quantization_supported_types
andquantization_enable_full_integer
. Flag
definitions can be found here: Post-traning
quantization. - Added automatic population of
tfdv.StatsOptions.vocab_paths
when computing
statistics within the Transform component.
Breaking changes
For pipeline authors
enable_quantization
from TFLITE_REWRITER is removed and setting
quantization_optimizations = [tf.lite.Optimize.DEFAULT]
will perform the
same type of quantization, dynamic range quantization. Users of the
TFLITE_REWRITER who do not enable quantization should be uneffected.- Default value for
infer_feature_shape
for SchemaGen changed fromFalse
toTrue
, as indicated in previous release log. The inferred schema might
change if you do not specifyinfer_feature_shape
. It might leads to
changes of the type of input features in Transform and Trainer code.
For component authors
- N/A
Deprecations
- Pipeline information is not be stored on the local filesystem anymore using
Kubeflow Pipelines orchestration with CLI. Instead, CLI will always use the
latest version of the pipeline in the Kubeflow Pipeline cluster. All
operations will be executed based on the information on the Kubeflow
Pipeline cluster. There might be some left files on
${HOME}/tfx/kubeflow
or${HOME}/kubeflow
but those will not be used
any more. - The
tfx.components.common_nodes.importer_node.ImporterNode
class has been
moved totfx.dsl.components.common.importer.Importer
, with its
old module path kept as a deprecated alias, which will be removed in a
future version. - The
tfx.components.common_nodes.resolver_node.ResolverNode
class has been
moved totfx.dsl.components.common.resolver.Resolver
, with its
old module path kept as a deprecated alias, which will be removed in a
future version. - The
tfx.dsl.resolvers.BaseResolver
class has been
moved totfx.dsl.components.common.resolver.ResolverStrategy
, with its
old module path kept as a deprecated alias, which will be removed in a
future version. - Deprecated input/output compatibility aliases for ExampleValidator,
Evaluator, Trainer and Pusher.
Bug fixes and other changes
- InfraValidator supports using alternative TensorFlow Serving image in case
deployed environment cannot reach the public internet (nor the docker hub).
Such alternative image should behave the same as official
tensorflow/serving
image such as the same model volume path, serving port,
etc. - Executor in
tfx.extensions.google_cloud_ai_platform.pusher.executor
supported regional endpoint and machine_type. - Starting from this version, proto files which are used to generate
component-level configs are included in thetfx
package directly. - The
tfx.dsl.io.fileio.NotFoundError
exception unifies handling of not-
found errors across different filesystem plugin backends. - Fixes the serialization of zero-valued default when using
RuntimeParameter
on Kubeflow. - Depends on
apache-beam[gcp]>=2.27,<3
. - Depends on
ml-metadata>=0.27.0,<0.28.0
. - Depends on
numpy>=1.16,<1.20
. - Depends on
pyarrow>=1,<3
. - Depends on
kfp-pipeline-spec>=0.1.5,<0.2
in test and image. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,<3
. - Depends on
tensorflow-data-validation>=0.27.0,<0.28.0
. - Depends on
tensorflow-model-analysis>=0.27.0,<0.28.0
. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,<3
. - Depends on
tensorflow-transform>=0.27.0,<0.28.0
. - Depends on
tfx-bsl>=0.27.0,<0.28.0
.
Documentation updates
- N/A