You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04.5 LTS
TensorFlow Model Analysis installed from (source or binary): binary
TensorFlow Model Analysis version (use command below): 0.40.0
Python version: 3.8.10
Describe the problem
While TFMA using beam metric, an error occurs because the type is numpy.int64 rather than int. The error log is as follows, and an error occurs when running evaluation with padding option (tf-ranking metrics). It seems that the above error occurs while obtaining batch_size from the metric called num_instances.
Source code / logs
When I log the metric with numpy int, it was taken as below
Dataflow error log
Error message from worker: Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 292, in _execute
response = task()
File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 365, in <lambda>
lambda: self.create_worker().do_instruction(request), request)
File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 624, in do_instruction
return getattr(self, request_type)(
File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 663, in process_bundle
monitoring_infos = bundle_processor.monitoring_infos()
File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/bundle_processor.py", line 1198, in monitoring_infos
op.monitoring_infos(transform_id, dict(tag_to_pcollection_id)))
File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/operations.py", line 543, in monitoring_infos
all_monitoring_infos.update(self.user_monitoring_infos(transform_id))
File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/operations.py", line 584, in user_monitoring_infos
return self.metrics_container.to_runner_api_monitoring_infos(transform_id)
File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/execution.py", line 309, in to_runner_api_monitoring_infos
all_metrics = [
File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/execution.py", line 310, in <listcomp>
cell.to_runner_api_monitoring_info(key.metric_name, transform_id)
File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/cells.py", line 76, in to_runner_api_monitoring_info
mi = self.to_runner_api_monitoring_info_impl(name, transform_id)
File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/cells.py", line 150, in to_runner_api_monitoring_info_impl
return monitoring_infos.int64_user_counter(
File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/monitoring_infos.py", line 185, in int64_user_counter
return create_monitoring_info(
File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/monitoring_infos.py", line 302, in create_monitoring_info
return metrics_pb2.MonitoringInfo(
TypeError: 3367 has type numpy.int64, but expected one of: bytes
The text was updated successfully, but these errors were encountered:
Could you please provide the minimum reproducible code to reproduce the error at our end?
Please refer Tensorflow Model Analysis Metrics and Plots for TFMA supported metrics and Ranking based metrics.
Thank you!
@zzing0907@singhniraj08 I am also facing similar issue. I am finding that this error comes and leaves, thus not quite reproducible. However, it occurs with high enough frequency to be concerning. I would also like to point out that the monitoring metric referred by the error message is not the same as Evaluation metrics in Data Science as referred by TFMA. The monitoring metrics are created by Apache Beam to check the progress of the workers (likely). So I wonder whether or not this is actually an issue with Apache Beam. I have filed a similar issue with the Apache Beam team here: apache/beam#27469
Coming at it from the Beam metric side, it looks like numpy.int64 values are being passed to the counter improperly somewhere. Those counters should only receive ints, as that is the only type that the Beam code will encode before passing it to a protobuffer to be reported. I provided a little context on apache/beam#27469. If you can find where the metric is getting numpy.int64s and convert the values to python ints in the call, that should resolve it.
System information
Describe the problem
While TFMA using beam metric, an error occurs because the type is numpy.int64 rather than int. The error log is as follows, and an error occurs when running evaluation with padding option (tf-ranking metrics). It seems that the above error occurs while obtaining batch_size from the metric called
num_instances
.Source code / logs
The text was updated successfully, but these errors were encountered: