only integer values should be passed to num_instances metric. #171

zzing0907 · 2023-03-30T10:28:20Z

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04.5 LTS
TensorFlow Model Analysis installed from (source or binary): binary
TensorFlow Model Analysis version (use command below): 0.40.0
Python version: 3.8.10

Describe the problem

While TFMA using beam metric, an error occurs because the type is numpy.int64 rather than int. The error log is as follows, and an error occurs when running evaluation with padding option (tf-ranking metrics). It seems that the above error occurs while obtaining batch_size from the metric called num_instances.

Source code / logs

When I log the metric with numpy int, it was taken as below

Dataflow error log

Error message from worker: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 292, in _execute
    response = task()
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 365, in <lambda>
    lambda: self.create_worker().do_instruction(request), request)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 624, in do_instruction
    return getattr(self, request_type)(
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 663, in process_bundle
    monitoring_infos = bundle_processor.monitoring_infos()
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/bundle_processor.py", line 1198, in monitoring_infos
    op.monitoring_infos(transform_id, dict(tag_to_pcollection_id)))
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/operations.py", line 543, in monitoring_infos
    all_monitoring_infos.update(self.user_monitoring_infos(transform_id))
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/operations.py", line 584, in user_monitoring_infos
    return self.metrics_container.to_runner_api_monitoring_infos(transform_id)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/execution.py", line 309, in to_runner_api_monitoring_infos
    all_metrics = [
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/execution.py", line 310, in <listcomp>
    cell.to_runner_api_monitoring_info(key.metric_name, transform_id)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/cells.py", line 76, in to_runner_api_monitoring_info
    mi = self.to_runner_api_monitoring_info_impl(name, transform_id)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/cells.py", line 150, in to_runner_api_monitoring_info_impl
    return monitoring_infos.int64_user_counter(
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/monitoring_infos.py", line 185, in int64_user_counter
    return create_monitoring_info(
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/monitoring_infos.py", line 302, in create_monitoring_info
    return metrics_pb2.MonitoringInfo(
TypeError: 3367 has type numpy.int64, but expected one of: bytes

The text was updated successfully, but these errors were encountered:

singhniraj08 · 2023-04-03T07:24:47Z

@zzing0907,

Could you please provide the minimum reproducible code to reproduce the error at our end?
Please refer Tensorflow Model Analysis Metrics and Plots for TFMA supported metrics and Ranking based metrics.
Thank you!

EdwardCuiPeacock · 2023-07-12T14:31:36Z

@zzing0907 @singhniraj08 I am also facing similar issue. I am finding that this error comes and leaves, thus not quite reproducible. However, it occurs with high enough frequency to be concerning. I would also like to point out that the monitoring metric referred by the error message is not the same as Evaluation metrics in Data Science as referred by TFMA. The monitoring metrics are created by Apache Beam to check the progress of the workers (likely). So I wonder whether or not this is actually an issue with Apache Beam. I have filed a similar issue with the Apache Beam team here: apache/beam#27469

jrmccluskey · 2023-07-17T20:18:16Z

Coming at it from the Beam metric side, it looks like numpy.int64 values are being passed to the counter improperly somewhere. Those counters should only receive ints, as that is the only type that the Beam code will encode before passing it to a protobuffer to be reported. I provided a little context on apache/beam#27469. If you can find where the metric is getting numpy.int64s and convert the values to python ints in the call, that should resolve it.

singhniraj08 self-assigned this Mar 31, 2023

singhniraj08 added the type:bug label Mar 31, 2023

singhniraj08 added the stat:awaiting response label Apr 3, 2023

EdwardCuiPeacock mentioned this issue Jul 12, 2023

[Bug]: 16177 has type numpy.int64, but expected one of: bytes apache/beam#27469

Open

15 tasks

singhniraj08 assigned mdreves and unassigned singhniraj08 Jul 17, 2023

singhniraj08 added stat:awaiting tensorflower and removed stat:awaiting response labels Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

only integer values should be passed to num_instances metric. #171

only integer values should be passed to num_instances metric. #171

zzing0907 commented Mar 30, 2023 •

edited

Loading

singhniraj08 commented Apr 3, 2023

EdwardCuiPeacock commented Jul 12, 2023 •

edited

Loading

jrmccluskey commented Jul 17, 2023

only integer values should be passed to num_instances metric. #171

only integer values should be passed to num_instances metric. #171

Comments

zzing0907 commented Mar 30, 2023 • edited Loading

System information

Describe the problem

Source code / logs

singhniraj08 commented Apr 3, 2023

EdwardCuiPeacock commented Jul 12, 2023 • edited Loading

jrmccluskey commented Jul 17, 2023

zzing0907 commented Mar 30, 2023 •

edited

Loading

EdwardCuiPeacock commented Jul 12, 2023 •

edited

Loading