Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

only integer values should be passed to num_instances metric. #171

Open
zzing0907 opened this issue Mar 30, 2023 · 3 comments
Open

only integer values should be passed to num_instances metric. #171

zzing0907 opened this issue Mar 30, 2023 · 3 comments

Comments

@zzing0907
Copy link

zzing0907 commented Mar 30, 2023

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04.5 LTS
  • TensorFlow Model Analysis installed from (source or binary): binary
  • TensorFlow Model Analysis version (use command below): 0.40.0
  • Python version: 3.8.10

Describe the problem

While TFMA using beam metric, an error occurs because the type is numpy.int64 rather than int. The error log is as follows, and an error occurs when running evaluation with padding option (tf-ranking metrics). It seems that the above error occurs while obtaining batch_size from the metric called num_instances.

Source code / logs

  • When I log the metric with numpy int, it was taken as below

image

  • Dataflow error log
Error message from worker: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 292, in _execute
    response = task()
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 365, in <lambda>
    lambda: self.create_worker().do_instruction(request), request)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 624, in do_instruction
    return getattr(self, request_type)(
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 663, in process_bundle
    monitoring_infos = bundle_processor.monitoring_infos()
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/bundle_processor.py", line 1198, in monitoring_infos
    op.monitoring_infos(transform_id, dict(tag_to_pcollection_id)))
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/operations.py", line 543, in monitoring_infos
    all_monitoring_infos.update(self.user_monitoring_infos(transform_id))
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/runners/worker/operations.py", line 584, in user_monitoring_infos
    return self.metrics_container.to_runner_api_monitoring_infos(transform_id)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/execution.py", line 309, in to_runner_api_monitoring_infos
    all_metrics = [
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/execution.py", line 310, in <listcomp>
    cell.to_runner_api_monitoring_info(key.metric_name, transform_id)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/cells.py", line 76, in to_runner_api_monitoring_info
    mi = self.to_runner_api_monitoring_info_impl(name, transform_id)
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/cells.py", line 150, in to_runner_api_monitoring_info_impl
    return monitoring_infos.int64_user_counter(
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/monitoring_infos.py", line 185, in int64_user_counter
    return create_monitoring_info(
  File "/usr/local/lib/python3.8/dist-packages/apache_beam/metrics/monitoring_infos.py", line 302, in create_monitoring_info
    return metrics_pb2.MonitoringInfo(
TypeError: 3367 has type numpy.int64, but expected one of: bytes  
@singhniraj08 singhniraj08 self-assigned this Mar 31, 2023
@singhniraj08
Copy link

@zzing0907,

Could you please provide the minimum reproducible code to reproduce the error at our end?
Please refer Tensorflow Model Analysis Metrics and Plots for TFMA supported metrics and Ranking based metrics.
Thank you!

@EdwardCuiPeacock
Copy link

EdwardCuiPeacock commented Jul 12, 2023

@zzing0907 @singhniraj08 I am also facing similar issue. I am finding that this error comes and leaves, thus not quite reproducible. However, it occurs with high enough frequency to be concerning. I would also like to point out that the monitoring metric referred by the error message is not the same as Evaluation metrics in Data Science as referred by TFMA. The monitoring metrics are created by Apache Beam to check the progress of the workers (likely). So I wonder whether or not this is actually an issue with Apache Beam. I have filed a similar issue with the Apache Beam team here: apache/beam#27469

@jrmccluskey
Copy link

Coming at it from the Beam metric side, it looks like numpy.int64 values are being passed to the counter improperly somewhere. Those counters should only receive ints, as that is the only type that the Beam code will encode before passing it to a protobuffer to be reported. I provided a little context on apache/beam#27469. If you can find where the metric is getting numpy.int64s and convert the values to python ints in the call, that should resolve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants