ToolCorrectnessMetric raises ZeroDivisionError when expected_tools is empty #1234

j-mesnil · 2024-12-18T10:24:42Z

Describe the bug
The ToolCorrectnessMetric raises a ZeroDivisionError when called with an LLMTestCase for which expected_tools is empty. In my use case, I want to benchmark that the LLM only calls a tool when necessary, i.e. with an empty expected_tools list.

To Reproduce

Run minimal example:

from deepeval.metrics import ToolCorrectnessMetric
from deepeval.test_case import LLMTestCase

metric = ToolCorrectnessMetric()
test_case = LLMTestCase(
    input="What is an elephant?",
    actual_output="(...)",
    tools_called=[],
    expected_tools=[]
)

metric.measure(test_case)
print(metric.score)

See error ZeroDivisionError: division by zero raised by metric.measure(test_case). This is caused by the line score = len(used_expected_tools) / len(self.expected_tools) in _calculate_score() of the ToolCorrectnessMetric.

Expected behavior
I expected a perfect score of 1 for the minimal example, which represents a test case where no tool call was executed and no tool call was expected.

Software versions:

OS: macOS 15.2 (24C101)
Python 3.12
deepeval 2.0.5

Additional context
This can be mitigated by using should_exact_match=True, but I did not find this workaround to be documented.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ToolCorrectnessMetric raises ZeroDivisionError when expected_tools is empty #1234

ToolCorrectnessMetric raises ZeroDivisionError when expected_tools is empty #1234

j-mesnil commented Dec 18, 2024

ToolCorrectnessMetric raises ZeroDivisionError when expected_tools is empty #1234

ToolCorrectnessMetric raises ZeroDivisionError when expected_tools is empty #1234

Comments

j-mesnil commented Dec 18, 2024