Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate DER for specified speaker identification mapping #69

Open
swm35 opened this issue Sep 11, 2024 · 0 comments
Open

Calculate DER for specified speaker identification mapping #69

swm35 opened this issue Sep 11, 2024 · 0 comments

Comments

@swm35
Copy link

swm35 commented Sep 11, 2024

Description

Is there a way to use the pyannote.metrics DiarizationErrorRate() to calculate errors based on specifically identified speakers rather than using either the in-built Hungarian optimal mapping or greedy mapping?

Most speaker diarization systems distinguish speakers with generic labels such as SPEAKER_00, but if we have a speaker identification system on top it would be great to see how the DER is affected.

Example

Reference RTTM file:
SPEAKER AUDIOFILE1 1 1 5 ANN
SPEAKER AUDIOFILE1 1 6 3 BOB
SPEAKER AUDIOFILE1 1 8 2 ANN

Hypothesis RTTM file 1 from a speaker diarization system:
SPEAKER AUDIOFILE1 1 1 5 SPEAKER_00
SPEAKER AUDIOFILE1 1 6 3 SPEAKER_01
SPEAKER AUDIOFILE1 1 8 2 SPEAKER_00

DER for hypothesis RTTM file 1 is 20%, comprising 10% miss and 10% false alarm.

Hypothesis RTTM file 2 from speaker diarization followed by speaker identification:
SPEAKER AUDIOFILE1 1 1 5 BOB
SPEAKER AUDIOFILE1 1 6 3 ANN
SPEAKER AUDIOFILE1 1 8 2 BOB

DER for hypothesis RTTM file 2 is still 20%, comprising 10% miss and 10% false alarm. It does not factor in the speaker identification error from the wrongly identified speakers.

Apologies if I am asking something obvious. I feel there must be an easy answer out there but I have not found it. I am aware of speaker-attributed word error rates (SAWER) and its variants, but am not aware of any speaker-attributed DER metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant