Evaluator Base Classes

Abstract base classes and trimming wrappers that define the evaluator architecture.

class pathbench.evaluator.LookupEvaluator[source]

Bases: ABC

Evaluator that maps utterance/speaker IDs to pre-computed scores. Needs only the utterance ID — no audio, transcription, or reference.

abstractmethod score(utterance_id)[source]
Return type:

Optional[float]

class pathbench.evaluator.ReferenceFreeEvaluator[source]

Bases: ABC

Utterance-level evaluator that needs only audio + segment bounds. No transcription, no reference audio, no language.

abstractmethod score(utterance_id, audio_path, start_time=0.0, end_time=-1.0)[source]
Return type:

Optional[float]

class pathbench.evaluator.ReferenceTxtEvaluator[source]

Bases: ABC

Utterance-level evaluator that needs transcription + language. Used for ASR-based metrics and FA-trimming wrappers.

abstractmethod score(utterance_id, audio_path, transcription, language, start_time=0.0, end_time=-1.0)[source]
Return type:

Optional[float]

class pathbench.evaluator.ReferenceAudioEvaluator[source]

Bases: ABC

Utterance-level evaluator that needs reference audio files. No transcription or language required.

abstractmethod score(utterance_id, audio_path, reference_audios, start_time=0.0, end_time=-1.0)[source]
Return type:

Optional[float]

class pathbench.evaluator.ReferenceTxtAndAudioEvaluator[source]

Bases: ABC

Utterance-level evaluator that needs both transcription (for FA trimming) AND reference audio files (for distance computation). Used for TrimmedNADEvaluator.

abstractmethod score(utterance_id, audio_path, transcription, language, reference_audios, start_time=0.0, end_time=-1.0)[source]
Return type:

Optional[float]

pathbench.evaluator.load_audio(audio_path, start_time=0.0, end_time=-1.0, cache=None)[source]

Load a single audio file, optionally using a cache.

Returns (audio_ndarray, fs) or (None, None) on failure.

Return type:

Tuple[Optional[ndarray], Optional[int]]

pathbench.evaluator.load_audios(audio_files, cache=None)[source]

Load a list of (path, start, end) tuples into (ndarray, fs) pairs.

Used by script-level dispatch before calling _score_audio_list() on plain (non-trimmed) speaker evaluators. If cache is provided, results are looked up / stored there to avoid redundant disk reads.

Return type:

List[Tuple[ndarray, int]]

class pathbench.evaluator.ReferenceFreeSpeakerEvaluator[source]

Bases: ABC

Speaker-level evaluator that needs only audio files + segment bounds. No transcription, no language.

Callers load audio with load_audios() and pass the result to _score_audio_list(). The trimmed wrapper (TrimmedReferenceFreeSpeakerEvaluator) does the same after FA-trimming each utterance.

class pathbench.evaluator.LanguageAwareSpeakerEvaluator[source]

Bases: ABC

Speaker-level evaluator that needs audio + language. Language is required for acoustic model parameters (e.g. vowel formant tables), not only for FA trimming.

Callers load audio with load_audios() and pass the result to _score_audio_list(). The trimmed wrapper (TrimmedLanguageAwareSpeakerEvaluator) does the same after FA-trimming each utterance.

class pathbench.evaluator.TrimmedReferenceFreeEvaluator(inner, trimmer)[source]

Bases: ReferenceTxtEvaluator

Wraps a ReferenceFreeEvaluator with FA trimming.

The inner evaluator stays reference-free — it never sees transcription or language. This wrapper is a ReferenceTxtEvaluator because the trimmer needs transcription + language to perform forced alignment.

Delegation flow:
  1. Receive (audio_path, transcription, language, start_time, end_time)

  2. If no explicit segment: call trimmer.trim() → trimmed ndarray

  3. Fallback to librosa.load() if trim fails or segment is specified

  4. Call inner._score_audio(audio, fs) ← inner knows nothing about text

score(utterance_id, audio_path, transcription, language, start_time=0.0, end_time=-1.0)[source]
Return type:

Optional[float]

class pathbench.evaluator.TrimmedReferenceFreeSpeakerEvaluator(inner, trimmer)[source]

Bases: object

Wraps a ReferenceFreeSpeakerEvaluator with FA trimming.

Trims each utterance in the speaker’s audio list, then delegates to inner._score_audio_list() with the trimmed audio arrays.

score(audio_files, transcriptions, language)[source]
Return type:

Optional[float]

class pathbench.evaluator.TrimmedLanguageAwareSpeakerEvaluator(inner, trimmer)[source]

Bases: object

Wraps a LanguageAwareSpeakerEvaluator with FA trimming.

Same delegation as TrimmedReferenceFreeSpeakerEvaluator but passes language through to inner._score_audio_list() since the inner evaluator uses language for its own computation (e.g. VSA vowel formant tables).

score(audio_files, transcriptions, language)[source]
Return type:

Optional[float]

class pathbench.evaluator.Evaluator[source]

Bases: object

Deprecated. Kept for backward compatibility. Use the typed ABCs instead.

class pathbench.evaluator.SpeakerEvaluator[source]

Bases: object

Deprecated. Kept for backward compatibility. Use the typed ABCs instead.

class pathbench.evaluator.Utt2ScoreEvaluator(scores)[source]

Bases: LookupEvaluator

Maps utterance IDs to pre-computed scores.

score(utterance_id)[source]
Return type:

Optional[float]

class pathbench.evaluator.Spk2ScoreEvaluator(spk2score, utt2spk)[source]

Bases: LookupEvaluator

Maps utterance IDs → speaker IDs → pre-computed speaker scores.

score(utterance_id)[source]
Return type:

Optional[float]