Forced Alignment / VAD

Wav2Vec2-based forced alignment trimmer for silence removal.

class pathbench.vad.FATrimmer(model_id='facebook/wav2vec2-xlsr-53-espeak-cv-ft', use_exp=False)[source]

Bases: object

A class to trim silence from audio using forced alignment.

MAX_CACHE_SIZE = 10000
trim(audio_path, transcription, language, start_time=0.0, end_time=-1.0)[source]

Trims silence from the beginning and end of an audio file using forced alignment. Returns a tuple of (trimmed_audio_array, sample_rate).

Return type:

Optional[Tuple[ndarray, int]]