WhisperLiveKit

Author	SHA1	Message	Date
Quentin Fuxa	cf6c49f502	Ruff lint cleanup	2026-01-03 10:23:00 +01:00
Chingning Chen	b63f54e838	fix(whisper/tokenizer): prevent IndexError from crashing multilingual streams This fix addresses a critical bug in the Whisper tokenizer that causes the transcription server to crash with an `IndexError: string index out of range` when streaming audio in languages utilizing multi-byte UTF-8 characters (e.g., Cantonese, Japanese, Mandarin). When a 3-byte character is cut off at the boundary of an audio chunk, incomplete bytes are decoded into a single Unicode replacement character (`\ufffd`), artificially shortening the string and breaking the offset mapping assumed by `split_tokens_on_unicode`. This ports the upstream fix from SYSTRAN/faster-whisper (PR #111) to add a strict bounds check before accessing the string index, allowing incomplete bytes to be safely caught and handled in the next chunk.	2026-03-02 15:31:43 +08:00
Quentin Fuxa	7f3a3df620	simulstreaming mlx & torch dedup of common base	2025-02-15 23:52:00 +01:00
Quentin Fuxa	4d9332ce7d	fixes #299	2025-12-05 17:54:14 +01:00
Quentin Fuxa	82cd24bb75	LoRa path v0 - functional	2025-11-29 17:21:10 +01:00
Quentin Fuxa	1d88ba9d69	Fixes #294 . improve model path backend detection and file extraction	2025-11-27 23:14:00 +01:00
Quentin Fuxa	7faa21f95f	alignatt: enable model sharing by removing hooks and centralizing session state. Solves #282 Co-authored-by: Emmanuel Schmidbauer <eschmidbauer@gmail.com>	2025-11-25 23:07:42 +01:00
Quentin Fuxa	870141298c	isort	2025-11-23 11:20:00 +01:00
Quentin Fuxa	4d2ffb24f8	coreml conversion	2025-11-16 19:11:43 +01:00
Quentin Fuxa	1bbbb7903c	lora loader in shared whisper core	2025-11-16 18:44:35 +01:00
Quentin Fuxa	13401ffe24	whisper core at root of wlk	2025-11-10 12:17:18 +01:00

11 commits