WhisperLiveKit

Author	SHA1	Message	Date
Quentin Fuxa	719e8b1a20	adapt online for mlx detection	2024-11-25 23:52:00 +01:00
Quentin Fuxa	f1b47178d8	adapt online for mlx detection	2024-11-25 23:52:00 +01:00
Quentin Fuxa	59db08e961	loader for full mlx	2024-11-25 23:52:00 +01:00
Quentin Fuxa	6fc20b9562	new dec class	2024-11-21 23:52:00 +01:00
Quentin Fuxa	fac8659161	uses native mlx function for attention	2024-11-21 23:52:00 +01:00
Quentin Fuxa	4d9332ce7d	fixes #299	2025-12-05 17:54:14 +01:00
Quentin Fuxa	62444ce746	session parameter required in OnnxWrapper	2025-12-05 15:37:18 +01:00
Quentin Fuxa	2431a6bf91	isolated VAD states per user: .onnx: share a stateless model. .jit: require duplicating the model. Co-authored-by: eschmidbauer <eschmidbauer@gmail.com>	2025-12-05 15:27:14 +01:00
Zizheng Guo	30ddd522a4	Fix local agreement backend, removing excess parameter, fixes https://github.com/QuentinFuxa/WhisperLiveKit/issues/295	2025-12-04 16:45:23 +08:00
Quentin Fuxa	82cd24bb75	LoRa path v0 - functional	2025-11-29 17:21:10 +01:00
Quentin Fuxa	d45c397c6a	simulstreaming: limit n tokens to prevent hallucinations	2025-11-28 21:41:19 +01:00
Quentin Fuxa	1d88ba9d69	Fixes #294 . improve model path backend detection and file extraction	2025-11-27 23:14:00 +01:00
Quentin Fuxa	c0965c6c31	Lines to Segments. Merging dataclasses	2025-11-27 21:54:58 +01:00
Quentin Fuxa	7faa21f95f	alignatt: enable model sharing by removing hooks and centralizing session state. Solves #282 Co-authored-by: Emmanuel Schmidbauer <eschmidbauer@gmail.com>	2025-11-25 23:07:42 +01:00
Quentin Fuxa	4e9f951551	correct silences handling when language not auto	2025-11-20 11:20:00 +01:00
Quentin Fuxa	870141298c	isort	2025-11-23 11:20:00 +01:00
Quentin Fuxa	a175d1a327	fixes silence detected but never reported by silero	2025-11-23 11:20:00 +01:00
Quentin Fuxa	6206fff118	0.2.15	2025-11-21 23:52:00 +01:00
Quentin Fuxa	b5067249c0	stt/diar/nllw alignment: internal rework 5	2025-11-20 23:52:00 +01:00
Quentin Fuxa	f4f9831d39	stt/diar/nllw alignment: internal rework 5	2025-11-20 23:52:00 +01:00
Quentin Fuxa	254faaf64c	stt/diar/nllw alignment: internal rework 5	2025-11-20 23:52:00 +01:00
Quentin Fuxa	8e7aea4fcf	internal rework 4	2025-11-20 23:45:20 +01:00
Quentin Fuxa	270faf2069	internal rework 3	2025-11-20 22:28:30 +01:00
Quentin Fuxa	b7c1cc77cc	internal rework 2	2025-11-20 22:06:38 +01:00
Quentin Fuxa	9a45ec221c	internal rework 1	2025-11-20 12:58:38 +01:00
Quentin Fuxa	b7d20a0ff0	segment attribution in result formatter	2025-11-19 21:10:28 +01:00
Quentin Fuxa	c1bb9c2bde	reduce flickering remaining_time_transcription	2025-11-19 19:09:37 +01:00
Quentin Fuxa	11e9def0b2	diarization corrections	2025-11-19 19:06:03 +01:00
Quentin Fuxa	3104f40f6e	fixes #279 #278	2025-11-19 18:17:50 +01:00
Quentin Fuxa	e9b4ceeee5	Add audio partial silence in chunks handling. bump to 0.2.14.post3	2025-11-17 22:52:00 +01:00
Quentin Fuxa	437641fb43	reduce min-chunk-size to 0.1, set default model to base	2027-04-25 23:52:00 +02:00
Quentin Fuxa	bfd60b3921	Add audio partial silence in chunks handling. bump to 0.2.14.post2	2025-11-17 22:52:00 +01:00
Quentin Fuxa	1e67bf97f0	improve buffering when use of heavy models	2027-04-25 23:52:00 +02:00
Quentin Fuxa	bbd4fd6cff	Merge branch 'improve_EOS_handling'	2025-11-16 22:30:31 +01:00
Quentin Fuxa	28985962a0	Silence handling: finish transcription even if not validated at the BEGINNING of the silence	2025-11-16 22:29:08 +01:00
Quentin Fuxa	a38c103fcd	simulstreaming coreml encoder compatibility	2025-11-16 21:24:14 +01:00
Quentin Fuxa	4d2ffb24f8	coreml conversion	2025-11-16 19:11:43 +01:00
Quentin Fuxa	1bbbb7903c	lora loader in shared whisper core	2025-11-16 18:44:35 +01:00
Quentin Fuxa	80b77998f9	Refactor backend handling	2025-11-15 19:51:41 +01:00
Quentin Fuxa	d310f7e25f	hf compatibility	2025-11-15 18:34:19 +01:00
Quentin Fuxa	8d9be88fe6	translation buffer is now displayed in frontend	2025-11-10 15:22:26 +01:00
Quentin Fuxa	16461052ed	task to direct-english-translation	2025-11-10 13:20:26 +01:00
Quentin Fuxa	5491dbd824	last_validated_token handled in state	2025-11-10 13:18:52 +01:00
Quentin Fuxa	13401ffe24	whisper core at root of wlk	2025-11-10 12:17:18 +01:00
Quentin Fuxa	7108d2ddc5	fixes https://github.com/QuentinFuxa/WhisperLiveKit/issues/269	2025-11-09 20:08:18 +01:00
Quentin Fuxa	a732e0903e	Add a script to detect alignement heads, usefull for distilled whisper	2025-11-09 18:12:09 +01:00
Quentin Fuxa	0491681be4	Distilled model compatibility with HF config.json to ModelDimensions	2025-11-08 20:20:05 +01:00
Quentin Fuxa	ffe5284764	_processing_tasks_done checks task completion	2025-11-05 23:34:00 +01:00
Quentin Fuxa	06b31f51eb	exception when translation and no nllw	2025-10-30 23:30:19 +01:00
Quentin Fuxa	ece02db6a3	Use optional new separate NLLW package for translation	2025-10-30 19:36:28 +01:00
Quentin Fuxa	939a7ebf8b	Translation Local Agreement + Cache optimization v0. Not connected yet	2025-10-28 00:16:52 +01:00
Quentin Fuxa	61edb70fff	audioProcessor state variables are now uniquely in State dataclass	2025-10-26 18:54:47 +01:00
Quentin Fuxa	4e455b8aab	translation now separates validated from output buffer tokens	2025-10-26 18:51:09 +01:00
Quentin Fuxa	9434390ad3	simplify task stopping condition	2025-10-26 17:26:43 +01:00
Quentin Fuxa	65250db92c	tensor to list at the stream end	2025-10-26 16:40:12 +01:00
Quentin Fuxa	416dce7975	fixes #261 Co-authored-by: yosagi <11404771+yosagi@users.noreply.github.com>"	2025-10-25 14:20:08 +02:00
Quentin Fuxa	0c5365e7c6	fixes #258	2025-10-24 20:51:16 +02:00
Quentin Fuxa	e7b05b0138	migration to silero vad v6: supports onnx	2025-10-23 23:52:00 +02:00
Quentin Fuxa	714fb3b14a	custom faster-whisper/mlx whisper encoder available	2025-10-23 20:33:17 +02:00
Quentin Fuxa	0af379c465	DOC: information about file format	2025-10-23 20:32:05 +02:00
Quentin Fuxa	1f684cdd97	fixes #251	2025-10-06 19:53:27 +02:00
Quentin Fuxa	9b1e061b32	forwarded_allow_ips in core	2025-10-04 23:04:00 +02:00
Quentin Fuxa	b4abc158b9	Merge pull request #249 from Damrod/add-ip-forwarding-support fix wss for reverse proxying	2025-10-06 10:20:05 +02:00
Alvaro Ollero	3736458503	Uvicorn exposes a configuration option to enable reverse proxying from a trusted ip. This PR exposes it downstreams to end clients	2025-10-04 22:21:06 +02:00
Quentin Fuxa	374618e050	token speakers are only reattributed for token coming after last_validated_token	2025-10-04 09:52:00 +02:00
Quentin Fuxa	543972ef38	fixes #248	2025-10-04 09:52:00 +02:00
Quentin Fuxa	a7db39d999	solves incorrect spacing in buffer diarization	2025-10-02 23:04:00 +02:00
Quentin Fuxa	a153e11fe0	update when self.diarization_before_transcription	2025-09-28 11:04:00 +02:00
Quentin Fuxa	ca6f9246cc	force language = en for .en models	2025-09-28 11:04:00 +02:00
Quentin Fuxa	d080d675a8	cutom alignment heads parameter for custom models	2025-09-27 11:04:00 +02:00
Quentin Fuxa	40bff38933	Merge pull request #239 from msghik/feature/fine-tuned-model-support feat: Allow loading fine-tuned models in simulstreaming	2025-09-29 10:08:26 +02:00
Quentin Fuxa	2fe3ca0188	connect source to output destination when used as chrome extension to keep audio playing	2025-09-27 13:59:44 +02:00
Quentin Fuxa	545ea15c9a	ensure buffer size to be a multiple of the element size	2025-09-27 13:58:32 +02:00
Quentin Fuxa	8cbaeecc75	cutom alignment heads parameter for custom models	2025-09-27 11:04:00 +02:00
google-labs-jules[bot]	70e854b346	feat: Allow loading fine-tuned models in simulstreaming This change modifies the `simulstreaming` backend to support loading fine-tuned Whisper models via the `--model_dir` argument. The `SimulStreamingASR` class has been updated to: - Use the `model_dir` path directly to load the model, which is the correct procedure for fine-tuned `.pt` files. - Automatically disable the `faster-whisper` and `mlx-whisper` fast encoders when `model_dir` is used, as they are not compatible with standard fine-tuned models. The call site in `core.py` already passed the `model_dir` argument, so no changes were needed there. This change makes the `simulstreaming` backend more flexible and allows users to leverage their own custom models.	2025-09-27 07:29:30 +00:00
Quentin Fuxa	d55490cd27	typo and simpler conditions	2025-09-26 20:38:26 +02:00
Quentin Fuxa	b22478c0b4	correct silences handling when language not auto	2025-09-25 23:20:00 +02:00
Quentin Fuxa	94c34efd90	chrome extension ws default to localhost	2025-09-25 23:04:00 +02:00
Quentin Fuxa	9fc6654a4a	common frontend for web/ and chrome extension	2025-09-25 23:14:25 +02:00
Quentin Fuxa	4dd5d8bf8a	translation compatible with auto and detected language	2025-09-22 11:20:00 +02:00
Quentin Fuxa	6caf3e0485	correct silence handling in translation	2025-09-27 11:58:00 +02:00
Quentin Fuxa	93f002cafb	language detection after few seconds working	2025-09-20 11:08:00 +02:00
Quentin Fuxa	c5e30c2c07	svg loaded once in javascript, no more need for StaticFiles	2025-09-20 11:06:00 +02:00
Quentin Fuxa	1c2afb8bd2	svg loaded once in javascript, no more need for StaticFiles	2025-09-20 11:06:00 +02:00
Quentin Fuxa	674b20d3af	in buffer while language not detected »	2025-09-21 11:05:00 +02:00
Quentin Fuxa	a5503308c5	O(n) to O(1) for simulstreaming timestamp determination	2025-09-21 11:04:00 +02:00
Quentin Fuxa	e61afdefa3	punctuation is now checked in timed_object	2025-09-22 22:40:39 +02:00
Quentin Fuxa	426d70a790	simulstreaming infer does not return a dictionary anymore	2025-09-21 11:03:00 +02:00
Quentin Fuxa	b03a212fbf	fixes #227 , auto language dectection v0.1 - simulstreaming only - when diarization and auto	2025-09-19 19:15:28 +02:00
Quentin Fuxa	1833e7c921	0.2.10	2025-09-16 23:45:00 +02:00
Quentin Fuxa	0a6e5ae9c1	ffmpeg install instruction error indicates --pcm-input alternative	2025-09-17 16:04:17 +02:00
Quentin Fuxa	ee448a37e9	when pcm-input is set, the frontend uses AudioWorklet	2025-09-17 14:55:57 +02:00
Quentin Fuxa	9c051052b0	Merge branch 'main' into ScriptProcessorNode-to-AudioWorklet	2025-09-17 11:28:36 +02:00
Quentin Fuxa	4d7c487614	replace deprecated ScriptProcessorNode with AudioWorklet	2025-09-17 10:53:53 +02:00
Quentin Fuxa	65025cc448	nllb backend can be transformers, and model size can be 1.3B	2025-09-17 10:20:31 +02:00
Quentin Fuxa	bbba1d9bb7	add nllb-backend and translation perf test in dev_notes	2025-09-16 20:45:01 +02:00
Quentin Fuxa	99dc96c644	fixes #224	2025-09-16 18:34:35 +02:00
GeorgeCaoJ	2a27d2030a	feat: support web audio 16kHz PCM input and remove ffmpeg dependency	2025-09-15 23:22:25 +08:00
Quentin Fuxa	cd160caaa1	asyncio.to_thread for transcription and translation	2025-09-15 15:23:22 +02:00
Quentin Fuxa	5aa312e437	simulstreaming warmup is done in whisperlivekit.simul_whisper.backend.load_model, not in warmup_online	2025-09-13 20:19:19 +01:00

1 2 3 4 5 ...

303 commits