Quentin Fuxa
cf6c49f502
Ruff lint cleanup
2026-01-03 10:23:00 +01:00
Quentin Fuxa
451535d48f
Fix ctranslate2 encoder conversion ( #345 ) and memory leak in TokensAlignment ( #344 )
...
- Add fallback chain for StorageView to numpy conversion
- Prune old tokens/segments after 5min to bound memory
2026-03-10 22:37:00 +01:00
Quentin Fuxa
b8d9d7d289
fix: handle numpy object_ dtype from ctranslate2 encoder ( #337 )
2026-02-20 20:48:28 +01:00
Quentin Fuxa
7f3a3df620
simulstreaming mlx & torch dedup of common base
2025-02-15 23:52:00 +01:00
Quentin Fuxa
8c799fa4d1
fix simulstreaming vram leak: cap cross-attn accumulation + token budget
...
fixes #283 , fixes #275
- accumulated_cross_attns was growing unboundedly during decoding loop,
using up to ~5GB for repetition loops. now capped to rolling window of 16
- max_tokens_per_chunk was using TOKENS_PER_SECOND (mel frame rate = 50)
instead of actual text token rate (~15/s), allowing 10-40x too many
decoding steps
- removed unused torch.cat on early return path
- removed dead self.committed/last_result_tokens lists (never read)
- same fixes applied to mlx variant
2026-02-11 22:10:00 +01:00
Emmanuel Schmidbauer
d59ddbaeae
Fix critical thread safety issues
2026-01-09 11:23:19 -05:00
Quentin Fuxa
d45c397c6a
simulstreaming: limit n tokens to prevent hallucinations
2025-11-28 21:41:19 +01:00
Quentin Fuxa
7faa21f95f
alignatt: enable model sharing by removing hooks and centralizing session state. Solves #282
...
Co-authored-by: Emmanuel Schmidbauer <eschmidbauer@gmail.com>
2025-11-25 23:07:42 +01:00
Quentin Fuxa
870141298c
isort
2025-11-23 11:20:00 +01:00
Quentin Fuxa
6206fff118
0.2.15
2025-11-21 23:52:00 +01:00
Quentin Fuxa
9a45ec221c
internal rework 1
2025-11-20 12:58:38 +01:00
Quentin Fuxa
e9b4ceeee5
Add audio partial silence in chunks handling. bump to 0.2.14.post3
2025-11-17 22:52:00 +01:00
Quentin Fuxa
437641fb43
reduce min-chunk-size to 0.1, set default model to base
2027-04-25 23:52:00 +02:00
Quentin Fuxa
a38c103fcd
simulstreaming coreml encoder compatibility
2025-11-16 21:24:14 +01:00
Quentin Fuxa
80b77998f9
Refactor backend handling
2025-11-15 19:51:41 +01:00
Quentin Fuxa
13401ffe24
whisper core at root of wlk
2025-11-10 12:17:18 +01:00
Quentin Fuxa
65250db92c
tensor to list at the stream end
2025-10-26 16:40:12 +01:00
Quentin Fuxa
416dce7975
fixes #261
...
Co-authored-by: yosagi <11404771+yosagi@users.noreply.github.com>"
2025-10-25 14:20:08 +02:00
Quentin Fuxa
714fb3b14a
custom faster-whisper/mlx whisper encoder available
2025-10-23 20:33:17 +02:00
Quentin Fuxa
4dd5d8bf8a
translation compatible with auto and detected language
2025-09-22 11:20:00 +02:00
Quentin Fuxa
93f002cafb
language detection after few seconds working
2025-09-20 11:08:00 +02:00
Quentin Fuxa
674b20d3af
in buffer while language not detected »
2025-09-21 11:05:00 +02:00
Quentin Fuxa
a5503308c5
O(n) to O(1) for simulstreaming timestamp determination
2025-09-21 11:04:00 +02:00
Quentin Fuxa
426d70a790
simulstreaming infer does not return a dictionary anymore
2025-09-21 11:03:00 +02:00
Quentin Fuxa
add7ea07ee
translator takes all the tokens from the queue
2025-09-09 19:55:39 +02:00
Quentin Fuxa
3358877054
Fix StorageView conversion for CPU/GPU compatibility
2025-09-09 15:44:16 +02:00
Quentin Fuxa
1f7798c7c1
condition on encoder_feature_ctranslate type
2025-09-09 12:16:52 +02:00
Alexander Lindberg
c7b3bb5e58
Fix regression with faster-whisper encoder_feature
2025-09-09 11:18:55 +03:00
Quentin Fuxa
334b338ab0
use platform to determine system and recommand mlx whisper
2025-09-07 15:49:11 +02:00
Quentin Fuxa
f3ad4e39e4
torch.Tensor to torch.as_tensor
2025-09-04 16:39:11 +02:00
Quentin Fuxa
e0a5cbf0e7
v0.1.0 chrome extension
2025-09-04 16:36:28 +02:00
Quentin Fuxa
953697cd86
torch.Tensor to torch.as_tensor
2025-09-04 15:25:39 +02:00
Quentin Fuxa
3bd2122eb4
0.2.8 : only the decoder of whisper is loaded in memory when a different encoder is used
2025-09-02 21:12:25 +02:00
Quentin Fuxa
d5008ed828
mlx/fasterWhisper encoders are loaded once and shared in simulstreaming
2025-09-01 12:33:19 +02:00
Quentin Fuxa
199e21b3ef
faster-whisper as an optional encoder alternative for simulstreaming
2025-08-30 23:50:16 +02:00
Quentin Fuxa
1d926f2e67
mlx-whisper used as simulstreaming encoder: improve speed for macos systems
2025-08-30 22:19:11 +02:00
Quentin Fuxa
d0e9e37ef6
simulstreaming: cumulative_time_offset to keep timestamps correct when audio > 30s
2025-08-17 09:33:47 +02:00
Quentin Fuxa
55e08474f3
recycle backend in simulstreaming thanks to new remove hooks function
2025-08-16 23:06:16 +02:00
Quentin Fuxa
d098af3185
each SimulStreamingOnlineProcessor now contains PaddedAlignAttWhisper instance. SimulStreamingASR only contains loaded whisper model
2025-08-11 08:24:14 +02:00
Quentin Fuxa
b678a55f63
remove duplicate file
2025-08-09 23:10:34 +02:00
Quentin Fuxa
00424d7ca3
latest version of simulstreaming
2025-07-31 16:44:23 +02:00
Quentin Fuxa
4b738d6f63
fix duplicate line
2025-07-31 16:29:35 +02:00
Quentin Fuxa
8a5e2adb1e
simulstreaming: fixes token handling during warm-up phase
2025-07-31 16:25:34 +02:00
Quentin Fuxa
e1d4bf7e94
modify import paths in simul whisper backend so that it works in lib mode
2025-07-01 20:34:47 +02:00