v0.0.48
Added
-
There's now an input queue in each frame processor. When you call
FrameProcessor.push_frame()
this will internally callFrameProcessor.queue_frame()
on the next processor (upstream or downstream) and the frame will be internally queued (except system frames). Then, the queued frames will get processed. With this input queue it is also possible for FrameProcessors to block processing more frames by callingFrameProcessor.pause_processing_frames()
. The way to resume processing frames is by callingFrameProcessor.resume_processing_frames()
. -
Added audio filter
NoisereduceFilter
. -
Introduce input transport audio filters (
BaseAudioFilter
). Audio filters can be used to remove background noises before audio is sent to VAD. -
Introduce output transport audio mixers (
BaseAudioMixer
). Output transport audio mixers can be used, for example, to add background sounds or any other audio mixing functionality before the output audio is actually written to the transport. -
Added
GatedOpenAILLMContextAggregator
. This aggregator keeps the last received OpenAI LLM context frame and it doesn't let it through until the notifier is notified. -
Added
WakeNotifierFilter
. This processor expects a list of frame types and will execute a given callback predicate when a frame of any of those type is being processed. If the callback returns true the notifier will be notified. -
Added
NullFilter
. A null filter doesn't push any frames upstream or downstream. This is usually used to disable one of the pipelines inParallelPipeline
. -
Added
EventNotifier
. This can be used as a very simple synchronization feature between processors. -
Added
TavusVideoService
. This is an integration for Tavus digital twins. (see https://www.tavus.io/) -
Added
DailyTransport.update_subscriptions()
. This allows you to have fine grained control of what media subscriptions you want for each participant in a room. -
Added audio filter
KrispFilter
.
Changed
-
The following
DailyTransport
functions are nowasync
which means they need to be awaited:start_dialout
,stop_dialout
,start_recording
,stop_recording
,capture_participant_transcription
andcapture_participant_video
. -
Changed default output sample rate to 24000. This changes all TTS service to output to 24000 and also the default output transport sample rate. This improves audio quality at the cost of some extra bandwidth.
-
AzureTTSService
now uses Azure websockets instead of HTTP requests. -
The previous
AzureTTSService
HTTP implementation is nowAzureHttpTTSService
.
Fixed
-
Websocket transports (FastAPI and Websocket) now synchronize with time before sending data. This allows for interruptions to just work out of the box.
-
Improved bot speaking detection for all TTS services by using actual bot audio.
-
Fixed an issue that was generating constant bot started/stopped speaking frames for HTTP TTS services.
-
Fixed an issue that was causing stuttering with AWS TTS service.
-
Fixed an issue with PlayHTTTSService, where the TTFB metrics were reporting very small time values.
-
Fixed an issue where AzureTTSService wasn't initializing the specified language.
Other
-
Add
23-bot-background-sound.py
foundational example. -
Added a new foundational example
22-natural-conversation.py
. This example shows how to achieve a more natural conversation detecting when the user ends statement.