AWS Nova Sonic

Overview

AWSNovaSonicLLMService enables natural, real-time conversations with AWS Nova Sonic. It provides built-in audio transcription, voice activity detection, and context management for creating interactive AI experiences with bidirectional audio streaming, text generation, and function calling capabilities.

AWS Nova Sonic API Reference

Pipecat’s API methods for AWS Nova Sonic integration

Example Implementation

Complete AWS Nova Sonic conversation example

AWS Bedrock Documentation

Official AWS Bedrock and Nova Sonic documentation

AWS Console

Access AWS Bedrock and manage Nova Sonic models

Installation

To use AWS Nova Sonic services, install the required dependencies:

uv add "pipecat-ai[aws-nova-sonic]"

Prerequisites

AWS Account Setup

Before using AWS Nova Sonic services, you need:

AWS Account: Set up at AWS Console
Bedrock Access: Enable AWS Bedrock service in your region
Model Access: Request access to Nova Sonic models in Bedrock
IAM Credentials: Configure AWS access keys with Bedrock permissions

Required Environment Variables

AWS_SECRET_ACCESS_KEY: Your AWS secret access key
AWS_ACCESS_KEY_ID: Your AWS access key ID
AWS_REGION: AWS region where Bedrock is available

Key Features

Real-time Speech-to-Speech: Direct audio input to audio output processing
Built-in Transcription: Automatic speech-to-text with real-time streaming
Voice Activity Detection: Automatic detection of speech start/stop
Function Calling: Support for external function and API integration
Multiple Voices: Choose from matthew, tiffany, and amy voice options

Configuration

AWSNovaSonicLLMService

secret_access_key

str

required

AWS secret access key for authentication.

access_key_id

str

required

AWS access key ID for authentication.

session_token

str

default:"None"

AWS session token for temporary credentials (e.g., when using AWS STS).

region

str

required

AWS region where the service is hosted. Supported regions for Nova 2 Sonic (default): "us-east-1", "us-west-2", "ap-northeast-1". Supported regions for Nova Sonic (older model): "us-east-1", "ap-northeast-1".

model

str

default:"amazon.nova-2-sonic-v1:0"

deprecated

Model identifier. Use "amazon.nova-2-sonic-v1:0" for the latest model or "amazon.nova-sonic-v1:0" for the older model.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(model=...) instead.

voice_id

str

default:"matthew"

deprecated

Voice ID for speech synthesis. Some voices are designed for specific languages. See AWS Nova 2 Sonic voice support for available voices.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(voice=...) instead.

params

Params

default:"Params()"

deprecated

Model parameters for audio configuration and inference. See Params below.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(...) for inference settings and audio_config=AudioConfig(...) for audio configuration.

audio_config

AudioConfig

default:"None"

Audio configuration (sample rates, sample sizes, channel counts). If not provided, defaults are used (16kHz input, 24kHz output, 16-bit, mono). See AudioConfig below.

settings

AWSNovaSonicLLMService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

system_instruction

str

default:"None"

deprecated

System-level instruction for the model.Deprecated in v0.0.105. Use settings=AWSNovaSonicLLMService.Settings(system_instruction=...) instead.

tools

ToolsSchema

default:"None"

Available tools/functions for the model to use.

session_continuation

SessionContinuationParams

default:"None"

Configuration for automatic session continuation. When enabled (the default), sessions are seamlessly rotated before the AWS time limit (~8 minutes) with no user-perceptible interruption. See SessionContinuationParams below.

Settings

Runtime-configurable settings passed via the settings constructor argument using AWSNovaSonicLLMService.Settings(...). These can be updated mid-conversation with LLMUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`NOT_GIVEN`	Model identifier. (Inherited from base settings.)
`system_instruction`	`str`	`NOT_GIVEN`	System instruction/prompt. (Inherited from base settings.)
`temperature`	`float`	`NOT_GIVEN`	Sampling temperature for text generation. (Inherited from base settings.)
`max_tokens`	`int`	`NOT_GIVEN`	Maximum number of tokens to generate. (Inherited from base settings.)
`top_p`	`float`	`NOT_GIVEN`	Nucleus sampling parameter. (Inherited from base settings.)
`voice`	`str`	`NOT_GIVEN`	Voice ID for speech synthesis.
`endpointing_sensitivity`	`str \| None`	`NOT_GIVEN`	Controls how quickly Nova Sonic decides the user has stopped speaking. Values: `"LOW"`, `"MEDIUM"`, or `"HIGH"`. Only supported with Nova 2 Sonic (default model).

NOT_GIVEN values are omitted, letting the service use its own defaults (e.g. "amazon.nova-2-sonic-v1:0" for model, "matthew" for voice, 0.7 for temperature, 1024 for max_tokens). Only parameters that are explicitly set are included.

AudioConfig

Audio configuration passed via the audio_config constructor argument.

Parameter	Type	Default	Description
`input_sample_rate`	`int`	`16000`	Audio input sample rate in Hz.
`input_sample_size`	`int`	`16`	Audio input sample size in bits.
`input_channel_count`	`int`	`1`	Number of input audio channels.
`output_sample_rate`	`int`	`24000`	Audio output sample rate in Hz.
`output_sample_size`	`int`	`16`	Audio output sample size in bits.
`output_channel_count`	`int`	`1`	Number of output audio channels.

SessionContinuationParams

Configuration for automatic session continuation, passed via the session_continuation constructor argument. Nova Sonic sessions have an AWS-imposed time limit (~8 minutes). When enabled, session continuation proactively creates a new session in the background before the limit is reached, buffers user audio during the transition, and seamlessly hands off — preserving conversation context with no user-perceptible gap.

Parameter	Type	Default	Description
`enabled`	`bool`	`True`	Whether automatic session continuation is enabled.
`transition_threshold_seconds`	`float`	`360.0`	How many seconds into a session to begin monitoring for a transition opportunity. The transition will occur when the assistant next starts speaking after this threshold.
`audio_buffer_duration_seconds`	`float`	`3.0`	Duration of the rolling audio buffer (in seconds) that captures user audio during the transition window. This audio is replayed into the new session so no user input is lost.
`audio_start_timeout_seconds`	`float`	`80.0`	Maximum time to wait for the assistant to start speaking after the threshold is reached. If no assistant audio arrives within this window, the transition is forced. Set to `0` to disable the timeout (wait indefinitely).

Usage

Basic Setup

import os
from pipecat.services.aws.nova_sonic import AWSNovaSonicLLMService

llm = AWSNovaSonicLLMService(
    secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region=os.getenv("AWS_REGION"),
    settings=AWSNovaSonicLLMService.Settings(
        voice="matthew",
        system_instruction="You are a helpful assistant.",
    ),
)

With Settings

from pipecat.services.aws.nova_sonic import AWSNovaSonicLLMService, AudioConfig

llm = AWSNovaSonicLLMService(
    secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region="us-east-1",
    audio_config=AudioConfig(
        input_sample_rate=16000,
        output_sample_rate=24000,
    ),
    settings=AWSNovaSonicLLMService.Settings(
        model="amazon.nova-2-sonic-v1:0",
        voice="tiffany",
        system_instruction="You are a helpful assistant.",
        temperature=0.5,
        max_tokens=2048,
        endpointing_sensitivity="MEDIUM",
    ),
)

With Function Calling

from pipecat.services.aws.nova_sonic import AWSNovaSonicLLMService

llm = AWSNovaSonicLLMService(
    secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region="us-east-1",
    settings=AWSNovaSonicLLMService.Settings(
        voice="matthew",
        system_instruction="You are a helpful assistant that can check the weather.",
    ),
    tools=tools,  # ToolsSchema instance
)

@llm.function("get_weather")
async def get_weather(function_name, tool_call_id, args, llm, context, result_callback):
    location = args.get("location", "unknown")
    await result_callback({"temperature": 72, "condition": "sunny", "location": location})

With Session Continuation

from pipecat.services.aws.nova_sonic import AWSNovaSonicLLMService
from pipecat.services.aws.nova_sonic.session_continuation import SessionContinuationParams

llm = AWSNovaSonicLLMService(
    secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
    access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    region="us-east-1",
    settings=AWSNovaSonicLLMService.Settings(
        voice="tiffany",
        system_instruction="You are a helpful assistant.",
    ),
    # Session continuation is enabled by default. You can customize the behavior:
    session_continuation=SessionContinuationParams(
        enabled=True,
        transition_threshold_seconds=360,  # Start transition after 6 minutes
        audio_buffer_duration_seconds=3.0,  # Buffer 3 seconds of audio during transition
        audio_start_timeout_seconds=80.0,  # Force transition if no response within 80s
    ),
)

# To disable session continuation:
# session_continuation=SessionContinuationParams(enabled=False)

The Params / params= pattern is deprecated as of v0.0.105. Use Settings / settings= for inference settings and AudioConfig / audio_config= for audio configuration instead. See the Service Settings guide for migration details.

Notes

Model versions: Nova 2 Sonic (amazon.nova-2-sonic-v1:0) is the default and recommended model. The older Nova Sonic (amazon.nova-sonic-v1:0) has fewer features and requires an assistant response trigger mechanism.
Session continuation: Enabled by default to handle AWS’s ~8-minute session limit. The service automatically rotates sessions in the background with no user-perceptible interruption, preserving conversation context and buffering user audio during the transition. You can tune the threshold or disable it via session_continuation parameter.
Endpointing sensitivity: Only supported with Nova 2 Sonic. Controls how quickly the model decides the user has stopped speaking — "HIGH" causes the model to respond most quickly.
Transcription frames: User speech transcription frames are always emitted upstream. Assistant text transcripts are delivered in real-time using speculative text events, providing text synchronized with audio output for responsive client UIs.
Connection resilience: If a connection error occurs while the service wants to stay connected, it automatically resets the conversation and reconnects.
System instruction precedence: The system_instruction from service settings takes precedence over an initial system message in the LLM context. A warning is logged when both are set. Tools provided in the LLM context take precedence over those provided at initialization time.
Audio format: Uses LPCM (Linear PCM) audio format for both input and output. Input defaults to 16kHz and output defaults to 24kHz.

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Overview

AWS Nova Sonic API Reference

Example Implementation

AWS Bedrock Documentation

AWS Console

Installation

Prerequisites

AWS Account Setup

Required Environment Variables

Key Features

Configuration

AWSNovaSonicLLMService

Settings

AudioConfig

SessionContinuationParams

Usage

Basic Setup

With Settings

With Function Calling

With Session Continuation

Notes

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Documentation Index

​Overview

AWS Nova Sonic API Reference

Example Implementation

AWS Bedrock Documentation

AWS Console

​Installation

​Prerequisites

​AWS Account Setup

​Required Environment Variables

​Key Features

​Configuration

​AWSNovaSonicLLMService

​Settings

​AudioConfig

​SessionContinuationParams

​Usage

​Basic Setup

​With Settings

​With Function Calling

​With Session Continuation

​Notes

Overview

Installation

Prerequisites

AWS Account Setup

Required Environment Variables

Key Features

Configuration

AWSNovaSonicLLMService

Settings

AudioConfig

SessionContinuationParams

Usage

Basic Setup

With Settings

With Function Calling

With Session Continuation

Notes