Skip to main content

Documentation Index

Fetch the complete documentation index at: https://daily-docs-pr-4386.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

NvidiaLLMService provides access to NVIDIA’s NIM language models through an OpenAI-compatible interface. It inherits from OpenAILLMService and supports streaming responses, function calling, and context management, with special handling for NVIDIA’s incremental token reporting and enterprise deployment.

NVIDIA NIM LLM API Reference

Pipecat’s API methods for NVIDIA NIM integration

Example Implementation

Complete example with function calling

NVIDIA NIM Documentation

Official NVIDIA NIM documentation and setup

NVIDIA Developer Portal

Access NIM services and manage API keys

Installation

To use NVIDIA NIM services, install the required dependencies:
uv add "pipecat-ai[nvidia]"

Prerequisites

NVIDIA NIM Setup

Before using NVIDIA NIM LLM services, you need:
  1. NVIDIA Developer Account (cloud endpoint only): Sign up at NVIDIA Developer Portal
  2. API Key (cloud endpoint only): Generate an NVIDIA API key for NIM cloud services
  3. Model Selection: Choose from available NIM-hosted models
  4. Local NIM Setup (optional): For local deployments, configure NIM on-premises and set the base_url to your local endpoint

Environment Variables

  • NVIDIA_API_KEY: Your NVIDIA API key for authentication (required for cloud endpoint, not needed for local NIM deployments)

Configuration

api_key
str | None
default:"None"
NVIDIA API key for authentication. Required when using the cloud endpoint (https://integrate.api.nvidia.com/v1). Not needed for local NIM deployments.
base_url
str
default:"https://integrate.api.nvidia.com/v1"
Base URL for NIM API endpoint. Defaults to NVIDIA’s cloud endpoint. For local deployments, pass the local address (e.g., http://localhost:8000/v1).
model
str
default:"None"
deprecated
Model identifier to use.Deprecated in v0.0.105. Use settings=NvidiaLLMService.Settings(model=...) instead.
settings
NvidiaLLMService.Settings
default:"None"
Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using NvidiaLLMService.Settings(...). These can be updated mid-conversation with LLMUpdateSettingsFrame. See Service Settings for details. This service uses the same settings as OpenAILLMService. See OpenAI LLM Settings for the full parameter reference.

Usage

Basic Setup

import os
from pipecat.services.nvidia import NvidiaLLMService

llm = NvidiaLLMService(
    api_key=os.getenv("NVIDIA_API_KEY"),
    settings=NvidiaLLMService.Settings(
        model="nvidia/nemotron-3-nano-30b-a3b",
    ),
)

With Custom Settings

from pipecat.services.nvidia import NvidiaLLMService

llm = NvidiaLLMService(
    api_key=os.getenv("NVIDIA_API_KEY"),
    settings=NvidiaLLMService.Settings(
        model="nvidia/nemotron-3-nano-30b-a3b",
        temperature=0.7,
        top_p=0.9,
        max_completion_tokens=1024,
    ),
)

Notes

  • Token reporting: NVIDIA NIM uses incremental token reporting. The service accumulates token usage metrics during processing and reports the final totals at the end of each request.
  • Cloud vs. local deployment: NIM supports both cloud-hosted and on-premises deployments. For on-premises, override the base_url to point to your local NIM endpoint. API keys are only required for the cloud endpoint.
  • Reasoning content: The service automatically detects and filters reasoning content from model responses, emitting it as LLMThought*Frame objects. This applies to:
    • Models with API-level reasoning separation (e.g., Nemotron Nano models) that include a reasoning_content field
    • Models that emit reasoning inline using <think>...</think> tags (e.g., DeepSeek-R1, some Nemotron models)
    Reasoning frames are accessible to observers and logging but are not sent to TTS, keeping the spoken output clean while preserving visibility into the model’s thought process.
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.