Large Language Models

Spice provides a high-performance, OpenAI API-compatible AI Gateway optimized for managing and scaling large language models (LLMs). It offers tools for Enterprise Retrieval-Augmented Generation (RAG), such as SQL query across federated datasets and an advanced search feature (see Search).

Spice.ai Large-Language-Model (LLM) AI-Gateway .

Spice supports full OpenTelemetry observability, helping with detailed tracking of model tool use, recursion, data flows and requests for full transparency and easier debugging.

Configuring Language Models

Spice supports a variety of LLMs (see Model Providers).

Core Features

Custom Tools: Provide models with tools to interact with the Spice runtime. See Tools.
System Prompts: Customize system prompts and override defaults for v1/chat/completion. See Parameter Overrides.
Memory: Provide LLMs with memory persistence tools to store and retrieve information across conversations. See Memory.
Vector Search: Perform advanced vector-based searches using embeddings. See Vector Search.
Evals: Evaluate, track, compare, and improve language model performance for specific tasks. See Evals.
Local Models: Load and serve models locally from various sources, including local filesystems and Hugging Face. See Local Models.