AI Features — Help Center

Codapult ships with a production-ready AI layer built on the Vercel AI SDK with support for OpenAI and Anthropic models, streaming responses, tool use, organization quotas, conversation memory, and a full RAG pipeline.

Architecture

src/lib/ai/
├── models.ts         # Client-safe model options (id, label, provider)
├── providers.ts      # getModel() — resolves modelId → LanguageModel
├── embeddings.ts     # Embedding adapter (OpenAI / Ollama)
├── vector-store.ts   # Vector store adapter (SQLite / memory)
├── rag.ts            # RAG pipeline (index → chunk → embed → store → retrieve)
├── conversations.ts  # Conversation/message CRUD
└── chunker.ts        # Text chunking with overlap

Chat Endpoint

POST /api/chat accepts a JSON body with a messages array and an optional model selector:

{
  "messages": [{ "role": "user", "content": "How do I deploy?" }],
  "modelId": "gpt-4o-mini"
}

The endpoint follows the standard API route pattern: auth check → rate limiting (30 requests per 60 seconds per user) → org quota check → Zod validation → RAG context injection → streaming response.

Available Models

| Model ID | Label | Provider | | -------------------------- | --------------- | --------- | | gpt-4o-mini | GPT-4o Mini | OpenAI | | gpt-4o | GPT-4o | OpenAI | | claude-sonnet-4-20250514 | Claude Sonnet 4 | Anthropic | | claude-haiku-4-20250514 | Claude Haiku 4 | Anthropic |

Models are defined in src/lib/ai/models.ts. To add a new model, add an entry there and — if it's a new provider — add a case in src/lib/ai/providers.ts.

Configuration

All AI settings live in src/config/app.ts under appConfig.ai:

ai: {
  defaultModel: 'gpt-4o-mini',
  systemPrompt: 'You are a helpful AI assistant. Be concise, accurate, and helpful.',
  ragEnabled: true,
  ragMaxChunks: 3,
  ragMinScore: 0.4,
  allowedModels: [],   // empty = all models from models.ts
}

| Setting | Description | | --------------- | --------------------------------------------------------------------------- | | defaultModel | Model used when the user doesn't pick one (must match an ID in models.ts) | | systemPrompt | Prepended to every conversation | | ragEnabled | Toggle RAG context injection in chat | | ragMaxChunks | Maximum number of knowledge base chunks injected into the prompt | | ragMinScore | Minimum cosine similarity score (0–1) for RAG results | | allowedModels | Restrict the model selector; empty array enables all |

Tool Use

Chat supports function calling via the Vercel AI SDK. Tools are defined in /api/chat/route.ts:

import { z } from 'zod';
import type { Tool } from 'ai';

const chatTools: Record<string, Tool> = {
  getWeather: {
    description: 'Get current weather for a city',
    parameters: z.object({ city: z.string() }),
    execute: async ({ city }) => {
      const data = await fetchWeather(city);
      return { temperature: data.temp, condition: data.condition };
    },
  },
};

Multi-step tool invocations are enabled with maxSteps: 3.

Organization Quotas

AI usage is tracked per organization. Each plan defines a monthly credit allowance for the aiChat resource. The quota is checked before every chat request via checkOrgQuota(). Credits reset monthly via a background cron job.

Chat Memory

Conversation history is persisted in the database via src/lib/ai/conversations.ts:

| Endpoint | Method | Description | | ------------------------------ | ------ | -------------------------------- | | /api/chat/conversations | GET | List user conversations | | /api/chat/conversations | POST | Create a new conversation | | /api/chat/conversations/[id] | GET | Get a conversation with messages | | /api/chat/conversations/[id] | DELETE | Delete a conversation |

The Chat UI component (src/components/ai/ChatUI) connects to these endpoints and renders a full chat interface with model selection, conversation switching, and streaming responses.

RAG Pipeline

The RAG (Retrieval-Augmented Generation) pipeline lets the AI chat reference your domain-specific content — blog posts, help docs, feature requests, or any custom text.

How It Works

Index — content is chunked (800 chars, 150 overlap), embedded, and stored in the vector store
Retrieve — user queries are embedded and matched against stored vectors by cosine similarity
Augment — matching chunks are injected into the system prompt with source citations

Indexing Content

Use the indexDocument function or the admin API:

import { indexDocument } from '@/lib/ai/rag';

await indexDocument({
  sourceType: 'help',
  sourceId: 'getting-started',
  title: 'Getting Started Guide',
  content: markdownContent,
});

For large content, use the rag-index background job:

import { enqueue } from '@/lib/jobs';

await enqueue('rag-index', {
  sourceType: 'blog',
  sourceId: 'post-123',
  title: 'My Blog Post',
  content: markdownContent,
});

Admin Indexing API

POST /api/ai/index supports three actions:

| Action | Description | | -------- | ---------------------------------------------------------------------- | | index | Index a document (sourceType, sourceId, title, content) | | search | Search the vector store (query, optional sourceTypes, limit, minScore) | | delete | Delete indexed content (sourceType, optional sourceId) |

Embedding Providers

Embeddings use the adapter pattern, switched via the EMBEDDING_PROVIDER env var:

| Provider | Env Value | Requirements | | -------- | ------------------ | ------------------------------------------- | | OpenAI | openai (default) | OPENAI_API_KEY | | Ollama | ollama | OLLAMA_BASE_URL, OLLAMA_EMBEDDING_MODEL |

Ollama enables fully self-hosted embeddings — no external API calls. The default Ollama model is nomic-embed-text.

Vector Store

Vector storage uses the adapter pattern, switched via VECTOR_STORE_PROVIDER:

| Store | Env Value | Description | | ------ | ------------------ | --------------------------------------- | | SQLite | sqlite (default) | Persisted in Turso alongside app data | | Memory | memory | In-memory store for development/testing |

Source Types

Indexed content is categorized by source type:

| Type | Description | | ----------------- | ------------------------------------ | | blog | Blog posts | | help | Help center / documentation articles | | feature_request | Feature request descriptions | | custom | Any custom content |

Environment Variables

| Variable | Default | Description | | ------------------------ | ------------------------ | ------------------------------------------------- | | OPENAI_API_KEY | — | Required for OpenAI models and default embeddings | | ANTHROPIC_API_KEY | — | Required for Anthropic models | | EMBEDDING_PROVIDER | openai | Embedding backend (openai or ollama) | | VECTOR_STORE_PROVIDER | sqlite | Vector storage backend (sqlite or memory) | | OLLAMA_BASE_URL | http://localhost:11434 | Ollama server URL | | OLLAMA_EMBEDDING_MODEL | nomic-embed-text | Ollama model name for embeddings |

Removing the Module

AI Chat and the RAG Pipeline are separate removable modules. Use the setup wizard (npx @codapult/cli setup) to strip either or both. See docs/MODULES.md for manual removal steps.