Integrating AI into Laravel: A Guide to RAG, Embeddings, and Function Calling

Beyond the Chatbox: Practical AI in Laravel

Most developers treat AI as a standalone product, but the real power lies in making it a core feature of your web applications. Integrating large language models (LLMs) like those from

into the
Laravel
ecosystem isn't just about sending strings to an API. It requires a fundamental shift in how we handle data retrieval and program flow. By moving past simple chat interfaces, we can build tools that understand our private documentation and interact with our application's internal functions. This tutorial explores the technical bridge between PHP and AI, focusing on making responses predictable, relevant, and actionable.

Prerequisites

To follow this guide, you should be comfortable with:

  • PHP 8.2+ and the Laravel framework.
  • PostgreSQL with a basic understanding of database migrations.
  • Fundamental API concepts (REST, JSON payloads).
  • A basic understanding of the Inertia.js or Vue.js stack is helpful for front-end implementation but not strictly required for the backend logic.

Key Libraries & Tools

  • OpenAI PHP Laravel
    : A community-supported wrapper developed by
    Nuno Maduro
    and
    Sandro Gehri
    that provides a clean facade for API interactions.
  • PG Vector
    : A PostgreSQL extension that allows for storing and querying vector embeddings.
  • Eloquent PG Vector: A package that adds vector support directly to Laravel’s Eloquent ORM, allowing you to treat high-dimensional data like standard attributes.

Mastering the Prompt: Identity and Instructions

Everything begins with the prompt. In a programmatic context, we use the Chat Completions API. This isn't just a single string; it's an array of messages with specific roles.

$response = OpenAI::chat()->create([
    'model' => 'gpt-4o',
    'messages' => [
        ['role' => 'system', 'content' => 'You are an energetic Laravel expert.'],
        ['role' => 'user', 'content' => 'How do I add TFA to my app?'],
    ],
    'max_tokens' => 1024,
    'temperature' => 0,
]);

The System Role acts as the director, setting the persona and behavioral boundaries. The User Role represents the actual query, while the Assistant Role is the model's response. Setting the temperature to 0 is a best practice for technical tools; it minimizes "creativity" and ensures the model provides the most likely, factual response rather than hallucinating varied answers.

Vector Embeddings and Semantic Search

Traditional search relies on keywords. Semantic search relies on meaning. We achieve this through Vector Embeddings—giant arrays of floating-point numbers (usually 1,536 dimensions) that represent the "essence" of a text snippet.

When a user asks a question, you first convert that question into a vector using the embeddings endpoint:

$result = OpenAI::embeddings()->create([
    'model' => 'text-embedding-3-small',
    'input' => $userQuestion,
]);

$queryVector = $result->embeddings[0]->embedding;

You then store these vectors in a database like PostgreSQL using

. This allows you to perform a mathematical comparison called an inner product to find documents that are "mathematically close" to the user's question. This is the foundation of Retrieval Augmented Generation (RAG).

Implementing RAG: Knowledge Augmentation

allows an LLM to answer questions using data it wasn't originally trained on, such as your company's private documentation. The workflow follows a strict pattern:

  1. Embed the user's query.
  2. Query your database for snippets with similar vectors.
  3. Inject those snippets into the System Message of your next API call.
// Querying similar documentation sections
$sections = Section::query()
    ->where('embedding', '<#>', $queryVector) // Inner product operator
    ->limit(5)
    ->get();

$context = $sections->pluck('content')->implode("\n");

// Send context + question to the model
$finalResponse = OpenAI::chat()->create([
    'model' => 'gpt-4o',
    'messages' => [
        ['role' => 'system', 'content' => "Use this context to answer: $context"],
        ['role' => 'user', 'content' => $userQuestion],
    ],
]);

Agentic Control and Function Calling

While RAG provides information, Function Calling (or Tools) provides action. This shifts your app from linear control to Agentic Program Control. Instead of the developer defining every step, the model decides which tool to use.

You define your functions in JSON schema within the API request. If the model determines it needs a tool—like get_weather—it won't return a text answer. Instead, it returns a finish_reason of tool_calls.

'tools' => [[
    'type' => 'function',
    'function' => [
        'name' => 'get_weather',
        'description' => 'Get current weather for a location',
        'parameters' => [
            'type' => 'object',
            'properties' => [
                'location' => ['type' => 'string'],
                'unit' => ['type' => 'string', 'enum' => ['celsius', 'fahrenheit']],
            ],
        ],
    ],
]]

Your Laravel application catches this request, executes the actual PHP logic (e.g., calling a weather API), and sends the results back to the model in a new message with the tool role. This loop continues until the model has enough data to provide a final response to the user.

Syntax Notes & Best Practices

  • Role Discipline: Use the system role for instructions and the tool role for data output. Never mix user input with system instructions to avoid prompt injection.
  • Token Management: Always monitor usage in the API response. High-density vectors and large context windows increase costs rapidly.
  • Identity Predictability: Give your AI a specific job title in the prompt. An "Expert Code Auditor" provides better feedback than a generic "AI Assistant."

Tips & Gotchas

  • Hallucinations: Always provide an "out" in your prompt. Tell the model: "If you do not know the answer based on the context provided, say you don't know."
  • Statelessness: Remember that each API call is independent. If you want a conversation, you must pass the entire history of messages (User, Assistant, and Tool) back to the model with every new request.
  • Distance Metrics: When using PG Vector, understand your distance operators. <#> is for negative inner product, while <-> is for Euclidean distance. For OpenAI embeddings, inner product is the recommended similarity metric.
5 min read