MCP – Research, Videos, Insights & Reviews

// AI Engineer

The Monolithic Memory Problem in Enterprise AI Enterprise AI has hit a wall. Despite the explosion of LLM capabilities, from prompt engineering to deep agents like Replit, Jira tickets aren't moving faster. Raj Navakoti, a staff software engineer at IKEA, identifies a core structural issue: institutional knowledge is a monolith. He compares the current state of AI agents to the protagonist of the film Memento, who possesses high skill but cannot hold memory for more than 15 minutes. This creates a perpetual state of disorientation where agents excel at general tasks but fail at specific, domain-heavy requirements. The industry currently pushes a retrieval-heavy strategy involving RAG and MCP servers. However, plugging an MCP server into a broken, monolithic knowledge base is like connecting a high-speed pipe to a dry well. Navakoti argues that roughly 40% of critical organizational knowledge is "tribal"—it lives in people's heads and never hits Confluence or Slack. Another 20% is outdated, and 20% is unreliable. When agents fail, they aren't just failing to process data; they are exposing the structural gaps in how a company documents its existence. Switching from Knowledge Push to Demand-Driven Pull Traditional knowledge management relies on a "push" strategy: engineers attempt to document everything upfront and push it toward the agent. This is inherently inefficient. Navakoti proposes a "pull" strategy inspired by how we onboard new human employees. We don't ask a junior dev to memorize the entire company wiki before their first commit; we give them a task, and they pull the necessary information as they hit roadblocks. This demand-driven context approach turns the agent from a passive consumer into an active knowledge manager. In this framework, the agent is intentionally given a problem it will likely fail to solve. This failure is the catalyst. Instead of giving up, the agent generates a checklist of exactly what it doesn't understand. It identifies missing API definitions, unclear business logic, and undocumented system behaviors. This "demand extraction" surfaces the tribal knowledge that documentation efforts usually miss because they don't know what they don't know. The Agent Lifecycle and the Curation Loop The methodology operates in a cycle similar to Test-Driven Development (TDD). In TDD, we write a failing test first to define the requirement. In demand-driven context, we provide a "failed problem." The cycle moves through four distinct phases: 1. **Problem Assignment**: An agent is tasked with a real-world issue, such as a root cause analysis for a production incident. 2. **Discovery of Gaps**: The agent attempts retrieval via RAG but identifies specific missing entities or outdated information, assigning confidence scores to its current context. 3. **Human Interjection**: A domain expert fills the specific gaps identified by the agent's checklist. This is surgical documentation rather than a broad, exhaustive effort. 4. **Curation and Storage**: The agent takes the new information, curates it into a structured format (like Markdown), and saves it to a persistent repository. By running this cycle across multiple incidents, the agent's confidence score improves measurably. Navakoti demonstrated that over 14 incident cycles, an agent's confidence in handling tasks jumped from 1.5 to 4.4 out of 5. The agent builds its own "cache" of curated context blocks that are significantly more useful than the raw monolith of Confluence pages. Why GitHub is the Ultimate Knowledge Repository A controversial but practical element of Navakoti's framework is the storage layer. While many look to expensive SaaS solutions for knowledge management, he advocates for storing curated context in GitHub repositories. The reasoning is purely engineering-driven: GitHub provides built-in version control, Pull Request reviews, and conflict resolution. When multiple agents and human experts contribute to a shared knowledge base, data conflicts are inevitable. Treating knowledge as code allows teams to use the same rigorous CI/CD pipelines for their documentation as they do for their software. If an agent proposes a documentation update based on a resolved incident, a human expert can review that PR to ensure the logic is sound. Once merged, that knowledge is permanently indexed and available for all other agents in the ecosystem. Navigating the Domain with Meta Models Beyond raw text, the framework benefits from a Meta Model—a map of how different domain entities relate to one another. An agent needs to understand that a notification service failure affects specific business processes and relies on certain APIs. Without this map, the agent is just guessing which files to retrieve. The Meta Model acts as a navigation layer. It allows the agent to reason about the "blast radius" of a change or the dependencies of a specific system. When the file structure of the knowledge base reflects this Meta Model, retrieval becomes deterministic rather than probabilistic. Instead of hoping the vector database finds the right chunk, the agent knows exactly which branch of the knowledge tree to explore. Scalability and the Future of Agentic Operations Critics of this approach often point to the human cost of answering agent questions. Navakoti acknowledges this but argues the cost is front-loaded. You are effectively performing a one-time "denial of service" on your engineers to fix a decade of documentation debt. Once the 20% of most-used knowledge is curated into context blocks, the agent becomes semi-autonomous. The goal is to move the agent from being a pure consumer of information to a guardian of it. As context windows expand—with Claude now supporting 1 million tokens—the technical limitation is no longer memory size, but memory quality. By using failure as a scanner, enterprises can finally map their "unknown unknowns" and build a knowledge base that actually helps the AI move the Jira board.

May 5, 2026

Raj Navakoti says agent failure is the best way to fix your knowledge base

Codex CLI /goal command survives context limits but hits command approval wall