December 29, 2025
AUTHOR Inside Pactice
Knowledge discipline and architectural control now matter more than models.

For most law firms, the AI conversation has already moved on from curiosity.
The early questions of can this draft, summarize, speed things up?, have largely been answered. What firms are grappling with now is something more difficult and more consequential: what does it mean to trust AI inside a legal operating environment?
Not “trust” in the abstract. Trust in the very practical sense that partners, risk teams, and clients care about:
- Can we rely on the output?
- Can we explain how it was produced?
- Can we defend the system that produced it?
Across recent Inside Practice discussions with firm leaders, innovation teams, and technologists, a consistent pattern is emerging. Legal AI rarely fails because the model underperforms. It fails because the surrounding system was never designed for reliability.
Two pressure points show up again and again: how firms handle knowledge, and how they control access as AI systems become more autonomous. Each exposes a different weakness. Together, they explain why so many pilots stall when firms try to scale.
Why “adding retrieval” hasn’t delivered reliability
Retrieval-Augmented Generation (RAG) is often positioned as a corrective - a way to ground AI output in firm knowledge and reduce hallucinations. In practice, many firms discover that retrieval solves one problem while revealing several others.
Outputs may be technically linked to internal documents, yet still fall short of legal standards. Answers sound plausible but miss critical nuance. Authorities are incomplete. Context is flattened. The result is work that feels risky rather than reliable.
The issue is rarely the retrieval technique itself. It is the condition of the knowledge being retrieved.
Most law firm knowledge environments were not built with machine consumption in mind. Information is spread across systems, structured inconsistently, maintained unevenly, and constrained by governance requirements that matter enormously in legal practice, ethical walls, matter boundaries, jurisdictional sensitivities.
When that reality is ignored, RAG becomes a thin layer over unresolved complexity.
What firms are learning (often the hard way) is that reliability begins upstream. Before retrieval ever takes place, knowledge must be prepared: structured so meaning is preserved, terminology aligned across practices, outdated material addressed, and context narrowed deliberately so systems are not forced to guess.
Governance here is not a policy exercise. It is an architectural one. If context is too broad, models improvise. If it is too narrow, they underperform. Designing that balance is work firms can no longer avoid.
These realities are the focus of From Retrieval to Reliability: Making RAG Work in Law Firms, a live Inside Practice discussion on January 23, 2026, led by Illitch Real and Zsolt Apponyi of Rubiklab. The session is grounded not in theory, but in the operational realities firms encounter once they try to deploy RAG beyond controlled demos.
When AI stops answering questions and starts taking action
As firms move beyond retrieval and drafting, a second challenge emerges, one that raises the stakes considerably.
Agent-based systems change the nature of the risk. These systems do not simply respond to prompts. They interact with tools, touch data, and initiate actions. At that point, familiar governance questions become more urgent:
- What can the system access?
- What is it allowed to do?
- And how consistently are those rules enforced?
Many firms discover that their existing AI governance frameworks were designed for a different era, one where AI lived inside a single interface and operated largely as a passive assistant. Agents expose the limits of that model. Permissions are implemented inconsistently. Integrations proliferate. Different tools expose the same data in different ways. Control becomes fragmented, and with it, accountability.
This is where architectural questions move to the foreground. The Model Context Protocol (MCP) has attracted attention not because it promises better answers, but because it addresses a different problem: how AI systems connect to the environments they operate within.
MCP focuses on standardizing the interface between AI applications and external resources, data sources, tools, and workflows. For law firms, its relevance lies in the possibility of reducing integration sprawl, enforcing permissions consistently, and introducing auditability into how AI systems actually function in practice.
Whether MCP itself becomes the dominant standard is almost beside the point. The underlying shift is more important. Firms are beginning to recognize that control cannot be bolted on after the fact. It has to be designed into the system.
These issues will be examined in Mastering AI with MCP: A Protocol for Controlling AI Data Access, taking place on January 21, 2026, with Morgan Llewellyn of HIKE2. The discussion is aimed squarely at firms that are already experimenting with advanced AI and are now confronting the operational consequences.
The uncomfortable middle ground firms are now in
What connects these two conversations is not technology, but operating reality.
Many firms are stuck in an uncomfortable middle ground. They have pilots that impress. They have proof-of-concepts that work under controlled conditions. But when asked whether those systems could be trusted across matters, practices, and jurisdictions, confidence drops.
The reason is structural.
- Knowledge is treated as content rather than infrastructure.
- Access is treated as configuration rather than design.
- Governance is treated as oversight rather than architecture.
The result is AI that performs well in isolation but struggles when exposed to the complexity of real legal work.
What is emerging instead is a more demanding and more realistic view of legal AI:
- Knowledge must be prepared intentionally, not simply indexed.
- Context must be constrained deliberately, not inferred.
- Access must be standardized, not recreated tool by tool.
- Control must be embedded in architecture, not managed through exception handling.
This is not glamorous work. It does not produce viral demos. But it is the work required if AI is to become something firms can rely on rather than continually manage around.
Why these conversations matter now
Neither of these discussions are about chasing the next tool. Both are about confronting decisions firms are already making, often implicitly.
- How much autonomy are we prepared to allow
- Where do we draw boundaries and how are they enforced
- What does “reliable” actually mean in a legal context
As AI capabilities accelerate, these questions are becoming harder to defer. Firms that address them deliberately will be better positioned to scale. Those that don’t will find themselves constrained not by regulation, but by their own systems.
RAG is not “just retrieval.” MCP is not “just integration.”
Both point toward the same destination: AI that is safe enough to scale, consistent enough to trust, and governed enough to defend.
We invite you to join these discussions.
Two events. One question: how do we build legal AI that can be trusted?





