What California's Verify AI Output Rule Requires

On May 5, 2026, the California State Bar's ethics committee proposed adding a comment to Rule 1.1 (Competence) requiring lawyers to independently review and verify AI output used in client work. Connecticut's Rules Committee proposed a parallel requirement weeks earlier. New York's Uniform Court System Part 161 takes effect June 1 with the same independent-checking standard. Three states proposing or enforcing essentially the same duty to verify AI output within 30 days is no longer a coincidence, it is a cascade.

The unspoken question the rule creates is operational. What does 'independent verification' actually mean? Pasting a draft back into the same model's API and asking 'is this correct?' does not qualify. This post walks through the proposed text, the multi-state cascade, the enforcement record that prompted it, and what an objective standard for 'independent' looks like in practice.

What does the California Bar's 'verify every AI output' rule actually require?

The proposed comment to Rule 1.1 (Competence) reads, in the redline circulated by the California State Bar in April 2026: 'When using technology, including AI, a lawyer must independently review and verify any output used in representing a client.' This is the language in the proposed amended rules redline and confirmed by the Newsroom of the California Courts on May 5, 2026, alongside five other AI-focused ethics changes covering disclosure, confidentiality, candor, and supervision.

Three things matter about the wording.

The duty is independent

The rule does not say 'review the AI output.' A lawyer already has a competence duty to review their work product. The new word is 'independent,' and that word is doing all the work.

The duty is affirmative

The rule does not say 'do not rely uncritically on AI.' It says 'review and verify.' Verification is an action with an artifact. A lawyer asked 'did you verify this independently?' needs an answer that does not reduce to 'I read it again.'

The duty applies to any output used in representing a client

Not 'AI-generated citations' or 'filings.' Outputs. That language scopes to research summaries, deposition prep memos, draft contract clauses, witness questions, anything that travels from the model into client work.

Bob Ambrogi's LawNext walkthrough circulated the exact 'independently review and verify' language to the bar and in-house community within 24 hours of the announcement. Every California lawyer with an AI workflow now has that language sitting on their desk.

How is 'independent verification' different from re-reading an AI draft?

This is the question the rule does not answer, and the answer is the operational core of the duty. Consider the difference between two workflows.

Workflow A

Lawyer drafts brief with Claude. Pastes the draft back into Claude. Asks: 'Are these citations accurate? Are these arguments correctly stated?' Claude responds with confidence that everything looks fine. Lawyer files.

Workflow B

Lawyer drafts brief with Claude. Pulls citations into Westlaw and confirms each one exists, says what is claimed, and remains good law. Asks GPT and Gemini to independently extract the supporting passages from the actual sources. Files only when the cross-vendor extractions agree.

Workflow A is what most lawyers describe when they say 'I checked the AI output.' Workflow B is what the proposed rule actually requires. The difference is structural. Workflow A is the same probability distribution that produced the draft grading the draft. Workflow B introduces non-correlated error sources, which is the architectural reason independence has operational meaning.

Why this matters for the duty: in April 2026, the Northern District of California fined a managing partner for failing to supervise a junior whose AI-fabricated citations made it into a filing (Bloomberg Law, April 29 2026). The court did not fault the AI tool. It faulted the supervision chain. Verification is now part of supervision. 'I read it again' is not a defense. 'We ran it through cross-vendor consensus' is.

A practical playbook published April 24 by Unite.ai proposes evidence-linked outputs with traceable sources, reviewer identity, timestamps, and model/version metadata. That framework is what defensible verification looks like as a paper trail. It does not specify the verification mechanism, but the implication is unavoidable: a paper trail of the same model checking itself is not evidence of independence.

Notice the pattern. Workflow B is just Workflow A with a second probability distribution checking the first. That is exactly what TrueStandard does: paste your draft, four to five models check in parallel in 60 seconds, every disagreement surfaced.

Which states are requiring lawyers to verify AI work?

Three live moves in the same 30-day window.

California (May 5, 2026)

Proposed Rule 1.1 comment requiring independent verification of AI output, plus five related ethics changes. See the Cal Bar redline PDF.

Connecticut (April 23, 2026)

Rules Committee proposed practice-book changes requiring attorneys and pro se parties to independently verify all citations and authorities produced by generative AI. Practitioner take: this fits longstanding obligations rather than imposing new ones (Day Pitney via Law360).

New York (effective June 1, 2026)

UCS Part 161 already adopted, requiring attorneys who used AI to 'carefully review papers and independently ensure no fabricated material.' Local courts are posting compliance guidance now (master rule, Kings County local).

The eDiscovery Today May 6 aggregation suggests the cascade is the story, not any individual rule. Three states acting nearly in parallel with substantively similar language is how a national standard gets written without a federal rulemaking. The next 90 days will produce more state proposals. The firms watching the cascade are already adjusting their internal QA before the rules take effect.

Why is California writing this rule now?

Because the enforcement docket already exists and is growing. The sanction record on AI citations reads like a chronology of the rule writing itself. In the same window the proposal was being drafted:

Pa. attorney

$5,000 sanction plus AI-ethics training for an AI-generated case citation. Judge 'appalled' by repeated bogus citations (Law360, April 20 2026).

Sullivan and Cromwell

Apologized to Bankruptcy Judge Martin Glenn (SDNY) for AI-introduced errors in a Chapter 15 filing. Opposing counsel flagged (Reuters, April 21 2026).

Georgia Supreme Court

Sanctioned an Asst. DA for nonexistent AI-generated citations in a murder appeal (Law360, May 5 2026).

California State Bar

Opened disciplinary charges against multiple lawyers for filings with fake AI citations. One faces probationary suspension (LA Times, April 13 2026).

Court order in deposition

Barred a pro se deponent from using ChatGPT during deposition and rejected privilege claims. The opinion: 'AI tools are not lawyers' (EDRM, April 27 2026).

Delaware Chancery $250M earn-out dispute

Cited a CEO's ChatGPT records in the judicial opinion. AI chats are now part of the evidentiary record (Alston Privacy, April 28 2026).

When five jurisdictions sanction five different practitioners in 30 days for variations on the same failure, the rule writes itself. The California proposal is not novel. It is a codification of what the bench has been doing case by case.

Does asking ChatGPT to fact-check itself satisfy a verification duty?

No. The rule's language, 'independently' review and verify, is the answer. The architectural reason matters: a single LLM verifying its own output is a closed loop. The model that produced the hallucination is the same probability distribution being asked to evaluate it. Asking the same model whether its output is correct is asking the failure mode to grade itself.

The April 2026 Microsoft Research DELEGATE-52 benchmark measured this directly. Frontier models, Gemini 3.1 Pro, Claude 4.6 Opus, GPT 5.4, corrupt an average of 25% of document content over 20-step workflows. The paper finds that 'agentic tool use did not measurably reduce corruption.' Adding more steps with the same vendor amplifies the failure rather than catching it.

A simultaneous Purdue preprint, No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models, proves a stronger statement: non-hallucinating learning is statistically impossible from training data alone, regardless of how clean the corpus.

The structural conclusion is that single-vendor verification cannot, in principle, be the primary defense. Independent verification, in the sense the California rule uses, requires a non-correlated error source. In practice that means a different vendor, a model from a different provider, different training run, different alignment regime. Multi-vendor consensus is the only architecture where 'independent' has an operational meaning.

Notice the pattern. Same vendor checking same vendor is the same probability distribution grading itself. That is exactly what TrueStandard does differently: paste your draft, four to five models from different labs check in parallel in 60 seconds, every disagreement surfaced.

What does a defensible AI workflow look like for a regulated industry?

Five elements, drawn from the Unite.ai defensible-LLM-outputs playbook and the supervisor-sanction precedent in NorCal.

Multi-vendor verification

Run claims through models from at least two different vendors. The verification artifact is the disagreement signal, not just the consensus.

Recorded model identity and version

The verification log captures which models were used, at what version, on what date. This is the audit trail a regulator or judge can ask to see.

Reviewer identity

A specific human signs off on the verified output. The signature is the bridge between the AI workflow and the lawyer's professional duty.

Risk-stratified gates

Not every output needs the same verification depth. Citations and statistical claims get the highest check. Style edits do not.

Privilege segregation

Public-tier consumer chatbots are not privileged. Enterprise tiers with confidentiality terms are different. The April 2026 US v. Heppner analysis shows the line being drawn case by case.

The Anthropic legal demo on April 23, 2026 drew over 20,000 registrants (Florida Bar coverage). The same week Freshfields announced a firm-wide Claude deployment. The volume of professional adoption tells you why the verification rule is being written now: lawyers are using AI faster than the standards exist to govern that use.

If California passes the rule, what changes for non-lawyers using AI in regulated work?

The rule is written for lawyers, but the reasoning generalizes immediately to any profession with a competence duty: licensed accountants, financial advisors, medical writers, regulatory affairs consultants, compliance officers, due-diligence analysts. Three predicted ripple effects in the next 6 to 12 months.

State medical and accounting boards will copy the language

The 'competence' duty exists in nearly every professional code. The verification gap is the same. Expect parallel rulemakings in 2026 and 2027.

Enterprise procurement criteria will shift

Compliance teams at regulated employers will require vendor evidence of multi-vendor verification capability before approving AI workflows. The current 'approved AI tools list' pattern at most large firms does not address verification independence. It will have to.

Insurance pricing will follow the docket

Professional liability carriers price what is in the claims data. The PA, NorCal, GA, and CA disciplinary record is now the data. Premiums will reflect verification practice within 12 to 18 months.

For independent professionals (freelance journalists, newsletter writers, solo consultants), the practical version of the rule already applies, not as a regulatory duty but as a market reality. The cost of an unverified AI claim is reputational and immediate. The verification question is the same. The architecture answer is the same.

What's the practical difference between Harvey, Legora, and Claude for compliance-grade work?

This was the most-discussed question on r/legaltech this month. The post 'Lawyer here - how are Legora and Harvey differentiated from Claude' hit 49 upvotes and 106 comments (thread). The honest read:

Vertical legal AI tools: Harvey, Legora, Spellbook, Ivo, Wordsmith

They wrap LLMs with legal-specific UX, integrations, prompt libraries, and source citations. They are workflow products. Their verification is typically single-vendor (often Claude or GPT under the hood).

General-purpose LLM access: Claude, GPT, Gemini Word add-ins

Integrated into the documents lawyers work in. Cheaper, broader, but no legal-specific guardrails.

Multi-vendor verification (TrueStandard category)

Adds a layer that verifies claims across vendors, independent of which tool produced the draft. The verification is vendor-neutral by design.

The 'moat' question is the wrong question. The right question: which layer of the stack does the regulator's 'independent verification' duty actually live in? It cannot be the drafting layer (that is the layer producing the output). It cannot be a same-vendor self-check (that fails the independence test). It has to be a verification layer that crosses vendors. That layer does not yet exist as a default in Harvey, Legora, or Claude, which is the gap the rule is going to expose.

How does multi-vendor AI verification reduce supervisor liability?

This is what the NorCal supervision sanction actually decided. The court did not punish the junior lawyer who pasted in the AI-fabricated citation. The court punished the supervising partner for not having a process that would catch it. Three operational implications for managing partners and content team leads.

The verification step has to be in the workflow, not in the lawyer's head

'We trust our associates to check AI output' is no longer a process. A documented step that produces an artifact (verification report, model-disagreement log) is.

The verification artifact has to be vendor-independent

A log showing 'Claude was asked to fact-check the Claude draft' demonstrates the absence of independence rather than its presence.

The verification has to be reproducible

If a regulator asks 'show your work,' the answer needs to be replayable: same input, same model panel, same disagreement output.

These three together make the verification step defensible. The supervisor's duty becomes 'our firm has a documented multi-vendor verification process and the artifact is in the file.' That is a different conversation from 'we trusted the associate.'

Notice the pattern. A defensible artifact is a vendor-independent, reproducible record of model disagreement. That is exactly what TrueStandard produces: paste your draft, four to five models check in parallel in 60 seconds, every disagreement surfaced with model identity, version, and timestamp logged for the file.

Frequently Asked Questions

When does the California rule take effect?

The proposal was released May 5, 2026 for public comment. Final adoption follows the comment period and Bar approval. New York's Part 161 is already adopted, effective June 1, 2026. Connecticut's proposal is pending.

Does the rule apply to lawyers in other states using AI on California matters?

Choice-of-law for ethics rules generally follows the jurisdiction where the lawyer is admitted and the matter is being handled. Multi-state firms typically follow the strictest standard. Practical answer: if California's rule passes, large firms will adopt it as a default to avoid having state-specific carve-outs in their AI workflows.

What about confidential client data? Can I use a multi-vendor verification tool with privileged content?

Privilege analysis depends on the vendor's terms. Public-tier consumer chatbots typically do not preserve privilege. Enterprise tiers with confidentiality terms are different. The April 2026 US v. Heppner ruling shows public-Claude chats not treated as privileged. TrueStandard's enterprise terms include explicit no-training-on-content commitments, see truestandard.ai/security.

Is there a federal rule equivalent to the California proposal?

Not yet. The Federal Judicial Conference has not promulgated a verification rule, though individual federal judges have issued standing orders requiring AI-use disclosure on filings. The state-level cascade is currently the de facto standard. A federal rule may follow within 12 to 24 months.

How does this connect to the Lancet study on fabricated medical references?

The Lancet study (May 7, 2026) documented a 12-fold rise in fabricated references in biomedical papers since 2023, across 2.5M papers audited. The professional-duty implications for medical writers, regulatory affairs, and IRB-reviewing physicians are parallel to the legal duty being codified in California. Any profession with a competence duty has a verification problem the same architecture solves.

Keep reading

AI Architecture | 12 min read

Why AI Hallucinations Are Structural

DELEGATE 52, GPT-5.5, and a Purdue impossibility proof. Three April 2026 results that move 'hallucinations are structural' from take to documented fact.

AI Reliability | 11 min read

Why AI Citations Keep Showing Up Wrong

A 12-fold rise in fake biomedical references, four legal sanctions in 30 days, public defenders flooded with ChatGPT case theories. The same failure shape, across professions.

AI Reliability | 13 min read

When AI Cites Studies That Don't Exist

AI does not just get facts wrong. It invents whole sources, cases, studies, DOIs, and cites them with the same confidence it uses for real ones. Here is why it happens, the disasters it has already caused, and how to catch a fabricated citation before your name is on it.

AI Verification | 9 min read

AI Detector vs Fact Checker

One asks who wrote this. The other asks is this true. Before you publish, only one of those questions protects your reputation — and most teams are watching the wrong one.

AI Reliability | 12 min read

How to Check If AI Citations Are Fake

Four checks catch a fabricated reference before your readers do. One of them is new: in 2026, a DOI that resolves no longer means the citation is real.

Make Verification Defensible

The California rule asks for an artifact, not a feeling. TrueStandard runs your draft through four to five frontier models in 60 seconds and gives you a reproducible record of where they agree and where they do not.

Start Verifying →