Do law firms need a HIPAA Business Associate Agreement (BAA) to use AI tools with PHI in 2025?

A partner drops a patient record into an AI tool. Did your firm just trigger HIPAA? Clients expect speed in 2025. Regulators still expect Business Associate discipline. The basic question: if an AI pl...

A partner drops a patient record into an AI tool. Did your firm just trigger HIPAA?

Clients expect speed in 2025. Regulators still expect Business Associate discipline. The basic question: if an AI platform will receive, store, or process Protected Health Information (PHI), do you need a HIPAA Business Associate Agreement (BAA)? Short answer: usually yes.

Here’s what follows. When law firms are Business Associates, when AI vendors become subcontractor BAs, and why the “conduit” exception doesn’t cover AI or cloud. You’ll see common AI use cases that need a BAA, when proper de-identification can avoid one, and the clauses you should insist on (no training on your data, retention/deletion, breach notice, and subprocessor flow-downs).

We’ll also hit security must-haves (SSO/MFA, RBAC, encryption, audit logs), practical data governance, a quick decision flow, and a vendor checklist. By the end, you’ll know how to use AI with PHI confidently—and keep clients and regulators happy.

TL;DR — Do law firms need a BAA to use AI tools with PHI in 2025?

If an AI platform will receive, maintain, or process PHI for your matters, you need a BAA with that vendor. HHS OCR’s cloud guidance says providers that maintain ePHI are Business Associates—even if the data is encrypted and the vendor claims “no access.”

OCR has enforced this for years. In the Raleigh Orthopaedic case (2016), a provider paid $750,000 for sending PHI to a vendor without a BAA. Same principle here: if your firm is a Business Associate and you send PHI to AI, that AI vendor is your subcontractor BA, so a BAA is required.

Staff paste, upload, or sync medical details into AI? Treat the vendor as a BA.
AI connects to your DMS or eDiscovery with PHI? It’s in HIPAA’s chain.

Do law firms need a BAA for AI tools handling PHI? In 2025, assume yes unless you’ve properly de-identified the data under HIPAA. One practical move: use a single BAA that covers every generative feature—chat, summarization, extraction—and future modules, so you’re not renegotiating every new release.

HIPAA essentials for law firms: PHI, Business Associates, and BAAs

PHI is any identifiable health info related to care, payment, or health status. Your firm becomes a Business Associate when providing legal services to a covered entity that involve creating, receiving, maintaining, or transmitting PHI. Think malpractice defense, subpoenas, internal investigations, payer disputes, regulatory work.

OCR is clear: just “maintaining” ePHI triggers BA obligations. Once your team touches PHI and shares it with a third party, HIPAA rules apply. The BAA sets the ground rules—permitted uses, breach notice, safeguards, subcontractor obligations, termination, and return or deletion of PHI.

Duties don’t end when you sign. You still need minimum necessary practices, training, and vendor oversight. Tip for litigators: when you build timelines that mix PHI and non-PHI (like employment docs), treat the whole set as PHI for handling. That lowers the risk of something sensitive slipping into an AI prompt by accident.

When AI vendors become your subcontractor Business Associates

An AI vendor is your subcontractor BA when it creates, receives, maintains, or transmits PHI for you. OCR’s cloud guidance says cloud providers are BAs even if they can’t see the data. The “conduit” exception is tiny—postal carriers and basic telecom—and it doesn’t apply to AI or typical cloud services.

So if you paste PHI into a drafting tool, upload records for summaries, or hook AI to repositories that hold PHI, you need a BAA. That includes background operations many teams forget: short-term caching, vector indexes, or evaluation pipelines. Those still “maintain” PHI.

Ask for a clear data map from the vendor—where prompts go, how embeddings are stored, what retrieval indices look like, and what gets logged. Confirm every subprocessor in that flow is covered by a BAA.

AI use cases in law firms that typically require a BAA

If your practice touches PHI, plenty of everyday AI work will trigger a BAA:

Document review and medical chronologies: EMRs, DICOM reports, payer data—classic PHI in eDiscovery and AI document review.
Drafting demand letters, motions, or discovery responses that include medical details.
Subpoena responses and regulatory investigations using centralized records and analytics.
Early case assessment across mixed corp/clinical datasets with identifiers.
Integrations: AI connected to DMS/EDRMS, intake, or matter systems that store PHI.

Enforcement often cites missing BAAs as a major issue, and adding AI doesn’t change that. One mid-size defense firm reported about a 40% cut in chart review hours after adding AI summaries, but they only rolled it out firmwide once the vendor signed a BAA and enabled U.S.-only hosting with audit logs.

One more thing: treat plaintiff-provided records as PHI even if redacted. Redactions are uneven, and AI can infer identity from context. Default to PHI unless a proper de-identification workflow confirms otherwise.

When you may not need a BAA (and the pitfalls)

You might not need a BAA if the data is de-identified under HIPAA’s Safe Harbor (18 identifiers removed) or via expert determination. De-identified data isn’t PHI. But it has to be done right.

A Limited Data Set is different. It’s still PHI and requires a Data Use Agreement. If you’re a BA, your AI vendor still needs a BAA. Make sure the DUA and BAA line up on purpose, limits, and re-disclosure.

Metadata creep: file names, EXIF, DICOM headers, email headers. Scrub content and metadata—and keep proof.
Prompt leakage: dates and rare conditions can re-identify in combo. Stick to minimum necessary.
Model outputs: RAG might pull identified text from a connected repo. Lock down access.

HHS warns that combining de-identified data with other sources can re-identify people. With vector search across mixed corpora, that risk grows. Keep de-identified indexes separate from identified repositories, and block cross-index retrieval.

2025 regulatory landscape and risk signals

OCR is still focused on BAAs, safeguards, and tracking technologies. In 2024, OCR reiterated that sending PHI to third-party trackers without BAAs and controls is a no-go. Same logic applies to AI telemetry and logging.

Also in play: 42 CFR Part 2’s 2024 final rule, which aligns parts of SUD confidentiality with HIPAA. If matters touch SUD records, your vendors are BAs with tighter redisclosure limits and more detailed accounting. State laws add pressure too. Washington’s My Health My Data Act widens “consumer health data” obligations outside HIPAA, which can shape hosting and analytics choices.

Clients now ask for U.S.-only hosting, faster breach notice (often 15 days, not 60), and yearly risk reviews. If a client’s BAA template requires SOC 2 Type II and named subprocessors, they’ll expect you to hold AI vendors to the same standard. Bake that into selection and contracting.

Must-have clauses in a BAA with an AI vendor

Use an AI-aware BAA, not a generic cloud form. Look for:

No training on your data—ban fine-tuning, embedding reuse, evaluation datasets, and derivative model weights.
Purpose limits, minimum necessary, and matter-level segregation.
Retention/deletion: short default retention for caches, deletion on request, destruction certificates at termination.
Fast incident notice (e.g., 15 days), cooperation, and support with client notifications.
Subprocessor transparency and flow-down BAAs, with change notices and pre-approval.
Access and audit: SSO/MFA, RBAC, least privilege, IP allowlists, audit logs, periodic reviews, plus SOC 2 Type II or ISO 27001.
Data return/export: prompts, files, outputs, and audit trails in usable formats.

Spell out what counts as “model artifacts” (embeddings, vectors, caches). If derived from PHI, treat them as PHI with the same protections and deletion timelines. That keeps data from lingering in retrieval indexes after the main content is gone.

Security and privacy controls to demand for AI handling PHI

Anchor AI security to the HIPAA Security Rule. Ask for:

Identity and access: SSO/MFA, RBAC, least privilege, IP allowlists, device posture checks.
Data protections: encryption in transit and at rest, KMS options, customer-managed keys, tamper-evident logs.
Isolation: dedicated tenancy, model isolation, private inference endpoints, no shared evaluation pipelines.
DLP and safeguards: prompt filtering, PHI-aware redaction, output controls, block public sharing.
Observability: full audit logs for prompts, files, outputs, admin actions, exports—pipe to your SIEM.
Resilience: geo-fenced backups, disaster recovery, tested incident response.

Proof matters. Ask for a recent SOC 2 Type II, pen test summary, and HIPAA mapping. Surveys of health-sector security leaders show audit logging and tenant isolation are top gating controls for AI. Also cover the embeddings store and retrieval pipeline—those often sit outside your DMS safeguards.

Consider an “emergency access” path with alerts to matter leads, so urgent needs don’t break your controls silently.

Data governance for AI in HIPAA contexts

Tools help, but governance keeps you out of trouble. Put in place:

An approved-use list for AI and a short “don’t” list (no PHI in consumer AI, no uploads to personal accounts).
Prompt hygiene rules with a clear minimum necessary standard.
Redaction workflows that default to de-identification when possible, with logs.
Human review for high-risk outputs like filings and regulator letters.
Centralized logging and periodic access reviews with exceptions tracked.
Vendor risk cadence: onboarding diligence, annual refresh, subprocessor change checks.

Set up “AI data zones” at the matter level—one workspace where PHI is allowed (with BAA coverage and audits) and another for non-PHI research. Let users pick the right zone up front. In training, show real prompts from your practice and how a couple extra sentences can exceed minimum necessary.

Client consent, privilege, and professional responsibility

Ethics rules already point the way. Confidentiality (Model Rule 1.6), supervision of nonlawyer assistance (5.3), and competence (1.1) all apply. Treat your AI vendor like a nonlawyer assistant. Make reasonable efforts to ensure the service supports your obligations—usually a BAA, solid security, and training.

Privilege can hold when disclosures are necessary for legal services and protected by safeguards. Many firms address AI use in engagement letters. Consider a clause noting AI may assist with drafting or summaries under strict confidentiality, with opt-outs for sensitive work.

For discovery, combine AI use with protective orders and, when needed, a 502(d) order to reduce privilege waiver risk. Recent bar guidance stresses verifying AI outputs and updating clients about material AI use. One simple control for PHI-heavy regulator responses: require a second-lawyer review and log sign-off.

Implementation roadmap: adopting AI with PHI safely

Step 1: Map matters and data flows. Flag where PHI could hit AI—prompts, uploads, connectors. Tag repos.
Step 2: Pick vendors; run BAA and security diligence. Ask for SOC 2 Type II, U.S.-only hosting, and a “no training on your data” commitment. Lock in retention/deletion in the BAA.
Step 3: Configure and pilot with de-identified data. Start low risk, then expand.
Step 4: Train users and watch usage. Provide prompt patterns, redaction tips, and clear do/don’t examples. Gate consumer AI so PHI can’t get through.
Step 5: Scale and keep auditing. Quarterly access reviews, subprocessor updates, and tabletop breach drills.

Firms that rolled out in phases—pilot, expand, scale—saw fewer issues and faster adoption than all-at-once launches. Assign owners: a partner sponsor per practice, plus IT/security and privacy leads. Use a one-page intake form for new AI use cases so teams can move fast without skipping checks.

Decision tree: Do you need a BAA for this AI workflow?

Ask three things:

Is the data PHI, a Limited Data Set, or de-identified? If any identifier is present or reasonably inferable, call it PHI. Document your call.
Will the AI vendor create, receive, maintain, or transmit it? Uploads, detailed prompts, or repo integrations usually mean yes.
Who else touches it? List subprocessors and telemetry/log flows. Confirm coverage and residency.

Outcomes:

BAA required: Any PHI or Limited Data Set goes to the AI platform. Vendors are subcontractor BAs.
DUA only: De-identified data under a DUA for analytics or research, with re-identification controls. Keep strong confidentiality and security terms.
No HIPAA obligation: No PHI, no LDS, no combo risk—think internal know-how or public legal research.

About the conduit exception: if a vendor stores or processes data, it isn’t a conduit. Also check client contracts—some require BAAs beyond HIPAA’s baseline. Build this flow into matter intake or your legal ops portal so teams can answer it in minutes.

Due diligence checklist and RFP questions for AI vendors

Security and compliance

Send a current SOC 2 Type II and pen test summary. Show HIPAA control mapping.
Offer U.S.-only hosting and clear data residency. List subprocessors and explain BAA flow-downs.

Data handling

Do you train on our data? If not, is that barred in the BAA?
What are the defaults for prompt/file/embedding/log retention? Can we configure and verify deletion?
How do you isolate tenants, models, and indexes? Private inference available?

Operations

Breach history, notification timelines, incident response, RTO/RPO.
Audit log coverage and export. Admin tools for RBAC and IP allowlists.
Support SLAs, uptime, and roadmap for PHI-aware features like redaction.

Fit-for-purpose

PHI detection and redaction options.
Prompt filters and output controls to block external sharing.

Vendors that share a subprocessor roster up front and promise change notices usually cut contracting time by weeks. Ask for a full data flow diagram, including observability and monitoring—telemetry often escapes standard contracts.

FAQs

Does encryption or “zero retention” remove the need for a BAA? No. If a vendor maintains ePHI, it’s a Business Associate, even if it can’t read it.
Are law firms covered entities? Generally no. You’re Business Associates when your legal work involves PHI for a covered entity.
Is a Limited Data Set still PHI? Yes. You need a DUA, and if you’re a BA, your AI vendor still needs a BAA.
Can de-identified data avoid a BAA? Yes, if it’s de-identified under HIPAA and protected against re-identification.
Does the conduit exception cover AI or cloud services? No. Storage and processing vendors are BAs.
Can I use consumer AI for PHI if prompts aren’t stored? No. Sending PHI to a tool without a BAA is still a disclosure.

How LegalSoul supports HIPAA-aligned AI for law firms

LegalSoul gives law firms a HIPAA-aligned AI workspace for PHI. We sign BAAs, never train on your matters, and isolate tenants and models with private inference endpoints. Security includes SSO/MFA, RBAC, IP allowlists, encryption in transit and at rest, optional customer-managed keys, and full audit logs you can export to your SIEM. We also provide U.S.-only hosting and a transparent subprocessor roster with flow-down BAAs.

On the product side: PHI-aware redaction, prompt templates built for minimum necessary, retrieval over segregated indexes, and fine-grained retention and export controls. Most firms start with a de-identified pilot, then move to PHI-permitted workspaces once controls are verified. If your client contracts demand faster breach notice or named subprocessors, we match the configuration and the BAA.

Quick Takeaways

If an AI tool will receive, store, or process PHI, the vendor is your subcontractor BA and you need a BAA. Encryption, “zero retention,” and “no access” claims don’t change that.
You can skip a BAA only with properly de-identified data. Limited Data Sets are still PHI and require a DUA—and a BAA if you’re a BA. Watch for metadata and re-identification risks.
Your BAA and stack should include: no training on your data, tight retention/deletion, fast breach notice, subprocessor flow-downs; plus SSO/MFA, RBAC, encryption, tenant/model isolation, audit logs, and U.S.-only hosting options.
Make it operational: map PHI flows, train on minimum necessary prompts, and run real vendor diligence. Pick an AI platform that signs BAAs and meets HIPAA Security Rule expectations—LegalSoul checks those boxes.

Conclusion and next steps

Bottom line: if an AI platform will handle PHI, you need a BAA. Encryption and “zero retention” don’t erase that. You can avoid one only with proper de-identification; Limited Data Sets still count as PHI.

Pair the BAA with AI-specific clauses (no training, retention/deletion, subprocessor flow-downs) and the right controls (SSO/MFA, RBAC, audit logs, U.S.-only hosting). Want a faster path? LegalSoul signs BAAs, isolates data and models, and never trains on your cases. Book a demo and move your PHI work forward without inviting risk.