In the last issue of Copiscope, we explored the potential role of artificial intelligence (AI) in medicine. This included Large Language Models (LLMs), which are technology that includes generative language models, deep language learning, and other programs that interact with users through natural language in sophisticated ways.
LLMs are not simply tools, as they can function autonomously and beyond the scope of understanding of the user. This ability to execute processes that human programmers did not foresee and cannot fully explain is the basis for the most serious concerns about AI-enabled systems. LLMs have zero “comprehension,” in real-life terms, of what it is “saying.” It compares a proposed output to its training data, and if it looks probabilistically coherent, the LLM delivers content. Understanding this is critically important in evaluating the risks of AI in patient care. LLMs can produce fluent, seemingly valid output, but this can sometimes be inaccurate or entirely fabricated; and possibly dangerous to rely upon.
In this article, we will further that discussion to the implications for clinical practice. This subject can and will occupy sizable discussions as the pace of AI development and deployment is moving rapidly and presenting unprecedented challenges.
AI LIABILITY AND SAFETY IMPLICATIONS IN CLINICAL CARE AND DECISION SUPPORT
“Medical AI” is a broad category that contains applications as diverse as diagnostic image interpretation, collecting and summarizing patient histories, making triage decisions, scribing visit notes or operative reports, predicting disease prevalence, and finding gaps in care in data from millions of patient encounters. Some ways AI and LLMs are currently being tested in clinical practice are:
Documentation
- The voice recognition capabilities of LLMs make them well suited to creating written transcripts of spoken notes. Like earlier voice-to-text software, their output still needs careful proofreading.
- The quantum leap in documentation is seen in products that generate fully curated reports (e.g., a full S.O.A.P. note)—not a simple transcript. These are also subject to errors and inaccuracies.
Communication
- Helping with provider inbox management, results review, and referral management.
- Helping with customer service tasks, answering phone calls, emails and questions, system navigation, and patient advice.
Diagnosis and Treatment
- Capturing and interpreting images, text, device output, and other medical data.
- Summarizing records, prompting providers, and care coordination.
- Real-time and retrospective clinical decision support; treatment planning.
- Delivering alerts and reminders—prospective, retrospective, and time-of-care.
- Case management, navigation, and facilitation in selected specialty areas, where training data is large and processes can be standardized.
Population Health
- Prediction and planning.
- Identifying at-risk populations for access and treatment optimization.
- Disease surveillance, recognition, and management.
- Identifying patterns in large data sets that are not apparent in typical analysis.
Medical Education
- Sophisticated simulations.
- Post-graduate exams, specialty certifications, and MOCs.
- Ongoing knowledge and skills assessment.
- Summarizing literature and evaluating evidence.
Process Analytics
- Comparing outcomes and costs between treatments, organizations, and providers.
- Measuring efficiency and effectiveness (e.g., emergency services, resource allocation).
- Strategic planning (e.g., donor organ procurement).
CONSIDER THESE IMPORTANT LIABILITY AND SAFETY ISSUES
>The practitioner remains responsible for the practice of medicine.
Introducing another entity into the care process creates potential liability, and it does not necessarily reduce risk exposure for the provider-user. It is still the licensed provider who is practicing medicine. Apportioning contributions to the outcome may take new forms when AI is used, but current regulatory theories tend to follow the model of “device safety.”
Liability for AI mishaps potentially involves both device and human accountability. Fully autonomous systems will doubtless begin to appear, but humans are likely to remain in the accountability loop. Legislatures, courts, and agencies like the FDA will determine if additional liabilities attach to entities besides practitioners. There may be claims against vendors when product defects are not apparent or foreseeable. There may be claims against organizations that fail to use diligence in specifying, acquiring, configuring, or maintaining systems or training users. Investigating an AI claim will require determining the exact version and configuration of the tool and manner of its use, and reviewing its activity logs and possibly its operating code. But for the present, provider-users will be expected to oversee, review, and affirm the accuracy of the process and final work product.
>The use of AI needs to be transparent, verifiable, and reproducible.
Users of AI applications need to be able to explain in general how they work and what safety measures apply to them. A foreseeable deposition question in a malpractice claim involving AI might be, “Please describe exactly how this event happened.” Answering this can be problematic with some applications that operate as black boxes, without user visibility into algorithms that even the developers may not be fully able to explain. Nevertheless, defending good care will require you to show what tool you used, how you reviewed the output from that tool, and the role it played in clinical care and decision making.
It is foreseeable that plaintiff attorneys will employ AI tools in reviewing malpractice claims. Standards of care will begin to incorporate expectations for AI use, as they adopted other technologies (like EKG and MRI). Guidelines will constantly evolve for what “a reasonable provider in similar circumstances” should do. Payers will use ever more sophisticated algorithms in their coverage, payment, and auditing processes. Providers and facilities will be subjected to AI-augmented oversight, surveillance, and evaluation. Regulators are already being asked to mandate the disclosure of AI-assisted functions in many settings.
>Credibility is a challenge; inaccuracies can propagate and be difficult to identify.
Credibility of the medical record and the decision making process will face challenges, as the line blurs between what is generated by AI and what is contributed by human judgment. Similarly to what occurred in EHRs when copy-paste began to be used, the credibility of an entire record can be called into question when content is fabricated or faulty. The fluency of LLM-generated records may make it more difficult to spot documentation errors. Rapid propagation of information across networks amplifies the impact of content errors and imposes a higher responsibility upon users to proofread AI-assisted work product.
AI tools are only as good as their training data. Building them upon large medical record archives—which are well known to contain inaccuracies and biases—has been shown to produce outputs that can sometimes be strikingly inappropriate, discriminatory, or dangerously wrong. The adoption of AI tools in domains like medicine and law enforcement has already gotten ahead of professionals’ ability to oversee algorithms’ performance and intercept their misuse. There is an urgent call for explanatory systems that allow users to audit and examine AI thought processes.
>AI-enabled tools are rapidly being adopted in every aspect of medical care.
AI will increasingly make its presence felt in everyone’s workflow. It will prompt clinicians to follow protocols that optimize financial and quality metrics. It will suggest—or require—studies, treatments, medications, alternative diagnoses, and management strategies. As the "human inbox" swells with AI-assigned tasks, AI assistants will also be there to help manage it. Because both AIs and humans remain prone to error, they will need to collaborate—hopefully with mutual benefit. Malpractice claims arising from human-AI teamwork will still be measured against a “reasonable standard of care.” But standards are going to evolve more rapidly than ever as interactions between humans and algorithms become more intricate.
>Privacy issues are complex and are already challenging current safeguards.
Large AI systems typically require data processing by remote cloud servers. When patient data, images, or recordings are transmitted to external parties, issues arise about how they are stored and processed, and how the information can be used. For example, many machine learning systems incorporate data they receive into their permanent training sets. This is a different situation from “a transcriptionist in the back room.” Patient consent is not required for functions that are simply part of health care operations. But if protected health information (PHI) is re-purposed for uses other than the benefit of a particular patient, there needs to be a specific disclosure and consent. Providers need to understand their user agreements and HIPAA Business Associate Agreements with vendors of applications that involve PHI. Claims that “data are de-identified” need to be verified because it has been shown that large databases can be used to re-identify confidential information that is presumed to be anonymous. Providers who use AI-powered search engines, intelligent assistants, documentation, and decision support applications that have access to PHI need to inquire carefully how vendors comply with HIPAA and other privacy rules.
Information in this article is for general educational purposes and is not intended to establish practice guidelines or provide legal advice.
Article originally published in 1Q24 Copiscope.