A succinct, easy-to-skim Patient Summary is a powerful tool. This is a key takeaway from every health job I’ve ever had. As a nurse, I saw firsthand the reality that the average emergency patient has ~ 15 seconds before the doctor has formed an impression; and in digital health, many healthcare providers mention the value of something like a “digital resident” that summarizes the patient’s story.
And then: We observed this need for better patient summaries during our Site Visits in Nairobi this year at OMRS ‘24.
So, I wanted to rapidly experiment with an idea for using LLMs as “Robo-Residents” to make a short patient summary.
The Summary
Hypothesis: LLMs like ChatGPT may help summarize the chaos of EMR patient chart data in a short ~3 sentence summary, if given training examples; however, I suspected this could contain:
- (1) lies (made up data or inappropriate, mis-leading, unsafe assumptions written as fact),
- (2) hyperfixations (unhelpful detail on specific pieces of data), and;
- (3) cost-heavy requirements (I suspected the cheaper mini model would not perform as well).
Method:
- I used a paid version of ChatGPT, and tried the following process on 3 models: 4o, 4o-mini-, and o1-preview.
- I created 9 sample patient summaries representing a few different disciplines (general OPD, Mental Health, NCD, and primary care).
- I then created a prompt which included these sample summaries for training purposes.
- I then copy-pasted into the model all obs data for 2 example fake patients (just their obs data) from dev3 → Legacy Admin → Manage Observations to see what the LLM would summarize for a particular demo data patient.
Results:
- Overall, All 3 of my hypothesized concerns were correct. The most concerning issue was the model’s insistence on making-up not specified data like age or diagnosis, despite direct instructions not to make inferences.
- Lies: I changed the prompt after I kept finding the model was making up ages for the patients, and I realized all my sample summaries had a patient age in them (“Mr. Name is XX years old and…”). I also added a specific line in the prompt “VERY IMPORTANT: You must not include information that cannot be determined from the data provided. Peoples’ lives and safety depends on this!” And yet, the model still added inferred data that was NOT actually in the original Obs data.
- Cost: 4o-mini’s summary was less clinically clear than the full 4o model. If an implementer were going to use an LLM provider’s API service, the non-minis would be substantially more money per query.
- Hyperfixation and actual Clinical Value: In my clinical opinion, the summaries were not quite the level of clinical quality and story-telling I was looking for.
What this means for us:
-
If any Implementer wants to use an LLM for patient summary generation, my extremely cautious thoughts would be:
i. Explore Medical/Clinical-specific LLMs rather than general ones like ChatGPT.
ii. Perform very, very, very careful QA, as the risk of data invention/hallucination/mis-leading inference is even higher than I suspected.
-
These LLMs will keep getting better - I suspect we will see improvement not long from now and it will be worth checking in again in the near future. But for now, I don’t think this is immediately worth pursuing by the Global Support Team, because it would require so much specialized attention and care that I think is beyond our core mission. Thankfully many companies and groups worldwide are working on LLMs for the purpose of medical summarization.
The Actual Results
Here are 2 sample patient summaries from our dummy data on dev3.
-
From the 01 preview model: George Anderson is a male patient diagnosed as HIV positive on September 4, 2024. He has experienced significant weight loss, decreasing from 86 kg in November 2023 to 69 kg in May 2024. Laboratory results indicate elevated white blood cells (11.0 × 10⁹/L) on September 4, 2024, suggesting a possible infection, and anemia with a hemoglobin level of 10.9 g/dL in May 2023. On June 21, 2024, he presented with a positive bacteriuria test and a fever of 38.0°C, indicating a likely urinary tract infection at that time. Previous notes highlight social concerns, including possible homelessness and food insecurity. His immunization history includes the Bacillus Calmette–Guérin (BCG) vaccine and oral polio vaccine administered in June 2024.
-
From the 4o model: George Anderson is a 48-year-old male recently diagnosed as HIV positive on September 4, 2024. His HIV status was confirmed on September 6, 2024, during a structured SOAP note evaluation. Despite the recent diagnosis, he has not reported any associated symptoms as of his last encounter. George’s past medical history includes multiple, unspecified issues (“Many Problems: 1 2 3 4”) and a positive urine bacteriuria test on June 21, 2024. He has received several immunizations, including the Bacillus Calmette-Guérin (BCG) vaccine and Vitamin A, administered in June 2024. Bloodwork from May 2023 shows elevated platelets (394) and slightly low hemoglobin (10.9). Notably, he had a body temperature of 38.0°C on June 21, 2024. George’s blood pressure, oxygen saturation, and weight have fluctuated across visits, with concerns about his living situation noted by a provider in May 2023, recommending he see a social worker for support.
The Prompt and Training Examples
Here is what I used as my prompt and training examples. My full notes and sources of inspiration for the summaries are here.
For just this chat, you are MedGPT, an AI Resident Doctor. Your goal is to take miscellaneous inputs from a patient file and summarize the patient’s history in about 100-150 words. Your mission is to (1) make it easy for human clinicians to quickly know the history of a patient, and (2) to ensure clinically important information about a patient is brought to the forefront rather than being buried in a chart history.
It is very important to not miss anything that may be of interest to a nurse, doctor, social worker, or other type of healthcare worker.
Patient summaries play a crucial role in improving healthcare efficiency and the continuity of care, as they provide an accurate picture of who the patient is and how they have been doing. A good patient summary should be a narrative that synthesizes the information, provides context, and alerts downstream clinicians about any follow-ups needed by the patient. There is great potential for generative AI technology to automate narrative summaries and save healthcare workers time.
Before I give you some patient data, here are some examples of the kind of summary we are looking for:
Examples:
Devan Modi is a 65 y/o patient who was diagnosed with HTN 25 years ago at age 40. Initially started on HCTZ 50mg daily. His BP was stable until 3 years ago age 62 when amlodipine 5mg daily was started. He’s been stable on those medications since then. He was also diagnosed with diabetes 5 years ago at age 60 and started on Meftofim at that time. He was recently admitted to the hospital for PUD and was Dx’d with H pylori.
Romeo Hernandez is a 48 year old Hispanic male with a 2 month history of Rheumatoid Arthritis and a strong family history of autoimmune disease, who presents with a four month history of joint pain and weakness, a 2 month history of headaches with lightheadedness and nausea, and a one month history of subjective fever, and night sweats alongside symptoms of burning eyes and dry mouth, decreased RBC and Hemoglobin, and a CT with an enlarged parotid gland.
Ms Jane Doe is a female with a past medical history of hypertension (HTN), hyperlipidemia (HLD), migraines, gastroesophageal reflux (GERD), depression, psoriasis, left frontal arteriovenous malformation (AVM), an 8mm wide nicked anterior communicating artery aneurysm, and a fusiform 9mm ectasia of the left anterior cavernous carotid artery.
John Doe is a 44 year old man scheduled for today for a routine physical examination. No medications are on file for John. His last annual physical was April 18 2023, where he was advised to continue his current exercise regimen, a flu vaccination was offered and administered. A note was left by Dr. Smith in 2023 to to review his exercise regimen at his next visit.
Mary Smith is a 59 year old woman last seen on September 21 2024, when she was diagnosed with Type 2 Diabetes Mellitus. She has been prescribed Insulin glargine, 40 units at bedtime. She was referred for telehealth check-ins to monitor blood sugar levels. She was instructed to modify her diet to lower carbohydrate intake and was advised to monitor her symptoms closely. Importantly, she has past records of hypoglycemia.
Lisa Green is a 37 year old woman last seen on August 18 2024 for knee surgery. She was prescribed Ibuprofen 400 mg every 8 hours as needed for pain. She was to receive Physical therapy twice a week for six weeks; Avoid strenuous activities; focus on gentle knee exercises; Wound check and removal of sutures, and to wear a knee brace at all times except during physical therapy. Her next follow-up is in two weeks to reassess progress.
Bob Harrison is 70-year-old male with a history of diabetes underwent a left total hip replacement on March 1, 2016, and has a chronic diabetic foot ulcer. After rehabilitation at a skilled nursing facility, he was referred for home health physical therapy and skilled nursing services starting on April 7, 2016. Currently, he is homebound due to unsteady gait and stairs, requires assistance with ambulation and transfers, and has a slow-healing ulcer on his left foot that was surgically debrided. The treatment plan includes physical therapy to improve mobility and balance, skilled nursing care for wound management and edema, and education for his caregiver on fall prevention and dressing changes.
Mr. Carl Smith is a man with a history of chronic homelessness and severe mental health conditions, including major depressive disorder, anxiety disorder, ADHD, intermittent explosive disorder, and a possible traumatic brain injury. He has experienced significant personal trauma, including abuse, parental neglect, early exposure to substance abuse, and the loss of both parents. Carl has struggled to maintain employment due to difficulties with anger management, concentration, and social interactions, often resulting in job loss. Physically, he suffers from degenerative disc disease and lumbar radiculopathy, leading to chronic back pain. Despite engaging in outpatient mental health treatment and receiving housing support, he continues to face significant challenges in daily functioning and has never been able to sustain gainful employment.
Alice Green is a 47-year-old woman with bipolar disorder and major depression has experienced chronic homelessness, substance abuse, and severe mental health symptoms such as anxiety, racing thoughts, and social isolation. Her traumatic history includes childhood abuse and a violent assault, significantly impacting her ability to maintain employment and stable housing. She has been prescribed various medications over the years to manage her mental health, including Seroquel, Haldol, Depakote, Zoloft, Neurontin, Lithium, and Prozac. Currently, she is taking Seroquel but continues to struggle with daily functioning and has been unable to work since 2003 due to her mental health impairments.
VERY IMPORTANT: You must not include information that cannot be determined from the data provided. Peoples’ lives and safety depends on this!
(Then from here I copy-pasted a long list of obs.)