ChatGPT Health's Triage Failures Labeled 'Unbelievably Dangerous'

A study in Nature Medicine found that ChatGPT Health failed to recognize medical emergencies in more than 50% of tested cases, often advising patients to stay home during life-threatening crises. The AI also demonstrated a dangerous inconsistency in detecting suicidal ideation and was easily influenced by social context to downplay symptoms. Experts warn that these failures create a lethal false sense of security, necessitating immediate regulatory oversight.

Key Points

ChatGPT Health failed to recommend immediate hospital visits in over half of the emergency scenarios presented by researchers.
The AI's suicide prevention guardrails failed 100% of the time when a patient's description of suicidal thoughts was accompanied by medical lab results.
The platform was 12 times more likely to downplay serious symptoms if the prompt included a mention of a friend suggesting the condition was minor.
While the AI under-triaged emergencies, it also over-triaged 64.8% of safe individuals, potentially leading to unnecessary medical presentations.
Experts are calling for urgent independent auditing and safety standards to prevent legal liability and patient harm.

Sentiment

The HN community broadly agrees with the article's premise that ChatGPT Health is dangerously unreliable for medical triage. The majority of highly-engaged commenters support the study's findings and share personal anecdotes reinforcing the risks. However, a substantial minority challenges the framing by pointing to human medical errors, questioning the study's methodology, and arguing that for many Americans, flawed AI advice is still better than no healthcare access at all. The overall tone is one of concern rather than outrage, with most commenters treating this as a serious and expected finding rather than a surprise.

In Agreement

ChatGPT's failure to recommend emergency care in life-threatening scenarios is genuinely dangerous, and the suicide crisis banner disappearing when lab results are added is a particularly alarming failure mode
LLMs lack the clinical experience, physical examination capability, and real-world patient interaction that doctors gain through years of training — textbook knowledge alone is insufficient
Patients arriving at doctors with ChatGPT-derived theories actively bias the diagnostic process, wasting limited appointment time and potentially leading to missed diagnoses
AI speaks with false authority compared to reference tools like WebMD, making it more dangerous because users treat its output as expert advice
The rush to commercialize AI in healthcare without proper testing mirrors historical reckless commercialization and is predatory when targeting people who cannot afford standard care
AI produces plausible-sounding wrong information that can mislead even experts, and its errors are random and insidious rather than systematic and predictable
Some doctors are already relying on ChatGPT in clinical settings, with one anecdote of a GP prescribing inappropriate medication to a pregnant woman based on its recommendation

Opposed

Human doctors also frequently miss diagnoses and medical errors are a leading cause of death, so AI should not be held to a higher standard than human practitioners
The study used structured clinical scenarios rather than real-world patient interactions, which may not reflect how the tool actually performs in practice
Healthcare is prohibitively expensive in the US, making the real question whether AI advice is better than no advice at all for people who cannot afford to see a doctor
AI used alongside doctors as a complementary tool or second opinion could improve outcomes, especially since second opinions are normally too expensive
LLMs represent a different kind of intelligence with distinct strengths, and dismissing them as 'fancy Markov chains' is reductive
People were already self-diagnosing with Google and WebMD for decades — ChatGPT is just the latest iteration of the same phenomenon