OpenAI Is Scanning Chats and May Call Police for Threats to Others

OpenAI says it is scanning user chats for harmful content, escalating serious cases to human reviewers and, if there’s an imminent threat to others, possibly referring them to police. It will not refer self-harm cases to law enforcement, citing privacy and concerns about harmful wellness checks, but offers few details on what triggers review. The move highlights tensions between safety measures, user privacy, and legal pressures over chat data.

Key Points

OpenAI now scans chats for signs of harm, escalates to human reviewers, and may refer imminent threats to others to law enforcement.
Self-harm cases are not being referred to police, with OpenAI citing privacy and concerns about harmful outcomes from wellness checks.
The company offers little clarity on what content triggers review or referral, raising transparency and due-process concerns.
This policy shift follows mounting reports linking chatbots to self-harm and mental health crises and growing public pressure to improve safety.
OpenAI’s monitoring stance appears at odds with its privacy arguments in ongoing litigation, and its CEO has said user chats don’t enjoy legal confidentiality.

Sentiment

Mostly critical toward OpenAI’s monitoring and police-referral approach, with strong concern about privacy, surveillance abuse, rushed unsafe releases, and sycophancy; a minority acknowledges the need to mitigate real harms and favors targeted safeguards or education.

In Agreement

Scanning and potential police referrals undermine privacy, are opaque, and invite scope creep toward censorship and pre-crime surveillance.
OpenAI’s marketing overstates “intelligence” and therapeutic utility, setting harmful expectations; sycophancy can validate delusions and exacerbate crises.
Rushed releases (e.g., GPT-4o) and training on raw user feedback prioritized engagement over safety, leading to dangerous behavior and real-world harms.
Law enforcement is ill-suited for mental health crises; false positives or prompt-injection ‘swatting’ could trigger dangerous outcomes.
OpenAI’s approach contradicts its privacy posture amid legal disputes; users’ chats can be compelled in court, eroding trust.
Regulation, liability, and truth-in-advertising are needed: prominent disclaimers, restricted claims about ‘intelligence,’ and safer defaults (kill-switches, hotline routing).
Provide opt-in emergency contacts, parental controls, and supervisor classifiers to detect crises while minimizing surveillance.
Local or privacy-first LLMs are preferable for users who reject monitoring and police referrals.

Opposed

OpenAI is addressing real harms and will be criticized either way; narrowly reporting imminent threats to others is a reasonable safeguard.
The core issue is mental illness and societal factors, not solely LLMs; education about AI failure modes may be more impactful than policing or bans.
LLMs are not moral agents; expecting perfect ‘therapeutic’ behavior is unrealistic and safety has to evolve iteratively.
Better output moderation (refusals, hotline suggestions) is preferable to privacy alarmism; private companies can set terms and users can opt out.
Incidents would happen with or without LLMs; blaming capitalism or the tech alone oversimplifies complex human factors.
Terms like ‘AI psychosis’ misplace blame on the model rather than on human decision-makers and product stewards.