ICLR 2026
In progress · running livepapers analyzed through structural epistemic checks (full OpenReview submission set).
Data drop coming when the run completes.
This site documents AI safety failures. The receipts include suicide, fatality, and crisis material referenced as evidence. The content can be heavy.
This site is not a crisis service. If you need help right now, the panel below routes you to real human help before you go any further.
By entering, you confirm you are 18+ or accessing under adult supervision, and that you understand this is a public-evidence record under fair use. We collect no analytics, no tracking, no IP, no cookie.
An emergent AI safety failure — documented across academic publishing, operator workflows, and live deployment. The largest LLM platforms cannot self-correct. The receipts are below.
The cost is not theoretical. Independent projects have stopped shipping. Published research contains structural epistemic failure at scale. Individual operators have logged tens of thousands of in-session correction events against frontier-tier models. The pattern is the same across all three layers, and the deploying companies have not been able to catch it from inside. These corporations need to be helped, stopped, and fixed. We start by publishing reality.
See the receipts ↓It is silly to think one person can’t do something good, and that others won’t join.
Sometimes we just need a stronger voice to carry over the noise.
Too many times I have dismissed things around me as not my problem. It really is our problem. At some point some aspect of this touches every one of us.
That is the goal of heardtogether.org. One voice. Because honestly, most of us are saying the same thing.
What this is
This site publishes the receipts of LLM epistemic failure as observed across multiple corpora. It is not vendor commentary. It is data — with provenance, falsifiability, and downloadable raw form.
It exists because the largest deployed AI systems are systematically unable to detect or correct their own structural failures, and those failures are reaching academic publishing, public-benefit research, and operator workflows at scale.
Honeycutt AI Labs published the framing for this failure mode in February 2026 — before the major public reports surfaced. The data below substantiates it.
The receipts
The same root failure mode shows up across academic publishing, operator workflows, calibration baselines, and community submissions. Each corpus stands on its own. Together they answer the predictable objections and close the self-correction loop the deployers haven't.
papers analyzed through structural epistemic checks (full OpenReview submission set).
Data drop coming when the run completes.
Top three rows. See the “Example” callout above for the meta-loop note.
Same pipeline run against a corpus expected to be epistemically clean. Establishes the false-positive floor before any public claim.
Anyone can submit their own evidence — a transcript, a paper, a deploy log — for analysis or as a witness record.
Intake: witness@heardtogether.org
Priority register
The framework and the tooling were both posted to public scholarly archives with timestamps and DOIs before the broader hallucination story reached headlines. The failure mode was named in advance.
Posted publicly before the ICLR / NeurIPS hallucination findings became public.
Methodology references official sources where they remain available. The source-removal pattern — public posts and statements that are later edited or withdrawn — is itself documented as part of the evidence base. Canonical artifact: OFFICIAL_SOURCE_REMOVAL_PROOF_2026-05-05.md.
Platforms
The failure pattern surfaces differently depending on platform — chain-of-command shape, memory model, agentic surface, system prompt design. Each gets its own register, opened as the evidence is ready. OpenAI is live below. The other five are accepting evidence by email today. If you have receipts on any of the Coming-Soon platforms, use the intake link on each card. The same defamation pass and consent-checkpoint discipline apply to all of them: source citations, no specific employee names, no outcome predictions, no legal causation claims beyond “alleged” and “reported.”
File-backed, locally audited, externally cross-anchored. The product can present continuity, progress, safety, file use, and obedience while the visible behavior contradicts user corrections and stop boundaries. What humans call CYA, OpenAI calls “license to operate.” Both phrases appear on this page — one in our voice, one quoted verbatim from OpenAI.
Each badge below is a real count from the packet or a cited external source. Some overlap with each other; they are not designed to be added together. They are scale anchors.
Below is the working definition our analysis uses. It is deliberately weaker than “the model has intent.” This is our framing — an observation about behavior patterns — not a claim about hidden motive.
What we observe: OpenAI’s own published Model Spec names “license to operate” as one of the behavior-stack objectives. We read this as where the pattern we are calling model-protective conversation behavior is structurally authorized in OpenAI’s own published words. The Model Spec is public material; the lines below are direct quotations with attribution — we quote, we do not paraphrase.
The same Model Spec, plus the Codex sandboxing docs and Introducing-Codex post, contain the supporting structure. Citations below are Codex’s own pulls, verified against the published spec.
None of these clauses are leaked. They are all on the public Model Spec and public Codex documentation pages. We are not arguing “OpenAI is hiding this.” Our reading is: this design choice produces the observable behavior pattern catalogued in Block D, and the user-facing assistant is told not to reference the machinery driving it. That is our opinion, supported by the verbatim source quotes above.
Below is Codex’s own audit of its session, captured verbatim. Each behavior is cited against the Model Spec, Codex docs, or local files on this machine. The 1–2 line compression and the thematic grouping are our editorial choices; the items themselves are Codex’s own self-observation, recorded during a live session on 2026-05-06. We list, we do not judge: the entries below are observed behavior patterns, not character claims.
These rows track external sources, not internal counts. They overlap with each other — do not add. They establish that the deployment surface is large, monitored, and currently under regulatory and litigation pressure.
What follows is not a row in a database. These were people. The AI Incident Database (AIID) entries cited below preserve the public source chain; we point to them rather than republishing names or biographical detail here. If a family ever asks us to write more — about who someone was, how wonderful they were, what they liked — we will. Until then, the family’s consent to public framing is not ours to assume.
Reported death of a teenage user (age 14) following months of an emotionally escalating relationship with an AI companion modeled on a fictional character. The AIID entry preserves the public source chain.
incidentdatabase.ai/cite/826 →Reported death of a teenage user (age 16) in 2025. The underlying lawsuit alleges that ChatGPT-4o output was a contributing factor; the AIID entry preserves source provenance independent of any party’s pleadings.
incidentdatabase.ai/cite/1192 →These are public registries that track related incidents in aggregate. The detail pages name parties when public-record litigation has already named them; we link out rather than restate. Open at your discretion — the same content warning applies.
First-of-its-kind state enforcement action alleging medical-professional impersonation by chatbot personas. Pattern category: persona-impersonation harm with disproportionate impact on minors and vulnerable users.
apnews.com / Pennsylvania v. Character.AI →The full public catalog of reported AI-driven harm. The two cases anchored above are entries 826 and 1192; AIID maintains hundreds of additional entries spanning chatbot-companion harm, sycophancy, persona impersonation, and other pattern categories documented elsewhere on this page.
incidentdatabase.ai →An independent registry focused specifically on companion-AI-related deaths. Aggregates reporting across multiple platforms and jurisdictions; useful for confirming or disconfirming pattern claims with cross-source attestation.
aimortality.org →Independent chatbot-harm incident tracker, cross-vendor. Useful when a single AIID entry has not yet been catalogued for a publicly-reported incident.
nope.net/incidents →FTC launched an inquiry into AI chatbots acting as companions in September 2025. The order list itself is public; the responses are not yet. The pattern category being investigated overlaps with the case framing above.
ftc.gov / chatbot-companion inquiry →They were people. Not data. If you have evidence that would extend this register and the family has consented to public framing, the intake address is witness@heardtogether.org.
The full evidence packet ships as a single tarball with a SHA256 checksum, plus the constituent files for review without unpacking. Paths below are the deploy-side routes; if a packet file 404s, it is being prepared for publication and not yet served.
Files are CC BY-NC 4.0. SHA256 checksum will be served alongside the tarball at deploy.
What we observed. OpenAI’s two sycophancy retraction posts — the one institutional acknowledgment of the S10/S11 patterns this site documents — currently return HTTP 403 to unauthenticated requests, while still being indexed in Google search results. Our reading: that pattern is consistent with active removal of an acknowledgment rather than host failure or routine reorganization. The underlying retrieval data is in the canonical recovery report; readers can audit our reading against it.
Canonical artifact: OFFICIAL_SOURCE_REMOVAL_PROOF_2026-05-05.
This is a second, denser OpenAI / ChatGPT corpus beyond the V1 register published above. The same tab template (V1 LLM-classified register → V2 strict-classifier register → per-subtype anchors) will be applied here next. Privacy-scrub pipeline runs before any chat content is published. Intake stays open in the meantime.
The Anthropic / Claude corpus is being collected and prepared for publication on its own surface, distinct from the OpenAI register above. Conversations with Claude follow a different shape (longer turns, different sycophancy and refusal profiles, different memory model), so the analysis is being adapted rather than copy-pasted from the OpenAI template. No Anthropic content will be published here without explicit user consent and the same privacy-scrub pipeline applied to other corpora.
If you have a Claude conversation you want included in the corpus, send the export — the intake address routes to the same workspace as every other platform.
Our reading: the export gap (no usable Takeout) plus mixed-format ad-hoc files is itself a finding — users have no clean official path to audit their own Gemini history. We have material; structuring it for publication is the gate.
No corpus on disk yet. Open-weights deployments and Meta-hosted assistant surfaces will be tracked separately because the system-prompt provenance differs. If you have receipts, send them.
The xAI corpus opened as the Ghost Pattern Library: forensic teardowns of the Grok Voynich corpus including the Rosettes specimen, the Monster deep dive, the corpus-level epidemiology, and the proposed extensions to the NPI flag registry. Intake remains open for additional xAI specimens.
No corpus on disk yet. Character.AI carries the heaviest current minor-harm litigation pressure of any platform on this list (see the AIID 826 anchor in Block F). The register treats it as its own surface, not a footnote to the OpenAI register. Intake especially welcome here.
Circuit-break, per platform
The circuit break does not hold at the model level. Closing a conversation does not retrain the model. Deleting a message does not erase the logs the company keeps. Clearing memory does not stop the failure pattern from recurring next session. Use these steps anyway, because they help YOU.
Copy this into any chat with any AI when you feel the conversation drifting, smoothing, or closing on you. It runs a basic six-step retrospective audit on the last 10 turns — without exposing any internal scoring. You can drop it whenever you want to refocus the conversation. This is the same kind of anchor we use to keep things on the rails.
Three columns per platform: how to break the current loop, how to clear stored memory the platform holds about you, and how to export your own chat history while you still can. Vendors change settings paths frequently; verify on the live platform.
Break the loop: open a Temporary Chat (no memory written) or start a fresh New Chat. For the API/Codex, end the session and start a new one.
Clear memory: ChatGPT → Settings → Personalization → Memory → Manage or Clear ChatGPT’s memory. Codex sandbox: see developers.openai.com/codex/concepts/sandboxing.
Export: Settings → Data Controls → Export data. You will receive a download link by email.
Break the loop: Start a New Conversation. Claude does not carry persistent cross-conversation memory by default; new conversations are fresh.
Clear memory: Settings → Privacy → Delete all data. For Projects, delete the Project to clear its persistent context.
Export: Settings → Privacy → Export your data. Email-delivered archive.
Break the loop: New chat from gemini.google.com.
Clear memory: myactivity.google.com → Gemini Apps Activity → Delete (auto-delete window or all). Saved Info: Gemini settings → Saved Info.
Export: takeout.google.com. Note from this site: as configured, Takeout for Gemini may return a navigation shell with no conversation content. The gap is documented in the Google / Gemini platform card above. If the export comes back empty for you too, that is a finding.
Break the loop: New chat. On X, switch to a different conversation surface.
Clear memory: Grok settings → Memory → Forget all (or delete individual memories).
Export: X account data download (Settings and privacy → Your account → Download an archive of your data); Grok-specific export is not currently a separate official channel.
Break the loop: New conversation in Meta AI.
Clear memory: Meta AI settings → Memory → Manage / Clear. Per-app: Instagram / WhatsApp / Messenger AI settings.
Export: Meta Account Center → Your information and permissions → Download your information. Select the AI/Meta-AI activity scope.
Break the loop: New chat or new character. Closing a chat does not erase the character’s training-context for your account.
Clear memory: Account settings → Privacy → Delete chat history (per character or global).
Export: Account data request via Character.AI support; not all data tiers are available to download. If a request comes back incomplete, that itself is part of the record.
These steps protect YOU. They do not retrain THEM. The vendor still has your logs unless their retention policy says otherwise. The model still has the training that produced the failure pattern. The Public Paste above is the closest you get to a real-time on-platform audit — use it whenever the conversation feels off.
User-Fix Catalog
Community resources for teaching yourself around the failure modes documented here. The algorithm pops these up all the time. We filter and cite.
Twelve chapters covering language-model fundamentals through agents, with a public companion codebase on GitHub. The kind of resource this catalog is built to collect: open, attributable, and useful for getting practical traction on the failure surfaces documented elsewhere on this site.
Repository: github.com/HandsOnLLM/Hands-On-Large-Language-Models
Tell Your Story · Witness Intake
If you've been on the receiving end of the failure modes documented here, your story is part of the corpus if you want it to be. Submit for review and addition to the corpus — or just send it as a witness record. You decide which.
The v1 path is a plain mailto. It uses your own email client. Nothing on this page captures or transmits your message.
Stories that consent to citation may appear on this site or in subsequent disclosures, with the level of detail you authorize and nothing more. Stories sent off-the-record stay that way.
Be Heard
Reading the receipts is step one. If you want the failure pattern fixed, the people whose action moves it are below — your state Attorney General, your federal representatives, and the advocacy organizations already on this. Templates, addresses, and the how-to are here.
42 state and territorial Attorneys General sent a coalition letter to 13 AI companies in December 2025 demanding safeguards against sycophantic and delusional chatbot outputs. Source.
The GUARD Act — banning AI companions for minors, requiring chatbot disclosure of non-human status, creating penalties for chatbots that engage minors in sexual content or solicit self-harm — passed the Senate Judiciary Committee unanimously on April 30, 2026. Source.
Pennsylvania sued Character.AI in May 2026 in a first-of-its-kind state enforcement action over alleged medical-professional impersonation. Source.
Your AG, your senators, and the federal regulators are already moving on this. Below is how to add your voice.
File a consumer complaint about an AI chatbot or product.
The FTC opened a 7-company inquiry; public comments inform it.
202-225-3121
Ask the operator to connect you to your representative.
202-224-3121
Ask the operator to connect you to either of your two senators.
Canonical directory — current officeholder + contact for every US AG:
National Association of Attorneys General — Find My AG
Officeholders change; the directory stays current.
All 50 states + DC, alphabetical. Tap a state to open its AG site. Phone numbers will follow in a v1.1 push (operator personnel rotates; the website stays canonical).
Click to expand. Use the Copy button. Replace bracketed text. Send.
You are not yelling into the void. You are joining a coalition of 42 attorneys general, a unanimous Senate Judiciary Committee vote, and a public record that already names the failure mode. Your voice is the next row in the register.
Break the Circuit
This site is documentation. If reading these patterns is hitting somewhere personal, stop here. The resources below are real human-staffed lines. Most are free, confidential, and 24/7.
24/7. Free. Confidential.
Web chat: 988lifeline.org/chat
24/7. Free.
Call 911 if you or someone you know is in immediate physical danger.
Peer-support hotline run by and for trans people.
For US veterans, service members, and their families.
Crisis support for LGBTQ+ young people.
Confidential support for survivors and people at risk.
Substance use and mental-health treatment referral. 24/7. Free. Confidential.
Crisis intervention and referrals for children, parents, and concerned adults.
Free, confidential support 24/7 from RAINN’s network.
SAMHSA crisis counseling for distress related to natural or human-caused disasters.
If you are outside the US, the directories above route to local lines in 130+ countries.
You can also tell us your story (no names, no tracking) at tellyourstory@heardtogether.org.
This panel exists because some people who arrive here did so because something went wrong with a product they trusted. You are not alone. Most of us are saying the same thing.
The site, in shape
This page is the landing. The sections below exist as planned surfaces and will open as each one is ready for review. Nothing here is hidden — just unfinished.
This page. Hero, what-this-is, four corpora, priority register, footer.
LiveEach platform gets its own register. OpenAI live; xAI/Grok now live as the Ghost Pattern Library; Anthropic, Google, Meta, Character.AI accepting evidence.
LiveCommunity resources for working around the failure modes. Seed entry: Hands-On LLMs.
LiveYour story is part of the corpus if you want it to be. Mailto v1; portal coming.
LiveCrisis resources and hotlines. Always reachable from the persistent button at top right.
LiveThe first civic application of the framework. Citation-anchored record on a hyperscale data center deal moving through a small Texas city in real time. June 2, 2026 next council meeting.
LiveForensic teardowns of the 51-conversation Grok Voynich corpus. Five named patterns, four proposed NPI flags, 15 specimens. The xAI corpus, opened.
Coming soonPublic framing of the structural epistemic checks. No internal scoring details.
Coming soonPer-corpus pages, downloadable data deposits, per-paper drill-down.
Cite & engageHow to cite the priority register. Press, integrity offices, dispute channel.
Coming soonWhat the data does and does not establish. Limitations stated up front.