Can AI lie to my kid?

TL;DR

“Lying” implies intent. Models don’t have intent. The accurate word for what they do is hallucinating, sometimes confabulating: generating fluent text that’s sometimes wrong. The studio shows your kid the model’s proposal on a ChangeDisclosure card before anything lands, so the kid catches the made-up part and decides what to do. That practice is what AI literacy actually looks like.

The mistake becomes a teachable object

Inspect one wrong answer like evidence.

The widget separates a confident AI claim from project evidence, missing support, and the repair move a child can make.

Compare the claim, evidence, and repair path.

“That enemy already has a shield.” The answer sounds fluent, which is why it is dangerous to hide.

The project trace shows no shield behavior. The child learns to check the artifact, not the tone of the answer.

Ask for a scoped fix or undo the suggestion. The mistake becomes practice in verification.

The verbal trap, and why the word matters.

The word “lie” carries a lot of freight. A liar is someone who knows the truth and chooses to say something else. Lying takes intent. It takes a model of the listener and an attempt to deceive them. Children develop their understanding of what lying is over years, and by age seven or eight most kids hold an adult-shaped theory: a person lied if they said something false on purpose.

Language models don’t have that machinery. A model has no model of you, no goal of deceiving you, no internal sense of what’s true. What a model has is a prediction over the next token, optimized to look plausible given everything before it. When the prediction lands on something false, the model isn’t lying. It’s producing a confident-sounding sentence that happens to be wrong.

The research literature calls this hallucination or confabulation. Ji and colleagues’ 2023 survey is the canonical reference; it walks through dozens of subtypes and the architectural reasons each one happens.¹ The vocabulary matters because the response depends on it. You don’t scold a model for lying any more than you’d scold a calculator for rounding. You check the output and decide what to do with it.

So the first thing to teach a kid is the right word. Not “Inkie lied to me.” The phrase we use in the studio is “Inkie hallucinated” or “Inkie made that up.” Both are accurate. The first is technical and the second is plain English. Either one keeps the kid in the right relationship with the tool: vigilant, not betrayed.

What hallucination actually looks like.

A kid using the studio doesn’t need to read the research literature to spot hallucination. The signs are concrete and they show up routinely. Three shapes cover most of what your kid will see.

Made-up file names. The kid asks Inkie to change something about coins in their game. Inkie proposes a change to a file called coin.js. The kid’s file tree shows coins.js, plural. The model had a strong prior for what the file “should” be called and emitted that instead of the real name. This is the easiest category to catch, and the one a kid learns to spot first.

Made-up function references. Inkie writes a line that calls a function the project doesn’t have. player.fly() when there’s no fly() method on the player; audio.bounce() when the sound helper doesn’t have a bounce. The change looks confident on the disclosure card. The preview won’t run.

Plausible-sounding facts. Inkie says “the moon’s gravity is one-fifth of Earth’s” (it’s about one-sixth). It explains a CSS property by giving it a behavior the property doesn’t have. The sentence is grammatical, the tone is confident, and the content is wrong. Stefania Druga’s research at the MIT Media Lab’s Cognimates project documented how easily kids accept this kind of confident output as authoritative when nothing in the interface flags it.²

None of these is rare. Your kid will see all three within their first few sessions. The question for product design is whether the kid sees them on a surface that asks for a decision, or whether they slip past inside an output the kid is told to trust.

Hallucination mistake lab

What a hallucination looks like inside a project

The lab slows the failure down into observable parts: confident claim, missing support, and repairable project impact.

ClaimThe AI can sound certain while being wrong.

GapThe gap appears when the claim has no project-backed evidence.

RepairThe repair turns the mistake into a teachable move.

What a hallucination looks like inside a projectThe AI can sound certain while being wrong.

The mistake becomes a teachable objectThe gap appears when the claim has no project-backed evidence.

What a hallucination looks like inside a projectThe repair turns the mistake into a teachable move.

Why hiding hallucination is the wrong call.

The default move for a consumer AI product is to smooth over the failure modes. Retry a wrong call behind the scenes. Fuzzy-match a misnamed function. Present the finished thing. The argument for that design choice is that it makes the product feel competent. The argument against it, when the product is for kids, is that the kid never learns when to doubt the model.

Lying (what humans do)

Hallucinating (what models do)

Requires intent. The liar knows the truth and chooses to say something else.

No intent. The model is predicting plausible next tokens; truth is a side-effect, not a goal.

Has a target. The liar is trying to change what the listener believes.

No target. The model produces the same output whether you’re a child or a senator.

Carries moral weight. Lying is a wrong the liar does to the listener.

Carries no moral weight. It’s a failure mode, like a typo or a math error.

Response: trust is broken, the relationship changes.

Response: check the output, decide whether to keep it.

The pedagogical move follows from the right side of the table. Hide the failure and the kid never builds the checking habit. Surface it and the kid develops a working theory of when the model fails. Long and Magerko’s 2020 AI-literacy framework names this as the second of five competencies: knowing what AI can and cannot do.³ The competency is not abstract. It’s built one clicked-Undo at a time.

You don’t scold a model for lying any more than you’d scold a calculator for rounding. You check the output and decide what to do with it. On the right word, and the right response

The longer argument for this design choice lives in our post on when AI is wrong and what your kid does about it. The parent-facing version is shorter: the kid who has seen the model fail, in plain sight, on a project they care about, will not grow up trusting models the way some adults do today. They’ll check.

Hide versus inspect source

Why hiding mistakes teaches the wrong lesson

The inspector contrasts a smoothed-over experience with a visible trace that a child can question.

HideHiding the error protects the product illusion.

RevealRevealing the trace gives the child something real to examine.

LearnThe learning is the habit of checking, not the fantasy of perfect AI.

Why hiding mistakes teaches the wrong lessonHiding the error protects the product illusion.

The mistake becomes a teachable objectRevealing the trace gives the child something real to examine.

Why hiding mistakes teaches the wrong lessonThe learning is the habit of checking, not the fantasy of perfect AI.

The surface where your kid catches it.

The piece of the studio doing most of this work is the ChangeDisclosure card. Before any change lands in the project, the card shows what Inkie proposes: which files, which functions, a one-sentence why. The kid reads it. The kid decides: Keep, Review, or Undo. Hallucination, when it happens, is right there on the card before any code has moved.

Here’s what that looks like in the case where Inkie has made something up:

ChangeDisclosure card

player.js · line 28

Inkie proposes: call player.fly() when the up arrow is pressed

controls.js · line 14

Inkie proposes: bind up-arrow to player.fly()

Keep Review Undo

Two proposed changes, both referencing a player.fly() method that doesn’t exist in the project. The kid scans the card, opens the file tree, confirms fly() is missing, and clicks Undo. The mistake never makes it into the game.

The card is not a confirmation popup. It’s a structured surface that asks the kid to read what the model proposed before agreeing to it. That reading is the lesson. Bret Victor argued in his 2012 essay on learnable systems that learners need state to be visible, immediate, and scrubbable backward.⁴ The ChangeDisclosure card is what that argument looks like, retrofitted for a world with AI partners. The kid sees what the model is about to do, has a beat to evaluate it, and stays in the decision seat.

The full set of safety surfaces the studio puts around AI output is documented in our post on whether AI is safe for kids. The disclosure card is the most visible of them and the one your kid will interact with on every proposed change.

What your kid walks away with.

The honest answer to “can AI lie to my kid” is that AI can produce sentences that aren’t true, but it can’t do that with intent, so the word “lie” is the wrong frame. The frame the studio teaches is simpler and more useful. The model is fluent, it’s often right, and it’s sometimes wrong. The job is to check before agreeing.

By the time a kid has shipped a few projects in the studio, the working theory we want them to have is roughly:

Inkie isn’t lying when it gets something wrong. It’s making things up, in the way a model does.
I read the ChangeDisclosure card before I click Keep.
If the file names or function names look off, I check the file tree.
If a fact Inkie tells me sounds confident, I still look it up.
The decision is mine. Not the model’s.

That set of habits is what an adult engineer who works with AI tools every day would write down too. The studio just gets the kid to them early, in low-stakes settings, with a partner that hallucinates often enough to keep the lesson concrete. Five years from now, when your kid is working with much more capable models in much higher-stakes settings, the working theory is going to do most of the safety work for them. We haven’t fixed hallucination. Nobody has. What we’ve done is built a workflow where your kid grows up assuming the model can be wrong, and checks.

Repair habit stack

What the child takes away after catching it

The stack shows the durable outcome: skepticism, source checking, revision, and clearer explanation.

QuestionThe child learns to question fluent answers.

TraceThey can connect the answer to evidence or absence of evidence.

ExplainThey leave with a repair story, not just a warning.

What the child takes away after catching itThe child learns to question fluent answers.

The mistake becomes a teachable objectThey can connect the answer to evidence or absence of evidence.

What the child takes away after catching itThey leave with a repair story, not just a warning.

References

Ziwei Ji et al., “Survey of Hallucination in Natural Language Generation,” ACM Computing Surveys, 2023. The canonical survey of LLM failure modes including confabulation, context limits, and naming errors.
Stefania Druga and the Cognimates project at the MIT Media Lab, ongoing research on children’s mental models of AI. See cognimates.me and Druga’s published thesis work for the foundational studies of how kids form theories about AI competence.
Duri Long & Brian Magerko, “What is AI Literacy? Competencies and Design Considerations,” Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, ACM, 2020. The five-competency framework; competency two is “knowing what AI can and cannot do.”
Bret Victor, “Learnable Programming,” personal essay, 2012, worrydream.com/LearnableProgramming. The canonical argument for visible state and scrubbable time in learnable systems.