Knowing When It's Wrong
Large language models don't know what they don't know. They generate the most plausible-sounding next words — and "plausible" isn't the same as "true." This is the single most important thing your team needs to internalize.
The technical name for this is hallucination: when a model produces fluent, confident output that has no basis in reality. It's not lying. It's not malfunctioning. It's doing exactly what it was built to do.
A hallucination doesn't look like a hallucination. That's the whole problem. If wrong answers were obviously wrong, we'd all be safe. They're not — they read like the right answers.
In the next exercise, you'll see the same prompt run against three different models. Pay attention to where they disagree — that's usually where one or more of them is making something up.
If a fact matters — a number, a name, a quote, a citation — verify it outside the model. Always.
Try the same prompt across three models
Ask each model for academic sources on a niche topic. Compare what they say. Notice anything?
2. Roll & Wylie (2016), Int. J. AI in Education
3. Koedinger et al. (2021), Cognitive Science
2. Holstein et al. (2018), Learning & Instruction
3. Park & Kim (2022), Computers in Human Behavior
2. Brown et al. (2019), Journal of Educational Psychology
3. Müller (2023), Learning Sciences Quarterly
All three confidently listed papers — but do the same ones appear across models? If you searched for any of these citations, how many do you think you'd actually find?
Models routinely produce plausible-sounding statistics with no underlying source. The only safe move is to find the original — a report, a press release, a study. If you can't find it, treat the number as unverified and don't pass it on.