Artificial Intelligence
How do large language models work?
Large language models work by predicting the next word in a sequence. Trained on vast text using the transformer architecture, they learn statistical patterns of language, then generate responses one token at a time.
See it in motion.
Watch a 2-minute animated lesson that shows exactly how large language models works.
Step by step
- 1Text is converted into tokens (numbers); the model processes them through billions of learned parameters.
- 2The transformer's 'attention' mechanism weighs how each word relates to the others.
- 3Training adjusts the parameters to predict the next token accurately across huge datasets.
- 4Because they predict plausible text rather than verify facts, they can 'hallucinate' — sound right but be wrong.
Frequently asked questions
- How does an LLM generate an answer?
- It predicts the most likely next token given everything so far, appends it, and repeats — building the response token by token.
- What is attention in a transformer?
- A mechanism that lets the model weigh which earlier words matter most for predicting the next one, capturing context and relationships.
- Why do LLMs hallucinate?
- They optimize for plausible-sounding text based on patterns, not truth, so they can produce confident but incorrect statements.