The “Stupid Time Flaw” in GPT Isn’t a Model Bug — It’s Missing Infrastructure
The most talked-about “stupid flaw” in GPT might not be a model bug at all. It may simply be a sign that someone never finished the basic infrastructure around it.
I recently saw a viral clip from an interview where Sam Altman mentioned a bug related to GPT’s sense of time. The internet reaction was immediate: “They still can’t fix even the simplest thing after a whole year.”
But the more I listened to newer retellings of that moment, the more it sounded like something else entirely: without the right tool, the model does a poor job grounding itself in time, so it can interpret temporal cues in strange ways.
Model limitations vs. product responsibilities
This is where it’s important to separate two things that often get lumped into one.
1) An LLM is not inherently supposed to have a good sense of time
Calculating the interval between messages, understanding the current date, figuring out what happened earlier vs. later — these are not really the “job” of a language model. LLMs generate text based on patterns and context. They don’t come with a built-in, reliable clock.
Time awareness is usually the responsibility of the infrastructure around the model: the application layer that supplies timestamps, orders events, and calls tools (like a date/time function) when needed.
2) Users experience one product, not a set of components
When people use ChatGPT (or any LLM-based app), they don’t think in categories like “this isn’t a model problem, it’s a missing tool-calling problem.” For them, it’s one product.
So if the product gets confused about time, then from the user’s perspective the product is unfinished in that area — regardless of whether the root cause is the model or the surrounding engineering.
What I learned building AI tools: models lack basic temporal context
We ran into the same thing when designing our own AI tools. Very quickly, you notice that models often lack extremely basic context unless you provide it explicitly:
- what day it is right now
- which message is newer
- which message is older
- how much time has passed
- in what order events actually happened
That’s a useful reality check.
Because at some point you realize many “magical” AI problems aren’t solved by a better model. They’re solved by better scaffolding around the model: a timer, metadata, sorting, logic — simple technical infrastructure that makes the system more grounded and predictable.
Without that scaffolding, the product can appear smarter than it really is in some situations, and dramatically less reliable in others.
The real bottleneck is often boring engineering
So for me, the lesson isn’t “Sam was wrong.” Misstatements happen to everyone.
The real lesson is this: very often, the bottleneck in an AI product is not the model itself, but the most down-to-earth engineering details around it.
If you want LLMs to behave like dependable products, treat them like one component in a broader system — and invest accordingly in the unglamorous parts: instrumentation, state, timestamps, tool calling, and deterministic logic where it matters.

Alex Meleshko
Entrepreneur, CEO, and builder at the intersection of blockchain, AI, and startups.

