Right... because it's been trained on those news stories.
The point is a model whose training stopped in 2021 would not produce a history of ukraine (etc.) that a person writing in 2023 would.
The later GPTs are trained on the user-provided prompts/answers of previous GPTs, so this process (which isnt the LLM, but it's the activity of research staff at OpenAI) is what's inducing approximate tracking of some changes in meaning.
Whilst this works for any changes over-represented in the new training data, (1) the LLM isnt doing that, its the researchers; and (2) this process is vastly expensive and time-intensive; and (3) only tracks changes with a high word frequency in new data.
If you could run the months-long, 1GWh, 10s-million-USD training process each minutes of the day, you would resolve the inability of the model to track major news stores... but would not resolve its ability to track, say, the user changing their clothes.
The sensitivity to the model of stuff in the world arises because of humans preparing the training data to bring about apparent sensitivity. Absent the activity of these humans, the whole thing drifts gradually into irrelvance.
> would not resolve its ability to track, say, the user changing their clothes.
In context learning works fine for this (and does for the Russia/Ukraine change too).
But yes, sure. It can be outdated in the same way a person cut off from news can be.
We've never argued that a shipwrecked person who was unaware of news became less intelligent because of that, just that their knowledge is outdated.
Additionally, the whole point of machine learning is to make systems that learn so they remain useful.
It seems likely that a model in soon (one year? five years? one month? who knows..) will be able to continually watch video broadcast news and videos of your home, continually updating its model.
In this case it would understand both the Ukraine issue and what you are wearing. Is it now suddenly intelligent? It's true it might be more useful, but to me that is a different thing.
The point is a model whose training stopped in 2021 would not produce a history of ukraine (etc.) that a person writing in 2023 would.
The later GPTs are trained on the user-provided prompts/answers of previous GPTs, so this process (which isnt the LLM, but it's the activity of research staff at OpenAI) is what's inducing approximate tracking of some changes in meaning.
Whilst this works for any changes over-represented in the new training data, (1) the LLM isnt doing that, its the researchers; and (2) this process is vastly expensive and time-intensive; and (3) only tracks changes with a high word frequency in new data.
If you could run the months-long, 1GWh, 10s-million-USD training process each minutes of the day, you would resolve the inability of the model to track major news stores... but would not resolve its ability to track, say, the user changing their clothes.
The sensitivity to the model of stuff in the world arises because of humans preparing the training data to bring about apparent sensitivity. Absent the activity of these humans, the whole thing drifts gradually into irrelvance.