OC below by @HaraldvonBlauzahn@feddit.org
What called my attention is that assessments of AI are becoming polarized and somewhat a matter of belief.
Some people firmly believe LLMs are helpful. But programming is a logical task and LLMs can’t think - only generate statistically plausible patterns.
The author of the article explains that this creates the same psychological hazards like astrology or tarot cards, psychological traps that have been exploited by psychics for centuries - and even very intelligent people can fall prey to these.
Finally what should cause alarm is that on top that LLMs can’t think, but people behave as if they do, there is no objective scientifically sound examination whether AI models can create any working software faster. Given that there are multi-billion dollar investments, and there was more than enough time to carry through controlled experiments, this should raise loud alarm bells.
Is the argument that LLMs are thinking because they make guesses when they don’t know things combined with no provided quantity or quality to describe thinking?
If so, I would suggest that the word “guessing” is doing a lot of heavy lifting here. The real question would be “is statistics guessing”? I would say guessing and statistics are not the same thing, and Oxford would agree. An LLM just grabs tokens based on training data on what word or token most likely comes next, it will just be using what the statistically most likely next token or word is. I don’t think grabbing the highest likely next token counts as guessing. That feels very algorithmic and statistical to me. It is also possible I’m missing the argument still.
No, it’s that you can’t root the argument that they don’t think over the fact they make stuff up, because humans too. You could root it in the amount of things it guess wrong, but it’s extremely hard to measure.
Again, I’m not claiming that they think, but that we don’t know until one or the other is proven.
Right now, thinking one, or the other is true, is belief.
I think you can make a strong argument that they don’t think rooted in words should mean something and that statistics and thinking don’t mean the same thing. To me, that feels like a fairly valid argument.
So you think you need words to be able to think ? Monkeys, birds, human babies are unable to think then ?
My apologies, I was too vague. I’m saying “thinking” by definition is not “statistics”. Where Monkeys, birds, and human babies all “think”, LLMs use algorithms and “statistics”. I also think that “statistics” not meaning the same thing that “thinking” is a valid argument. I would go farther and say it’s important that words have meaning. That is what I was attempting to convey. I’m happy to clear up anything I was unclear about.
You are mistaking how LLMs are trained to how they work.
It’s not because it’s been trained with statistics, that they compute, or think using statistics.
For example, to do additions, internally LLMs do trignonometry: https://arxiv.org/abs/2502.00873
They do probably use statistics for tons of stuff internally, but humans do too: guessing, bias, tendency, preferences.
Anthropics researcher found that their LLMs have “features” for concepts.
I don’t think you can disconnect how an LLM was trained from how it operates. If you train an LLM to use trigonometry to solve addition problems, I think you will find the LLM will do trigonometry to solve addition problems. If you train an LLM in only Russian, it will speak Russian. I would suggest that regardless of what you train it on it will choose the statistically most likely next token based on its training data.
I would also suggest we don’t know the exact training data being used on most LLMs, so as outsiders we can’t say one way or another on how the LLM is being trained to do anything. We can try to extrapolate from posts like the one that you linked to how the LLM was trained though. In general if that is how the LLM is coming to its next token, then the training data must be really heavily weighted in that manner.
You can, heck the example I gave show exaclty this:
It was not trained to do trigonometry to solve addition problem, it was trained to respond to additions, trigonometry is how the statiscal part, the backpropagation, found a way to make the neurons solve additions.
You are mixing up stuff, the way LLM are trained does not impose anything about how the neurons gets organised to get better score at inferrence.
I would point out I think you might be overly confident in the manner in which it was trained addition. I’m open to being wrong here, but when you say “It was not trained to do trigonometry to solve addition problem”, that suggests to me either you know how it was trained, or are making assumptions about how it was trained. I would suggest unless you work at one of these companies, you probably are not privy to their training data. This is not an accusation, I think that is probably a trade secret at this point. And if the idea that there would be nobody training an LLM to do addition in this manner, I invite you to glance the Wikipedia article on addition. Really glance at literally any math topic on Wikipedia. I didn’t notice any trigonometry in this entry but I did find the discussion around finding the limits of logarithmic equations in the “Related Operations” section: https://en.m.wikipedia.org/wiki/Addition. They also cite convolution as another way to add in which they jump straight to calculus: https://en.m.wikipedia.org/wiki/Convolution.
This is all to say, I would suggest that we don’t know how they’re training LLMs. We don’t know what that training data is or how it is being used exactly. What we do know is that LLMs work on tokens and weights. The weights and statistical relevance to each of the other tokens depends on the training data, which we don’t have access to.
I know this is not the point, but up until this point I’ve been fairly pedantic and tried to use the correct terminology, so I would point out that technically LLMs have “tensors” not “neurons”. I get that tensors are designed to behave like neurons, and this is just me being pedantic. I know what you mean when you say neurons, just wanted to clarify and be consistent. No shade intended.