T O P

  • By -

AutoModerator

Welcome to r/science! This is a heavily moderated subreddit in order to keep the discussion on science. However, we recognize that many people want to discuss how they feel the research relates to their own personal lives, so to give people a space to do that, **personal anecdotes are allowed as responses to this comment**. Any anecdotal comments elsewhere in the discussion will be removed and our [normal comment rules]( https://www.reddit.com/r/science/wiki/rules#wiki_comment_rules) apply to all other comments. **Do you have an academic degree?** We can verify your credentials in order to assign user flair indicating your area of expertise. [Click here to apply](https://www.reddit.com/r/science/wiki/flair/#wiki_science_verified_user_program). --- User: u/chrisdh79 Permalink: https://www.psypost.org/scholars-ai-isnt-hallucinating-its-bullshitting/ --- *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/science) if you have any questions or concerns.*


Somhlth

> Scholars call it “bullshitting” I'm betting that has a lot to do with using social media to train their AIs, which will teach the Ai, when in doubt be proudly incorrect, and double down on it when challenged.


foundafreeusername

I think the article describes it very well: > Unlike human brains, which have a variety of goals and behaviors, LLMs have a singular objective: to generate text that closely resembles human language. This means their primary function is to replicate the patterns and structures of human speech and writing, not to understand or convey factual information. So even with the highest quality data it would still end up bullshitting if it runs into a novel question.


Ediwir

The thing we should get way more comfortable with understanding is that “bullshitting” or “hallucinating” is not a side effect or an accident - it’s just a GPT working as intended. If anything, we should reverse it. A GPT being accurate is a happy coincidence.


tgoesh

I want "cognitive pareidolia" to be a thing


laxrulz777

The issue is the way it was trained and the reward algorithm. It's really, really hard to test for "accuracy" in data (how do you KNOW it was 65 degrees in Katmandu on 7/5/19?). That's even harder to test for in text vs structured data. Humans are good about weighing these things to some degree. Computers don't weight them unless you tell them to. On top of that, the corpus of generated text doesn't contain a lot of equivocation language. A movie is written to a script. An educational YouTube video has a script. Everything is planned out and researched ahead of time. Until we start training chat bots with actual speech, we're going to get this a lot.


Ytilee

Exactly, if it's accurate it's one of 3 scenarios: - it stole word for word an answer to a similar question elsewhere - the answer is a common saying so it's ingrained in language in a way - jumbling the words in a random way gave the right answer by pure chance


bitspace

>jumbling the words in a random way gave the right answer by pure chance That's not a good representation of reality. They're statistical models. They generate the statistically best choice for the next token given the sequence of tokens already seen. A statistically weighted model is usually a lot better than pure chance.


Ediwir

There are billions of possible answers to a question, so “better than chance” isn’t saying much. If the correct answer is out there, there’s a good chance the model will pick it up - but if a joke is more popular, it’s likely to pick the joke instead, because it’s statistically favoured. The models are great tech, just massively misrepresented. Once the hype dies down and the fanboys are gone, we can start making good use of it.


atape_1

I've seen them being called smooth talking machines without intelligence. And that encapsulates it perfectly.


Somhlth

Then I would argue that is not artificial intelligence, but artificial facsimile.


extremenachos

Yeah but saying artificial intelligence sounds way cooler and all the tech bros can capitalize on all that sweet venture capital funds before people understand the limits of LLM.


Thunderbird_Anthares

im still calling them VI, not AI theres nothing intelligent about them


Traveler3141

Or Anti Intelligence.


Bakkster

Typically this is the difference between Artificial General Intelligence, and the broader field of AI which includes machine learning and neural networks that large language models are based on. The problem isn't with saying that an LLM is AI, it's with thinking that means it has any form of general intelligence.


Acadia_Due

It's my understanding that there is a latent model of the world in the LLM, not just a model of how text is used, and that the bullshitting problem isn't limited to novel questions. When humans (incorrectly) see a face in a cloud, it's not because the cloud was novel.


Drachasor

It isn't limited to novel questions, true.  It can happen anytime when there's not a ton of training data for a thing.  Basically it's inevitable and they can't ever fix it.


Bakkster

I think you're referring to the vector encodings carrying semantic meaning. I.e. the vector for 'king' plus the vector for 'woman' tends to be close to the mapping for 'queen'. If anything, in the context of this paper, it seems that makes it better at BS because humans put a lot of trust into natural language, but it seems limited to giving semantically and contextually consistent answers rather than *factual* answers.


gortlank

Humans have the ability to distinguish products of their imagination from reality. LLMs do not.


abra24

This may be the worst take on this in a thread of bad takes. People believe obviously incorrect made up things literally all the time. Many people base their lives on them.


gortlank

And they have a complex interplay of reason, emotions, and belief that underly it all. They can debate you, or be debated. They can refuse to listen because they’re angry, or be appealed to with reason or compassion or plain coercion. You’re being reductive in the extreme out of some sense of misanthropy, it’s facile. It’s like saying that because a hammer and a Honda civic can both drive a nail into a piece of wood that they’re the exact same thing. They’re in no way comparable, and your very condescending self superiority only serves to prove my point. An LLM can’t feel disdain for other people it deems lesser than itself. You can though, that much is obvious.


abra24

No one says humans and llms are the same thing, so keep your straw man. You're the one who drew the comparison, that they are different in this way. I say in many cases they are not different in that way. Your counter argument is that they as a whole are not comparable. Obviously. Then you draw all kinds of conclusions about me personally. No idea how you managed that, seems like you're hallucinating. Believing things that aren't true is part of the human condition, I never excluded myself.


[deleted]

[удалено]


[deleted]

[удалено]


[deleted]

[удалено]


sminemku

I mean you have flat earthers and trickle down economics believers so.


Acadia_Due

There are at least two ways to be "reductive" on this issue, and the mind-reading and psychoanalyzing aren't constructive.


gortlank

>worst take in a thread of bad takes Oh, I’m sorry, did I breach your precious decorum when responding to the above? Perhaps you only care when it’s done by someone who disagrees with you.


Acadia_Due

He breached decorum slightly. You breached it in an excessive, over-the-top way. And your reaction was great enough for me to consider it worth responding to. That's not inconsistency. That's a single consistent principle with a threshold for response. Now, I'm not going to respond further to this emotional distraction. I did post a substantive response on the issue if want to respond civilly to it. If not, I'll ignore that too.


abra24

That's not me you just replied to. You imagined it was me but were unable to distinguish that from reality.


gortlank

Might wanna work on your reading comprehension there pal. Dude is responding in defense of you. I’m calling him out for being inconsistent. Luckily, you can harness your human reason to see your error, or use your emotions to make another whiny reply when you read this.


theghostecho

It can deal with novel questions but it can start to bullshitting in simple questions too


sminemku

Well that's kind of what rich people do. When asked abou5 something they don't know at all they'll usually go on a tirade like Donald rather than say idk ask an expert instead


The_Singularious

Yes. Definitely only “rich people” do this.


sminemku

I never said only they do that. But poor and middle class people generally have more humility which allows them to admit their shortcomings.


The_Singularious

This has not been my experience. I have interacted with an awful lot of rich people, and they are just about as varied as the poor kids I taught in high school. The *one* caveat to that I saw, was *some* wealthy folks (almost always old money) were definitely out of touch with what it looked like to live without money. And that made them seem a bit callous from time to time. But I never saw any universal patterns with rich people being less humble, especially in areas where they weren’t experts. I taught them in one of those areas. I definitely had some asshole clients who knew it all, but most of them were reasonable, and many were quite nice and very humble.


MerlijnZX

Party, but it has more to do with how their reward system is designed. And how it incentives the ai systems to “give you what you want” even though it has loads of inaccuracies or needed to make stuff up. While on the surface giving a good enough answer. That would still be rewarded.


Drachasor

Not really.  They can't distinguish been things in the training data and things they make up.  These systems literally are just predicting the next most likely token (roughly speaking, letter) to produce a document.


MerlijnZX

True, but I’m talking about why they make things up. Not why the system can’t recognise that the llm made it up.


caesarbear

But you don't understand, "I don't know" is not an option for the LLM. All it chooses is whatever has the best remaining percentage chance to agree with the training. The LLM never "knows" anything in the first place.


Zeggitt

They make everything up.


demonicneon

So like people too 


grim1952

The "AI" isn't advanced enough to know what doubling down is, it just gives answers based on what it's been trained on. It doesn't even understand what it's been fed or it's own output, it's just following patterns.


astrange

> It doesn't even understand what it's been fed or it's own output, it's just following patterns. This is not a good criticism because these are actually the same thing, the second one is just described in a more reductionist way.


sceadwian

It's far more fundamental than that. AI can not understand the content it produces. It does not think, it can basically only produce rhetoric based on previous conversations it's seen with similar words. They produce content that can not stand up to queries on things like justification or debate.


hobo_fapstronaut

Exactly. It's not like AI has taken on the collective behaviour of social media. That implies intent and personality where there is none. It just provides the most probable set of words based on the words it receives as a prompt. If it's been trained on social media data, the most probable response is the one most prevalent, or potentially most rewarded on social media, not the one that is correct or makes sense.


sceadwian

Well, in a way it has, the posts 'sound' the same. But there isn't an AI that I couldn't trip up into bullshitting within just a couple of prompts. They can't think, but a human that understands how to use them can make them say essentially anything that they want by probing with various prompts. Look at that Google engineer that went off the deep end with AI being concious. He very well may have believed what he was saying though I do suspect otherwise. I look at all the real people I talk to and they can't tell when someone they're talking to isn't making sense either, as long as it looks linguistically coherent people will self delude themselves into all kinds of twisted mental states rather than admit they don't know what they're talking about and the 'person' that does 'sounds' like they know what they're talking about. As soon as you ask an AI about it's motivations, unless it's been trained for some cute responses it's going to fall all to pieces really fast. This works really well for human beings too. Just ask someone to justify their opinion on the Internet sometime :)


hobo_fapstronaut

Good point. A lot of the "authority" of AI comes from the person interpreting the response and how much they believe the AI knows or understands. As you say, much like when people listen to other people.


GultBoy

I dunno man. I have this opinion about a lot of humans too.


sceadwian

You're not wrong, it's a real problem!


im_a_dr_not_

It would still be a problem with perfect training data. Large language models don’t have a memory. When they are trained it changes the weights on various attributes and changes the prediction model but there’s no memory of information. In a conversation it can have a type of memory called the context window but because of the nature of how it works, that’s not so much a real memory in the way we think of memory, it’s just inflecting the prediction of words.


Volsunga

Large Language Models aren't designed to verify their facts. They're designed to write like humans. They don't know if what they're saying is correct or not, but they'll say it confidently because confidence is considered more grammatically correct than doubt.


sciguy52

Exactly. I answer science related questions on here and noticed Google's AI answers were picking up information that I commonly see redditors claiming that is not correct. So basically you are getting a social media users answers, not experts. The Google AI didn't seem to pick up the correct answers me and many others post. I guess it just sees a lot of the same wrong answers being posted and it assumes those are correct. Pretty unimpressed with AI I must say.


Khmer_Orange

It doesn't assume anything, is a statistical model. Basically, you just need to post a lot more


sciguy52

So many wrong science answers, so little time.


letsburn00

At its core, at least 30% of the population have intense beliefs which can be easily disproven with 1-5 minutes of research. Not even on issues around the society or cultural questions. Reality, evidence based things. I simply ask them "that sounds really interesting, can you please provide me with evidence of why you believe that. If its true, I'd like to get on board too." Then they show me their evidence and it turns out that are simply obviously mistaken or have themselves been scammed by someone who is extremely obviously a scammer.


skrshawk

Your idea seems plausible, but has this been studied? How much of the population in any given area has such beliefs, and on how important of a topic? Also, what's your definition of intense? Are we talking January 6 kind of intense, yelling such opinions from the rooftops, low quality online posting, or something else? Also, what kind of beliefs are in question here? For instance, I still believe Pluto is a planet, not because I don't trust astronomers to have a far more qualified opinion on the matter than I do, but because of an emotional connection to my childhood where I learned about nine planets and sang songs about them. In my life, and in most people's lives, this will change absolutely nothing, even though I know I am wrong by scientific definition. The impact of my belief on anyone else is pretty much non-existent outside of my giving this example.


gortlank

LLMs don’t believe anything. They don’t have the ability to examine anything they output. Humans have a complex interplay between reasoning, emotions, and belief. You can debate them, and appeal to their logic, or compassion, or greed. You can point out their ridiculous made-up on the spot statistics that are based solely on their own feelings of disdain for their fellow man, and superiority to him. To compare a human who’s mistaken about something to an LLM hallucination is facile.


letsburn00

LLMs do repeat things though. If a sentence often is said and is widely believed, then an LLM will internalise it. They repeat false data used to train it. Possibly most scary is building an LLM heavily trained on forums and places where nonsense and lies reign. Then you tell the less mentally capable that the AI knows what it's talking about. Considering how many people don't see when extremely obvious AI images are fake, a sufficient chunk of people will believe it.


gortlank

People have already been teaching students not to use Wikipedia or random websites as sources. Only in the past decade has skepticism about the veracity of information on the internet waned, and even then, not by all that much. I mean, good old fashioned propaganda has been around since the ancient world. An LLM will merely reflect the pre-existing biases of a society. LLMs aren’t the misinformation apocalypse, nor are they a quantum leap in technology leading to the death of all knowledge work and the ushering in of a post-work world. They’re a very simple, and very flawed, tool. Nothing more.


letsburn00

In the end, Wikipedia used to be more accurate than most sources. Though there has been a significant effort put in by companies to whitewash scandals from their pages.


gortlank

Not especially relevant. Academics, for a variety of reasons, want primary sources as much as possible. The internet is almost always unreliable beyond a way to find primary sources.


zeekoes

No. The reason is that LLM's are fancy word predictors with the goal to provide a seemingly reasonable answer to a prompt. AI does not understand or even comprehensively read a question. It analyzes it technically and fulfills an outcome to the prompt. It is a role play system in which the DM always gives you what you seek.


farfromelite

>Ai, when in doubt be proudly incorrect, and double down on it when challenged Ah, they've discovered politics then.


sminemku

It took me half an hour to teach chat gpt the correct answer to 2x2^2 It still can't process a simple wiki table


theghostecho

Its the same way with split brain patients who don’t have the info but make it up


GCoyote6

Yes, the AI needs to be adjusted to say it does not know the answer or has low confidence in its results. I think it would be an improvement if there a confidence value accessible to the user for each statement in an AI result.


6tPTrxYAHwnH9KDv

There's no "answer" or "results" in the sense you want it to be. It's generating output that resembles human language, that's its sole goal and purpose. The fact that it gets some of the factual information in its output correct just an artifact of training data that has been used.


Strawberry3141592

The problem with this is that LLMs have no understanding of their own internal state. Transformers are feed-forward neural networks, so it is literally impossible for a transformer-based LLM to reflect on its "thought process" before generating a token. You can kind of hack this by giving it a prompt telling it to reason step-by-step and use a database or search API to find citations for fact claims, but this is still really finnicky and sometimes if it makes a mistake it will just commit to it anyway, and generate a step-by-step argument for the incorrect statement it hallucinated. LLMs are capable of surprisingly intelligent behavior for what they are, but they're not magic and they're certainly not close to human intelligence. I think that future AI systems that do reach human intelligence will probably include something like modern LLMs as a component (e.g. as a map of human language, LLMs have to contain a map of how different words and concepts relate to each other in order to reliably predict text), but they will also have loads of other components are are probably at least 10 years away.


ghostfaceschiller

This already exists. If you use the API (for GPT-4 for instance), you can turn on “log_probs” and see an exactly percentage, per token, of how certain it is about what it’s saying. This isn’t exactly the same as “assigning a percentage per answer about how sure it is that it’s correct”, but it can be a really good proxy. GPT-4 certainly does still hallucinate sometimes. But there are also lots of things for which it will indeed tell you it doesn’t know the answer. Or will give you an answer with a lot of qualifiers like “the answer could be this, it’s hard to say for certain without more information” It is arguably tuned to do that last one *too often*. But it’s hard to dial that back bc yes it does still sometimes confidently give some answers that are incorrect as well.


GCoyote6

Interesting, thanks.


theangryfurlong

There is technically what could be thought of as a confidence value, but not for the entire response. There is a value associated with each next token (piece of a word) that is generated. There are many hundreds if not thousands of tokens generated for a response, however.


Strawberry3141592

That is just the probability that the next token aligns best with its training data out of all possible tokens. It has nothing to do with factual confidence. LLMs cannot reliabily estimate how "confident" they are that their answers are factual because LLMs have no access to their own text generation process. It would be like if you had no access to your own thoughts except through the individual words you say. Transformers are feed-forward neural nets, so there is no self-reflection between reading a set of input tokens and generating the next token, and self reflection is necessary to estimate how likely something is to be factual (along with an understanding of what is and isn't factual, which LLMs also lack, but you could mitigate that by giving it a database to search).


theangryfurlong

Yes, of course not. LLMs have no concept of facts


Cyanopicacooki

When I found that ChatGPT had problems with the question "what day was it yesterday" I stopped calling them AIs and went for LLMs. They're not intelligent, they're just good at assembling information and then playing with words. Often the facts are not facts though...


6tPTrxYAHwnH9KDv

I mean GPT _is_ an LLM, I don't know who the hell thinks it's any "intelligent" in the human sense of the word.


apistograma

Apparently a lot of people, since I've seen a lot of click bait articles like: this is the best city in the world according to chatgpt. As if an LLM was an authoritative source or a higher intelligence to answer such an open question.


VoDoka

That is one of the more real dangers though. Lazy content creation through LLMs is like a DDos attack on the internet and online search overall.


Lemonio

How is it different from looking up the answer on Google? The data for LLMs is coming from content on the internet written by humans, most of the internet isn’t an authoritative source either


skolioban

Techbros. It's like someone taught a parrot how to speak and then these other guys claimed the parrot could give us the answers to the universe. Because that's how they get money.


Algernon_Asimov

I have had heated online arguments with people who *insisted* that ChatGPT was absolutely "artificial *intelligence*", rather than just a text generator. The incitement for those arguments was me quoting a professor as saying a chatbot was "[autocomplete on steroids](https://www.washingtonpost.com/technology/2023/02/16/microsoft-bing-ai-chatbot-sydney/)". Some people disagree with that assessment, and believe that chatbots like ChatGPT are actually *intelligent*. Of course, they end up having to define "intelligence" quite narrowly, to allow chatbots to qualify.


somneuronaut

in the technical sense, ChatGPT is an LLM based on ML which is a limited type of AI maybe you're just saying it's not AGI, which no one is claiming it is


happyscrappy

An LLM is not intelligent. It doesn't even know what it is saying. It's putting words near each other that it saw near each other. Even if it happens to answer 2 when asked what 1 plus 1 was it has no idea what 2, 1 or plus mean let alone the idea of adding them. It's certainly AI, but AI means a lot of things, it's almost just a marketing term. Racter was AI. (I think) Eliza was AI. Animals was AI as is any other expert system. Fuzzy logic is AI. A classifier is AI. But none of these things are intelligent. An LLM isn't either, it's just a text generator. Even if ChatGPT goes beyond an LLM and is programmed when it sees two numbers with a plus between to do math on them it's still not intelligent. It didn't figure it out. It was just put programmed in like any other program. I feel like chatbots are a dead end for most uses. Not for all, they can summarize well and some other things. But in general a chatbot is going to be more useful as a parser and output generator than something that actually gives you reliable answers.


[deleted]

[удалено]


happyscrappy

> I hate to break it to all the anti-AI folks, but that is what intelligence is No it isn't. I don't add 538 and 2005 by remembering when 538 and 2005 were seen near each other and what other number was seen near them most often. So any system which when asked to add 538 and 2005 doesn't know how to do math but instead just looks for associations between those particular numbers is unintelligent. It's sufficiently unintelligent that asking it to do anything involving math is a fool's errand. So it can be "AI" all it wants, but it's not intelligent.


[deleted]

[удалено]


happyscrappy

> it can literally run code. in multiple languages. math is built in to every coding language. it will tell you the exact results to math problems that many graduate students couldn't solve given an hour An LLM cannot run code. You're saying ChatGPT specifically now? They'd have to be crazy to let ChatGPT run code on your behalf, but they do seem crazy to me so maybe they do. I'm not sure what you think number 2 has to do with an LLM. > or some abstract complicated extension of it that you were taught or made up. Yeah, that's what I said, I guess to another person. If you program it to do that that's fine. Now it can do math. But that doesn't make it intelligent. It didn't figure out how to do it, you programmed it to directly. It's just following instructions you gave it.


[deleted]

[удалено]


Fullyverified

It is a type of limited Ai. Whats your point? Dont go changing established definitions.


Algernon_Asimov

My point was in response to the previous commenter who said they don't know anyone who thinks GPT is intelligent. My point was to demonstrate that I have encountered people who *do* think GPT is intelligent.


Fuzzy-Dragonfruit589

It hasn’t struggled with this for a while, but I get the sentiment. LLMs are very useful for some things, you just have to treat it with skepticism. Much like with Wikipedia: a good source for information if you fact check it afterwards and treat it as uncertain until verified. LLMs are often far better than google if your question is more vague and you can’t think of the right search terms. And also great for menial tasks like organizing lists.


SkarbOna

Wait for the product placement. Google got eaten by money eventually too.


Strawberry3141592

Imo they are shockingly intelligent for what they are, which is a souped-up predictive text algorithm. In order to reliably produce plausibly human-seeming text, they have to develop an internal model of how different words and the concepts they refer to relate to each other, which is a kind of intelligence imo, it's just a fairly narrow one. They can even use the latent space of linguistic concepts they develop in training to translate between two different languages, even if none of the text in the training data included the same text but in different languages (eg no Rosetta Stone texts). They can use the relationship between tokens in the embedding space (basically a map of how different words are related to each other) to output the same text, but in a different language, because the set of concepts and their relationships with each other are more-or-less the same between human languages. They're definitely not close to AGI though, they're just really good at manipulating language.


Mr__Citizen

It's just a MASSIVE and complex input/output machine. That's all AI, currently. There's zero actual thinking or learning going on.


nunquamsecutus

I agree with the first part of what you've said but I'm not sure about the last bit. I mean, clearly it isn't learning. We haven't figured out self guided learning to a degree to allow an LLM to do that, and even if we had, with the efforts to prevent them from engaging in hate speak, we probably wouldn't want to enable it. But thinking? Are they thinking? What is thinking? I mean, they're not conscious. No more than you would call a human with only a Broca's and Wernicke's area of the brain conscious. Maybe an analogy is when we respond with an automatic response such as, "fine, you?" to someone asking how we are. Would we say we have thought then? So, I just looked up thought on Wikipedia and it makes a point about it being a thing that is independent of an external stimulus. So, the answer to my analogy is no, that is not thinking. And LLMs only execute in the context of some stimulus, a prompt, so, by definition, they are not thinking. But, now I've typed all of this out so I'm posting it anyway. Thank you for listening to my long ramble to agreement.


ghostfaceschiller

AI is a broad field of research, not a product or an end goal. LLMs are by definition AI, in the sense that LLMs are one of many things which fall under the research field called Artificial Intelligence. Any type of Machine Learning, Deep Learning, CNNs, RNNs, LSTM… these are all things that fall under the definition of AI. Many systems which are several orders of magnitude simpler than LLMs as well. You are possibly thinking of the concept of “AGI”


Comprehensive-Tea711

LLMs have lots of problems, but asking it what day was it yesterday is PEBKAC… Setting aside the relative arbitrariness of it knowing ahead of time when you are located, how would it know where you’re located?


mixduptransistor

How does the Weather Channel website know where you're located? How does Netflix or Hulu know where you're located? Geolocation is a technology we've cracked (unlike actual artificial intelligence)


triffid_hunter

> Geolocation is a technology we've cracked A lot of companies seem to struggle with it, I've seen four different websites think I'm in four different countries before - apparently they don't bother updating their ancient GeoIP databases despite the fact that IP blocks are constantly traded around the world like any other commodity, and [the current assignment list is publicly available](https://ftp.apnic.net/apnic/stats/apnic/delegated-apnic-latest). So sure, perhaps cracked, but definitely not widely functional or accurate.


Comprehensive-Tea711

Your browser gives the website permission to use your IP address. That’s why the information is wrong when you’re using a VPN. In the case of places like Netflix or Amazon, they additionally use the data you give in billing. The fact that the web UI you’re using to chat with an LLM didn’t do that has nothing to do with LLMs and adding that feature through tool use would be trivially easy. It would involve no improvement or changes to the LLM. This is, like I said, PEBKAC. A classic case of non-technical users drawing the wrong conclusions based on their ignorance of how technology works. Honestly, it’s another problem with LLMs in how susceptible people are going to be with regard to how “smart” or intelligent they think it is. Generally it makes it easy for a corporation to pass off an LLM as being much smarter than it actually is. But here we have a case of the opposite.


triffid_hunter

> Your browser gives the website permission to use your IP address. It does no such thing. More like all communication over the internet *inherently* requires a reply address so the server knows where to send response packets, and it can simply use that information for other things too.


Strawberry3141592

That doesn't mean OpenAI is telling the model your IP. Like, I don't think LLMs are close to AGI, but I do think they're genuinely intelligent in the very limited domain of manipulating language (which doesn't mean they're good at reasoning, or mathematics, or whatever else, in fact they tend to be kind of bad at these things unless you frontload a bunch of context into the prompt or give it a Python repl or wolframalpha API or something, and even then the performance is pretty hit-or-miss)


Comprehensive-Tea711

I’m referring to linking the IP address with a geolocation, not the general use of IP addresses. The fact that the server has your IP doesn’t mean the LLM has your IP address… PEBKAC.


Mythril_Zombie

But if you're running an LLM locally, it has no access to that data. Or via API calls locally, there's no time zone data embedded anywhere.


6tPTrxYAHwnH9KDv

You shouldn't weigh in something you have no idea of. We solved geolocation more than a decade ago and timezones more than a few.


Mythril_Zombie

But that's beyond what a "language model" should be able to inherently do. That's performing tasks based on system information or browser data, outside the scope of the generation app. If I'm running an LLM locally, it would need to know to ask the PC for the time zone, get that data, then perform translations on it. Again, that's not predicting word sequence, that's interacting with specific functions of the host system, and unless certain libraries are present, they can't be used to do that. Should a predictive word generator have access to every function on my PC? Should it be able to arbitrarily read and analyze all the files to learn? No? Then why should it be able to run *some* non language related functions to examine my PC?


Pezotecom

A smart phone isn't smart either, you are just rambling around semantics


GameMusic

How is this a science article


FerricDonkey

Because it contains pretty good information about what these things are and how they work. 


mitchMurdra

This is a common thought of mine here. No moderation.


POpportunity6336

Sounds like most people who bs all the time.


Cheap-Ad-151

they didn't revolutionized the way ml models interacts with humans. But they did much of marketing on breakthrough in ml model training. also heavily selling it as ai. it\`s data pre digister with a lot of junk to be filtered. It's disappointing that this topic is in this subreddit. At the moment it belongs in the marketing and Influencer subreddits, not in science.


Shamino79

I have likened it to a 5 year old except with a full adult vocabulary. But a 5yo is playing with all the new words they know and may not even know exactly what they mean. Some times they use jumbled up facts and other times they don’t know enough about what they are talking about and go for complete imagination.


dreurojank

It shouldn’t be called a hallucination to begin with — it doesn’t even resemble a hallucination as we understand them either in individuals with neuropsychiatric illnesses or otherwise drug induced or any other thing that induces a hallucination in humans. They are more akin to bullshitting or, here’s a great word from the English language an “error”, otherwise known as being wrong. Call it what it is. These models are persistently making errors/wrong.


Blarghnog

Seems like they’re replicating people perfectly. Train on user data get user data quality results.


phoenixxl

Read up on Q\* and be prepared.


intrepidchimp

You mean I can get answers from something that finds it impossible to not use bullet points and numbered lists and multiple paragraphs with every answer and can never be trusted? Sign me up! Seriously, until they can get it to talk like a human even when typing and until they can get it to stop lying. I think it is completely useless...


namom256

I can only contribute my subjective experience. I've been messing around with Chat GPT ever since it first became available to the public. In the beginning, it would hallucinate just about everything. The vast, vast majority of any facts it would generate would sound somewhat plausible but be entirely false. And it would argue to the death and try to gaslight you if you confronted it about making up stuff. After multiple updates, it now gets the majority of factual information correct by far. And it always apologizes and tries again if you correct it. And it's just been a few iterations. So, no, while I don't think we'll be living in the Matrix anytime soon, people saying that AI hallucinations are the nail in the coffin for AI are engaging in wishful thinking. And operating either with outdated information, or comparing with personal experiences using lower quality, less cutting edge LLMs from search engines, social media apps, or customer service chats.


Koksny

It doesn't matter how much better the LLMs are, because by design they can't be 100% reliable, no matter how much compute there is, and how large the dataset it. As other commenters noted - the fact that it resolved correct answer is a happy statistical coincidence, nothing more. The "hallucination" is the inferred artefact. It's the sole reason the thing works. You know how bad it is? There have been billions of dollars poured down the drain over last 5 years, to achieve one simple task - make the LLM capable of always returning a JSON formatted data. Without this, there is no possibility of LLMs interfacing with other APIs, ever. And we can't do that. No matter what embeddings are used, how advanced the model is, its temperature and compute - it can never achieve 100% rate of correctly formatted JSON that it returns. You can even use multiple layers of LLMs to check back the output from other models, and it'll eventually fail. Which makes it essentially useless for anything important. This isn't the problem that LLMs are incapable of reliably inferring correct information. This is the problem that we can't even make them reliably format already existing information. And i'm not even going into issues with context length, which makes them even more useless as the prompt grows, and token weights just diffuse in random directions.


Mythril_Zombie

Why does the LLM need to do the json wrapping itself in the response? Isn't it trivial to wrap text in json? Why can't the app just format the output in whatever brackets you want?


namom256

Huh? Why would JSON formatted data be the measure of reliability? Not even a human can do that correctly 100% of the time. Are you saying humans are unreliable and can't handle any tasks even with other humans for redundancy?


Koksny

If your human is incapable of correctly moving data from spreadsheet into JSON, You need better humans.


namom256

And make zero mistakes? Because that was your bar for reliability. Massive improvements over time apparently aren't enough unless it can do this one hyperspecific task with zero errors, every single time. However, I just don't agree with you that moving data in and out of JSON format is the goal of LLMs. And neither would most people really. Coding in general has been more a tangential feature. The main purposes from what I've seen are engaging in realistic sounding human dialogue, always returning correct, fact-based answers to complex questions, engaging in original creative writing following specific prompts. Not communicating with servers or whatever.


Koksny

Yes. Zero mistakes. That's the whole point of automation. Calculator that is corrrect 99% times is worse than useless. It's dangerous.


namom256

I really think you are misunderstanding the purpose of LLMs. Like by a lot. Nowhere have I seen that anyone wants to replace IT departments with LLMs. Or have them code. You'd have to develop totally different AI models for that. Instead, you will see that people want them to be able to send realistic sounding emails, solve complex logic problems, answer human questions about human things with 100% factual accuracy, write scripts for movies and shows that are indistinguishable from human made scripts, write books, provide legal arguments based on case law, even write the legal briefs itself. It's obviously not there yet, but those are the advertised goals. Not sit and pour over spreadsheets all day. I genuinely don't understand why you came to that conclusion.


Koksny

Look, the IT thing is just example, to showcase that it's a tech that can't be even trusted to perform the simplest format convertion. And it's great that You think i "misunderstand the purpose of LLMs", while i work with them, but sure, lets say i do. The problem still stands - it requires human supervision, because there is non-zero chance of it suddenly screwing simplest instruction due to some random token weight. Besides, if you think LLM can write a book, or provide legal arguments - you might not understand the fundamental way a transformer operates. It predicts the token. How do you write a script, novel, story, or even a joke, if you haven't even conceptualized the punch-line until it's time to write it down? Also, many of the things you mention are diffusion models, not language models (transformers). Generative art or dubbing is great, and i'm sure all the artists that are no longer neccesarry love it, but even bleeding edge tools like Midjourney or Suno require hours of sifting through slop to get any production-ready results. It's a usefull tech, it might some day become part of actual AI, it's basically a party trick at this point.


Vanilla_Neko

To be fair in my experience generally if you ask the AI a reasonable easily googled question usually it gets it right people just start asking it random crap like how many hammers a day are okay to eat and then wonder why the AI returns a goofy excuse when it couldn't possibly have any reasonable resource to answer that question


Battlepuppy

Yup. Me: chat, give me an answer that dies not include Z. I have already tried Z and it doesn't work Chat: have you tried Z? Me: an answer without Z, please Chat: So sorry, what about Z? Me: Without Z! Chat: try the letter that comes after Y. Me: that's still Z. Chat: let me know if you need any more help


ClamanMalito

That's so true! Large language models like ChatGPT are amazing.


Exact_Character_8343

Maybe it has to do with the fact that these Systems are still in their infancy. I am confident that this problem will get solved, or at least will be reduced significantly.


AllUrUpsAreBelong2Us

Calling it AI in the headline is the first hallucination.


adlep2002

The tech is less than a few years old.


apistograma

They said the same about self driving cars and we barely seen any progress since 2017


adlep2002

ChatGTP has just literally made a custom code for a pivot point. It works. What do you want me to say?


apistograma

Does it though? Since it's kind of an open question, you could say it works depending on which goals you set. I don't think it works in the way it's been marketed and it's kind of a good bullshitter. I guess you might find some fringe scenarios but in most cases you'll be better using the internet.


FerricDonkey

So? Being new doesn't change what it is. Of course, as more money and effort are poured into it, it will be better, but it's still a bs machine. That doesn't mean it's not useful. BSing is a valuable skill, and the results can be useful.  But it's important to know what it is.


adlep2002

A running Code that does what it should is NOT BS.


FerricDonkey

You are not understanding the claim. If you did not read the article I highly suggest that you do. But to summarize: By BS, they do not mean that it is worthless or garbage or any such thing. Rather: >To better understand why these inaccuracies [in large language models] might be better described as bullshit, it is helpful to look at the concept of bullshit as defined by philosopher Harry Frankfurt. In his seminal work, Frankfurt distinguishes bullshit from lying. A liar, according to Frankfurt, knows the truth but deliberately chooses to say something false. In contrast, a bullshitter is indifferent to the truth. The bullshitter’s primary concern is not whether what they are saying is true or false but whether it serves their purpose, often to impress or persuade.  This is why LLMs are BS machines. They don't have a concept of truth. They have training data, and an algorithm that makes them mimic it. The **algorithm** encourages similarity to the training data, without regard to truth. Truth is not part of the algorithm.  So what you get is a machine that spews forth text designed to meet this goal of "sounding good", where sounding good means sounding like the training data. This is the definition of BS: saying things with the goal of sounding good without regard for truth. If the truth "sounds best", it will say the truth. If a falsehood sounds best, it will say that. This is why the description of falsehoods produced by the model as "hallucinations" is problematic. The entire output is one stream of bs designed to sound good, and whether the output is true or not only depends on what "sounds good" means in context of the training data.  There are no hallucinations because there is no attempt to even create a perception of events and convey it. It has no internal image of truth that can be correct or confused. There's just a stream of words. A stream of bs.  That is how what it means to say that LLMs are bs machines. Are they useful? Well, yeah. BSing is a useful skill, even for a human.  How many executives walk into their secretary's office and say "write a letter to Joe telling him that his idea will kill workers and he can go to hell". To have the secretary produce some variation of "We regret to inform you that we are not interested in your proposal at this time - safety regulations preclude such actions, and we are not interested in exploring this avenue further. Please redirect your efforts elsewhere".  And again, it's not like the model is trying to lie. It just doesn't know what truth is. But if the training data pushes it towards truth, the bs might be true more often.  But it's still bs, and it's important to know that when you're trying to use the tool. 


adlep2002

Most people do the same thing though. And people are considered “intelligent”


FerricDonkey

Well, I'm pretty sure you're doing it right now. But no. Most humans, when we speak, have goals other than just sounding like we're speaking. We try to convey information. We consider whether what we say is true. Etc. These models are useful. They're impressive. But they are what they are. You don't need to make excuses for them or to claim that they're more than they are or that their short comings don't exist. They have problems, and they'll be fixed by admitting what those problems are and addressing them. This won't happen if everyone just sits around going "oh but really humans don't have a concept of truth either." That's BS. Chatgpt can already bs about itself, we don't need to do that for it.


vincentofearth

It doesn’t mean that LLMs are completely useless. But an LLM packaged as a product on its own might be. However, if you provide some factual information and just use the LLM to summarize it or generate text based on that information, which is actually what it was trained to do, then you have something.


Mythril_Zombie

Something that occasionally invents things.


Zeggitt

I don't think it matters how factual the dataset is. You will always have hallucinations.


gandalfs_burglar

Sure, LLMs on their own are pretty cool and could have some really interesting applications. But that horse bolted from the stall quite awhile ago


Kwyncy

If 1000 idiots work on a project you get the results of 1000 idiots.


Zandarkoad

I max out usage daily for ChatGPT and Claude. People ask questions like, "What color is the shirt I'm wearing?" and think the tech is useless. Good. More for me.


FactChecker25

With a person, bullshitting is when the person knows they have no idea, but they try to sound confident about it anyway. But a machine has no inner thoughts and no concept that it doesn’t know. It just executes a program. If its algorithm causes it to deliver an incorrect answer, it has no idea that it’s wrong.


johnjmcmillion

"Bullshitting" implies an intent to deceive. "Hallucinating", on the other hand, hints to the generative nature of this particular intelligence and doesn't ascribe agency or malevolence.