You know how Google’s new feature called AI Overviews is prone to spitting out wildly incorrect answers to search queries? In one instance, AI Overviews told a user to use glue on pizza to make sure the cheese won’t slide off (pssst…please don’t do this.)

Well, according to an interview at The Vergewith Google CEO Sundar Pichai published earlier this week, just before criticism of the outputs really took off, these “hallucinations” are an “inherent feature” of  AI large language models (LLM), which is what drives AI Overviews, and this feature “is still an unsolved problem.”

  • givesomefucks@lemmy.world
    link
    fedilink
    English
    arrow-up
    297
    arrow-down
    32
    ·
    4 months ago

    They keep saying it’s impossible, when the truth is it’s just expensive.

    That’s why they wont do it.

    You could only train AI with good sources (scientific literature, not social media) and then pay experts to talk with the AI for long periods of time, giving feedback directly to the AI.

    Essentially, if you want a smart AI you need to send it to college, not drop it off at the mall unsupervised for 22 years and hope for the best when you pick it back up.

    • Zarxrax@lemmy.world
      link
      fedilink
      English
      arrow-up
      31
      arrow-down
      1
      ·
      4 months ago

      I’m addition to the other comment, I’ll add that just because you train the AI on good and correct sources of information, it still doesn’t necessarily mean that it will give you a correct answer all the time. It’s more likely, but not ensured.

      • RidcullyTheBrown@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        4 months ago

        Yes, thank you! I think this should be written in capitals somewhere so that people could understand it quicker. The answers are not wrong or right on purpose. LLMs don’t have any way of distinguishing between the two.

    • Leate_Wonceslace@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      23
      ·
      4 months ago

      it’s just expensive

      I’m a mathematician who’s been following this stuff for about a decade or more. It’s not just expensive. Generative neural networks cannot reliably evaluate truth values; it will take time to research how to improve AI in this respect. This is a known limitation of the technology. Closely controlling the training data would certainly make the information more accurate, but that won’t stop it from hallucinating.

      The real answer is that they shouldn’t be trying to answer questions using an LLM, especially because they had a decent algorithm already.

      • Aceticon@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        4 months ago

        Yeah, I’ve learned Neural Networks way back when those thing were starting in the late 80s/early 90s, use AI (though seldom Machine Learning) in my job and really dove into how LLMs are put together when it started getting important, and these things are operating entirelly at the language level and on the probabilities of language tokens appearing in certain places given context and do not at all translate from language to meaning and back so there is no logic going on there nor is there any possibility of it.

        Maybe some kind of ML can help do the transformation from the language space to a meaning space were things can be operated on by logic and then back, but LLMs aren’t a way to do it as whatever internal representation spaces (yeah, plural) they use in their inners layers aren’t those of meaning and we don’t really have a way to apply logic to them).

      • sudo42@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        4 months ago

        It’s worse than that. “Truth” can no more reliably found by machines than it can be by humans. We’ve spent centuries of philosophy trying to figure out what is “true”. The best we’ve gotten is some concepts we’ve been able to convince a large group of people to agree to.

        But even that is shaky. For a simple example, we mostly agree that bleach will kill “germs” in a petri dish. In a single announcement, we saw 40% of the American population accept as “true” that bleach would also cure them if injected straight into their veins.

        We’re never going to teach machine to reason for us when we meatbags constantly change truth to be what will be profitable to some at any given moment.

        • Leate_Wonceslace@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          4 months ago

          Are you talking about epistemics in general or alethiology in particular?

          Regardless, the deep philosophical concerns aren’t really germain to the practical issue of just getting people to stop falling for obvious misinformation or people being wantonly disingenuous to score points in the most consequential game of numbers-go-up.

    • jeeva@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      4 months ago

      That’s just not how LLMs work, bud. It doesn’t have understanding to improve, it just munges the most likely word next in line. It, as a technology, won’t advance past that level of accuracy until it’s a completely different approach.

    • Canary9341@lemmy.ml
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      4 months ago

      They could also perform some additional iterations with other models on the result to verify it, or even to enrich it; but we come back to the issue of costs.

    • scarabic@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 months ago

      I think you’re right that with sufficient curation and highly structured monitoring and feedback, these problems could be much improved.

      I just think that to prepare an AI, in such a way, to answer any question reliably and usefully would require more human resources than there are elementary particles in the universe. We would be better off connecting live college educated human operators to Google search to individually assist people.

      So I don’t know how helpful it is to say “it’s just expensive” when the entire point of AI is to be lower cost than a battalion of humans.

    • thefactremains@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      4 months ago

      Why not solve it before training the AI?

      Simply make it clear that this tech is experimental, then provide sources and context with every result. People can make their own assessment.

    • Excrubulent@slrpnk.net
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 months ago

      No he’s right that it’s unsolved. Humans aren’t great at reliably knowing truth from fiction too. If you’ve ever been in a highly active comment section you’ll notice certain “hallucinations” developing, usually because someone came along and sounded confident and everyone just believed them.

      We don’t even know how to get full people to do this, so how does a fancy markov chain do it? It can’t. I don’t think you solve this problem without AGI, and that’s something AI evangelists don’t want to think about because then the conversation changes significantly. They’re in this for the hype bubble, not the ethical implications.

      • dustyData@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        arrow-down
        1
        ·
        4 months ago

        We do know. It’s called critical thinking education. This is why we send people to college. Of course there are highly educated morons, but we are edging bets. This is why the dismantling or coopting of education is the first thing every single authoritarian does. It makes it easier to manipulate masses.

        • helenslunch@feddit.nl
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          3 months ago

          It’s called critical thinking education.

          Yeah, I mean, we have that, and parents are constantly trying to dismantle it. No amount of “critical thinking education” can undo decades of brainwashing from parents and local culture.

    • helenslunch@feddit.nl
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      3 months ago

      You could only train AI with good sources

      I mean yes, but also no. If you only train it with “good sources” then you miss out on a whole bunch of other valuable information.

      Just like scholar.google.com only has “good sources” but generally it’s not going to have the information that 90% of your search queries will be about.

    • redfellow@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      2
      ·
      edit-2
      4 months ago

      The truth is, this is the perfect type of a comment that makes an LLM hallucinate. Sounds right, very confident, but completely full of bullshit. You can’t just throw money on every problem and get it solved fast. This is an inheret flaw that can only be solved by something else than a LLM and prompt voodoo.

      They will always spout nonsense. No way around it, for now. A probabilistic neural network has zero, will always have zero, and cannot have anything but zero concept of fact - only stastisically probable result for a given prompt.

      It’s a politician.

    • RBG@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      4 months ago

      I let you in on a secret: scientific literature has its fair share of bullshit too. The issue is, it is much harder to figure out its bullshit. Unless its the most blatant horseshit you’ve scientifically ever seen. So while it absolutely makes sense to say, let’s just train these on good sources, there is no source that is just that. Of course it is still better to do it like that than as they do it now.

      • givesomefucks@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        The issue is, it is much harder to figure out its bullshit.

        Google AI suggested you put glue on your pizza because a troll said it on Reddit once…

        Not all scientific literature is perfect. Which is one of the many factors that will stay make my plan expensive and time consuming.

        You can’t throw a toddler in a library and expect them to come out knowing everything in all the books.

        AI needs that guided teaching too.