The research from Purdue University, first spotted by news outlet Futurism, was presented earlier this month at the Computer-Human Interaction Conference in Hawaii and looked at 517 programming questions on Stack Overflow that were then fed to ChatGPT.

“Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose,” the new study explained. “Nonetheless, our user study participants still preferred ChatGPT answers 35% of the time due to their comprehensiveness and well-articulated language style.”

Disturbingly, programmers in the study didn’t always catch the mistakes being produced by the AI chatbot.

“However, they also overlooked the misinformation in the ChatGPT answers 39% of the time,” according to the study. “This implies the need to counter misinformation in ChatGPT answers to programming questions and raise awareness of the risks associated with seemingly correct answers.”

  • NotMyOldRedditName@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    edit-2
    6 months ago

    My experience with an AI coding tool today.

    Me: Can you optimize this method.

    AI: Okay, here’s an optimized method.

    Me seeing the AI completely removed a critical conditional check.

    Me: Hey, you completely removed this check with variable xyz

    Ai: oops you’re right, here you go I fixed it.

    It did this 3 times on 3 different optimization requests.

    It was 0 for 3

    Although there was some good suggestions in the suggestions once you get past the blatant first error

    • piecat@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      6 months ago

      My favorite is when I ask for something and it gets stuck in a loop, pasting the same comment over and over

  • efstajas@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    3
    ·
    6 months ago

    Yeah it’s wrong a lot but as a developer, damn it’s useful. I use Gemini for asking questions and Copilot in my IDE personally, and it’s really good at doing mundane text editing bullshit quickly and writing boilerplate, which is a massive time saver. Gemini has at least pointed me in the right direction with quite obscure issues or helped pinpoint the cause of hidden bugs many times. I treat it like an intelligent rubber duck rather than expecting it to just solve everything for me outright.

    • Jimmyeatsausage@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 months ago

      Same here. It’s good for writing your basic unit tests, and the explain feature is useful getting for getting your head wrapped around complex syntax, especially as bad as searching for useful documentation has gotten on Google and ddg.

    • InternetPerson@lemmings.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      6 months ago

      That’s a good way to use it. Like every technological evolution it comes with risks and downsides. But if you are aware of that and know how to use it, it can be a useful tool.
      And as always, it only gets better over time. One day we will probably rely more heavily on such AI tools, so it’s a good idea to adapt quickly.

  • zelifcam@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    edit-2
    6 months ago

    “Major new Technology still in Infancy Needs Improvements”

    – headline every fucking day

    • aname@lemmy.one
      link
      fedilink
      English
      arrow-up
      13
      ·
      6 months ago

      “Corporation using immature technology in productions because it’s cool”

      More news at eleven

      • capital@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        6 months ago

        This is scary because up to now, all software released worked exactly as intended so we need to be extra special careful here.

        • otp@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          0
          ·
          6 months ago

          Yes, and we never have and never will put lives in the hands of software developers before!

          Tap for spoiler

          /s…for this comment and the above one, for anyone who needs it

  • BeatTakeshi@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    edit-2
    6 months ago

    Who would have thought that an artificial intelligence trained on human intelligence would be just as dumb

    • capital@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      3
      ·
      edit-2
      6 months ago

      Hm. This is what I got.

      I think about 90% of the screenshots we see of LLMs failing hilariously are doctored. Lemmy users really want to believe it’s that bad through.

      Edit:

      • otp@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        5
        ·
        6 months ago

        I’ve had lots of great experiences with ChatGPT, and I’ve also had it hallucinate things.

        I saw someone post an image of a simplified riddle, where ChatGPT tried to solve it as if it were the entire riddle, but it added extra restrictions and have a confusing response. I tried it for myself and got an even better answer.

        Prompt (no prior context except saying I have a riddle for it):

        A man and a goat are on one side of the river. They have a boat. How can they go across?

        Response:

        The man takes the goat across the river first, then he returns alone and takes the boat across again. Finally, he brings the goat’s friend, Mr. Cabbage, across the river.

        I wish I was witty enough to make this up.

        • capital@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 months ago

          I reproduced that one and so I believe that one is true.

          I looked up the whole riddle and see how it got confused.

          It happened on 3.5 but not 4.

            • capital@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              6 months ago

              Evidently I didn’t save the conversation but I went ahead and entered the exact prompt above into GPT-4. It responded with:

              The man can take the goat across the river in the boat. After reaching the other side, he can leave the goat and return alone to the starting side if needed. This solution assumes the boat is capable of carrying at least the man and the goat at the same time. If there are no further constraints like a need to transport additional items or animals, this straightforward approach should work just fine!

      • AIhasUse@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        Yesterday, someone posted a doctored one on here saying everyone eats it up even if you use a ridiculous font in your poorly doctored photo. People who want to believe are quite easy to fool.

  • Subverb@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    6 months ago

    ChatGPT and github copilot are great tools, but they’re like a chainsaw: if you apply them incorrectly or become too casual and careless with them, they will kickback at you and fuck your day up.

    • agelord@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      6 months ago

      In my experience, if you have the necessary skills to point it at the right direction, you don’t need to use it at the first place

      • andallthat@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        edit-2
        6 months ago

        it’s just a convenience, not a magic wand. Sure relying on AI blindly and exclusively is a horrible idea (that lots of people peddle and quite a few suckers buy), but there’s room for a supervised and careful use of AI, same as we started using google instead of manpages and (grudgingly, for the older of us) tolerated the addition of syntax highlighting and even some code completion to all but the most basic text editors.

    • aidan@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 months ago

      It can, it also sometimes can’t unless you ask it “could it be x answer”

  • corroded@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    6 months ago

    I will resort to ChatGPT for coding help every so often. I’m a fairly experienced programmer, so my questions usually tend to be somewhat complex. I’ve found that’s it’s extremely useful for those problems that fall into the category of “I could solve this myself in 2 hours, or I could ask AI to solve it for me in seconds.” Usually, I’ll get a working solution, but almost every single time, it’s not a good solution. It provides a great starting-off point to write my own code.

    Some of the issues I’ve found (speaking as a C++ developer) are: Variables not declared “const,” extremely inefficient use of data structures, ignoring modern language features, ignoring parallelism, using an improper data type, etc.

    ChatGPT is great for generating ideas, but it’s going to be a while before it can actually replace a human developer. Producing code that works isn’t hard; producing code that’s good requires experience.

    • interdimensionalmeme@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      2
      ·
      6 months ago

      Yes, and even if it was only right 1% of the time it would still be amazing

      Also hallucinations are not a universally bad thing.

  • reksas@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    6 months ago

    I just use it to get ideas about how to do something or ask it to write short functions for stuff i wouldnt know that well. I tried using it to create graphical ui for script but that was constant struggle to keep it on track. It managed to create something that kind of worked but it was like trying to hold 2 magnets of opposing polarity together and I had to constantly reset the conversation after it got “corrupted”.

    Its useful tool if you dont rely on it, use it correctly and dont trust it too much.

  • foremanguy@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 months ago

    We have to wait a bit to have an useful assistant (but maybe something like copilot or more coded focused ai are better)

  • NounsAndWords@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    2
    ·
    6 months ago

    GPT-2 came out a little more than 5 years ago, it answered 0% of questions accurately and couldn’t string a sentence together.

    GPT-3 came out a little less than 4 years ago and was kind of a neat party trick, but I’m pretty sure answered ~0% of programming questions correctly.

    GPT-4 came out a little less than 2 years ago and can answer 48% of programming questions accurately.

    I’m not talking about mortality, or creativity, or good/bad for humanity, but if you don’t see a trajectory here, I don’t know what to tell you.

      • otp@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        6 months ago

        I appreciate the XKCD comic, but I think you’re exaggerating that other commenter’s intent.

        The tech has been improving, and there’s no obvious reason to assume that we’ve reached the peak already. Nor is the other commenter saying we went from 0 to 1 and so now we’re going to see something 400x as good.

        • stufkes@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 months ago

          I think the one argument for the assumption that we’re near peak already is the entire issue of AI learning from AI input. I think numberphile discussed a maths paper that said that to achieve the accuracy that we want, there is simply not enough data to train it on.

          That’s of course not to say that we can’t find alternative approaches

        • 14th_cylon@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          6 months ago

          I appreciate the XKCD comic, but I think you’re exaggerating that other commenter’s intent.

          i don’t think so. the other commenter clearly rejects the critic(1) and implies that existence of upward trajectory means it will one day overcome the problem(2).

          while (1) is well documented fact right now, (2) is just wishful thinking right now.

          hence the comic, because “the trajectory” doesn’t really mean anything.

          • otp@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 months ago

            In general, “The technology is young and will get better with time” is not just a reasonable argument, but almost a consistent pattern. Note that XKCD’s example is about events, not technology. The comic would be relevant if someone were talking about events happening, or something like sales, but not about technology.

            Here, I’m not saying that you’re necessarily right or they’re necessarily wrong, just that the comic you shared is not a good fit.

            • 14th_cylon@lemm.ee
              link
              fedilink
              English
              arrow-up
              1
              ·
              6 months ago

              In general, “The technology is young and will get better with time” is not just a reasonable argument, but almost a consistent pattern. Note that XKCD’s example is about events, not technology.

              yeah, no.

              try to compare horse speed with ford t and blindly extrapolate that into the future. look at the moore’s law. technology does not just grow upwards if you give it enough time, most of it has some kind of limit.

              and it is not out of realm of possibility that llms, having already stolen all of human knowledge from the internet, having found it is not enough and spewing out bullshit as a result of that monumental theft, have already reached it.

              that may not be the case for every machine learning tool developed for some specific purpose, but blind assumption it will just grow indiscriminately, because “there is a trend”, is overly optimistic.

              • otp@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                6 months ago

                I don’t think continuing further would be fruitful. I imagine your stance is heavily influenced by your opposition to, or dislike of, AI/LLMs

                • 14th_cylon@lemm.ee
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  6 months ago

                  oh sure. when someone says “you can’t just blindly extrapolate a curve”, there must be some conspiracy behind it, it absolutely cannot be because you can’t just blindly extrapolate a curve 😂

        • 31337@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 months ago

          We’re close to peak using current NN architectures and methods. All this started with the discovery of transformer architecture in 2017. Advances in architecture and methods have been fairly small and incremental since then. The advancements in performance has mostly just been throwing more data and compute at the models, and diminishing returns have been observed. GPT-3 costed something like $15 million to train. GPT-4 is a little better and costed something like $100 million to train. If the next model costs $1 billion to train, it will likely be a little better.

      • NounsAndWords@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        6 months ago

        Perhaps there is some line between assuming infinite growth and declaring that this technology that is not quite good enough right now will therefore never be good enough?

        Blindly assuming no further technological advancements seems equally as foolish to me as assuming perpetual exponential growth. Ironically, our ability to extrapolate from limited information is a huge part of human intelligence that AI hasn’t solved yet.

        • 14th_cylon@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          6 months ago

          will therefore never be good enough?

          no one said that. but someone did try to reject the fact it is demonstrably bad right now, because “there is a trajectory”.

    • Snot Flickerman@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      6 months ago

      https://www.reuters.com/technology/openai-ceo-altman-says-davos-future-ai-depends-energy-breakthrough-2024-01-16/

      Speaking at a Bloomberg event on the sidelines of the World Economic Forum’s annual meeting in Davos, Altman said the silver lining is that more climate-friendly sources of energy, particularly nuclear fusion or cheaper solar power and storage, are the way forward for AI.

      “There’s no way to get there without a breakthrough,” he said. “It motivates us to go invest more in fusion.”

      It’s a good trajectory, but when you have people running these companies saying that we need “energy breakthroughs” to power something that gives more accurate answers in the face of a world that’s already experiencing serious issues arising from climate change…

      It just seems foolhardy if we have to burn the planet down to get to 80% accuracy.

      I’m glad Altman is at least promoting nuclear, but at the same time, he has his fingers deep in a nuclear energy company, so it’s not like this isn’t something he might be pushing because it benefits him directly. He’s not promoting nuclear because he cares about humanity, he’s promoting nuclear because has deep investment in nuclear energy. That seems like just one more capitalist trying to corner the market for themselves.

    • egeres@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 months ago

      Lemmy seems to be very near-sighted when it comes to the exponential curve of AI progress, I think this is an effect because the community is very anti-corp

      • NounsAndWords@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        arrow-down
        1
        ·
        6 months ago

        No clue? Somewhere between a few years (assuming some unexpected breakthrough) or many decades? The consensus from experts (of which I am not) seems to be somewhere in the 2030s/40s for AGI. I’m guessing accuracy probably will be more on a topic by topic basis, LLMs might never even get there, or only related to things they’ve been heavily trained on. If predictive text doesn’t do it then I would be betting on whatever Yann LeCun is working on.

  • exanime@lemmy.today
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 months ago

    You have no idea how many times I mentioned this observation from my own experience and people attacked me like I called their baby ugly

    ChatGPT in its current form is good help, but nowhere ready to actually replace anyone

    • UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      6 months ago

      A lot of firms are trying to outsource their dev work overseas to communities of non-English speakers, and then handing the result off to a tiny support team.

      ChatGPT lets the cheap low skill workers churn out miles of spaghetti code in short order, creating the illusion of efficiency for people who don’t know (or care) what they’re buying.

  • Petter1@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 months ago

    I guess it depends on the programming language… With python, I got very fast great results. But python is all about quick and dirty 😂

    • anlumo@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 months ago

      In Rust, it’s not great. It can’t do proper memory management in the language, which is pretty essential.

      • Petter1@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 months ago

        Well, if you use free chatGPT you only have knowledge until 2022, maybe that’s the reason

  • tsonfeir@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    If you ask the wrong questions you get the wrong results. If you don’t check the response for accuracy, you get invalid answers.

    It’s just a tool. Don’t use it wrong because you’re lazy.