• Vanth@reddthat.com
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    1
    ·
    4 months ago

    I wonder what kind of contract they went with.

    I can’t imagine this being a great long-term deal for Google. There’s minimal good new content being created on Reddit. Searching for useful information mostly brings up old posts, while new posts are heavily spam generated or designed to support AI learning.

    I imagine buying access to historic reddit content from creation to ~2020 would be valuable. While paying for ongoing access to new content is going to be far less valuable and turn into AI devolution as we get to where AI is learning from other AI and spiraling into progressively worse outputs.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      4 months ago

      I wonder what kind of contract they went with.

      https://www.reuters.com/technology/reddit-ai-content-licensing-deal-with-google-sources-say-2024-02-22/

      SAN FRANCISCO, Feb 21 (Reuters) - Social media platform Reddit has struck a deal with Google (GOOGL.O) , opens new tab to make its content available for training the search engine giant’s artificial intelligence models, three people familiar with the matter said.

      The contract with Alphabet-owned Google is worth about $60 million per year, according to one of the sources.

      For perspective:

      https://www.cbsnews.com/news/google-reddit-60-million-deal-ai-training/

      In documents filed with the Securities and Exchange Commission, Reddit said it reported net income of $18.5 million — its first profit in two years — in the October-December quarter on revenue of $249.8 million.

      So if you annualize that, Reddit’s seeing revenue of about $1 billion/year, and net income of about $74 million/year.

      Given that Reddit granting exclusive indexing to Google happened at about the same time, I would assume that that AI-training deal included the exclusivity indexing agreement, but maybe it’s separate.

      My gut feeling is that the exclusivity thing is probably worth more than $60 million/year, that Google’s probably getting a pretty good deal. Like, Google did not buy Reddit, and Google’s done some pretty big acquisitions, like YouTube, and that’d have been another way for Google to get exclusive access. So I’d think that this deal is probably better for Google than buying Reddit. Reddit’s market capitalization is $10 billion, so Google is maybe paying 0.6% the value of Reddit per year to have exclusive training rights to their content and to be the only search engine indexing them; aside from Reddit users themselves running into content in subreddits, I’d guess that those two forms are probably the main way in which one might leverage the content there.

      Plus, my impression is that the idea that a number of companies have – which may or may not be valid – is that this is the beginning of the move away from search engines. Like, the idea is that down the line, the typical person doesn’t use a search engine to find a webpage somewhere that’s a primary source to find material. Instead, they just query an AI. That compiles all the data that it can see and spits out an answer. Saves some human searcher time and reduces complexity, and maybe can solve some problems if AIs can ultimately do a better job of filtering out erroneous information than humans. We definitely aren’t there yet in 2024, but if that’s where things are going, I think that it might make a lot of strategic sense for Google. If Google can lock up major sources of training data, keep Microsoft out, then it’s gonna put Microsoft in a difficult spot if Microsoft is gunning for the same thing.

      • Vanth@reddthat.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        Cool, thank you. You seem to know quite a bit about this stuff.

        If we do end up at a point without search engines, where AI does the search and summarizes an answer, what do you think their level of ability to tie back to source material will be?

        I’m thinking in cases of asking about a technical detail for a hobby, “how do I get x to work”. I don’t necessarily want a response like “connect blue wire to red”. What I really want is the forum posts discussing the troubleshooting and solutions from various people. If an AI search can’t get me to those forums, it’s of little value to me and when I do figure out an answer acceptable to my application, I’m not tied into that forum to share my findings (and generate new content for the AI to index).

        Related to that, I’m thinking about these stories of lawyers relying on AI to write their briefs, and the AI cites non-existent cases as if they were real. It seems to me, not at all a programmer, that getting an AI to the point where it knows what’s real and what’s a hallucination would be a challenge. And until we get to that point, it’s hard to put full trust into an AI search.