Welcome to the Era of Experience [pdf]

(storage.googleapis.com)

104 points | by Siah 1 day ago

13 comments

  • kubb 1 day ago
    Wowzers, it’s happening imminently. Great to know that we can expect agents that learn from experience very very soon!

    When they’re here I’ll make an upvote farming bot that learns from experience how not to get caught and unleash it on HM.

    After that I’ll make an agent that runs a SaaS company that learns from experience how to make money and I’ll finally be able to chill out and play video games.

    That last thing I’ll actually do myself, I won’t use an agent, although the experience revolution stared with games. Ironic!

    But I’ll make an agent that learns from experience what kind of games I like and how to make them. This way I’ll have an endless supply.

    • tempodox 1 day ago
      > very very soon!

      If they're not careful, they'll be sued for copyright violation of the Real Soon Now™ brand.

    • shiandow 1 day ago
      We don't need an agent to do all that.

      We just need it to get better at building agents.

    • trollbridge 1 day ago
      There are days when I feel like I’m a not-so-advanced LLM, just spewing forth text in documents nobody is going to read.
  • jgbmlg 1 day ago
    It's ironic that machine intelligence is advancing during an era when human intelligence is declining.
    • chneu 1 day ago
      It's not. Everytime there's a new form of media or communication there's an uptick in "bad actors". Think yellow journalism or any of the moral panics around TV programming. Even back when the printing press was invented there was an uptick in troll behavior. One of the Green brothers posited that martin Luther was really just a pamphlet troll.

      With social media and the Internet, stupid just got louder. I don't think people got stupid.

      • codeflo 1 day ago
        Amplifying stupid can be very deadly, though. In some sense, the printing press caused the 30-year war, and radio brought us World War II. Eventually, society will adapt. I just wish we could find a way to adapt faster than the bad actors do.
        • KineticLensman 1 day ago
          > In some sense, the printing press caused the 30-year war,

          Well the 30-year war definitely showed technology-driven speed-up. Before the printing press we had wars that lasted for 100 years [0]

          [0] https://en.wikipedia.org/wiki/Hundred_Years%27_War

        • daseiner1 1 day ago
          i think it’s consensus that yellow journalism directly led to the spanish-american war via hearst
        • disqard 1 day ago
          (continuing in that vein, and taking the liberty of making a giant oversimplification):

          ...and TV brought us Ronald Reagan, and the Internet gave us Trump as POTUS.

      • sdsd 1 day ago
        Yes and no. In swaths of the world, we're actually observing a reverse Flynn effect and IQ has been dropping, in some places for decades.

        Eg: https://www.popularmechanics.com/science/a43469569/american-...

        • _Algernon_ 1 day ago
          This started long before the internet.
          • trollbridge 1 day ago
            Started in 2006 in Denmark and seemed to start a few years ago in the U.S… coinciding with smartphones (which I think will make us even dumber).
        • jimbob45 1 day ago
          IQ tests are administration-sensitive and have changed dramatically since the beginning of such a Flynn effect study. The population makeup of many countries has changed in recent years to include many immigrants for whom the study would make exceedingly little sense to include. IQ tests do not cover and do not claim to cover a comprehensive view of human intelligence, famously lacking verbal and social components entirely. It is possible past IQ tests were simply overtuned and we’re now seeing the natural correction.
      • croes 1 day ago
        Not just louder, it got in power
      • baxtr 1 day ago
        Yes. Exactly right.

        Also well documented. Anyone interested, read the book: Attention Merchants by Wu.

    • dullcrisp 1 day ago
      Maybe machine intelligence only seems to be advancing from the perspective of human intelligence
      • teberl 1 day ago
        I like that thought.
    • unsnap_biceps 1 day ago
      I feel like declining human intelligence is a result of advancing machine intelligence. Computers are a force multiplier and societal pressure towards building intelligence is reduced.
      • whatnow37373 1 day ago
        So the AGI/ASI problem might solve itself: we slowly become incapable of iterating on the problem while existing AI is not nearly advanced enough to pick up the slack.

        It’s quite beautiful. Once a civilization tries to build machine intelligence it slowly degrades its own capacity during the process thus eventually losing all hope of ever achieving their goal - assuming they still understand their goal at that point. Maybe it’s an algorithm in the Universe to keep us from being naughty.

    • CuriouslyC 1 day ago
      The human brain optimizes for efficiency, if that extra intelligence doesn't confer survival benefits it'll be lost. I can't imagine that intelligence doesn't confer survival and reproductive benefits though, it's more likely that the gradient of survival and reproduction between the most intelligent and least has shrunk. In a sense civilization is coddling the weak, and humanity is getting weaker for it.
    • baxtr 1 day ago
      Is there scientific evidence for this statement?
    • huijzer 1 day ago
      Maybe on average, but I think it’s probably correlated to inequality. The kid of two Oxford professors will probably be smarter than a kid that grew up in poverty. The school system is aimed at mitigating these differences, but if on average everyone gets less intelligent maybe the school system is working poorly.
      • lazide 1 day ago
        Eh, or there is a massive effort to push as many people as possible down Maslow’s hierarchy of needs - which also shows up as being less intelligent.

        Happening right out in the open, and quite blatantly.

    • anal_reactor 1 day ago
      > when human intelligence is declining

      It's not. It's just that previously we were unaware how stupid people are, and now we're starting to understand this.

    • enaaem 1 day ago
      Most people are not stupid. They react to their emotions.
    • apwell23 1 day ago
      > era when human intelligence is declining

      is it ? i am listening to most beautiful music that was ever created. it was created in 2024.

      • daseiner1 1 day ago
        would citing lebron james explain away the obesity epidemic?
        • bethekidyouwant 1 day ago
          Yes, the strongest man who ever lived is alive today. The best player of every sport is alive today . It absolutely supports the theory support the theory that the smartest person ever is alive today, etc. etc..
      • daseiner1 23 hours ago
        what music??
  • quantumHazer 1 day ago
    Is it me or this is yet another PR stunt masked as a serious article with LaTeX and all the fancy things? The graph doesn’t even make sense.

    I’m burning out from all this hypester type of thing, it’s really really tiring.

    • tempodox 1 day ago
      > I’m burning out

      The constant barrage of excrement makes critical thinking ever harder, which is by design (it has been proven that pumping out BS en masse is way easier than debunking it). Stop using your brain already and just buy what they tell you. Thinking is done by machines now. As is pumping out BS about how good machines are at thinking.

    • TheFragenTaken 1 day ago
      There should be a term for this. I semi-unironically trust content written in Computer Modern, even if I know it's insane.
    • Jensson 1 day ago
      Maybe it was generated by an LLM?
    • gwern 1 day ago
      What fancy things? There's not a single equation in the whole thing. (I don't think this is even using Computer Modern, is it? It looks like a considerably thicker serif, and some of the characters like the lower-case 'g' look different.) Or are you referring to 'having 4 short, simple footnotes' as 'fancy' now? And also the graph makes perfect sense and tracks my own impression of RL history, what are you talking about?
      • quantumHazer 19 hours ago
        Why not use a blog dot google dot com domain or whatever? Why choose a format that resembles a published article instead?

        As for the graph, it’s too generic, it doesn’t provide any real value, other than a certain pseudo-appeal reminiscent of paper-style visuals. In my humble opinion, it’s designed to mislead people who fall for hype, much like some of Google’s recent pseudo-scientific blog posts on machine learning.

        I have deep respect for Sutton and his work, but this kind of things are a hard pass for me.

        • gwern 17 hours ago
          > Why choose a format that resembles a published article instead?

          ...It is in a format that resembles a published article because it is going to be a published article? "This is a preprint of a chapter that will appear in the book Designing an Intelligence, published by MIT Press." on the first page.

          > As for the graph, it’s too generic

          A history of RL from DQN to AlphaProof/LLM computer use in Gemini is not 'generic', and could not be.

          > it doesn’t provide any real value

          It provides value to people who were not around then and not familiar with how RL attention peaks and crests, and a similar chart about TD-Gammon and Deep Blue, say, would likewise be useful for the many people who did not actually live through those eras, and helps contextualize material from back then. (I did, and maybe you did, and so it's not useful to us, but there exist other, younger people in the world, who are not us{{citation needed}}.) And the fact that these cycles exist is something worth reflecting on - Karpathy and others have reflected on how there were expectations of DRL leading to AGI in the 2015-2020 period, which wound up being swamped by self-supervised learning and DRL relegated to a backwater (and contributed very directly to many major events like how OA and DM became like they are now - and why Sutton is at Keen rather than DM with Silver), but now suddenly becoming super-relevant again.

          • quantumHazer 10 hours ago
            > ...It is in a format that resembles a published article because it is going to be a published article? "This is a preprint of a chapter that will appear in the book Designing an Intelligence, published by MIT Press." on the first page.

            It doesn't make any difference and doesn't invalidate my critique. It appears to be a science communication book, so it could easily be a web page. Even if it was a LaTeXish PDF there were multiple ways to not making it a PDF that resembles a scientific article, there's a precise choice being made about how to communicate. The medium is the message.

            > A history of RL from DQN to AlphaProof/LLM computer use in Gemini is not 'generic', and could not be.

            History of RL is not 'generic' and is indeed really interesting, I look forward to reading Sutton's book! But the graph in the PDF is. The y-axis is ill-defined because

            1. it combines different technologies (DQN, AlphaGo, GPT models) on a single continuum implying direct comparison.

            2. the evergreen hypester future trajectory toward "superhuman intelligence"

            I will not comment further on the graph, it's not a interesting visualization in my opinion and only serves the author's purpose for the narrative of "feeling the AGI (through RL)” There would be more interesting way of plotting this information for a general public. I agree that is harsh from me that it doesn’t provide value. Maybe it provides value to people who want to explore RL now, but again, the medium is the message, and this format is clearly saying out loud “look at me, I’m a paper, trust me.”

            • gwern 3 hours ago
              > It doesn't make any difference and doesn't invalidate my critique. It appears to be a science communication book, so it could easily be a web page. Even if it was a LaTeXish PDF there were multiple ways to not making it a PDF that resembles a scientific article, there's a precise choice being made about how to communicate. The medium is the message.

              If this is really your response, I agree with your original comment about you being burned-out.

              • quantumHazer 1 hour ago
                Shifting from addressing the substance to referencing a personal remark like my joke about “burning out” from AI hype doesn't strength your argument. I expected more from someone in “EA/rationalism”. In retrospect, perhaps I shouldn't be surprised when substantive critique about presentation format gets poisoned with a personal comment rather than discussing the actual points about how scientific information is communicated. The medium is indeed the message. Bye.
    • planck_tonne 2 hours ago
      [dead]
  • artninja1988 1 day ago
    Related: https://www.deepmind.com/publications/reward-is-enough. Not sure if I buy the reward is enough hypothesis though. Imo an AI with a fixed reward function doesn't seem like agi to me
  • simianwords 1 day ago
    I want to clarify whether the "learn from experience" is still done through RL offline and not autonomously and continuously?

    I think the core idea from the paper is that while we have already hit the ceiling of normal kind of data; there's a new kind of data from agents acting in the real world and users (or some one else?) providing rewards based on some ground truth.

    Somehow I misinterpreted from this paper that this kind of learning would be autonomous and continuous.

  • ChrisArchitect 1 day ago
    Richard Sutton: "Ultimately (this) will become a chapter in the book 'Designing an Intelligence' edited by George Konidaris and published by MIT Press."
  • tempodox 1 day ago
    I suspect this text was generated by an LLM.
  • 0xCE0 7 hours ago
    I hate that "scientific papers" do not have date of publication.
  • numpad0 1 day ago
    ...yeah? It'll be great if machines could learn and adapt on-the-fly instead of just compressing the scrapes for 1 epoch over the course of few months into a 1TB download. Making machines learn and adapt on-line is what AI was always about.
  • kkfx 22 hours ago
    Honestly? Well, we entering an era (beside possible global war, famine, ... etc) where knowledge application will be more and more automated, while knowledge creation will be human.

    Meaning we need less strong arms and more strong brains. Not something that new anyway, the "information age" already makes clear intelligent people could do pretty anything they want to do, while less intelligent are constrained in what they can actually do even if they want.

    Experience means essentially automation in the chapter terms, something we have already "solved" could be automated by some machine. To solve new things we need humans. That's is.

    Small potatoes new knowledge, meaning knowledge emerging merely crossing per-existing knowledge like from a literature review paper could be a machine game, it's not really creation of new knowledge in the end.

    BUT the real point is another: who own the model? LLMs state a clear thing, we need open knowledge just to train them, copyright can't be sustained anymore. But once a model is created who own it? Because the current model is dramatically dangerous since training is expensive and not much exiting, so while it could be a community procedure in practice is a giant-led process, and the giant own the result while harvesting anything from anyone. The effect implied by such evolution are much more startling then the mere automation risk in Lisanne Bainbridge terms https://ckrybus.com/static/papers/Bainbridge_1983_Automatica... or short/mid term job losses.

  • yapyap 1 day ago
    [dead]
  • Siah 1 day ago
    [flagged]
  • baq 1 day ago
    Any such article or book must be read with https://ai-2027.com/ in the back of the mind. Exponential processes are… exponentially… dependent on starting conditions, but if takeoff really happens this decade, we’ll be at the destination before Winds of Winter.