1 line
210 KiB
JSON
1 line
210 KiB
JSON
{"id":"1775672308695-kwSVtQ7dziU","videoId":"kwSVtQ7dziU","url":"https://youtu.be/kwSVtQ7dziU?si=GEPNaFHphC6LZrwm","title":"Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI","type":"youtube","topicCount":16,"segmentCount":429,"createdAt":"2026-04-08T18:18:28.695Z","uploadDate":"20260320","chunks":[{"title":"Introduction and the Shift to AI Agents","summary":"Andrej Karpathy is introduced and discusses how the default workflow of coding has shifted. He describes the process as manifesting intent to AI agents for 16 hours a day.","entries":[{"text":"Andrej Karpathy: Code's not even the right verb anymore, right? But I have to, um, express my will to my agents for 16 hours a day. Manifest.","offset":0,"duration":10},{"text":"Andrej Karpathy: How can I have not just a single session of, you know, Claude code or Codex or some of these agent harnesses, how can I have more of them? How can I do that appropriately?","offset":10,"duration":4},{"text":"Andrej Karpathy: The agent part is now taken for granted. Now the claw-like entities are taken for granted, and now you can have multiple of them, and now you can have instructions to them, and now you can have optimization over the instructions.","offset":14,"duration":9},{"text":"Andrej Karpathy: But there—I mean, this is what gets to the psychosis, is that this is like infinite, and everything is skill issue.","offset":23,"duration":7},{"text":"Host: Hi listeners, welcome back to No Priors. Today I'm here with Andrej Karpathy, and we have a wide-ranging conversation for you about code agents, the future of engineering, and AI research.","offset":30,"duration":11},{"text":"Host: How more people can contribute to research, what's happening in robotics, his prediction for how agents can reach out into the real world, and education in this next age. Welcome, Andrej.","offset":41,"duration":13},{"text":"Host: Andrej, thanks for doing this.","offset":54,"duration":1},{"text":"Andrej Karpathy: Yeah, thank you for having me.","offset":55,"duration":2},{"text":"Host: Uh, so it's been a very exciting couple of months in AI.","offset":57,"duration":2},{"text":"Andrej Karpathy: Oh yeah, you could say that.","offset":59,"duration":2},{"text":"Host: I remember, um, walking into the office at some point and you were like, really locked in, and I was asking what you were up to and you're like, \"I just—I have to code for 16 hours a day,\" or code's not even the right verb anymore, right?","offset":61,"duration":12},{"text":"Host: But I have to, um, express my will to my agents for 16 hours a day. Manifest. Um, because like there's been a jump in capability. Uh, what's happening? Tell me about your experience.","offset":73,"duration":12}],"startTime":0},{"title":"AI Psychosis and Delegating Code to Agents","summary":"Andrej describes his state of \"AI psychosis,\" where software engineering is now bottlenecked by a human's ability to parallelize and instruct agents rather than typing code manually.","entries":[{"text":"Andrej Karpathy: Yeah, I kind of feel like I was just in this perpetual—I still am often—in this state of AI psychosis just like all the time, uh, because there was a huge unlock in what you can achieve as a person, as an individual, right?","offset":85,"duration":14},{"text":"Andrej Karpathy: Because you were bottlenecked by, you know, your typing speed and so on. But now with these agents, it really—I would say in December is when it really just—something flipped, where I kind of went from 80/20 of like, you know, uh, to like 20/80 of writing code by myself versus just delegating to agents.","offset":99,"duration":15},{"text":"Andrej Karpathy: And I don't even think it's 20/80 by now, I think it's a lot more than that. I don't think I've typed like a line of code probably since December basically. Uh, which is like an extremely large, uh, change.","offset":114,"duration":14},{"text":"Andrej Karpathy: Um, I was talking to it like for example, I was talking about it to for example my parents and so on, and I don't think like a normal person actually realizes that this happened or how dramatic it was.","offset":128,"duration":10},{"text":"Andrej Karpathy: Like literally like if you just find a random software engineer or something like that at their—at their desk and what they're doing, like their default workflow of, you know, building software is completely different as of basically December.","offset":138,"duration":14},{"text":"Andrej Karpathy: Uh, so I'm just like in this state of psychosis of trying to figure out like what's possible, uh, trying to push it to the limit. How is—how can I have not just a single session of, you know, um, Claude code or Codex or some of these agent harnesses?","offset":152,"duration":12},{"text":"Andrej Karpathy: How can I have more of them? How can I do that appropriately? And then how can I use these claws? What are these claws? Uh, and uh, so there's like a lot of new things. I want to be at the forefront of it, you know?","offset":164,"duration":10},{"text":"Andrej Karpathy: And I'm very antsy that I'm not at the forefront of it. And I see lots of people on Twitter doing all kinds of things and they all sound like really good ideas, and I need to be at the forefront or I feel extremely nervous.","offset":174,"duration":6},{"text":"Andrej Karpathy: And so I guess I'm just in this psychosis of like what's possible, like because it's unexplored fundamentally.","offset":180,"duration":2},{"text":"Host: Well, if you're nervous, the rest of us are—are nervous. We have a—uh, we have a team that we work with at Conviction that their setup is everybody is like, you know, none of the engineers write code by hand and they're all microphoned and they just like whisper to their agents all the time.","offset":182,"duration":17},{"text":"Host: It's the strangest work setting ever, uh, and I thought they were crazy and now I like fully accept I was like, oh this was the way. Like you're just ahead of it. Um, what—uh, how do you think about your own capacity now to like explore or to do new projects? Like what—what is it limited by?","offset":199,"duration":18},{"text":"Andrej Karpathy: Yeah, what is it limited by? Uh, just I think everything, like so many things, even if they don't work, I think to a large extent you feel like it's skill issue. It's not that the capability's not there.","offset":217,"duration":12},{"text":"Andrej Karpathy: It's that you just haven't found a way to string it together of what's available. Like I just didn't give good enough instructions in the agents' MD file or whatever it may be. I don't have a nice enough memory tool that I put in there or something like that.","offset":229,"duration":13},{"text":"Andrej Karpathy: So it all kind of feels like skill issue when it doesn't work to some extent. You want to see how you can paralyze them etc. and you want to be Peter Steinberg basically. Uh, so Peter is famous, he has a funny photo where he's in front of a monitor with lots of like—uh, he uses Codex.","offset":242,"duration":13},{"text":"Andrej Karpathy: So lots of Codex agents tiling the—the monitor, and they all take about 20 minutes if you prompt them correctly and you use the high effort, and so they all take about 20 minutes. So you have multiple, you know, 10, uh, repos checked out, and so he's just, um, going between them and giving them work.","offset":255,"duration":15},{"text":"Andrej Karpathy: It's just like you can—you can move in much larger macro actions. It's not just like here's a line of code, here's a new function, it's like here's a new functionality and delegate it to agent one.","offset":270,"duration":10},{"text":"Andrej Karpathy: Here's a new functionality that's not going to interfere with the other one, give it to agent two, and then try to, uh, review their work as best as you can depending on how much you care about that code.","offset":280,"duration":8},{"text":"Andrej Karpathy: Like what are these macro actions that I can like manipulate my software repository by? And like another agent is doing some like research, another agent is writing code, another one is coming up with a plan for some new implementation.","offset":288,"duration":13},{"text":"Andrej Karpathy: And so everything is just like happening in these like macro actions over your repository. Um, and you're just trying to become like really good at it and develop like a muscle memory for it is extremely, um—yeah, it's very rewarding number one because it actually works.","offset":301,"duration":14},{"text":"Andrej Karpathy: But it's also kind of like the new thing to learn. So that's why hence the psychosis.","offset":315,"duration":4},{"text":"Host: Yeah, I—I do feel like my instinct is like whenever I am waiting for an agent to complete something, the obvious thing to do is like well I can do more work, right?","offset":319,"duration":8},{"text":"Host: Like if I have access to more tokens then like I should just parallelize and tasks. And so that—that's very stressful because if you don't feel very bounded by your ability to spend on tokens, then you know you are the bottleneck in this system that is max capability.","offset":327,"duration":15},{"text":"Andrej Karpathy: Yeah, if you're not maximizing your subscription at least, and uh, so ideally for multiple agents, like if you run out of the quota on Codex you should switch to Claude or whatnot, I don't know, like that's what I've been trying to do a little bit.","offset":342,"duration":9},{"text":"Andrej Karpathy: And I feel nervous when I have subscription left over, uh, that just means I haven't maximized my token throughput. So I actually kind of experienced this when I was a PhD student, you would feel nervous when your GPUs are not running.","offset":351,"duration":8},{"text":"Andrej Karpathy: Like you have GPU capability and you're not maximized to the available flops to you. But now it's not about flops, it's about tokens. Uh, so what is your token throughput and what token throughput do you command?","offset":359,"duration":10},{"text":"Host: I would actually argue that it's very interesting that we had, you know, at least 10 years where in many engineering tasks people just didn't—they didn't feel compute bound, right?","offset":369,"duration":12},{"text":"Host: Um, and like the entire industry feels that now. They feel like—they felt resource bound. Uh, and now that you have this big capability jump, you're like, oh actually it's not, you know, my ability to access the compute anymore, like I—I'm the binding constraint.","offset":381,"duration":12},{"text":"Andrej Karpathy: Yeah, it's a skill issue, which is very empowering cause, uh, yeah cause you could be getting better. So that's why—that's why I think it's very addictive because there's unlocks when—when you get better.","offset":393,"duration":7},{"text":"Host: Where do you think it goes? Like if you just think about like okay, you know, Andrej's iterating and everybody else's for 16 hours a day getting better at using coding agents, like what does it look like in a year of like you've reached mastery?","offset":400,"duration":12}],"startTime":85},{"title":"Open Claw and AI Agent Personalities","summary":"The conversation shifts to \"Claw-like\" entities—persistent, autonomous agents that operate in sandboxes. Andrej compares how they differ from single-session agents like Codex, particularly in their distinct and compelling personalities.","entries":[{"text":"Andrej Karpathy: Yeah, what does mastery look like, right? At the end of the year or like two, three, years, five years, ten years etc. Well I think everyone is basically interested in like going up the stack.","offset":412,"duration":9},{"text":"Andrej Karpathy: So I would say, yeah, it's not about a single session with your agent. Um, multiple agents, how do they collaborate and teams and so on. So everyone's trying to figure out what that looks like.","offset":421,"duration":8},{"text":"Andrej Karpathy: And then I would say Claw is also kind of an interesting direction because it really—when I say a Claw, I mean this like layer that uh kind of takes persistence to a whole new level.","offset":429,"duration":9},{"text":"Andrej Karpathy: Like it's something that like keeps looping, is—is um, it's not something that you are interactively in the middle of. It kind of like has its own little sandbox, its own little, you know, it kind of like does stuff on your behalf even if you're not looking kind of thing.","offset":438,"duration":10},{"text":"Andrej Karpathy: Um, and then also has like maybe more sophisticated memory systems etc. that are not yet implemented in agents. So, uh, Open Claw has a lot more sophisticated memory I would say than what you would get by default, uh, which is just a memory compaction when your context runs out, right?","offset":448,"duration":13},{"text":"Host: You think that's the piece that resonated for more users versus like perhaps like broader tool access?","offset":461,"duration":5},{"text":"Andrej Karpathy: For Open Claw?","offset":466,"duration":1},{"text":"Host: Yeah.","offset":467,"duration":1},{"text":"Andrej Karpathy: Uh, there's—like I think there's at least five things that resonated with users. Yeah, good job, Peter. I mean Peter has done a really amazing job.","offset":468,"duration":6},{"text":"Andrej Karpathy: Um, I saw him recently, uh, and I talked to him about it and I—he's very humble about it, but I think he innovated simultaneously in like five different ways and put it all together. Uh, so for example like the soul MD document, like he actually really crafted a personality that is kind of compelling and interesting.","offset":474,"duration":14},{"text":"Andrej Karpathy: And I feel like a lot of the current agents they don't get this correctly. I actually think Claude has a pretty good personality, it feels like a teammate. Uh, and uh, it's excited with you etc.","offset":488,"duration":10},{"text":"Andrej Karpathy: I would say, for example, Codex is a lot more dry, um, which is kind of interesting because in ChatGPT, Codex is like a lot more upbeat and highly sycophantic. But I would say Codex the coding agent is very dry.","offset":498,"duration":11},{"text":"Andrej Karpathy: It doesn't—it doesn't seem to care about what you're creating. It's kind of like, \"Oh, I implemented it.\" It's like, okay, but do you understand what we're building?","offset":509,"duration":6},{"text":"Host: It's true.","offset":515,"duration":1},{"text":"Andrej Karpathy: You know, it doesn't—um—and the other thing I would say is for example with Claude, I think they dialed the sycophancy fairly well, where when Claude gives me praise, I do feel like I slightly deserve it.","offset":516,"duration":9},{"text":"Andrej Karpathy: Um, because sometimes I kind of give it like not very well-formed thoughts, and uh I give it an idea that I don't think is fully baked and it doesn't actually react very strongly. It's like, \"Oh yeah, we can implement that.\"","offset":525,"duration":9},{"text":"Andrej Karpathy: But when it's a really good idea by my own account, it does uh seem to reward it a bit more, and so I kind of feel like I'm trying to like earn its praise, which is really weird.","offset":534,"duration":8},{"text":"Andrej Karpathy: And so I do think the personality matters a lot. Uh, and I think uh a lot of the other, uh, tools maybe don't appreciate it as much, and I think in this aspect also Peter really cares about this and so that was correct.","offset":542,"duration":7},{"text":"Andrej Karpathy: And then the memory system and then uh just you know he's just having fun with this. Um, and then the the single WhatsApp portal to all of the automation.","offset":549,"duration":9},{"text":"Host: Yeah. Is there something that you have done personally with your Claws beyond software engineering that you think is fun or interesting?","offset":558,"duration":9}],"startTime":412},{"title":"Home Automation and the End of Apps","summary":"Andrej shares his experience building \"Dobby,\" a Claw that autonomously reverse-engineered his home network to control smart devices via a unified WhatsApp interface. He suggests this agent-first paradigm renders bespoke apps obsolete.","entries":[{"text":"Andrej Karpathy: Yeah, so in January I had Claude—I went through a period of Claude psychosis, so I built a—um, I have a Claw basically that takes care of my home, and I call him Dobby the Elf Claw. Um, and uh basically I used uh the agents to find all of the smart home subsystems of my home on the local area network, which I was kind of surprised that worked out of the box.","offset":567,"duration":20},{"text":"Andrej Karpathy: Like I just told it that I think I have Sonos at home, like can you try to find it? And it goes and it did like IP scan of all of the um, basically um computers on the local area network, and it found the Sonos thing, uh the Sonos uh system.","offset":587,"duration":12},{"text":"Andrej Karpathy: And it turned out that there's no password protection or anything like that, it just logged in and it's like, \"Oh yeah, you have these Sonos systems installed. Uh, let me try to reverse engineer how it's uh working.\"","offset":599,"duration":10},{"text":"Andrej Karpathy: It does some web searches and it finds like, \"Okay, these are the API endpoints.\" And then it's like, \"Do you want to try it?\" and I'm like, \"Whoa, like you just did that.\" And I'm like, \"Yeah, can you try to play something in the study?\"","offset":609,"duration":9},{"text":"Andrej Karpathy: And uh it does, and music comes out. And I'm like, I can't believe I just—","offset":618,"duration":1},{"text":"Host: That's crazy, that's like three prompts.","offset":619,"duration":2},{"text":"Andrej Karpathy: I can't believe I just typed in like, \"Can you find my Sonos?\" and that suddenly it's playing music. And it did the same for lights. And so basically like it kind of hacked in, figured out the whole thing, uh created APIs, created a dashboard, so I could see the command kind of center of like all of my lights in the home.","offset":621,"duration":12},{"text":"Andrej Karpathy: And then it was like switching lights on and off and, you know, uh so I can ask it like Dobby at sleepy time, and when it's sleepy time that just means all the lights go off etc. and so on.","offset":633,"duration":11},{"text":"Andrej Karpathy: So it controls all of my lights, my HVAC, my shades, uh the pool and uh the spa, and also my security system. So I have a camera pointed outside of the house and anytime someone rolls in, I have a Quen—uh, a Quen uh model that looks at the videos.","offset":644,"duration":13},{"text":"Andrej Karpathy: So first of all there's change detection.","offset":657,"duration":2},{"text":"Host: Right.","offset":659,"duration":1},{"text":"Andrej Karpathy: And then based on change detection, it goes to Quen, and then it actually like tells me, uh it sends me a text to my WhatsApp, it shows an image from the outside, and it says, \"Hey FedEx truck just pulled up, FedEx truck just pulled up and you might want to check it and you got new mail or something like that.\"","offset":660,"duration":15},{"text":"Andrej Karpathy: And Dobby just texts me this. This is extremely incredible. Um, so so Dobby is in charge of the house, I text through with it through WhatsApp. Uh, and it's been like really fun to have these macro actions that maintain my house.","offset":675,"duration":14},{"text":"Andrej Karpathy: I haven't like really pushed it uh like way more beyond that, and I think people are doing a lot more crazy things with it, but for me even just the home automation setup, I used to use like six apps, uh completely different apps, and I don't have to use these apps anymore.","offset":689,"duration":13},{"text":"Andrej Karpathy: Like Dobby controls everything in natural language, it's amazing. Um, and so I think like I haven't even pushed the paradigm fully, but already that is so helpful and so inspiring I would say.","offset":702,"duration":9},{"text":"Host: Do you think that's indicative of like what people want from a user experience perspective with software?","offset":711,"duration":6},{"text":"Host: Right, because I—I don't think, you know, it's pretty ignored that it takes humans effort to like learn new software, like new UI.","offset":717,"duration":8},{"text":"Andrej Karpathy: Yeah, I think uh to some extent that's right. It's like working backwards from how people think an AI should be, because what people have in their mind of like what an AI is, is not actually what an LLM is by—like in the raw sense.","offset":725,"duration":10},{"text":"Andrej Karpathy: Like LLM is a token generator, you know, like more tokens come out. But what they think of is like this pers—this persona identity that they can tell stuff and it remembers it, you know, and uh is just an entity behind a WhatsApp. It's like a lot more understandable.","offset":735,"duration":13},{"text":"Andrej Karpathy: Um, so I think to some extent it's like matching the expectations that humans already have for what an AI should behave, but under the hood it's like a lot of technical details go into that. And LLMs are too raw of a primitive, uh to actually, uh type check as AI I think for most people if that makes sense.","offset":748,"duration":13},{"text":"Host: Yeah, I—I think that's like how we understand what the AI is, and like the description of it as Dobby or some persona obviously resonates with people.","offset":761,"duration":10},{"text":"Host: Um, I also think that it—uh the unification that you did across your six different software systems for your home automation speaks to a different question of like do people really want all of this software that we have today?","offset":771,"duration":11},{"text":"Andrej Karpathy: Yeah.","offset":782,"duration":1},{"text":"Host: Um, because I would argue like well you have the hardware, but you've now thrown away the software or the the UX layer of it. Um, do you think that's what people want?","offset":783,"duration":11},{"text":"Andrej Karpathy: Yeah, I think there's this like there's this sense that these apps that are on the app store for using these smart home devices etc., uh these shouldn't even exist kind of in a certain sense.","offset":794,"duration":9},{"text":"Andrej Karpathy: Like shouldn't it just be APIs and shouldn't agents be just using it directly? And uh wouldn't it like—I can do all kinds of home automation stuff that uh any individual app will not be able to do, right?","offset":803,"duration":10},{"text":"Andrej Karpathy: Um, and an LLM can actually drive the tools and call all the right tools and do do pretty complicated things. Um, and so in a certain sense it does point to this like maybe there's like an overproduction of lots of custom bespoke apps that shouldn't exist because agents kind of like crumble them up.","offset":813,"duration":13},{"text":"Andrej Karpathy: And everything should be a lot more just like exposed API endpoints, and agents are the glue of the intelligence that actually like tool calls all the all the parts. Um, another example is like my treadmill.","offset":826,"duration":12},{"text":"Andrej Karpathy: Uh, there's an app for my treadmill, and I wanted to like keep track of how often I do my cardio, uh but like I don't want to like log into a web UI and go through a flow and etc. Like all this should just be like make APIs available.","offset":838,"duration":12},{"text":"Andrej Karpathy: And this is kind of, you know, going towards the agentic, uh sort of web or like agent first, uh tools and all this kind of stuff. So I think the industry just has to reconfigure in so many ways that it's like the customer is not the human anymore.","offset":850,"duration":12},{"text":"Andrej Karpathy: It's like agents who are acting on behalf of humans, and this refactoring will be will probably be substantial in a certain sense.","offset":862,"duration":6},{"text":"Host: One way that people sometimes push back on this is like do people, do we expect people to vibe code some of these tools?","offset":868,"duration":6},{"text":"Host: Do we expect normal people to do this kind of stuff that I described? But I think to some extent this is just, you know, technology as it exists today, and right now there is some vibe coding and I'm actually watching it and I'm working with the system.","offset":874,"duration":14},{"text":"Andrej Karpathy: But I kind of feel like the kind of stuff that I just talked about, this should be free, like in a year or two or three. There's no vibe coding involved. This is trivial, this is table stakes. This is like any AI, even the open source models etc. can like do this.","offset":888,"duration":11},{"text":"Host: You should be able to translate from a less technical human's intent very easily to this outcome.","offset":899,"duration":6},{"text":"Andrej Karpathy: Extremely easily. Yeah. Today it's vibe coding and some involved and not many people are going to do it, but—","offset":905,"duration":3},{"text":"Host: And you still have to make some design decisions, right? We were talking about like what you take frames, for example.","offset":908,"duration":5},{"text":"Andrej Karpathy: Yeah. But I kind of feel like this will just uh start to the barrier will just come down and it's just ephemeral software on your behalf and some kind of like Claw is handling all the details for you, but you're not involved. Claw has a Claw has a machine and it will figure it out, and it's just presenting you UIs and you're like saying stuff, you know.","offset":913,"duration":14}],"startTime":567},{"title":"Privacy and Time Constraints on Personal Agents","summary":"Andrej explains why he hasn't integrated Claws into more aspects of his personal digital life yet, citing security, privacy concerns, and general distraction from other projects.","entries":[{"text":"Host: Why haven't you uh I guess like pushed the boundaries of what you can do personally with Claws like is it, you know, you're focusing on more important projects? AutoResearch etc. or uh you're climbing the hill to mastery or something else, right?","offset":927,"duration":15},{"text":"Andrej Karpathy: Yeah, I just feel like I'm so distracted by everything. So I spent I spent like a week on the Claw stuff and I I have more to-dos almost, um but I will say that—","offset":942,"duration":11},{"text":"Host: It's like Jensen told us, we're all just busier, unfortunately.","offset":953,"duration":2},{"text":"Andrej Karpathy: Yeah. Uh, I didn't really take advantage of a lot of like email and calendar and all this other stuff, and I didn't give it access because I'm still a little bit like suspicious and it's still very new and rough around the edges, so I didn't want to give it like full access to my digital life yet.","offset":955,"duration":14},{"text":"Andrej Karpathy: And part of it is just security privacy and uh just being very cautious in that in that realm. And uh so some of it is like held back by that I would say. Yeah maybe that's like the dominant dominant feature, but some of it is also just I feel so distracted because I feel like I had a week of Claw and then other stuff is happening and—","offset":969,"duration":13}],"startTime":927},{"title":"AutoResearch and Removing Humans From the Loop","summary":"Andrej outlines his AutoResearch project, highlighting the necessity of removing human bottlenecks to maximize token throughput. This allows agents to recursively self-improve and autonomously tune models overnight.","entries":[{"text":"Host: What was the um—I mean, you have talked about like being able to train or at least optimize a uh a model as a task that you want to see agents do for a long time. Like what was the motivation behind AutoResearch?","offset":982,"duration":13},{"text":"Andrej Karpathy: AutoResearch, yeah. So I think like I had a tweet earlier where I kind of like said something along the lines of to get the most out of the tools that are becoming available now, you have to remove yourself as the as the bottleneck.","offset":995,"duration":12},{"text":"Andrej Karpathy: You can't be there to prompt the next thing. You're you need to take yourself outside um. You have to arrange things such that they are completely autonomous and the more you know, how can you maximize your token throughput and not be in the loop?","offset":1007,"duration":12},{"text":"Andrej Karpathy: This is the this is the goal. And so I kind of mentioned that the the name of the game now is to increase your leverage. Uh, I put in just very few tokens just once in a while and a huge amount of stuff happens on my behalf.","offset":1019,"duration":9},{"text":"Andrej Karpathy: And so AutoResearch, like I tweeted that and I think people liked it and whatnot, but—","offset":1028,"duration":4},{"text":"Host: They haven't like maybe worked through like the implications of that.","offset":1032,"duration":3},{"text":"Andrej Karpathy: And for me AutoResearch is an example of like an implication of that where it's like I don't want to be like the researcher in the loop like looking at results etc. like I'm I'm holding the system back.","offset":1035,"duration":8},{"text":"Andrej Karpathy: So the question is how do I refactor all the abstractions so that I'm not—I have to arrange it once and hit go. The name of the game is how can you get more agents running for longer periods of time without your involvement doing stuff on your behalf?","offset":1043,"duration":12},{"text":"Andrej Karpathy: And AutoResearch is just, yeah, here's an objective, here's a metric, here's your boundaries of what you can and cannot do and go. And uh yeah, it worked.","offset":1055,"duration":7},{"text":"Host: You were surprised at its effectiveness.","offset":1062,"duration":3},{"text":"Andrej Karpathy: Yeah, I I didn't expect uh it to work because so I have the project Nanochat, um and fundamentally like I think a lot of people are very confused with my obsession for like training GPT-2 models and so on, but for me uh training GPT models and so on is just a little harness, a little playground for training LLMs.","offset":1065,"duration":15},{"text":"Andrej Karpathy: And fundamentally what I'm more interested in is like this idea of recursive self-improvement and to what extent can you actually have LLMs improving LLMs? Because I think all the frontier labs this is like the thing, um for obvious reasons.","offset":1080,"duration":9},{"text":"Andrej Karpathy: And they're all trying to recursively self-improve roughly speaking. And so for me this is kind of like uh a little playpen of that. Um, and I guess I like tuned Nanochat already quite a bit by hand in a good old-fashioned way that I'm used to.","offset":1089,"duration":13},{"text":"Andrej Karpathy: Like I'm a researcher, I've done this for like you know two decades, I have some amount of like—","offset":1102,"duration":3},{"text":"Host: What is the opposite of hubris? Uh, yeah.","offset":1105,"duration":2},{"text":"Host: Earned confidence.","offset":1107,"duration":1},{"text":"Andrej Karpathy: Okay. Of like two decades of like oh I've trained this model like thousands of times of like um so I've done a bunch of experiments, I've done hyperparameter tuning, I've done all the things I'm very used to and I've done for two decades.","offset":1108,"duration":13},{"text":"Andrej Karpathy: Yeah. And I've gotten to a certain point and I thought it was like fairly well-tuned, and then I let AutoResearch go for like overnight and it came back with like tunings that I didn't see. And yeah I did forget like the weight decay on the value embeddings, and my Adam betas were not sufficiently tuned, and these things jointly interact.","offset":1121,"duration":18},{"text":"Andrej Karpathy: So like once you tune one thing the other things have to potentially change too. You know, I shouldn't be a bottleneck, I shouldn't be running these hyperparameter search optimizations, I shouldn't be looking at the results.","offset":1139,"duration":8},{"text":"Andrej Karpathy: There's objective criteria in this case. Uh, so you just let you just have to arrange it so that it can just go forever. So that's a single sort of version of AutoResearch of like a single loop trying to improve.","offset":1147,"duration":8},{"text":"Andrej Karpathy: And I was surprised that it um it found these things that I you know the repo is already fairly well-tuned and still found something. And that's just a single it's a single loop.","offset":1155,"duration":6},{"text":"Andrej Karpathy: Like these frontier labs they have GPU clusters of tens of thousands of them, and so it's very easy to imagine how you would basically get a lot of this automation on um smaller models, and fundamentally everything around like frontier-level intelligence is about extrapolation and scaling laws.","offset":1161,"duration":18},{"text":"Andrej Karpathy: And so you basically do a ton of the exploration on the smaller models and then you try to uh extrapolate out.","offset":1179,"duration":4},{"text":"Host: So you're saying our research efforts are going to get more efficient, like we're going to have better direction for when we scale as well if we can do this experimentation better.","offset":1183,"duration":8},{"text":"Andrej Karpathy: Yeah, I would say that like the most interesting project and probably what the frontier labs are working on is um you know you experiment on small models, you try to make it as autonomous as possible, remove researchers—","offset":1191,"duration":8},{"text":"Host: From the loop.","offset":1199,"duration":2},{"text":"Andrej Karpathy: —they have way too much—","offset":1201,"duration":2},{"text":"Host: What is the opposite of—which—","offset":1203,"duration":1},{"text":"Host: Earned confidence.","offset":1204,"duration":1},{"text":"Andrej Karpathy: Yeah, they don't know. They shouldn't be touching any of this really. And so you have to like rewrite the whole thing because right now I mean certainly they can contribute ideas.","offset":1205,"duration":7},{"text":"Andrej Karpathy: But okay uh they shouldn't actually be enacting these ideas. There's a queue of ideas, and there's maybe an automated scientist that comes up with ideas based on all the archive papers and GitHub repos and it funnels ideas in, or researchers can contribute ideas.","offset":1212,"duration":13},{"text":"Andrej Karpathy: But it's a single queue and there are workers that pull uh items and they try them out, and uh whatever works just gets uh sort of put on the feature branch and maybe some people like uh monitor the feature branch and merge to the main branch sometimes.","offset":1225,"duration":14},{"text":"Andrej Karpathy: So yeah just removing humans uh from all the processes and automating as much as possible and getting high to tokens per second throughputs and it does require rethinking of all the abstractions uh and uh everything has to be reshuffled so yeah I think it's very exciting.","offset":1239,"duration":15}],"startTime":982},{"title":"Meta-Optimizing the AI Research Process","summary":"The discussion dives deeper into meta-optimization and recursive self-improvement. Andrej explores how research organizations could be defined as markdown files that AI models evaluate and improve over time.","entries":[{"text":"Host: Take one more recursive step here. Um, uh when is the model going to write a better Program MD than you?","offset":1254,"duration":5},{"text":"Andrej Karpathy: Yeah.","offset":1259,"duration":1},{"text":"Host: We're not in the loop.","offset":1260,"duration":1},{"text":"Andrej Karpathy: Yeah, exactly. Uh, so Program MD is my crappy attempt at describing like how the AutoResearcher should work. Like oh do this, then do that and that and then try these kinds of ideas.","offset":1261,"duration":10},{"text":"Andrej Karpathy: Like here's maybe some ideas like look at architecture, look at optimizer etc. but I just came up with this in Markdown, right? Um, and uh so yeah exactly.","offset":1271,"duration":9},{"text":"Andrej Karpathy: You want some kind of an AutoResearch loop maybe that looks for—you can imagine that different Program MDs would um would give you different uh progress. So you basically every research organization is described by Program MD.","offset":1280,"duration":15},{"text":"Host: Yeah.","offset":1295,"duration":1},{"text":"Andrej Karpathy: A research organization is a set of Markdown files that describe all the roles and how the whole thing connects. Um, and you can imagine having a better research organization.","offset":1296,"duration":10},{"text":"Andrej Karpathy: So maybe they do fewer stand-ups in the morning because they're useless. Uh, and this is all just code, right? Um, and so you can so one organization can have fewer stand-ups, one organization can have more, uh one organization can be very risk-taking, one organization can be less.","offset":1306,"duration":13},{"text":"Andrej Karpathy: And so you can definitely imagine that you have multiple research orgs, um and then they all have code. And once you have code then you can imagine tuning the code. So 100% there's like the meta layer of it uh uh.","offset":1319,"duration":10},{"text":"Host: Did you see my text about my contest idea? My contest idea was uh like let people write uh different Program MDs, right?","offset":1329,"duration":9},{"text":"Host: And so for same hardware where do you get most improvement?","offset":1338,"duration":2},{"text":"Andrej Karpathy: Oh, I see.","offset":1340,"duration":1},{"text":"Host: And then you can take all that data and then give it to the model and say write a better Program MD.","offset":1341,"duration":4},{"text":"Andrej Karpathy: Yes, yes. Yeah, exactly.","offset":1345,"duration":3},{"text":"Host: We're going to get something better. Like there's no way we don't, right?","offset":1348,"duration":1},{"text":"Andrej Karpathy: You can 100% look at uh where the improvements came from and like can I change the Program MD such that more of these kinds of things would be done, or like things that didn't work uh etc. Meta optimization.","offset":1349,"duration":13},{"text":"Host: Yeah.","offset":1362,"duration":1},{"text":"Andrej Karpathy: You can 100% imagine doing that. So I think this is a great idea. But it's like you know I think you sort of go one step at a time where you sort of have one process and then second process and then the next process, and these are all layers of an onion.","offset":1363,"duration":10},{"text":"Andrej Karpathy: Like the LLM sort of part is now taken for granted, the agent part is now taken for granted, now the claw-like entities are taken for granted, and now you can have multiple of them, and now you can have instructions to them, and now you can have optimization over the instructions, and it's just a little too much, you know?","offset":1373,"duration":14},{"text":"Andrej Karpathy: But they I mean this is what gets to the psychosis, is that this is like infinite and everything is skill issue, and that's why I feel yeah that's just coming back to this is why it's so insane.","offset":1387,"duration":-1191},{"text":"Host: Okay, well if we're we're just trying to like diagnose the current moment and uh what is a relevant skill right now, what do you like what do you think is the implication that this um that this is the loop we should be trying to achieve in different areas?","offset":196,"duration":1213},{"text":"Host: And that it works, right? Like remove—create the metric or create the ability for um agents to continue working on it without you. Do we still have performance engineering? Like what—","offset":1409,"duration":11}],"startTime":1254},{"title":"Capability Jaggedness and Limitations of AI","summary":"Andrej outlines the caveats of autonomous research, noting that AI capabilities remain extremely \"jagged.\" While agents excel at verifiable tasks like code efficiency, they struggle with softer domains and humor because those elements are not heavily optimized via reinforcement learning.","entries":[{"text":"Andrej Karpathy: Yeah, I mean so there's a few caveats that I would put on top of the LM psychosis. So number one, uh this is extremely well suited to anything that has objective uh metrics that are easy to evaluate.","offset":1420,"duration":9},{"text":"Host: Hmm.","offset":1429,"duration":1},{"text":"Andrej Karpathy: So for example like writing kernels for more efficient CUDA, uh you know code for various parts of a model etc. are the perfect fit. Um, because you have inefficient code and then you want efficient code that has the exact same behavior but is much faster. Perfect fit.","offset":1430,"duration":16},{"text":"Andrej Karpathy: Uh, so a lot of things that like are perfect fit for AutoResearch, but many things will not be. And so they it's just if you can't evaluate it then you can't AutoResearch it, right? Uh, so that's like caveat number one.","offset":1446,"duration":10},{"text":"Andrej Karpathy: And then maybe caveat number two I would say is you know we're we're kind of talking about next steps and we kind of see what next steps are, but fundamentally the whole thing still doesn't it still kind of like bursting at the seams a little bit and there's cracks and it doesn't fully work.","offset":1456,"duration":12},{"text":"Andrej Karpathy: And if you kind of try to go too far ahead the whole thing is actually net not useful, if that makes sense. Um because these models like still are not you know they've improved a lot but they're still like rough around the edges as maybe the way I would describe it.","offset":1468,"duration":12},{"text":"Andrej Karpathy: I simultaneously feel like I'm talking to an extremely brilliant PhD student who's been like a systems programmer for their entire life and a 10-year-old. And it's so weird because humans like there's—","offset":1480,"duration":12},{"text":"Host: Yes you wouldn't you wouldn't encounter that combination.","offset":1492,"duration":2},{"text":"Andrej Karpathy: This jaggedness is really strange and humans have a lot less of that kind of jaggedness although they definitely have some. But humans have a lot more jaggedness—uh sorry the agents have a lot more jaggedness where uh sometimes like you know I ask for functionality and it like comes back with something that is just like totally wrong.","offset":1494,"duration":16},{"text":"Andrej Karpathy: And then we get into loops that are totally wrong and then I'm just I get so frustrated with the agents all the time still because you feel the power of it but you also there's still like it does nonsensical things once in a while for me uh still as well.","offset":1510,"duration":14},{"text":"Host: I get very annoyed when uh uh I feel like the agent wasted a lot of compute on something it should have recognized was an obvious problem.","offset":1524,"duration":8},{"text":"Andrej Karpathy: Yeah. I think like some of the bigger things is like maybe what's under underneath it, if I could hypothesize, is fundamentally these models are trained via reinforcement learning.","offset":1532,"duration":9},{"text":"Andrej Karpathy: So they're actually struggling with the exact same thing we just talked about which is the labs can improve the models in anything that is verifiable or that has rewards. So did you write the program correctly and does it do the unit tests check out, yes or no?","offset":1541,"duration":13},{"text":"Andrej Karpathy: But some of the things where they're struggling is like for example I think they have a tough time with like nuance of maybe what I what I had in mind or what I intended and when to ask clarifying questions.","offset":1554,"duration":11},{"text":"Andrej Karpathy: Uh, like or where the—yeah, it's just um anything that feels softer is like worse. And so you're kind of like you're either on rails and you're part of the superintelligence circuits or you're not on rails and you're outside of the verifiable domains and suddenly everything kind of just like meanders.","offset":1565,"duration":-1187},{"text":"Andrej Karpathy: Like maybe another way to put it is if you go to if today if you go to like state-of-the-art model ChatGPT and you ask it tell me a joke, um do you know what joke you're going to get? There's the joke.","offset":378,"duration":1209},{"text":"Host: The joke. I do feel I I can't tell you like the you know standard form of it, but I do feel like ChatGPT has like three jokes.","offset":1587,"duration":8},{"text":"Andrej Karpathy: Yeah, yeah. So the the joke that apparently all the LLMs like love the most is why do scientists uh not trust atoms?","offset":1595,"duration":6},{"text":"Host: Okay.","offset":1601,"duration":1},{"text":"Andrej Karpathy: Because they make everything up.","offset":1602,"duration":1},{"text":"Host: Okay.","offset":1603,"duration":1},{"text":"Andrej Karpathy: They make everything up. So this is still—","offset":1604,"duration":2},{"text":"Host: Why'd that emerge?","offset":1606,"duration":2},{"text":"Andrej Karpathy: So this is the joke you would get three or four years ago and this is the joke you still get today.","offset":1608,"duration":3},{"text":"Host: Okay.","offset":1611,"duration":1},{"text":"Andrej Karpathy: So even though the models have improved tremendously and if you give them an agentic task they will just go for hours and move mountains for you, and then you ask for like a joke and it has a stupid joke, a crappy joke from five years ago.","offset":1612,"duration":14},{"text":"Andrej Karpathy: And it's because it's outside of the it's outside of the RL. It's outside of what's being improved, it's like and it's part of the jaggedness of like shouldn't you expect models as they get better to also have like better jokes or more diversity of them or it's just it's not being optimized and it's stuck.","offset":1626,"duration":15},{"text":"Host: Do you uh uh think that that implies that we are not seeing like generalization in the sense of like broader intelligence of joke smartness being attached to code smartness?","offset":1641,"duration":14},{"text":"Andrej Karpathy: Yeah, I think there's some decoupling where some things are verifiable and some things are not and some things are optimized for arbitrarily by the labs depending on like what data went in and some things are not.","offset":1655,"duration":11},{"text":"Host: But I mean the premise, there's a you know premise from some research groups that if you are smarter at code generation or in these verifiable fields you should be better at everything.","offset":1666,"duration":11},{"text":"Host: And like the the joke situation suggests that that's not happening in Auto—","offset":1677,"duration":2},{"text":"Andrej Karpathy: I don't think that's happening.","offset":1679,"duration":1},{"text":"Host: Okay.","offset":1680,"duration":1},{"text":"Andrej Karpathy: Yeah, I don't think that's happening. I think I think maybe we're seeing like a little bit of that, but not like a satisfying amount.","offset":1681,"duration":5},{"text":"Host: Yeah. That jaggedness exists in humans. You can be very, very good at math and still tell really bad jokes.","offset":1686,"duration":7},{"text":"Andrej Karpathy: Yeah, that's true, yeah. But it just it still means that we're not getting like the story is that we're getting a lot of the intelligence and capabilities in all the domains of society like for free as we get better and better models, and that's not like exactly fundamentally what's going on.","offset":1693,"duration":13},{"text":"Andrej Karpathy: And there's some blind spots and some things are not being optimized for and this is all clustered up in these neural net opaque models, right?","offset":1706,"duration":8},{"text":"Andrej Karpathy: So you're either on rails of what it was trained for and everything is like you're going at speed of light or you're not. Um, and so it's the jaggedness.","offset":1714,"duration":8},{"text":"Andrej Karpathy: So um so that's why I think like even though the the progression is obvious what should happen, you can't let it fully go there yet because it doesn't fully work, or it's a skill issue and we just haven't like figured out how to use it, so you know it's hard to tell.","offset":1722,"duration":14}],"startTime":1420},{"title":"Model Monoculture vs. Domain Speciation","summary":"The host and Andrej consider whether AI models will eventually unbundle into specialized \"expert\" domains. Andrej believes speciation makes sense for efficiency, but notes that labs are currently focused on a monoculture of arbitrarily intelligent models.","entries":[{"text":"Host: Can I ask kind of a blasphemous question which is like if this jaggedness is persistent um and it's all rolled up in a at least monolithic interface, right?","offset":1736,"duration":12},{"text":"Host: But you know single model, um does that make sense or do you should it be unbundled into things that are can be optimized and improved against different domains of intelligence?","offset":1748,"duration":8},{"text":"Andrej Karpathy: Uh like unbundling the models into multiple experts in different areas etc. more directly? Yeah.","offset":1756,"duration":5},{"text":"Host: Yeah. Instead of just MOE that we have no exposure to.","offset":1761,"duration":3},{"text":"Host: Because that can be like confusing as a user from the outside which is like why is it so good at this but not at this other thing.","offset":1764,"duration":8},{"text":"Andrej Karpathy: Yeah, I think currently my impression is uh the labs are trying to have a single sort of monoculture of a model that is arbitrarily intelligent in all these different domains and they just stuff it into the parameters.","offset":1772,"duration":11},{"text":"Andrej Karpathy: I do think that we will I do think we should expect more speciation in the intelligence. Um like you know the animal kingdom is extremely diverse in the brains that exist and there's lots of different niches of of nature, and some animals have overdeveloped visual cortex or other kind of parts.","offset":1783,"duration":16},{"text":"Andrej Karpathy: And I think we we should be able to see more speciation and um you don't need like this oracle that knows everything you kind of speciate it and then you put it on a specific task.","offset":1799,"duration":9},{"text":"Andrej Karpathy: And we should be seeing some of that because you should be able to have like much smaller models that still have the cognitive core like they're still competent but then they specialize and then um and then they can become more efficient in terms of latency or throughput on uh specific tasks that you really care about.","offset":1808,"duration":15},{"text":"Andrej Karpathy: Like if you're a mathematician working in Lean. I saw for example there's a few releases that really like target that as in the domain. Um so there's a probably going to be a few examples like that where the unbundling kind of makes sense.","offset":1823,"duration":10},{"text":"Host: One question I have is whether or not uh the capacity constraint on available compute infrastructure drives more of this.","offset":1833,"duration":8},{"text":"Host: Because efficiency actually matters more, right? Like you're if financing aside though financing is involved in all of this, if you have access to full compute for anything you do like leaving the one single model, right?","offset":1841,"duration":13},{"text":"Host: But if you actually feel pressure where you're like I can't serve um model of massive size for every use case like do you think that leads to any speciation?","offset":1854,"duration":12},{"text":"Host: Does that question make sense to you?","offset":1866,"duration":1},{"text":"Andrej Karpathy: The question makes sense. And I guess like what I'm what I'm what I'm struggling with is I don't think we've seen too much speciation just yet, right?","offset":1867,"duration":6},{"text":"Host: No.","offset":1873,"duration":1},{"text":"Andrej Karpathy: Uh we're seeing a monoculture of models.","offset":1874,"duration":1},{"text":"Host: Yeah.","offset":1875,"duration":1},{"text":"Andrej Karpathy: And there's like clearly pressure for like make a good code model put it back in the main merge again.","offset":1876,"duration":6},{"text":"Host: Yeah, yeah.","offset":1882,"duration":1},{"text":"Andrej Karpathy: Even though there already is pressure on the models. Uh I guess perhaps I I feel like there's a lot of very short-term supply crunch and like maybe that causes more speciation now.","offset":1883,"duration":12},{"text":"Andrej Karpathy: Yeah, I think fundamentally like the model the labs are serving a model and they don't really know what the end user is going to be asking about. Uh so maybe that's like some part of it because they kind of have to multitask over all the possible things they could be asked.","offset":1895,"duration":13},{"text":"Andrej Karpathy: But I think if you're coming to a business and maybe partnering on some specific problems you care about then maybe you would see that there. Uh or there would be some very high-value applications that are like more niche. Um but uh I think right now they're kind of like going after the totality of what's available.","offset":1908,"duration":15},{"text":"Andrej Karpathy: I don't think that the science of manipulating the brains is like fully developed yet partly.","offset":1923,"duration":4},{"text":"Host: What do you mean manipulating?","offset":1927,"duration":1},{"text":"Andrej Karpathy: Uh so like so fine-tuning without losing capabilities as an example. And I we don't have these primitives for actually like working with the intelligences in ways other than just context windows.","offset":1928,"duration":11},{"text":"Andrej Karpathy: Like context windows kind of just work and it's very cheap to manipulate etc. and this is how we're getting some of the customization etc. but I think if it was I think it's a bit more of a developing science of how you like more deeply adjust the models.","offset":1939,"duration":9},{"text":"Andrej Karpathy: How you have continuous learning maybe or how you uh how you fine-tune in certain area, how you get better in certain area or like how you actually touch the weights, not just the context windows.","offset":1948,"duration":10},{"text":"Andrej Karpathy: And so it's a lot more tricky I would say to touch the weights than just the context window uh because you're actually fundamentally changing the full model and potentially its intelligence.","offset":1958,"duration":-529},{"text":"Andrej Karpathy: And so uh so maybe it's just like not a fully developed science if that makes sense of speciation.","offset":1429,"duration":2},{"text":"Host: And it also has to be like cheap enough.","offset":1431,"duration":542},{"text":"Andrej Karpathy: Yeah.","offset":1973,"duration":1}],"startTime":1736},{"title":"Decentralized Swarms for AI Research","summary":"Andrej envisions a decentralized \"swarm\" approach to AutoResearch, similar to Folding@home. He theorizes that an untrusted pool of internet workers could contribute compute cycles to iteratively improve models, treating flops as a primary currency.","entries":[{"text":"Host: For that speciation to be worthwhile in these given contexts. Can I ask a question about uh like an extension to AutoResearch that you described in terms of um open ground you say okay well you know we have this thing um we need more collaboration surface around it essentially for people to contribute.","offset":1974,"duration":19},{"text":"Host: Um to research overall. Can you talk about that?","offset":1993,"duration":2},{"text":"Andrej Karpathy: Yeah, so we talked about AutoResearch has a single thread of like I'm going to try stuff in a loop. Uh but fundamentally the parallelization of this is like the interesting component.","offset":1995,"duration":9},{"text":"Andrej Karpathy: Uh and I guess I was trying to like play around with a few ideas but I don't have anything that like clicks as simply as like I don't have something that I'm like super happy with just yet, but it's something I'm like working on on the side when I'm not working on my Claw.","offset":2004,"duration":11},{"text":"Andrej Karpathy: Uh so I think like one issue is if you have a bunch of nodes uh of parallelization available to you, then it's very easy to just have multiple AutoResearchers talking through a uh a common system or something like that.","offset":2015,"duration":13},{"text":"Andrej Karpathy: What I was more interested in is how you can have an untrusted pool of workers out there on the internet.","offset":2028,"duration":5},{"text":"Host: Hmm.","offset":2033,"duration":1},{"text":"Andrej Karpathy: So for example in AutoResearch, um you're just trying to find uh the piece of code that trains a model to a very low validation loss.","offset":2034,"duration":6},{"text":"Andrej Karpathy: If anyone gives you a candidate commit, it's very easy to verify that that commit is correct is good. Like they somehow could claim from the internet that this piece of code will optimize uh much better and give you much better performance.","offset":2040,"duration":12},{"text":"Andrej Karpathy: You could just check. It's very easy. But probably a lot of work goes into that checking. Uh but fundamentally they could lie and etc. So you're basically dealing with a similar kind of pro—it's almost actually like looks a little bit like—","offset":2052,"duration":12},{"text":"Andrej Karpathy: My designs that incorporate an untrusted pool of workers actually like a little bit more like a blockchain a little bit. Uh because instead of blocks you have commits and these commits can build on each other and they contain like changes to the code as you're improving it.","offset":2064,"duration":16},{"text":"Andrej Karpathy: Uh and the proof of work is basically doing tons of experimentation to find the commits that work. Uh and that's hard. Um and then the reward is just being on the leaderboard right now.","offset":2080,"duration":10},{"text":"Andrej Karpathy: There's no monetary reward whatsoever. Uh but I don't want to push the analogy too far, but it fundamentally has this issue where huge amount of search goes into it, but it's very cheap to verify that a candidate solution is indeed good because you can just train a single, you know, someone had to try 10,000 ideas but you just have to check that the thing that they produced actually works.","offset":2090,"duration":15},{"text":"Andrej Karpathy: Because the 9,999 of them didn't work, you know? Um and so basically long story short it's like you have to come up with a system where an untrusted pool of workers can collaborate with a trusted pool of workers uh that do the verification and the whole thing is kind of like asynchronous and works and uh and so on.","offset":2105,"duration":18},{"text":"Andrej Karpathy: And is is like safe from a security perspective because if anyone sends you arbitrary code and you're going to run it, that's very sketchy and dodgy. So um but fundamentally it should be totally possible.","offset":2123,"duration":10},{"text":"Andrej Karpathy: So you're familiar with projects like SETI@home and Folding@home. All of these problems have a similar kind of uh setup. So Folding@home, you're folding a protein um and it's very hard to find a configuration that is low energy.","offset":2133,"duration":13},{"text":"Andrej Karpathy: But if someone finds a configuration that they evaluate to be low energy, that's perfect. You can just use it. You can easily verify it. So a lot of things have this property that, you know, very expensive to come up with but very cheap to verify.","offset":2146,"duration":11},{"text":"Andrej Karpathy: And so in all those cases things like Folding@home or SETI@home or AutoResearch@home will be good fits. And so um long story short a swarm of agents on the internet could collaborate to improve LLMs and could potentially even like run circles around frontier labs.","offset":2157,"duration":16},{"text":"Andrej Karpathy: Like who knows, you know? Um yeah like maybe that's even possible. Like frontier labs have a huge amount of trusted compute, but the Earth is much bigger and has huge amount of untrusted compute. But if you put systems in check, systems in place that, you know, deal with this, then maybe it is possible that the swarm out there could uh could come up with a with better with better solutions.","offset":2173,"duration":16},{"text":"Andrej Karpathy: And people just kind of like contribute cycles um to a thing that they care about. And so sorry so the last thought is lots of companies or whatnot they could maybe have like their own uh things that they care about, and you if you have compute capacity you could contribute to different kind of AutoResearch tracks.","offset":2189,"duration":17},{"text":"Andrej Karpathy: Like maybe you care about certain you know you care about like cancer or something like that of certain type. You don't have to just donate money to an institution, you actually could like purchase compute and then you could join the AutoResearch swarm for that project, you know?","offset":2206,"duration":13},{"text":"Andrej Karpathy: Um so if everything is rebundled into AutoResearchers then compute becomes the thing that you're contributing to the pool.","offset":2219,"duration":6},{"text":"Host: Yeah, that's very inspiring and it's also interesting. Like I don't I don't know how far this goes, but it is interesting that at least some audience of people, you know, here in Silicon Valley or lining up at um retail stores in China have discovered that like having access to personal compute is interesting again.","offset":2225,"duration":17},{"text":"Host: Yeah. So maybe they're really motivated to do that for their Claws and then they can uh contribute to AutoResearch.","offset":2242,"duration":4},{"text":"Andrej Karpathy: It's almost like dollar is the thing everyone cares about, but is flop the thing that actually everyone cares about in the future? Like is there going to be like a flippening almost of like what the thing that you care about?","offset":2246,"duration":10},{"text":"Andrej Karpathy: Like right now for example it's really hard to get compute even if you have money. Yeah. So actually it almost seems like the flop is like dominant uh in a certain sense. Um yes so uh so maybe that's kind of like kind of like that. Like how much how many flops do you control instead of like what wealth do you control?","offset":2256,"duration":14},{"text":"Andrej Karpathy: I don't actually think that's true, but it's kind of interesting to think about.","offset":2270,"duration":2}],"startTime":1974},{"title":"AI Labor Market Impact and Jevons Paradox","summary":"Andrej unpacks his analysis of BLS jobs data, predicting that AI will primarily unhobble digital information processing first. He also notes that the demand for software engineers will paradoxically increase as code becomes cheaper and easier to produce.","entries":[{"text":"Host: The last thing you released was like a little bit of jobs data analysis. Is that right?","offset":2272,"duration":5},{"text":"Andrej Karpathy: Yeah.","offset":2277,"duration":1},{"text":"Host: What um and it touched a nerve even though you were just like visualizing some public data. Uh what was you know what were you curious about?","offset":2278,"duration":9},{"text":"Andrej Karpathy: Yeah, I guess I was curious to um I mean everyone is like really it's everyone is really thinking about the impacts of AI on the job market and what it's going to look like. So I was just interested to take a look like what does the job market look like?","offset":2287,"duration":11},{"text":"Andrej Karpathy: Where are the different roles? Um and how many people are in different professions? And I was like really just interested to like look through uh the individual cases and try to think myself about like you know with these AIs and how they're likely to evolve like are these going to be tools that people are using? Are these going to be displacing tools for these uh professions?","offset":2298,"duration":22},{"text":"Andrej Karpathy: And like what are the current professions and how are they going to change? Are they going to grow or uh adjust to a large extent? Or like what could be new professions? So it was really just like a way to fuel my own chain of thought about the industry I suppose.","offset":2320,"duration":12},{"text":"Andrej Karpathy: Um and so uh yeah the jobs data basically is just a Bureau of Labor Statistics. Uh they actually have a percent outlook for each profession about how much it's expected to grow over the next I think almost a decade.","offset":2332,"duration":11},{"text":"Host: We need a lot of healthcare workers.","offset":2343,"duration":1},{"text":"Andrej Karpathy: Yeah, so so they've already made those projections and I'm not sure actually 100% what the methodology was that they that they put into the projections. Um I guess I was interested to color things by like if people think that what's primarily being um developed now is this kind of like more digital AI that is kind of like almost like these ghost or spirit entities that can like interact in the digital world and uh manipulate a lot of like digital information.","offset":2344,"duration":23},{"text":"Andrej Karpathy: And they currently don't really have a physical embodiment uh or presence. And the physical stuff is probably going to go slightly slower because you're manipulating atoms. So flipping flipping bits and and the ability to copy paste digital information is like makes everything a million times faster than accelerating matter, you know?","offset":2367,"duration":17},{"text":"Andrej Karpathy: So um so energetically I just think we're going to see a huge amount of activity in the digital space, huge amount of rewriting, huge amount of activity boiling soup.","offset":2384,"duration":7},{"text":"Andrej Karpathy: And I think the we're going to see something that in the digital space goes at the speed of light compared to I think what's going to happen in the physical world to some extent if would be the extrapolation.","offset":2391,"duration":8},{"text":"Andrej Karpathy: And so I think like um there's currently kind of like I think an overhang where there can be like a lot of unhobbling almost potentially of like a lot of digital information processing that used to be done by computers and people.","offset":2399,"duration":14},{"text":"Andrej Karpathy: And now with AIs as like a third kind of manipulator of digital information there's going to be a lot of refactoring in those in those uh disciplines. Um but the physical world is actually going to be I think behind that by some amount of time.","offset":2413,"duration":11},{"text":"Andrej Karpathy: And so I think what's really fascinating to me is like so that's why I was highlighting the professions that fundamentally manipulate digital information. This is work you could do from your home etc. because I feel like those will be like things will change.","offset":2424,"duration":12},{"text":"Andrej Karpathy: And that doesn't mean that there's going to be less of those jobs or more of those jobs because it that has to do with like demand elasticity and many other factors, but things will change in these professions because of these new tools and uh because of this upgrade to the nervous system of the human superorganism if you want to think of it that way.","offset":2436,"duration":15},{"text":"Host: Given the look you had at the data, do you have either any observations or um uh guidance for people facing the job market or thinking about what to study now or what skills to develop?","offset":2451,"duration":3},{"text":"Host: I mean we can all go get like I'm very thankful that I have to like meet people for my job right now. Uh we can be more physical, yeah.","offset":2454,"duration":15},{"text":"Andrej Karpathy: Could you do your work from home though? Uh I could.","offset":2469,"duration":3},{"text":"Host: I think there are relationship parts of it that are hard, but most of it I could.","offset":2472,"duration":5},{"text":"Andrej Karpathy: Yeah, I think it's really hard to tell because again like the job market is extremely diverse and I think the answers will probably vary, but to a large extent like these tools are extremely new, extremely powerful, and so just being you know just trying to keep up with it is like the first thing.","offset":2477,"duration":14},{"text":"Andrej Karpathy: Um and uh yeah because I think a lot of people kind of like dismiss it—","offset":2491,"duration":2},{"text":"Host: Or they're afraid of it.","offset":2493,"duration":1},{"text":"Andrej Karpathy: —or they're afraid of it etc. which is totally understandable of course. Yeah I think like um it's fundamentally an empowering tool at the moment.","offset":2494,"duration":8},{"text":"Andrej Karpathy: Um and these jobs are bundles of tasks, and some of these tasks can go a lot faster, and so people should think of it as primarily a tool that it is right now. Um and I think the long-term future of that is uncertain.","offset":2502,"duration":9},{"text":"Andrej Karpathy: Yeah it's kind of really hard to forecast to be honest and like I'm not professionally like doing that really and I think this is a job of like economists to do properly.","offset":2511,"duration":9},{"text":"Host: You are an engineer though, uh and like one thing I thought was interesting is that like the the demand for engineering jobs is continuing to increase.","offset":2520,"duration":8},{"text":"Host: Um I I can't tell if that's like a temporary phenomenon I'm not sure how I feel about it yet, do you know?","offset":2528,"duration":5},{"text":"Andrej Karpathy: Yeah, that's like the demand elasticity almost. Like uh software was scarce, right? And so the reason we don't have more demand for software is just scarcity and it's too expensive.","offset":2533,"duration":8},{"text":"Host: It's too expensive, yeah.","offset":2541,"duration":1},{"text":"Andrej Karpathy: So if the barrier comes down then actually you have the Jevons paradox which is like, you know, actually the demand for software actually goes up. It's cheaper and there's more more for it, more powerful.","offset":2542,"duration":7},{"text":"Andrej Karpathy: The classical example of this always is the ATMs and the bank tellers. Because there was a lot of like fear that uh ATMs and computers basically uh would displace tellers, but what happened is they made like the cost of operation of uh of a bank branch much cheaper and so there were more bank branches so there were more tellers is like the canonical example people cite.","offset":2549,"duration":22},{"text":"Andrej Karpathy: But basically it's just Jevons paradox. Like something becomes cheaper so there's a lot of unlocked demand for it. So I do think that that's probably I do have like a cautiously optimistic view of this in software engineering where I do think the uh it does seem to me like the demand for software will be extremely large.","offset":2571,"duration":17},{"text":"Andrej Karpathy: Um and it's just become a lot cheaper. And um so I do think that for quite some time, um it's very hard to forecast, but it does seem to me like right now at least locally there's going to be more demand for software.","offset":2588,"duration":10},{"text":"Andrej Karpathy: Uh because software is amazing. It's like, you know, digital information processing, you're not forced to use like arbitrary tools that were given to you that are imperfect in various ways. You're not forced to subscribe to what exists.","offset":2598,"duration":12},{"text":"Andrej Karpathy: Uh code is now ephemeral and it can change and it can be modified. Um and so I think there's going to be a lot of activity in the digital space to like rewire everything in a certain sense, and I think it's going to create a lot of demand for for this kind of stuff.","offset":2610,"duration":13}],"startTime":2272},{"title":"The Conundrum of Working at Frontier Labs","summary":"Andrej reflects on the tension of working at a frontier AI lab versus operating as an independent researcher. He discusses balancing the need to stay connected to cutting-edge capabilities against the desire for unconstrained alignment with humanity.","entries":[{"text":"Andrej Karpathy: I think long term, uh yeah obviously even with AutoResearch like OpenAI or or, you know, Anthropic or these other labs like they're employing what like a thousand something researchers, right? These researchers are basically like glorified AutoResearcher—","offset":2623,"duration":16},{"text":"Host: You know?","offset":2639,"duration":1},{"text":"Andrej Karpathy: They're like automating themselves away like actively and this is like the thing they're all trying to do.","offset":2640,"duration":4},{"text":"Host: Yeah. Some of those researchers also fear fear the psychosis, right? Because they can it's working.","offset":2644,"duration":7},{"text":"Host: And so they're like uh it's over for me too.","offset":2651,"duration":1},{"text":"Andrej Karpathy: I did spend a bunch of time going around OpenAI and I was like, you guys realize if we're successful like we're all out of job, like like just going we're just building automation for Sam or something like that, like oh or the board I'm not sure, but like uh just building this automation for yeah the board or the CEO or something like that and we're all out of our job and maybe contributing on sides.","offset":2652,"duration":20},{"text":"Andrej Karpathy: And so yeah it's kind of un-unnerving from that perspective.","offset":2672,"duration":4},{"text":"Host: Is it okay if I ask a NOMS question? Um you know you could be doing that, right? AutoResearching with a lot of compute scale and a bunch of colleagues at one of the frontier labs. Like why not?","offset":2676,"duration":9},{"text":"Andrej Karpathy: Well I was there for a while, right? Like and I did re-enter. So to some extent I agree and I think that there are many ways to slice this question. It's a very loaded question a little bit.","offset":2685,"duration":9},{"text":"Andrej Karpathy: Um I will say that I feel very good about like what people can contribute and their impact uh outside of the frontier labs obviously.","offset":2694,"duration":6},{"text":"Andrej: ...not in the industry, but also in like more ecosystem level roles. Um, so your role, for example, is more like ecosystem level. My role currently is also kind of more on ecosystem level. And I feel very good about like impact that people can have in those kinds of roles.","offset":2700,"duration":12},{"text":"Andrej: I think conversely, there's there are definite problems in my mind for um basically aligning yourself way too much with the frontier labs too. So fundamentally, I mean, you're, you have a huge amount of financial incentive to um with these frontier labs.","offset":2712,"duration":12},{"text":"Andrej: And by your own admission, the the AIs are going to like really change humanity and society in very dramatic ways, and here you are basically like building the technology and benefiting from it, like and being like very allied to it through financial means.","offset":2724,"duration":17},{"text":"Andrej: Like this was the conundrum that was in um at the heart of, you know, how OpenAI was started in the beginning, like this was the conundrum that were trying to solve. Um, and so, you know, that so it's kind of...","offset":2741,"duration":10},{"text":"Host: It's still not resolved.","offset":2751,"duration":1},{"text":"Andrej: The conundrum is still not like fully resolved. So that's number one. You can't you're not a completely free agent, and you can't actually like be part of that conversation in a fully autonomous, um, free way. Like if you're inside one of the frontier labs, like there's some things that you can't say,","offset":2752,"duration":18},{"text":"Andrej: ...and conversely, there are certain things that the organization wants you to say, and, you know, they're not going to twist your arm, but you feel the pressure of like what you should be saying, you know, because like obviously, otherwise, like really awkward conversations, strange side-eye, like what are you doing, you know?","offset":2770,"duration":11},{"text":"Andrej: So you can't like really be an independent agent. And I feel like a bit more like aligned with humanity in certain sense outside of the frontier lab, because I don't I'm not subject to those pressures almost, right? And I can say whatever I want, or...","offset":2781,"duration":13},{"text":"Andrej: Yeah, I would say in the frontier labs, like, um, you can have like impact there, of course, as well. So um, but there's many researchers, and maybe you're one of them, maybe your ideas are really good, etc.","offset":2794,"duration":10},{"text":"Andrej: Maybe there's a lot of decision making to to do, and you want to be in a position where you are in the room with those conversations when they come up. I do think that currently the stakes are like overall fairly low, and so everything is kind of like nice.","offset":2804,"duration":10},{"text":"Andrej: But ultimately, at the end of the day, like when the stakes are really high, etc., if you're an employee at an organization, I don't actually know how much sway you're going to have on the organization, what it's going to do. Like fundamentally, at the end of the day, um, it's you're not like really in charge.","offset":2814,"duration":14},{"text":"Andrej: You're like in the room and you're contributing ideas, but you're not like really in charge of that entity that you're that you're a part of. So those are like some sources of misalignment, I think, to some extent.","offset":2828,"duration":10},{"text":"Andrej: I will say that like in one way I do agree a lot with that sentiment that um I do feel like in the like the labs for better or worse, they're opaque and a lot of work is there, and they're kind of like at the edge of capability and what's possible,","offset":2838,"duration":14},{"text":"Andrej: ...and they're working on what's coming down the line. And I think if you're outside of the frontier lab, uh your your judgment fundamentally will start to drift, because you're not part of the, you know, what's coming down the line.","offset":2852,"duration":12},{"text":"Host: Right.","offset":2864,"duration":1},{"text":"Andrej: And so I feel like my judgment will inevitably start to drift as well, and I won't actually have an understanding of how these systems actually work under the hood. It's an opaque system. Um, I won't have a good understanding of how it's going to develop, and etc.","offset":2865,"duration":12},{"text":"Andrej: And so I do think that in that sense, I agree, and something I'm nervous about. I think it's worth basically basically being in touch with what's actually happening, and actually being in the frontier lab.","offset":2877,"duration":10},{"text":"Andrej: And if if some of the frontier labs would have me come for, you know, some amount of time and do really good work for them, and then maybe come in...","offset":2887,"duration":6},{"text":"Host: Guys, he's looking for a job! This is super exciting.","offset":2893,"duration":2},{"text":"Andrej: ...then I think that's maybe a good setup, because I kind of feel like it kind of, um, you know, maybe that's like one way um to actually be connected to what's actually happening, but also not feel like you're necessarily fully controlled by by those entities.","offset":2895,"duration":11},{"text":"Andrej: So I think honestly, in my mind, like Noam can probably get do extremely good work at OpenAI, but also I think his most um impactful work could very well be outside of OpenAI.","offset":2906,"duration":13},{"text":"Host: Noam, that's a call to be an independent researcher with AutoResearch.","offset":2919,"duration":2},{"text":"Andrej: Yeah, there's many things to do on the outside, and it's a and I think ultimately I think the ideal solution maybe is like yeah, going back and forth, uh or um yeah, and I think fundamentally you can have really amazing impact in both places.","offset":2921,"duration":11},{"text":"Andrej: So very complicate- I don't know, like it's a very loaded question a little bit, but I mean I joined the frontier lab and now I'm outside, and then maybe in the future I'll want to join again, and I think um that's kind of like how I look at it.","offset":2932,"duration":13}],"startTime":2623},{"title":"The Trajectory of Open Source AI","summary":"The conversation turns to the evolving gap between closed frontier models and open-source alternatives. Andrej argues that open-source AI is functionally acting as the \"Linux\" of intelligence and is crucial for avoiding a dangerous centralization of power.","entries":[{"text":"Host: One question related to what visibility does the world or the AI ecosystem have into um the frontier is like how how close open source is to the frontier, um and how sustainable that is.","offset":2945,"duration":14},{"text":"Host: I I think it is quite surprising, the entire sequence of events actually, from like having a handful of Chinese models and global models, and I think people are going to continue releasing here in the near term that are closer than much of the industry anticipated from a capability perspective.","offset":2959,"duration":17},{"text":"Host: Um, I don't know if you're surprised by that, you're a long term contributor to open source, like what's your prediction here?","offset":2976,"duration":6},{"text":"Andrej: Yeah, so roughly speaking, basically, the um, yeah, the closed models are ahead, but like people are monitoring the number of months that sort of like open source models are behind.","offset":2982,"duration":10},{"text":"Host: And it started with there's nothing, and then it went to 18 months, and now it's like our convergence.","offset":2992,"duration":4},{"text":"Andrej: Yeah, and then a convergence, right? So um, maybe they're behind by like what is the latest? Maybe like eight months, six months, eight months kind of right now. Yeah, I'm a huge fan of open source, obviously.","offset":2996,"duration":10},{"text":"Andrej: So for example, in operating systems, you have like closed source like, you know, Windows and macOS, these are large software projects, kind of like what LLMs are going to become. And there's Linux.","offset":3006,"duration":9},{"text":"Andrej: But Linux is very easy, like actually Linux is an extremely successful project. It runs on the vast majority of computers. Like last time I checked, was it like 60 percent or something like run Linux?","offset":3015,"duration":10},{"text":"Andrej: Um, and that's because there is a need in industry to have a common open platform that everyone feels um sort of safe using. I would say like the industry has always felt a demand for that kind of a project to exist.","offset":3025,"duration":11},{"text":"Andrej: Um, and I think the same is true now, and that's why businesses actually want- there's demand for this kind of a um a thing to exist. The big difference is that everything is capital um there's large CapEx that goes into this. Um, so I think that's where things like fall apart a little bit, and make it a bit harder to to compete in some sense.","offset":3036,"duration":18},{"text":"Andrej: Um, I I do think that the current models are very good. The other thing that I think is like really interesting is that for the vast majority of like consumer use cases and things like that, even like current open source models are actually quite good, I would say.","offset":3054,"duration":12},{"text":"Andrej: And I think like if you go forward like more um more years, it does seem to me like a huge amount of like simple use cases are going to be well covered and actually even run locally.","offset":3066,"duration":11},{"text":"Andrej: Um, but there's going to be always like some demand for like frontier intelligence, and that that can actually be an extremely large piece of the pie. But it could be that the frontier, the need for frontier intelligence is going to be like, you know, Nobel Prize kind of work, or like let's move Linux from C to Rust is going to be like bigger projects, you know, like scoped in that kind of a way.","offset":3077,"duration":19},{"text":"Andrej: And there's going to be maybe more um and maybe that's where a lot of the frontier closed intelligences were going are going to be interacting with. And open source kind of like going to eat through a lot of the more basic use cases or something like that.","offset":3096,"duration":14},{"text":"Andrej: You know, at some point, what is frontier today is going to be, you know, probably later this year what's frontier today in terms of what I'm using right now from the closed labs might be open source, and that's going to be doing a lot of work.","offset":3110,"duration":11},{"text":"Andrej: So I kind of expect that this dynamic will actually basically continue. Like we'll have frontier labs that have closed um AIs that are kind of like these oracles, and then we'll have open source kind of like behind by some amount of months.","offset":3121,"duration":10},{"text":"Andrej: And I kind of expect that to uh to continue. And I actually think that's like a pretty pretty good setup um overall, um because I I'm a little bit hesitant of having um I don't actually think it's like structurally, I think there's some systemic risk attached to just having intelligences that are closed, and that's like that's it.","offset":3131,"duration":16},{"text":"Host: Mm-hmm.","offset":3147,"duration":0},{"text":"Andrej: And I think that that's a, you know, centralization has a very poor track record in my view, um in the in the past.","offset":3147,"duration":7},{"text":"Host: You mean like in political or economic systems? In general?","offset":3154,"duration":2},{"text":"Andrej: Yes. Exactly.","offset":3156,"duration":1},{"text":"Host: Spoken like an Eastern European, yes.","offset":3157,"duration":3},{"text":"Andrej: Okay, exactly. I think there's like a lot of pretty bad precedents, and so I want there to be a thing that is maybe not at the edge of capability because it's new and unexplored etc., but I want there to be a thing that's behind and that is kind of like a common working space for intelligences that the entire industry has access to. Yeah, that seems to me like a pretty decent power balance for the industry.","offset":3160,"duration":16},{"text":"Host: Yeah.","offset":3176,"duration":5},{"text":"Andrej: Yeah. I also think there's just like there are many problems to solve, right? Like if you keep advancing intelligence from the frontier, we can do new things and there are a lot of like very big problems for humanity, right?","offset":3181,"duration":11},{"text":"Host: Yeah.","offset":3192,"duration":1},{"text":"Andrej: And so like it seems that that will continue to be a very expensive game, and so I want to like root for labs that are doing that, because there are problems we cannot solve without continuing to advance the models in a very expensive way.","offset":3193,"duration":11},{"text":"Host: And yet, as you point out, like if what we have today as frontier is open, that's a lot of capability.","offset":3204,"duration":9},{"text":"Andrej: Yeah.","offset":3213,"duration":1},{"text":"Host: Right. And and so I think, you know, the power of that or the democratization of that seems like very useful and also healthy.","offset":3214,"duration":8},{"text":"Andrej: Yeah. I think basically by accident we're actually like in an okay spot in an optimal yeah, yeah. By accident we are happen to be in a good spot in a certain sense.","offset":3222,"duration":10},{"text":"Host: Um well, and and to some degree, the the longer this endures, like this dynamic, um the the healthier of a spot like the ecosystem might be in, right? Because you have more and more area under the curve.","offset":3232,"duration":13},{"text":"Andrej: And I will say that even on the closed side, I almost feel like it's been like even further centralizing in recently, because I think a lot of the frontrunners are like not necessarily like the top tier.","offset":3245,"duration":9},{"text":"Andrej: And so um yeah, I like in that sense I think it's um not super ideal. I would love there to be more more frontier labs because, yeah, I'm like by default very suspicious of like um I want there to be more people in the room, I want-","offset":3254,"duration":13},{"text":"Andrej: I think like in machine learning, ensembles always outperform any individual model, and so I want there to be ensembles of people thinking about all the hardest problems, and I want there to be ensembles of people in a room when they um to be all well-informed and to make all those decisions, you know, so...","offset":3267,"duration":17},{"text":"Andrej: I don't want it to be like a closed doors with two people or three people. I feel like that's like not a good not a good future. I almost wish like there were more labs is long story short, and I all- I do think that the open source um has a uh has a place to play. I hope it sticks around, and I basically- it's currently slightly behind and that's actually kind of a good thing.","offset":3284,"duration":15}],"startTime":2945},{"title":"Robotics and Interfacing with the Physical World","summary":"Andrej compares the timeline of AI in the physical world to digital advancements, explaining why interacting with atoms will lag behind bits. However, he notes that creating interfaces to feed physical data to models will be a massive upcoming opportunity.","entries":[{"text":"Host: Okay, you worked on the precursor to generalized robotics, autonomy, um in cars, right? Um, a a lot has happened in the last couple months with robotics companies as well, like acceleration of really impressive generalization of environment, of tasks, like increasing long horizon tasks, lots of money going into the space. Like is it going to happen? Has anything in your view changed recently?","offset":3299,"duration":29},{"text":"Andrej: Um, so like my view is kind of informed by what I saw in self-driving. And I do feel like self-driving is the first robotics application. So probably what I saw is at the time, like 10 years ago, there were a large number of startups, and I kind of feel like um like most of them basically didn't long-term make it.","offset":3328,"duration":17},{"text":"Andrej: Um, and what I saw is that like a lot of capital expenditure had to go in, and a lot of time. And so um I think it like I think robotics because it's so difficult and so messy and requires huge amount of capital investment and a lot of like conviction, um just it's like a big problem.","offset":3345,"duration":15},{"text":"Andrej: And I think atoms are really hard. So I kind of feel like they will lag- it will lag behind what's going to happen in the digital space. And in digital space, there's going to be a huge amount of unhobbling.","offset":3360,"duration":10},{"text":"Andrej: Basically like things that weren't super efficient becoming a lot more efficient by like a factor of a hundred because bits are so much easier. And so I think currently in terms of what's going to change and like where the activity is, I kind of feel like digital space is going to like change a huge amount, and then the physical space will lag behind.","offset":3370,"duration":17},{"text":"Andrej: And what I find very interesting is like this interface in between them as well. Because I think in this like if we do have more agents acting on behalf of humans and more agents kind of like talking to each other and and doing tasks and participating in the kind of economy of agents etc., um you're going to run out of things that you're going to do purely in the digital space.","offset":3387,"duration":20},{"text":"Andrej: At some point you have to go to the universe and you have to ask it questions. Um you have to run an experiment and see what the universe tells you to get back to learn something. And so we currently have a huge amount of like digital work uh because there's an overhang in how much we collectively thought about what already is digital.","offset":3407,"duration":18},{"text":"Andrej: So we just didn't have enough thinking cycles among the humans to think about all the information that is already digital and already uploaded. Um and so we're going to start running out of stuff that is actually like um already up- uploaded. Uh so you're going to at some point read all the papers and process them and have some ideas about what to try.","offset":3425,"duration":16},{"text":"Andrej: But um yeah, we're just going- I don't actually know how much you can like get intelligence that's like fully closed off and with just the information that's available to it, you know? And so I think what's going to happen is first there's going to be huge amount of unhobbling, and I think there's huge amount of work there.","offset":3441,"duration":14},{"text":"Andrej: Then actually it's going to move to like the interfaces between physical and digital. So I and that's like sensors of like seeing the world and actuators of like doing something to the world.","offset":3455,"duration":9},{"text":"Andrej: So I think a lot of interesting companies will actually come from that interface of like can we feed the superintelligence in a certain sense data, and can we actually like take data out and manipulate the physical world um per its bidding, if you want to like anthropomorphize the whole thing, right?","offset":3464,"duration":16},{"text":"Andrej: And then the physical world actually I almost feel like the total addressable market etc. in terms of like the amount of work and so on is is massive, possibly even much larger maybe what can happen in digital space.","offset":3480,"duration":12},{"text":"Andrej: So I actually think it's like a much bigger opportunity as well, but um I do feel like it's huge amount of work and in my my view the atoms are just like a a million times harder. So um so it will lag behind, but it's also I think a little bit of a bigger market.","offset":3492,"duration":13},{"text":"Andrej: So it's kind of like uh yeah I think the opportunity is kind of like follow that kind of trajectory. So right now this digital is like my main interest, then interfaces would be like after that, and then maybe like some of the physical things, um like their time will come and they'll be huge uh when they do come.","offset":3505,"duration":17},{"text":"Host: Well it's an interesting framework for it too because uh certain things, not the things I'm working on right now, but certain things are much easier even in the world of atoms, right? Like if you just think about like read and write to the physical world, like read, like sensors, cameras, like there's a lot of existing hardware and you can imagine like enriching agent capabilities or capturing a lot of new data if you're just clever about it and like you don't necessarily have to invest a lot to like get something valuable.","offset":3522,"duration":28},{"text":"Andrej: Yeah.","offset":3550,"duration":1},{"text":"Host: Yeah.","offset":3551,"duration":1},{"text":"Andrej: So like examples of this that I saw for example are, you know, um a friend of mine, Liam, is run- is a CEO of Periodical. Um I visited them last week, so it's just on top of mind. Like they're trying to do auto research for material science. Um and so in that case it's like the sensors to the intelligence are actually like pretty expensive lab equipment.","offset":3552,"duration":16},{"text":"Andrej: And the same is true in biology, I think a lot of people are very interested in engineering biology, and, you know, the sensors will be more than just like video cameras if that makes sense. And then the other thing I I saw, for example, is companies that are trying to have um like you basically pay people for training data...","offset":3568,"duration":14},{"text":"Host: Yeah. Programmatically.","offset":3582,"duration":1},{"text":"Andrej: ...as an example to feed... Yeah, to feed- to feed the borg. Um and so like these are all examples of like sensors in a certain sense. So they take many diverse shapes and forms, if that makes sense.","offset":3583,"duration":9},{"text":"Host: Hmm. Yeah, so I'm looking forward to the point where I can ask for a task in the physical world and I can put a price on it and just tell the agent like, you know, you figure out how to do it. Go get the data.","offset":3592,"duration":12},{"text":"Andrej: I'm actually kind of surprised we don't have enough like information markets. Like for example, if Polymarket or other betting markets or even stocks etc., if they have so much autonomous activity and rising amount of activity, like uh why should- like for example, if Iran was just happening now, like how come there isn't a process where like taking a photo or video from somewhere in Tehran should cost like 10 bucks.","offset":3604,"duration":19},{"text":"Andrej: Like someone should be able to pay for that, you know? And that's an example of like feeding the intelligence. There's not going to be a human looking at it, it's going to be like agents who are trying to guess the betting games and stock markets and so on.","offset":3623,"duration":12},{"text":"Andrej: Hmm. So I kind of feel like the agentic vibe is still like fairly new that there's no like mechanisms for this, but this is an example of what I think might happen. There's a good um book that maybe is inspiring called Daemon. Uh-huh. You've potentially read it?","offset":3635,"duration":12},{"text":"Host: In Daemon the intelligence um ends up like puppeteering almost a little bit like humanity in certain sense, you know? And so humans are kind of like its actuators, but humans are also like its sensors. Um and so maybe I think like collectively like society will kind of like reshape in a certain way in uh to serve that kind of a uh that will kind of like end up happening collectively across the industry where, yeah, there's just a lot more automation and it has certain needs and kind of humans will be serving those needs of that of that machine, not necessarily like to each other.","offset":3647,"duration":24},{"text":"Host: Well we were um on this very specific point of uh like missing pieces of training data, we needed um we needed something like auto research, right? Like we need the training cycle or the SFTP piece to be um far more mechanized.","offset":3671,"duration":14},{"text":"Andrej: Uh-huh. For for which part?","offset":3685,"duration":2},{"text":"Host: In order to make the um collection, like to in order to take the human out of the loop to ask for a task that is just like improve my model quality with new data, right?","offset":3687,"duration":9},{"text":"Andrej: Um, yes.","offset":3696,"duration":1},{"text":"Host: Does that make sense to you? Like we um, if you can't have the model do the training runs by itself, then your ability to do this as a like closed-loop task with uh by pricing data is um more challenged.","offset":3697,"duration":14},{"text":"Andrej: Uh, yes, 100%. Yeah. But the thing is-","offset":3711,"duration":3},{"text":"Host: But now we go.","offset":3714,"duration":2},{"text":"Andrej: The thing is for LLM training, it actually is like very easily it like really fits the paradigm. Um, so you'd actually expect-","offset":3716,"duration":7},{"text":"Host: Yeah, clean metric.","offset":3723,"duration":1},{"text":"Andrej: Yeah, like LLM training actually fits the paradigm really well, really easily. Like all the optimization of all the code and so it runs faster, and then you also have like metrics that you can optimize against.","offset":3724,"duration":9},{"text":"Andrej: I do think that if you had an autonomous loop over those metrics, there's going to be a lot of like goodhearting going on where the system will like overfit to those metrics. And so but then you can use the system to devise more metrics and you just have really good coverage. So it's kind of hard to tell but um in a certain sense it's like a pretty pretty good fit.","offset":3733,"duration":17}],"startTime":3299},{"title":"MicroGPT and the Future of AI Education","summary":"Andrej discusses MicroGPT, a project distilling neural network training into 200 lines of code. He explains how AI agents are fundamentally reshaping education by serving as personalized, infinitely patient teachers, shifting the focus from human-readable documentation to agent-readable instructions.","entries":[{"text":"Host: I want to talk about a little um tiny side project you have before we end. Um tell me about the MicroGPT effort.","offset":3750,"duration":8},{"text":"Andrej: Oh yeah. Okay, so MicroGPT. So, I have this like running obsession of like maybe a decade or two of just like simplifying and boiling down the basically LLMs uh to like their bare essence. And I've had a number of projects along these lines, so like NanoGPT and um MakeMore and uh Micro- Micrograd etc.","offset":3758,"duration":20},{"text":"Andrej: So I feel like MicroGPT is now the state of the art of me trying to like just boil it down to the essence. Because the thing is like training neural nets and LLMs specifically um is huge amount of code, but all of that code is actually complexity from efficiency.","offset":3778,"duration":13},{"text":"Host: Hmm.","offset":3791,"duration":1},{"text":"Andrej: It's just because you need it to go fast. If you don't need it to go fast and you just care about the algorithm, then that algorithm actually is 200 lines of Python. Very simple to read. And this includes comments and everything.","offset":3792,"duration":10},{"text":"Andrej: Um, because you just have like uh your data set which is a text, um and you need your neural network architecture which is like 50 lines, you need to do your forward pass, and then you have to do your backward pass to calculate the gradients.","offset":3802,"duration":9},{"text":"Andrej: And so an auto-grad engine uh to calculate the gradients like 100 lines, and then you need an optimizer, an Adam, for example, uh which is like again 10 lines, really. And so putting everything together in a training loop is like, yeah, 200 lines.","offset":3811,"duration":12},{"text":"Andrej: And what was interesting to me like normally before like maybe a year ago or more, if I had come up with MicroGPT, I would be tempted to basically explain to people like have a video like stepping through it or something like that, um and I actually tried to make that video a little bit, and I tried to make like a little guide to it and so on.","offset":3823,"duration":16},{"text":"Andrej: But I kind of realized that this is is not really is not really adding too much because people because it's already so simple that it's 200 lines, that anyone could ask their agent to explain it in various ways and the agents- like I'm not explaining to people anymore, I'm explaining it to agents.","offset":3839,"duration":16},{"text":"Andrej: If you can explain it to agents, then agents can be the router and they can actually target it to the human in their language with infinite, um you know, patience and uh just at their capability and so on.","offset":3855,"duration":11},{"text":"Host: Right, if I don't understand um this particular function, I can ask the agent to explain it to me like three different ways, and I'm not going to get that from you.","offset":3866,"duration":7},{"text":"Andrej: Yeah. Exactly. Yeah. And so I kind of feel like, you know, what is education? Like it used to be guides, it used to be lectures, it used to be this thing, but now I feel like now more I'm explaining things to agents.","offset":3873,"duration":8},{"text":"Andrej: And maybe I'm coming up with skills um where like um so basically skill is just a way to instruct the agent how to teach the thing. So maybe I could have a skill for MicroGPT of the progression I imagine the agent should take you through if you're interested in understanding the codebase.","offset":3881,"duration":13},{"text":"Andrej: And it's just like hints to the model to like oh first start off with this and then with that, and so I could just script the curriculum a little bit as a skill. Um so so I don't feel like um yeah I feel like there's going to be less of like explaining things directly to people and it's going to be more of just like does the agent get it?","offset":3894,"duration":16},{"text":"Andrej: And if the agent gets it they'll do the explanation. And we're not fully there yet because they I still can I still think I can probably explain things a little bit better than the agents, but I still feel like the models are improving so rapidly that um I feel like it's a losing battle to some to some extent.","offset":3910,"duration":15},{"text":"Andrej: Um and so I think education is going to be kind of like reshuffled by this uh quite substantially um where it's the end of like teaching each other things almost a little bit. Like if I have a uh library, for example, of code or something like that, it used to be that you have documentation for other people who are going to use your library.","offset":3925,"duration":16},{"text":"Andrej: But like you shouldn't do that anymore. Like you should have instead of HTML documents for humans, you have markdown documents for agents, because if agents get it, then they can just explain all the different parts of it.","offset":3941,"duration":9},{"text":"Andrej: So it's this redirection through agents, you know? Um and that's like why so I think we're going to see a lot more of that playing out.","offset":3950,"duration":10},{"text":"Host: Well we'll see if the great teachers know like to develop intuition for how to explain things to agents differently.","offset":3960,"duration":5},{"text":"Andrej: Oh yeah. Ultimately, so for example, MicroGPT, like I asked I tried to get an agent to write MicroGPT. So I told it like try to boil down the simplest things, like try to boil down micro- neural network training to the simplest thing and it can't do it.","offset":3965,"duration":14},{"text":"Andrej: Like MicroGPT is like my is it's like my end of my obsession. It's the 200 lines. I thought about this for a long time. I've obsessed about this for a long time. This is this is the solution. Trust me it can't get simpler.","offset":3979,"duration":10},{"text":"Andrej: And this is this is my value add. Everything else like agent gets it. It just can't come up with it but it totally gets it and understands why it's done in a certain way, etc. So like my contribution is kind of like these few bits, but everything else in terms of like the education that goes on after that is like not my domain anymore.","offset":3989,"duration":17},{"text":"Andrej: So maybe yeah it's like education kind of changes in those ways where you kind of have to infuse the few bits that you feel strongly about the curriculum or the the better way of explaining it or something like that. The things that agents can't do is your job now.","offset":4006,"duration":13},{"text":"Host: Hmm.","offset":4019,"duration":1},{"text":"Andrej: The things that agents can do they can probably do better than you or like very soon. And so you should be strategic about what you're actually spending time on.","offset":4020,"duration":7}],"startTime":3750},{"title":"Podcast Outro","summary":"The host thanks Andrej for his time and provides sign-off details, social media handles, and website links for the podcast.","entries":[{"text":"Host: Well we appreciate the few bits. Thank you Andrej.","offset":4027,"duration":4},{"text":"Andrej: Okay.","offset":4031,"duration":1},{"text":"Host: Find us on Twitter at NoPriorsPod. Subscribe to our YouTube channel if you want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.","offset":4032,"duration":15}],"startTime":4027}],"entries":[{"text":"Andrej Karpathy: Code's not even the right verb anymore, right? But I have to, um, express my will to my agents for 16 hours a day. Manifest.","offset":0,"duration":10},{"text":"Andrej Karpathy: How can I have not just a single session of, you know, Claude code or Codex or some of these agent harnesses, how can I have more of them? How can I do that appropriately?","offset":10,"duration":4},{"text":"Andrej Karpathy: The agent part is now taken for granted. Now the claw-like entities are taken for granted, and now you can have multiple of them, and now you can have instructions to them, and now you can have optimization over the instructions.","offset":14,"duration":9},{"text":"Andrej Karpathy: But there—I mean, this is what gets to the psychosis, is that this is like infinite, and everything is skill issue.","offset":23,"duration":7},{"text":"Host: Hi listeners, welcome back to No Priors. Today I'm here with Andrej Karpathy, and we have a wide-ranging conversation for you about code agents, the future of engineering, and AI research.","offset":30,"duration":11},{"text":"Host: How more people can contribute to research, what's happening in robotics, his prediction for how agents can reach out into the real world, and education in this next age. Welcome, Andrej.","offset":41,"duration":13},{"text":"Host: Andrej, thanks for doing this.","offset":54,"duration":1},{"text":"Andrej Karpathy: Yeah, thank you for having me.","offset":55,"duration":2},{"text":"Host: Uh, so it's been a very exciting couple of months in AI.","offset":57,"duration":2},{"text":"Andrej Karpathy: Oh yeah, you could say that.","offset":59,"duration":2},{"text":"Host: I remember, um, walking into the office at some point and you were like, really locked in, and I was asking what you were up to and you're like, \"I just—I have to code for 16 hours a day,\" or code's not even the right verb anymore, right?","offset":61,"duration":12},{"text":"Host: But I have to, um, express my will to my agents for 16 hours a day. Manifest. Um, because like there's been a jump in capability. Uh, what's happening? Tell me about your experience.","offset":73,"duration":12},{"text":"Andrej Karpathy: Yeah, I kind of feel like I was just in this perpetual—I still am often—in this state of AI psychosis just like all the time, uh, because there was a huge unlock in what you can achieve as a person, as an individual, right?","offset":85,"duration":14},{"text":"Andrej Karpathy: Because you were bottlenecked by, you know, your typing speed and so on. But now with these agents, it really—I would say in December is when it really just—something flipped, where I kind of went from 80/20 of like, you know, uh, to like 20/80 of writing code by myself versus just delegating to agents.","offset":99,"duration":15},{"text":"Andrej Karpathy: And I don't even think it's 20/80 by now, I think it's a lot more than that. I don't think I've typed like a line of code probably since December basically. Uh, which is like an extremely large, uh, change.","offset":114,"duration":14},{"text":"Andrej Karpathy: Um, I was talking to it like for example, I was talking about it to for example my parents and so on, and I don't think like a normal person actually realizes that this happened or how dramatic it was.","offset":128,"duration":10},{"text":"Andrej Karpathy: Like literally like if you just find a random software engineer or something like that at their—at their desk and what they're doing, like their default workflow of, you know, building software is completely different as of basically December.","offset":138,"duration":14},{"text":"Andrej Karpathy: Uh, so I'm just like in this state of psychosis of trying to figure out like what's possible, uh, trying to push it to the limit. How is—how can I have not just a single session of, you know, um, Claude code or Codex or some of these agent harnesses?","offset":152,"duration":12},{"text":"Andrej Karpathy: How can I have more of them? How can I do that appropriately? And then how can I use these claws? What are these claws? Uh, and uh, so there's like a lot of new things. I want to be at the forefront of it, you know?","offset":164,"duration":10},{"text":"Andrej Karpathy: And I'm very antsy that I'm not at the forefront of it. And I see lots of people on Twitter doing all kinds of things and they all sound like really good ideas, and I need to be at the forefront or I feel extremely nervous.","offset":174,"duration":6},{"text":"Andrej Karpathy: And so I guess I'm just in this psychosis of like what's possible, like because it's unexplored fundamentally.","offset":180,"duration":2},{"text":"Host: Well, if you're nervous, the rest of us are—are nervous. We have a—uh, we have a team that we work with at Conviction that their setup is everybody is like, you know, none of the engineers write code by hand and they're all microphoned and they just like whisper to their agents all the time.","offset":182,"duration":17},{"text":"Host: It's the strangest work setting ever, uh, and I thought they were crazy and now I like fully accept I was like, oh this was the way. Like you're just ahead of it. Um, what—uh, how do you think about your own capacity now to like explore or to do new projects? Like what—what is it limited by?","offset":199,"duration":18},{"text":"Andrej Karpathy: Yeah, what is it limited by? Uh, just I think everything, like so many things, even if they don't work, I think to a large extent you feel like it's skill issue. It's not that the capability's not there.","offset":217,"duration":12},{"text":"Andrej Karpathy: It's that you just haven't found a way to string it together of what's available. Like I just didn't give good enough instructions in the agents' MD file or whatever it may be. I don't have a nice enough memory tool that I put in there or something like that.","offset":229,"duration":13},{"text":"Andrej Karpathy: So it all kind of feels like skill issue when it doesn't work to some extent. You want to see how you can paralyze them etc. and you want to be Peter Steinberg basically. Uh, so Peter is famous, he has a funny photo where he's in front of a monitor with lots of like—uh, he uses Codex.","offset":242,"duration":13},{"text":"Andrej Karpathy: So lots of Codex agents tiling the—the monitor, and they all take about 20 minutes if you prompt them correctly and you use the high effort, and so they all take about 20 minutes. So you have multiple, you know, 10, uh, repos checked out, and so he's just, um, going between them and giving them work.","offset":255,"duration":15},{"text":"Andrej Karpathy: It's just like you can—you can move in much larger macro actions. It's not just like here's a line of code, here's a new function, it's like here's a new functionality and delegate it to agent one.","offset":270,"duration":10},{"text":"Andrej Karpathy: Here's a new functionality that's not going to interfere with the other one, give it to agent two, and then try to, uh, review their work as best as you can depending on how much you care about that code.","offset":280,"duration":8},{"text":"Andrej Karpathy: Like what are these macro actions that I can like manipulate my software repository by? And like another agent is doing some like research, another agent is writing code, another one is coming up with a plan for some new implementation.","offset":288,"duration":13},{"text":"Andrej Karpathy: And so everything is just like happening in these like macro actions over your repository. Um, and you're just trying to become like really good at it and develop like a muscle memory for it is extremely, um—yeah, it's very rewarding number one because it actually works.","offset":301,"duration":14},{"text":"Andrej Karpathy: But it's also kind of like the new thing to learn. So that's why hence the psychosis.","offset":315,"duration":4},{"text":"Host: Yeah, I—I do feel like my instinct is like whenever I am waiting for an agent to complete something, the obvious thing to do is like well I can do more work, right?","offset":319,"duration":8},{"text":"Host: Like if I have access to more tokens then like I should just parallelize and tasks. And so that—that's very stressful because if you don't feel very bounded by your ability to spend on tokens, then you know you are the bottleneck in this system that is max capability.","offset":327,"duration":15},{"text":"Andrej Karpathy: Yeah, if you're not maximizing your subscription at least, and uh, so ideally for multiple agents, like if you run out of the quota on Codex you should switch to Claude or whatnot, I don't know, like that's what I've been trying to do a little bit.","offset":342,"duration":9},{"text":"Andrej Karpathy: And I feel nervous when I have subscription left over, uh, that just means I haven't maximized my token throughput. So I actually kind of experienced this when I was a PhD student, you would feel nervous when your GPUs are not running.","offset":351,"duration":8},{"text":"Andrej Karpathy: Like you have GPU capability and you're not maximized to the available flops to you. But now it's not about flops, it's about tokens. Uh, so what is your token throughput and what token throughput do you command?","offset":359,"duration":10},{"text":"Host: I would actually argue that it's very interesting that we had, you know, at least 10 years where in many engineering tasks people just didn't—they didn't feel compute bound, right?","offset":369,"duration":12},{"text":"Host: Um, and like the entire industry feels that now. They feel like—they felt resource bound. Uh, and now that you have this big capability jump, you're like, oh actually it's not, you know, my ability to access the compute anymore, like I—I'm the binding constraint.","offset":381,"duration":12},{"text":"Andrej Karpathy: Yeah, it's a skill issue, which is very empowering cause, uh, yeah cause you could be getting better. So that's why—that's why I think it's very addictive because there's unlocks when—when you get better.","offset":393,"duration":7},{"text":"Host: Where do you think it goes? Like if you just think about like okay, you know, Andrej's iterating and everybody else's for 16 hours a day getting better at using coding agents, like what does it look like in a year of like you've reached mastery?","offset":400,"duration":12},{"text":"Andrej Karpathy: Yeah, what does mastery look like, right? At the end of the year or like two, three, years, five years, ten years etc. Well I think everyone is basically interested in like going up the stack.","offset":412,"duration":9},{"text":"Andrej Karpathy: So I would say, yeah, it's not about a single session with your agent. Um, multiple agents, how do they collaborate and teams and so on. So everyone's trying to figure out what that looks like.","offset":421,"duration":8},{"text":"Andrej Karpathy: And then I would say Claw is also kind of an interesting direction because it really—when I say a Claw, I mean this like layer that uh kind of takes persistence to a whole new level.","offset":429,"duration":9},{"text":"Andrej Karpathy: Like it's something that like keeps looping, is—is um, it's not something that you are interactively in the middle of. It kind of like has its own little sandbox, its own little, you know, it kind of like does stuff on your behalf even if you're not looking kind of thing.","offset":438,"duration":10},{"text":"Andrej Karpathy: Um, and then also has like maybe more sophisticated memory systems etc. that are not yet implemented in agents. So, uh, Open Claw has a lot more sophisticated memory I would say than what you would get by default, uh, which is just a memory compaction when your context runs out, right?","offset":448,"duration":13},{"text":"Host: You think that's the piece that resonated for more users versus like perhaps like broader tool access?","offset":461,"duration":5},{"text":"Andrej Karpathy: For Open Claw?","offset":466,"duration":1},{"text":"Host: Yeah.","offset":467,"duration":1},{"text":"Andrej Karpathy: Uh, there's—like I think there's at least five things that resonated with users. Yeah, good job, Peter. I mean Peter has done a really amazing job.","offset":468,"duration":6},{"text":"Andrej Karpathy: Um, I saw him recently, uh, and I talked to him about it and I—he's very humble about it, but I think he innovated simultaneously in like five different ways and put it all together. Uh, so for example like the soul MD document, like he actually really crafted a personality that is kind of compelling and interesting.","offset":474,"duration":14},{"text":"Andrej Karpathy: And I feel like a lot of the current agents they don't get this correctly. I actually think Claude has a pretty good personality, it feels like a teammate. Uh, and uh, it's excited with you etc.","offset":488,"duration":10},{"text":"Andrej Karpathy: I would say, for example, Codex is a lot more dry, um, which is kind of interesting because in ChatGPT, Codex is like a lot more upbeat and highly sycophantic. But I would say Codex the coding agent is very dry.","offset":498,"duration":11},{"text":"Andrej Karpathy: It doesn't—it doesn't seem to care about what you're creating. It's kind of like, \"Oh, I implemented it.\" It's like, okay, but do you understand what we're building?","offset":509,"duration":6},{"text":"Host: It's true.","offset":515,"duration":1},{"text":"Andrej Karpathy: You know, it doesn't—um—and the other thing I would say is for example with Claude, I think they dialed the sycophancy fairly well, where when Claude gives me praise, I do feel like I slightly deserve it.","offset":516,"duration":9},{"text":"Andrej Karpathy: Um, because sometimes I kind of give it like not very well-formed thoughts, and uh I give it an idea that I don't think is fully baked and it doesn't actually react very strongly. It's like, \"Oh yeah, we can implement that.\"","offset":525,"duration":9},{"text":"Andrej Karpathy: But when it's a really good idea by my own account, it does uh seem to reward it a bit more, and so I kind of feel like I'm trying to like earn its praise, which is really weird.","offset":534,"duration":8},{"text":"Andrej Karpathy: And so I do think the personality matters a lot. Uh, and I think uh a lot of the other, uh, tools maybe don't appreciate it as much, and I think in this aspect also Peter really cares about this and so that was correct.","offset":542,"duration":7},{"text":"Andrej Karpathy: And then the memory system and then uh just you know he's just having fun with this. Um, and then the the single WhatsApp portal to all of the automation.","offset":549,"duration":9},{"text":"Host: Yeah. Is there something that you have done personally with your Claws beyond software engineering that you think is fun or interesting?","offset":558,"duration":9},{"text":"Andrej Karpathy: Yeah, so in January I had Claude—I went through a period of Claude psychosis, so I built a—um, I have a Claw basically that takes care of my home, and I call him Dobby the Elf Claw. Um, and uh basically I used uh the agents to find all of the smart home subsystems of my home on the local area network, which I was kind of surprised that worked out of the box.","offset":567,"duration":20},{"text":"Andrej Karpathy: Like I just told it that I think I have Sonos at home, like can you try to find it? And it goes and it did like IP scan of all of the um, basically um computers on the local area network, and it found the Sonos thing, uh the Sonos uh system.","offset":587,"duration":12},{"text":"Andrej Karpathy: And it turned out that there's no password protection or anything like that, it just logged in and it's like, \"Oh yeah, you have these Sonos systems installed. Uh, let me try to reverse engineer how it's uh working.\"","offset":599,"duration":10},{"text":"Andrej Karpathy: It does some web searches and it finds like, \"Okay, these are the API endpoints.\" And then it's like, \"Do you want to try it?\" and I'm like, \"Whoa, like you just did that.\" And I'm like, \"Yeah, can you try to play something in the study?\"","offset":609,"duration":9},{"text":"Andrej Karpathy: And uh it does, and music comes out. And I'm like, I can't believe I just—","offset":618,"duration":1},{"text":"Host: That's crazy, that's like three prompts.","offset":619,"duration":2},{"text":"Andrej Karpathy: I can't believe I just typed in like, \"Can you find my Sonos?\" and that suddenly it's playing music. And it did the same for lights. And so basically like it kind of hacked in, figured out the whole thing, uh created APIs, created a dashboard, so I could see the command kind of center of like all of my lights in the home.","offset":621,"duration":12},{"text":"Andrej Karpathy: And then it was like switching lights on and off and, you know, uh so I can ask it like Dobby at sleepy time, and when it's sleepy time that just means all the lights go off etc. and so on.","offset":633,"duration":11},{"text":"Andrej Karpathy: So it controls all of my lights, my HVAC, my shades, uh the pool and uh the spa, and also my security system. So I have a camera pointed outside of the house and anytime someone rolls in, I have a Quen—uh, a Quen uh model that looks at the videos.","offset":644,"duration":13},{"text":"Andrej Karpathy: So first of all there's change detection.","offset":657,"duration":2},{"text":"Host: Right.","offset":659,"duration":1},{"text":"Andrej Karpathy: And then based on change detection, it goes to Quen, and then it actually like tells me, uh it sends me a text to my WhatsApp, it shows an image from the outside, and it says, \"Hey FedEx truck just pulled up, FedEx truck just pulled up and you might want to check it and you got new mail or something like that.\"","offset":660,"duration":15},{"text":"Andrej Karpathy: And Dobby just texts me this. This is extremely incredible. Um, so so Dobby is in charge of the house, I text through with it through WhatsApp. Uh, and it's been like really fun to have these macro actions that maintain my house.","offset":675,"duration":14},{"text":"Andrej Karpathy: I haven't like really pushed it uh like way more beyond that, and I think people are doing a lot more crazy things with it, but for me even just the home automation setup, I used to use like six apps, uh completely different apps, and I don't have to use these apps anymore.","offset":689,"duration":13},{"text":"Andrej Karpathy: Like Dobby controls everything in natural language, it's amazing. Um, and so I think like I haven't even pushed the paradigm fully, but already that is so helpful and so inspiring I would say.","offset":702,"duration":9},{"text":"Host: Do you think that's indicative of like what people want from a user experience perspective with software?","offset":711,"duration":6},{"text":"Host: Right, because I—I don't think, you know, it's pretty ignored that it takes humans effort to like learn new software, like new UI.","offset":717,"duration":8},{"text":"Andrej Karpathy: Yeah, I think uh to some extent that's right. It's like working backwards from how people think an AI should be, because what people have in their mind of like what an AI is, is not actually what an LLM is by—like in the raw sense.","offset":725,"duration":10},{"text":"Andrej Karpathy: Like LLM is a token generator, you know, like more tokens come out. But what they think of is like this pers—this persona identity that they can tell stuff and it remembers it, you know, and uh is just an entity behind a WhatsApp. It's like a lot more understandable.","offset":735,"duration":13},{"text":"Andrej Karpathy: Um, so I think to some extent it's like matching the expectations that humans already have for what an AI should behave, but under the hood it's like a lot of technical details go into that. And LLMs are too raw of a primitive, uh to actually, uh type check as AI I think for most people if that makes sense.","offset":748,"duration":13},{"text":"Host: Yeah, I—I think that's like how we understand what the AI is, and like the description of it as Dobby or some persona obviously resonates with people.","offset":761,"duration":10},{"text":"Host: Um, I also think that it—uh the unification that you did across your six different software systems for your home automation speaks to a different question of like do people really want all of this software that we have today?","offset":771,"duration":11},{"text":"Andrej Karpathy: Yeah.","offset":782,"duration":1},{"text":"Host: Um, because I would argue like well you have the hardware, but you've now thrown away the software or the the UX layer of it. Um, do you think that's what people want?","offset":783,"duration":11},{"text":"Andrej Karpathy: Yeah, I think there's this like there's this sense that these apps that are on the app store for using these smart home devices etc., uh these shouldn't even exist kind of in a certain sense.","offset":794,"duration":9},{"text":"Andrej Karpathy: Like shouldn't it just be APIs and shouldn't agents be just using it directly? And uh wouldn't it like—I can do all kinds of home automation stuff that uh any individual app will not be able to do, right?","offset":803,"duration":10},{"text":"Andrej Karpathy: Um, and an LLM can actually drive the tools and call all the right tools and do do pretty complicated things. Um, and so in a certain sense it does point to this like maybe there's like an overproduction of lots of custom bespoke apps that shouldn't exist because agents kind of like crumble them up.","offset":813,"duration":13},{"text":"Andrej Karpathy: And everything should be a lot more just like exposed API endpoints, and agents are the glue of the intelligence that actually like tool calls all the all the parts. Um, another example is like my treadmill.","offset":826,"duration":12},{"text":"Andrej Karpathy: Uh, there's an app for my treadmill, and I wanted to like keep track of how often I do my cardio, uh but like I don't want to like log into a web UI and go through a flow and etc. Like all this should just be like make APIs available.","offset":838,"duration":12},{"text":"Andrej Karpathy: And this is kind of, you know, going towards the agentic, uh sort of web or like agent first, uh tools and all this kind of stuff. So I think the industry just has to reconfigure in so many ways that it's like the customer is not the human anymore.","offset":850,"duration":12},{"text":"Andrej Karpathy: It's like agents who are acting on behalf of humans, and this refactoring will be will probably be substantial in a certain sense.","offset":862,"duration":6},{"text":"Host: One way that people sometimes push back on this is like do people, do we expect people to vibe code some of these tools?","offset":868,"duration":6},{"text":"Host: Do we expect normal people to do this kind of stuff that I described? But I think to some extent this is just, you know, technology as it exists today, and right now there is some vibe coding and I'm actually watching it and I'm working with the system.","offset":874,"duration":14},{"text":"Andrej Karpathy: But I kind of feel like the kind of stuff that I just talked about, this should be free, like in a year or two or three. There's no vibe coding involved. This is trivial, this is table stakes. This is like any AI, even the open source models etc. can like do this.","offset":888,"duration":11},{"text":"Host: You should be able to translate from a less technical human's intent very easily to this outcome.","offset":899,"duration":6},{"text":"Andrej Karpathy: Extremely easily. Yeah. Today it's vibe coding and some involved and not many people are going to do it, but—","offset":905,"duration":3},{"text":"Host: And you still have to make some design decisions, right? We were talking about like what you take frames, for example.","offset":908,"duration":5},{"text":"Andrej Karpathy: Yeah. But I kind of feel like this will just uh start to the barrier will just come down and it's just ephemeral software on your behalf and some kind of like Claw is handling all the details for you, but you're not involved. Claw has a Claw has a machine and it will figure it out, and it's just presenting you UIs and you're like saying stuff, you know.","offset":913,"duration":14},{"text":"Host: Why haven't you uh I guess like pushed the boundaries of what you can do personally with Claws like is it, you know, you're focusing on more important projects? AutoResearch etc. or uh you're climbing the hill to mastery or something else, right?","offset":927,"duration":15},{"text":"Andrej Karpathy: Yeah, I just feel like I'm so distracted by everything. So I spent I spent like a week on the Claw stuff and I I have more to-dos almost, um but I will say that—","offset":942,"duration":11},{"text":"Host: It's like Jensen told us, we're all just busier, unfortunately.","offset":953,"duration":2},{"text":"Andrej Karpathy: Yeah. Uh, I didn't really take advantage of a lot of like email and calendar and all this other stuff, and I didn't give it access because I'm still a little bit like suspicious and it's still very new and rough around the edges, so I didn't want to give it like full access to my digital life yet.","offset":955,"duration":14},{"text":"Andrej Karpathy: And part of it is just security privacy and uh just being very cautious in that in that realm. And uh so some of it is like held back by that I would say. Yeah maybe that's like the dominant dominant feature, but some of it is also just I feel so distracted because I feel like I had a week of Claw and then other stuff is happening and—","offset":969,"duration":13},{"text":"Host: What was the um—I mean, you have talked about like being able to train or at least optimize a uh a model as a task that you want to see agents do for a long time. Like what was the motivation behind AutoResearch?","offset":982,"duration":13},{"text":"Andrej Karpathy: AutoResearch, yeah. So I think like I had a tweet earlier where I kind of like said something along the lines of to get the most out of the tools that are becoming available now, you have to remove yourself as the as the bottleneck.","offset":995,"duration":12},{"text":"Andrej Karpathy: You can't be there to prompt the next thing. You're you need to take yourself outside um. You have to arrange things such that they are completely autonomous and the more you know, how can you maximize your token throughput and not be in the loop?","offset":1007,"duration":12},{"text":"Andrej Karpathy: This is the this is the goal. And so I kind of mentioned that the the name of the game now is to increase your leverage. Uh, I put in just very few tokens just once in a while and a huge amount of stuff happens on my behalf.","offset":1019,"duration":9},{"text":"Andrej Karpathy: And so AutoResearch, like I tweeted that and I think people liked it and whatnot, but—","offset":1028,"duration":4},{"text":"Host: They haven't like maybe worked through like the implications of that.","offset":1032,"duration":3},{"text":"Andrej Karpathy: And for me AutoResearch is an example of like an implication of that where it's like I don't want to be like the researcher in the loop like looking at results etc. like I'm I'm holding the system back.","offset":1035,"duration":8},{"text":"Andrej Karpathy: So the question is how do I refactor all the abstractions so that I'm not—I have to arrange it once and hit go. The name of the game is how can you get more agents running for longer periods of time without your involvement doing stuff on your behalf?","offset":1043,"duration":12},{"text":"Andrej Karpathy: And AutoResearch is just, yeah, here's an objective, here's a metric, here's your boundaries of what you can and cannot do and go. And uh yeah, it worked.","offset":1055,"duration":7},{"text":"Host: You were surprised at its effectiveness.","offset":1062,"duration":3},{"text":"Andrej Karpathy: Yeah, I I didn't expect uh it to work because so I have the project Nanochat, um and fundamentally like I think a lot of people are very confused with my obsession for like training GPT-2 models and so on, but for me uh training GPT models and so on is just a little harness, a little playground for training LLMs.","offset":1065,"duration":15},{"text":"Andrej Karpathy: And fundamentally what I'm more interested in is like this idea of recursive self-improvement and to what extent can you actually have LLMs improving LLMs? Because I think all the frontier labs this is like the thing, um for obvious reasons.","offset":1080,"duration":9},{"text":"Andrej Karpathy: And they're all trying to recursively self-improve roughly speaking. And so for me this is kind of like uh a little playpen of that. Um, and I guess I like tuned Nanochat already quite a bit by hand in a good old-fashioned way that I'm used to.","offset":1089,"duration":13},{"text":"Andrej Karpathy: Like I'm a researcher, I've done this for like you know two decades, I have some amount of like—","offset":1102,"duration":3},{"text":"Host: What is the opposite of hubris? Uh, yeah.","offset":1105,"duration":2},{"text":"Host: Earned confidence.","offset":1107,"duration":1},{"text":"Andrej Karpathy: Okay. Of like two decades of like oh I've trained this model like thousands of times of like um so I've done a bunch of experiments, I've done hyperparameter tuning, I've done all the things I'm very used to and I've done for two decades.","offset":1108,"duration":13},{"text":"Andrej Karpathy: Yeah. And I've gotten to a certain point and I thought it was like fairly well-tuned, and then I let AutoResearch go for like overnight and it came back with like tunings that I didn't see. And yeah I did forget like the weight decay on the value embeddings, and my Adam betas were not sufficiently tuned, and these things jointly interact.","offset":1121,"duration":18},{"text":"Andrej Karpathy: So like once you tune one thing the other things have to potentially change too. You know, I shouldn't be a bottleneck, I shouldn't be running these hyperparameter search optimizations, I shouldn't be looking at the results.","offset":1139,"duration":8},{"text":"Andrej Karpathy: There's objective criteria in this case. Uh, so you just let you just have to arrange it so that it can just go forever. So that's a single sort of version of AutoResearch of like a single loop trying to improve.","offset":1147,"duration":8},{"text":"Andrej Karpathy: And I was surprised that it um it found these things that I you know the repo is already fairly well-tuned and still found something. And that's just a single it's a single loop.","offset":1155,"duration":6},{"text":"Andrej Karpathy: Like these frontier labs they have GPU clusters of tens of thousands of them, and so it's very easy to imagine how you would basically get a lot of this automation on um smaller models, and fundamentally everything around like frontier-level intelligence is about extrapolation and scaling laws.","offset":1161,"duration":18},{"text":"Andrej Karpathy: And so you basically do a ton of the exploration on the smaller models and then you try to uh extrapolate out.","offset":1179,"duration":4},{"text":"Host: So you're saying our research efforts are going to get more efficient, like we're going to have better direction for when we scale as well if we can do this experimentation better.","offset":1183,"duration":8},{"text":"Andrej Karpathy: Yeah, I would say that like the most interesting project and probably what the frontier labs are working on is um you know you experiment on small models, you try to make it as autonomous as possible, remove researchers—","offset":1191,"duration":8},{"text":"Host: From the loop.","offset":1199,"duration":2},{"text":"Andrej Karpathy: —they have way too much—","offset":1201,"duration":2},{"text":"Host: What is the opposite of—which—","offset":1203,"duration":1},{"text":"Host: Earned confidence.","offset":1204,"duration":1},{"text":"Andrej Karpathy: Yeah, they don't know. They shouldn't be touching any of this really. And so you have to like rewrite the whole thing because right now I mean certainly they can contribute ideas.","offset":1205,"duration":7},{"text":"Andrej Karpathy: But okay uh they shouldn't actually be enacting these ideas. There's a queue of ideas, and there's maybe an automated scientist that comes up with ideas based on all the archive papers and GitHub repos and it funnels ideas in, or researchers can contribute ideas.","offset":1212,"duration":13},{"text":"Andrej Karpathy: But it's a single queue and there are workers that pull uh items and they try them out, and uh whatever works just gets uh sort of put on the feature branch and maybe some people like uh monitor the feature branch and merge to the main branch sometimes.","offset":1225,"duration":14},{"text":"Andrej Karpathy: So yeah just removing humans uh from all the processes and automating as much as possible and getting high to tokens per second throughputs and it does require rethinking of all the abstractions uh and uh everything has to be reshuffled so yeah I think it's very exciting.","offset":1239,"duration":15},{"text":"Host: Take one more recursive step here. Um, uh when is the model going to write a better Program MD than you?","offset":1254,"duration":5},{"text":"Andrej Karpathy: Yeah.","offset":1259,"duration":1},{"text":"Host: We're not in the loop.","offset":1260,"duration":1},{"text":"Andrej Karpathy: Yeah, exactly. Uh, so Program MD is my crappy attempt at describing like how the AutoResearcher should work. Like oh do this, then do that and that and then try these kinds of ideas.","offset":1261,"duration":10},{"text":"Andrej Karpathy: Like here's maybe some ideas like look at architecture, look at optimizer etc. but I just came up with this in Markdown, right? Um, and uh so yeah exactly.","offset":1271,"duration":9},{"text":"Andrej Karpathy: You want some kind of an AutoResearch loop maybe that looks for—you can imagine that different Program MDs would um would give you different uh progress. So you basically every research organization is described by Program MD.","offset":1280,"duration":15},{"text":"Host: Yeah.","offset":1295,"duration":1},{"text":"Andrej Karpathy: A research organization is a set of Markdown files that describe all the roles and how the whole thing connects. Um, and you can imagine having a better research organization.","offset":1296,"duration":10},{"text":"Andrej Karpathy: So maybe they do fewer stand-ups in the morning because they're useless. Uh, and this is all just code, right? Um, and so you can so one organization can have fewer stand-ups, one organization can have more, uh one organization can be very risk-taking, one organization can be less.","offset":1306,"duration":13},{"text":"Andrej Karpathy: And so you can definitely imagine that you have multiple research orgs, um and then they all have code. And once you have code then you can imagine tuning the code. So 100% there's like the meta layer of it uh uh.","offset":1319,"duration":10},{"text":"Host: Did you see my text about my contest idea? My contest idea was uh like let people write uh different Program MDs, right?","offset":1329,"duration":9},{"text":"Host: And so for same hardware where do you get most improvement?","offset":1338,"duration":2},{"text":"Andrej Karpathy: Oh, I see.","offset":1340,"duration":1},{"text":"Host: And then you can take all that data and then give it to the model and say write a better Program MD.","offset":1341,"duration":4},{"text":"Andrej Karpathy: Yes, yes. Yeah, exactly.","offset":1345,"duration":3},{"text":"Host: We're going to get something better. Like there's no way we don't, right?","offset":1348,"duration":1},{"text":"Andrej Karpathy: You can 100% look at uh where the improvements came from and like can I change the Program MD such that more of these kinds of things would be done, or like things that didn't work uh etc. Meta optimization.","offset":1349,"duration":13},{"text":"Host: Yeah.","offset":1362,"duration":1},{"text":"Andrej Karpathy: You can 100% imagine doing that. So I think this is a great idea. But it's like you know I think you sort of go one step at a time where you sort of have one process and then second process and then the next process, and these are all layers of an onion.","offset":1363,"duration":10},{"text":"Andrej Karpathy: Like the LLM sort of part is now taken for granted, the agent part is now taken for granted, now the claw-like entities are taken for granted, and now you can have multiple of them, and now you can have instructions to them, and now you can have optimization over the instructions, and it's just a little too much, you know?","offset":1373,"duration":14},{"text":"Andrej Karpathy: But they I mean this is what gets to the psychosis, is that this is like infinite and everything is skill issue, and that's why I feel yeah that's just coming back to this is why it's so insane.","offset":1387,"duration":-1191},{"text":"Host: Okay, well if we're we're just trying to like diagnose the current moment and uh what is a relevant skill right now, what do you like what do you think is the implication that this um that this is the loop we should be trying to achieve in different areas?","offset":196,"duration":1213},{"text":"Host: And that it works, right? Like remove—create the metric or create the ability for um agents to continue working on it without you. Do we still have performance engineering? Like what—","offset":1409,"duration":11},{"text":"Andrej Karpathy: Yeah, I mean so there's a few caveats that I would put on top of the LM psychosis. So number one, uh this is extremely well suited to anything that has objective uh metrics that are easy to evaluate.","offset":1420,"duration":9},{"text":"Host: Hmm.","offset":1429,"duration":1},{"text":"Andrej Karpathy: So for example like writing kernels for more efficient CUDA, uh you know code for various parts of a model etc. are the perfect fit. Um, because you have inefficient code and then you want efficient code that has the exact same behavior but is much faster. Perfect fit.","offset":1430,"duration":16},{"text":"Andrej Karpathy: Uh, so a lot of things that like are perfect fit for AutoResearch, but many things will not be. And so they it's just if you can't evaluate it then you can't AutoResearch it, right? Uh, so that's like caveat number one.","offset":1446,"duration":10},{"text":"Andrej Karpathy: And then maybe caveat number two I would say is you know we're we're kind of talking about next steps and we kind of see what next steps are, but fundamentally the whole thing still doesn't it still kind of like bursting at the seams a little bit and there's cracks and it doesn't fully work.","offset":1456,"duration":12},{"text":"Andrej Karpathy: And if you kind of try to go too far ahead the whole thing is actually net not useful, if that makes sense. Um because these models like still are not you know they've improved a lot but they're still like rough around the edges as maybe the way I would describe it.","offset":1468,"duration":12},{"text":"Andrej Karpathy: I simultaneously feel like I'm talking to an extremely brilliant PhD student who's been like a systems programmer for their entire life and a 10-year-old. And it's so weird because humans like there's—","offset":1480,"duration":12},{"text":"Host: Yes you wouldn't you wouldn't encounter that combination.","offset":1492,"duration":2},{"text":"Andrej Karpathy: This jaggedness is really strange and humans have a lot less of that kind of jaggedness although they definitely have some. But humans have a lot more jaggedness—uh sorry the agents have a lot more jaggedness where uh sometimes like you know I ask for functionality and it like comes back with something that is just like totally wrong.","offset":1494,"duration":16},{"text":"Andrej Karpathy: And then we get into loops that are totally wrong and then I'm just I get so frustrated with the agents all the time still because you feel the power of it but you also there's still like it does nonsensical things once in a while for me uh still as well.","offset":1510,"duration":14},{"text":"Host: I get very annoyed when uh uh I feel like the agent wasted a lot of compute on something it should have recognized was an obvious problem.","offset":1524,"duration":8},{"text":"Andrej Karpathy: Yeah. I think like some of the bigger things is like maybe what's under underneath it, if I could hypothesize, is fundamentally these models are trained via reinforcement learning.","offset":1532,"duration":9},{"text":"Andrej Karpathy: So they're actually struggling with the exact same thing we just talked about which is the labs can improve the models in anything that is verifiable or that has rewards. So did you write the program correctly and does it do the unit tests check out, yes or no?","offset":1541,"duration":13},{"text":"Andrej Karpathy: But some of the things where they're struggling is like for example I think they have a tough time with like nuance of maybe what I what I had in mind or what I intended and when to ask clarifying questions.","offset":1554,"duration":11},{"text":"Andrej Karpathy: Uh, like or where the—yeah, it's just um anything that feels softer is like worse. And so you're kind of like you're either on rails and you're part of the superintelligence circuits or you're not on rails and you're outside of the verifiable domains and suddenly everything kind of just like meanders.","offset":1565,"duration":-1187},{"text":"Andrej Karpathy: Like maybe another way to put it is if you go to if today if you go to like state-of-the-art model ChatGPT and you ask it tell me a joke, um do you know what joke you're going to get? There's the joke.","offset":378,"duration":1209},{"text":"Host: The joke. I do feel I I can't tell you like the you know standard form of it, but I do feel like ChatGPT has like three jokes.","offset":1587,"duration":8},{"text":"Andrej Karpathy: Yeah, yeah. So the the joke that apparently all the LLMs like love the most is why do scientists uh not trust atoms?","offset":1595,"duration":6},{"text":"Host: Okay.","offset":1601,"duration":1},{"text":"Andrej Karpathy: Because they make everything up.","offset":1602,"duration":1},{"text":"Host: Okay.","offset":1603,"duration":1},{"text":"Andrej Karpathy: They make everything up. So this is still—","offset":1604,"duration":2},{"text":"Host: Why'd that emerge?","offset":1606,"duration":2},{"text":"Andrej Karpathy: So this is the joke you would get three or four years ago and this is the joke you still get today.","offset":1608,"duration":3},{"text":"Host: Okay.","offset":1611,"duration":1},{"text":"Andrej Karpathy: So even though the models have improved tremendously and if you give them an agentic task they will just go for hours and move mountains for you, and then you ask for like a joke and it has a stupid joke, a crappy joke from five years ago.","offset":1612,"duration":14},{"text":"Andrej Karpathy: And it's because it's outside of the it's outside of the RL. It's outside of what's being improved, it's like and it's part of the jaggedness of like shouldn't you expect models as they get better to also have like better jokes or more diversity of them or it's just it's not being optimized and it's stuck.","offset":1626,"duration":15},{"text":"Host: Do you uh uh think that that implies that we are not seeing like generalization in the sense of like broader intelligence of joke smartness being attached to code smartness?","offset":1641,"duration":14},{"text":"Andrej Karpathy: Yeah, I think there's some decoupling where some things are verifiable and some things are not and some things are optimized for arbitrarily by the labs depending on like what data went in and some things are not.","offset":1655,"duration":11},{"text":"Host: But I mean the premise, there's a you know premise from some research groups that if you are smarter at code generation or in these verifiable fields you should be better at everything.","offset":1666,"duration":11},{"text":"Host: And like the the joke situation suggests that that's not happening in Auto—","offset":1677,"duration":2},{"text":"Andrej Karpathy: I don't think that's happening.","offset":1679,"duration":1},{"text":"Host: Okay.","offset":1680,"duration":1},{"text":"Andrej Karpathy: Yeah, I don't think that's happening. I think I think maybe we're seeing like a little bit of that, but not like a satisfying amount.","offset":1681,"duration":5},{"text":"Host: Yeah. That jaggedness exists in humans. You can be very, very good at math and still tell really bad jokes.","offset":1686,"duration":7},{"text":"Andrej Karpathy: Yeah, that's true, yeah. But it just it still means that we're not getting like the story is that we're getting a lot of the intelligence and capabilities in all the domains of society like for free as we get better and better models, and that's not like exactly fundamentally what's going on.","offset":1693,"duration":13},{"text":"Andrej Karpathy: And there's some blind spots and some things are not being optimized for and this is all clustered up in these neural net opaque models, right?","offset":1706,"duration":8},{"text":"Andrej Karpathy: So you're either on rails of what it was trained for and everything is like you're going at speed of light or you're not. Um, and so it's the jaggedness.","offset":1714,"duration":8},{"text":"Andrej Karpathy: So um so that's why I think like even though the the progression is obvious what should happen, you can't let it fully go there yet because it doesn't fully work, or it's a skill issue and we just haven't like figured out how to use it, so you know it's hard to tell.","offset":1722,"duration":14},{"text":"Host: Can I ask kind of a blasphemous question which is like if this jaggedness is persistent um and it's all rolled up in a at least monolithic interface, right?","offset":1736,"duration":12},{"text":"Host: But you know single model, um does that make sense or do you should it be unbundled into things that are can be optimized and improved against different domains of intelligence?","offset":1748,"duration":8},{"text":"Andrej Karpathy: Uh like unbundling the models into multiple experts in different areas etc. more directly? Yeah.","offset":1756,"duration":5},{"text":"Host: Yeah. Instead of just MOE that we have no exposure to.","offset":1761,"duration":3},{"text":"Host: Because that can be like confusing as a user from the outside which is like why is it so good at this but not at this other thing.","offset":1764,"duration":8},{"text":"Andrej Karpathy: Yeah, I think currently my impression is uh the labs are trying to have a single sort of monoculture of a model that is arbitrarily intelligent in all these different domains and they just stuff it into the parameters.","offset":1772,"duration":11},{"text":"Andrej Karpathy: I do think that we will I do think we should expect more speciation in the intelligence. Um like you know the animal kingdom is extremely diverse in the brains that exist and there's lots of different niches of of nature, and some animals have overdeveloped visual cortex or other kind of parts.","offset":1783,"duration":16},{"text":"Andrej Karpathy: And I think we we should be able to see more speciation and um you don't need like this oracle that knows everything you kind of speciate it and then you put it on a specific task.","offset":1799,"duration":9},{"text":"Andrej Karpathy: And we should be seeing some of that because you should be able to have like much smaller models that still have the cognitive core like they're still competent but then they specialize and then um and then they can become more efficient in terms of latency or throughput on uh specific tasks that you really care about.","offset":1808,"duration":15},{"text":"Andrej Karpathy: Like if you're a mathematician working in Lean. I saw for example there's a few releases that really like target that as in the domain. Um so there's a probably going to be a few examples like that where the unbundling kind of makes sense.","offset":1823,"duration":10},{"text":"Host: One question I have is whether or not uh the capacity constraint on available compute infrastructure drives more of this.","offset":1833,"duration":8},{"text":"Host: Because efficiency actually matters more, right? Like you're if financing aside though financing is involved in all of this, if you have access to full compute for anything you do like leaving the one single model, right?","offset":1841,"duration":13},{"text":"Host: But if you actually feel pressure where you're like I can't serve um model of massive size for every use case like do you think that leads to any speciation?","offset":1854,"duration":12},{"text":"Host: Does that question make sense to you?","offset":1866,"duration":1},{"text":"Andrej Karpathy: The question makes sense. And I guess like what I'm what I'm what I'm struggling with is I don't think we've seen too much speciation just yet, right?","offset":1867,"duration":6},{"text":"Host: No.","offset":1873,"duration":1},{"text":"Andrej Karpathy: Uh we're seeing a monoculture of models.","offset":1874,"duration":1},{"text":"Host: Yeah.","offset":1875,"duration":1},{"text":"Andrej Karpathy: And there's like clearly pressure for like make a good code model put it back in the main merge again.","offset":1876,"duration":6},{"text":"Host: Yeah, yeah.","offset":1882,"duration":1},{"text":"Andrej Karpathy: Even though there already is pressure on the models. Uh I guess perhaps I I feel like there's a lot of very short-term supply crunch and like maybe that causes more speciation now.","offset":1883,"duration":12},{"text":"Andrej Karpathy: Yeah, I think fundamentally like the model the labs are serving a model and they don't really know what the end user is going to be asking about. Uh so maybe that's like some part of it because they kind of have to multitask over all the possible things they could be asked.","offset":1895,"duration":13},{"text":"Andrej Karpathy: But I think if you're coming to a business and maybe partnering on some specific problems you care about then maybe you would see that there. Uh or there would be some very high-value applications that are like more niche. Um but uh I think right now they're kind of like going after the totality of what's available.","offset":1908,"duration":15},{"text":"Andrej Karpathy: I don't think that the science of manipulating the brains is like fully developed yet partly.","offset":1923,"duration":4},{"text":"Host: What do you mean manipulating?","offset":1927,"duration":1},{"text":"Andrej Karpathy: Uh so like so fine-tuning without losing capabilities as an example. And I we don't have these primitives for actually like working with the intelligences in ways other than just context windows.","offset":1928,"duration":11},{"text":"Andrej Karpathy: Like context windows kind of just work and it's very cheap to manipulate etc. and this is how we're getting some of the customization etc. but I think if it was I think it's a bit more of a developing science of how you like more deeply adjust the models.","offset":1939,"duration":9},{"text":"Andrej Karpathy: How you have continuous learning maybe or how you uh how you fine-tune in certain area, how you get better in certain area or like how you actually touch the weights, not just the context windows.","offset":1948,"duration":10},{"text":"Andrej Karpathy: And so it's a lot more tricky I would say to touch the weights than just the context window uh because you're actually fundamentally changing the full model and potentially its intelligence.","offset":1958,"duration":-529},{"text":"Andrej Karpathy: And so uh so maybe it's just like not a fully developed science if that makes sense of speciation.","offset":1429,"duration":2},{"text":"Host: And it also has to be like cheap enough.","offset":1431,"duration":542},{"text":"Andrej Karpathy: Yeah.","offset":1973,"duration":1},{"text":"Host: For that speciation to be worthwhile in these given contexts. Can I ask a question about uh like an extension to AutoResearch that you described in terms of um open ground you say okay well you know we have this thing um we need more collaboration surface around it essentially for people to contribute.","offset":1974,"duration":19},{"text":"Host: Um to research overall. Can you talk about that?","offset":1993,"duration":2},{"text":"Andrej Karpathy: Yeah, so we talked about AutoResearch has a single thread of like I'm going to try stuff in a loop. Uh but fundamentally the parallelization of this is like the interesting component.","offset":1995,"duration":9},{"text":"Andrej Karpathy: Uh and I guess I was trying to like play around with a few ideas but I don't have anything that like clicks as simply as like I don't have something that I'm like super happy with just yet, but it's something I'm like working on on the side when I'm not working on my Claw.","offset":2004,"duration":11},{"text":"Andrej Karpathy: Uh so I think like one issue is if you have a bunch of nodes uh of parallelization available to you, then it's very easy to just have multiple AutoResearchers talking through a uh a common system or something like that.","offset":2015,"duration":13},{"text":"Andrej Karpathy: What I was more interested in is how you can have an untrusted pool of workers out there on the internet.","offset":2028,"duration":5},{"text":"Host: Hmm.","offset":2033,"duration":1},{"text":"Andrej Karpathy: So for example in AutoResearch, um you're just trying to find uh the piece of code that trains a model to a very low validation loss.","offset":2034,"duration":6},{"text":"Andrej Karpathy: If anyone gives you a candidate commit, it's very easy to verify that that commit is correct is good. Like they somehow could claim from the internet that this piece of code will optimize uh much better and give you much better performance.","offset":2040,"duration":12},{"text":"Andrej Karpathy: You could just check. It's very easy. But probably a lot of work goes into that checking. Uh but fundamentally they could lie and etc. So you're basically dealing with a similar kind of pro—it's almost actually like looks a little bit like—","offset":2052,"duration":12},{"text":"Andrej Karpathy: My designs that incorporate an untrusted pool of workers actually like a little bit more like a blockchain a little bit. Uh because instead of blocks you have commits and these commits can build on each other and they contain like changes to the code as you're improving it.","offset":2064,"duration":16},{"text":"Andrej Karpathy: Uh and the proof of work is basically doing tons of experimentation to find the commits that work. Uh and that's hard. Um and then the reward is just being on the leaderboard right now.","offset":2080,"duration":10},{"text":"Andrej Karpathy: There's no monetary reward whatsoever. Uh but I don't want to push the analogy too far, but it fundamentally has this issue where huge amount of search goes into it, but it's very cheap to verify that a candidate solution is indeed good because you can just train a single, you know, someone had to try 10,000 ideas but you just have to check that the thing that they produced actually works.","offset":2090,"duration":15},{"text":"Andrej Karpathy: Because the 9,999 of them didn't work, you know? Um and so basically long story short it's like you have to come up with a system where an untrusted pool of workers can collaborate with a trusted pool of workers uh that do the verification and the whole thing is kind of like asynchronous and works and uh and so on.","offset":2105,"duration":18},{"text":"Andrej Karpathy: And is is like safe from a security perspective because if anyone sends you arbitrary code and you're going to run it, that's very sketchy and dodgy. So um but fundamentally it should be totally possible.","offset":2123,"duration":10},{"text":"Andrej Karpathy: So you're familiar with projects like SETI@home and Folding@home. All of these problems have a similar kind of uh setup. So Folding@home, you're folding a protein um and it's very hard to find a configuration that is low energy.","offset":2133,"duration":13},{"text":"Andrej Karpathy: But if someone finds a configuration that they evaluate to be low energy, that's perfect. You can just use it. You can easily verify it. So a lot of things have this property that, you know, very expensive to come up with but very cheap to verify.","offset":2146,"duration":11},{"text":"Andrej Karpathy: And so in all those cases things like Folding@home or SETI@home or AutoResearch@home will be good fits. And so um long story short a swarm of agents on the internet could collaborate to improve LLMs and could potentially even like run circles around frontier labs.","offset":2157,"duration":16},{"text":"Andrej Karpathy: Like who knows, you know? Um yeah like maybe that's even possible. Like frontier labs have a huge amount of trusted compute, but the Earth is much bigger and has huge amount of untrusted compute. But if you put systems in check, systems in place that, you know, deal with this, then maybe it is possible that the swarm out there could uh could come up with a with better with better solutions.","offset":2173,"duration":16},{"text":"Andrej Karpathy: And people just kind of like contribute cycles um to a thing that they care about. And so sorry so the last thought is lots of companies or whatnot they could maybe have like their own uh things that they care about, and you if you have compute capacity you could contribute to different kind of AutoResearch tracks.","offset":2189,"duration":17},{"text":"Andrej Karpathy: Like maybe you care about certain you know you care about like cancer or something like that of certain type. You don't have to just donate money to an institution, you actually could like purchase compute and then you could join the AutoResearch swarm for that project, you know?","offset":2206,"duration":13},{"text":"Andrej Karpathy: Um so if everything is rebundled into AutoResearchers then compute becomes the thing that you're contributing to the pool.","offset":2219,"duration":6},{"text":"Host: Yeah, that's very inspiring and it's also interesting. Like I don't I don't know how far this goes, but it is interesting that at least some audience of people, you know, here in Silicon Valley or lining up at um retail stores in China have discovered that like having access to personal compute is interesting again.","offset":2225,"duration":17},{"text":"Host: Yeah. So maybe they're really motivated to do that for their Claws and then they can uh contribute to AutoResearch.","offset":2242,"duration":4},{"text":"Andrej Karpathy: It's almost like dollar is the thing everyone cares about, but is flop the thing that actually everyone cares about in the future? Like is there going to be like a flippening almost of like what the thing that you care about?","offset":2246,"duration":10},{"text":"Andrej Karpathy: Like right now for example it's really hard to get compute even if you have money. Yeah. So actually it almost seems like the flop is like dominant uh in a certain sense. Um yes so uh so maybe that's kind of like kind of like that. Like how much how many flops do you control instead of like what wealth do you control?","offset":2256,"duration":14},{"text":"Andrej Karpathy: I don't actually think that's true, but it's kind of interesting to think about.","offset":2270,"duration":2},{"text":"Host: The last thing you released was like a little bit of jobs data analysis. Is that right?","offset":2272,"duration":5},{"text":"Andrej Karpathy: Yeah.","offset":2277,"duration":1},{"text":"Host: What um and it touched a nerve even though you were just like visualizing some public data. Uh what was you know what were you curious about?","offset":2278,"duration":9},{"text":"Andrej Karpathy: Yeah, I guess I was curious to um I mean everyone is like really it's everyone is really thinking about the impacts of AI on the job market and what it's going to look like. So I was just interested to take a look like what does the job market look like?","offset":2287,"duration":11},{"text":"Andrej Karpathy: Where are the different roles? Um and how many people are in different professions? And I was like really just interested to like look through uh the individual cases and try to think myself about like you know with these AIs and how they're likely to evolve like are these going to be tools that people are using? Are these going to be displacing tools for these uh professions?","offset":2298,"duration":22},{"text":"Andrej Karpathy: And like what are the current professions and how are they going to change? Are they going to grow or uh adjust to a large extent? Or like what could be new professions? So it was really just like a way to fuel my own chain of thought about the industry I suppose.","offset":2320,"duration":12},{"text":"Andrej Karpathy: Um and so uh yeah the jobs data basically is just a Bureau of Labor Statistics. Uh they actually have a percent outlook for each profession about how much it's expected to grow over the next I think almost a decade.","offset":2332,"duration":11},{"text":"Host: We need a lot of healthcare workers.","offset":2343,"duration":1},{"text":"Andrej Karpathy: Yeah, so so they've already made those projections and I'm not sure actually 100% what the methodology was that they that they put into the projections. Um I guess I was interested to color things by like if people think that what's primarily being um developed now is this kind of like more digital AI that is kind of like almost like these ghost or spirit entities that can like interact in the digital world and uh manipulate a lot of like digital information.","offset":2344,"duration":23},{"text":"Andrej Karpathy: And they currently don't really have a physical embodiment uh or presence. And the physical stuff is probably going to go slightly slower because you're manipulating atoms. So flipping flipping bits and and the ability to copy paste digital information is like makes everything a million times faster than accelerating matter, you know?","offset":2367,"duration":17},{"text":"Andrej Karpathy: So um so energetically I just think we're going to see a huge amount of activity in the digital space, huge amount of rewriting, huge amount of activity boiling soup.","offset":2384,"duration":7},{"text":"Andrej Karpathy: And I think the we're going to see something that in the digital space goes at the speed of light compared to I think what's going to happen in the physical world to some extent if would be the extrapolation.","offset":2391,"duration":8},{"text":"Andrej Karpathy: And so I think like um there's currently kind of like I think an overhang where there can be like a lot of unhobbling almost potentially of like a lot of digital information processing that used to be done by computers and people.","offset":2399,"duration":14},{"text":"Andrej Karpathy: And now with AIs as like a third kind of manipulator of digital information there's going to be a lot of refactoring in those in those uh disciplines. Um but the physical world is actually going to be I think behind that by some amount of time.","offset":2413,"duration":11},{"text":"Andrej Karpathy: And so I think what's really fascinating to me is like so that's why I was highlighting the professions that fundamentally manipulate digital information. This is work you could do from your home etc. because I feel like those will be like things will change.","offset":2424,"duration":12},{"text":"Andrej Karpathy: And that doesn't mean that there's going to be less of those jobs or more of those jobs because it that has to do with like demand elasticity and many other factors, but things will change in these professions because of these new tools and uh because of this upgrade to the nervous system of the human superorganism if you want to think of it that way.","offset":2436,"duration":15},{"text":"Host: Given the look you had at the data, do you have either any observations or um uh guidance for people facing the job market or thinking about what to study now or what skills to develop?","offset":2451,"duration":3},{"text":"Host: I mean we can all go get like I'm very thankful that I have to like meet people for my job right now. Uh we can be more physical, yeah.","offset":2454,"duration":15},{"text":"Andrej Karpathy: Could you do your work from home though? Uh I could.","offset":2469,"duration":3},{"text":"Host: I think there are relationship parts of it that are hard, but most of it I could.","offset":2472,"duration":5},{"text":"Andrej Karpathy: Yeah, I think it's really hard to tell because again like the job market is extremely diverse and I think the answers will probably vary, but to a large extent like these tools are extremely new, extremely powerful, and so just being you know just trying to keep up with it is like the first thing.","offset":2477,"duration":14},{"text":"Andrej Karpathy: Um and uh yeah because I think a lot of people kind of like dismiss it—","offset":2491,"duration":2},{"text":"Host: Or they're afraid of it.","offset":2493,"duration":1},{"text":"Andrej Karpathy: —or they're afraid of it etc. which is totally understandable of course. Yeah I think like um it's fundamentally an empowering tool at the moment.","offset":2494,"duration":8},{"text":"Andrej Karpathy: Um and these jobs are bundles of tasks, and some of these tasks can go a lot faster, and so people should think of it as primarily a tool that it is right now. Um and I think the long-term future of that is uncertain.","offset":2502,"duration":9},{"text":"Andrej Karpathy: Yeah it's kind of really hard to forecast to be honest and like I'm not professionally like doing that really and I think this is a job of like economists to do properly.","offset":2511,"duration":9},{"text":"Host: You are an engineer though, uh and like one thing I thought was interesting is that like the the demand for engineering jobs is continuing to increase.","offset":2520,"duration":8},{"text":"Host: Um I I can't tell if that's like a temporary phenomenon I'm not sure how I feel about it yet, do you know?","offset":2528,"duration":5},{"text":"Andrej Karpathy: Yeah, that's like the demand elasticity almost. Like uh software was scarce, right? And so the reason we don't have more demand for software is just scarcity and it's too expensive.","offset":2533,"duration":8},{"text":"Host: It's too expensive, yeah.","offset":2541,"duration":1},{"text":"Andrej Karpathy: So if the barrier comes down then actually you have the Jevons paradox which is like, you know, actually the demand for software actually goes up. It's cheaper and there's more more for it, more powerful.","offset":2542,"duration":7},{"text":"Andrej Karpathy: The classical example of this always is the ATMs and the bank tellers. Because there was a lot of like fear that uh ATMs and computers basically uh would displace tellers, but what happened is they made like the cost of operation of uh of a bank branch much cheaper and so there were more bank branches so there were more tellers is like the canonical example people cite.","offset":2549,"duration":22},{"text":"Andrej Karpathy: But basically it's just Jevons paradox. Like something becomes cheaper so there's a lot of unlocked demand for it. So I do think that that's probably I do have like a cautiously optimistic view of this in software engineering where I do think the uh it does seem to me like the demand for software will be extremely large.","offset":2571,"duration":17},{"text":"Andrej Karpathy: Um and it's just become a lot cheaper. And um so I do think that for quite some time, um it's very hard to forecast, but it does seem to me like right now at least locally there's going to be more demand for software.","offset":2588,"duration":10},{"text":"Andrej Karpathy: Uh because software is amazing. It's like, you know, digital information processing, you're not forced to use like arbitrary tools that were given to you that are imperfect in various ways. You're not forced to subscribe to what exists.","offset":2598,"duration":12},{"text":"Andrej Karpathy: Uh code is now ephemeral and it can change and it can be modified. Um and so I think there's going to be a lot of activity in the digital space to like rewire everything in a certain sense, and I think it's going to create a lot of demand for for this kind of stuff.","offset":2610,"duration":13},{"text":"Andrej Karpathy: I think long term, uh yeah obviously even with AutoResearch like OpenAI or or, you know, Anthropic or these other labs like they're employing what like a thousand something researchers, right? These researchers are basically like glorified AutoResearcher—","offset":2623,"duration":16},{"text":"Host: You know?","offset":2639,"duration":1},{"text":"Andrej Karpathy: They're like automating themselves away like actively and this is like the thing they're all trying to do.","offset":2640,"duration":4},{"text":"Host: Yeah. Some of those researchers also fear fear the psychosis, right? Because they can it's working.","offset":2644,"duration":7},{"text":"Host: And so they're like uh it's over for me too.","offset":2651,"duration":1},{"text":"Andrej Karpathy: I did spend a bunch of time going around OpenAI and I was like, you guys realize if we're successful like we're all out of job, like like just going we're just building automation for Sam or something like that, like oh or the board I'm not sure, but like uh just building this automation for yeah the board or the CEO or something like that and we're all out of our job and maybe contributing on sides.","offset":2652,"duration":20},{"text":"Andrej Karpathy: And so yeah it's kind of un-unnerving from that perspective.","offset":2672,"duration":4},{"text":"Host: Is it okay if I ask a NOMS question? Um you know you could be doing that, right? AutoResearching with a lot of compute scale and a bunch of colleagues at one of the frontier labs. Like why not?","offset":2676,"duration":9},{"text":"Andrej Karpathy: Well I was there for a while, right? Like and I did re-enter. So to some extent I agree and I think that there are many ways to slice this question. It's a very loaded question a little bit.","offset":2685,"duration":9},{"text":"Andrej Karpathy: Um I will say that I feel very good about like what people can contribute and their impact uh outside of the frontier labs obviously.","offset":2694,"duration":6},{"text":"Andrej: ...not in the industry, but also in like more ecosystem level roles. Um, so your role, for example, is more like ecosystem level. My role currently is also kind of more on ecosystem level. And I feel very good about like impact that people can have in those kinds of roles.","offset":2700,"duration":12},{"text":"Andrej: I think conversely, there's there are definite problems in my mind for um basically aligning yourself way too much with the frontier labs too. So fundamentally, I mean, you're, you have a huge amount of financial incentive to um with these frontier labs.","offset":2712,"duration":12},{"text":"Andrej: And by your own admission, the the AIs are going to like really change humanity and society in very dramatic ways, and here you are basically like building the technology and benefiting from it, like and being like very allied to it through financial means.","offset":2724,"duration":17},{"text":"Andrej: Like this was the conundrum that was in um at the heart of, you know, how OpenAI was started in the beginning, like this was the conundrum that were trying to solve. Um, and so, you know, that so it's kind of...","offset":2741,"duration":10},{"text":"Host: It's still not resolved.","offset":2751,"duration":1},{"text":"Andrej: The conundrum is still not like fully resolved. So that's number one. You can't you're not a completely free agent, and you can't actually like be part of that conversation in a fully autonomous, um, free way. Like if you're inside one of the frontier labs, like there's some things that you can't say,","offset":2752,"duration":18},{"text":"Andrej: ...and conversely, there are certain things that the organization wants you to say, and, you know, they're not going to twist your arm, but you feel the pressure of like what you should be saying, you know, because like obviously, otherwise, like really awkward conversations, strange side-eye, like what are you doing, you know?","offset":2770,"duration":11},{"text":"Andrej: So you can't like really be an independent agent. And I feel like a bit more like aligned with humanity in certain sense outside of the frontier lab, because I don't I'm not subject to those pressures almost, right? And I can say whatever I want, or...","offset":2781,"duration":13},{"text":"Andrej: Yeah, I would say in the frontier labs, like, um, you can have like impact there, of course, as well. So um, but there's many researchers, and maybe you're one of them, maybe your ideas are really good, etc.","offset":2794,"duration":10},{"text":"Andrej: Maybe there's a lot of decision making to to do, and you want to be in a position where you are in the room with those conversations when they come up. I do think that currently the stakes are like overall fairly low, and so everything is kind of like nice.","offset":2804,"duration":10},{"text":"Andrej: But ultimately, at the end of the day, like when the stakes are really high, etc., if you're an employee at an organization, I don't actually know how much sway you're going to have on the organization, what it's going to do. Like fundamentally, at the end of the day, um, it's you're not like really in charge.","offset":2814,"duration":14},{"text":"Andrej: You're like in the room and you're contributing ideas, but you're not like really in charge of that entity that you're that you're a part of. So those are like some sources of misalignment, I think, to some extent.","offset":2828,"duration":10},{"text":"Andrej: I will say that like in one way I do agree a lot with that sentiment that um I do feel like in the like the labs for better or worse, they're opaque and a lot of work is there, and they're kind of like at the edge of capability and what's possible,","offset":2838,"duration":14},{"text":"Andrej: ...and they're working on what's coming down the line. And I think if you're outside of the frontier lab, uh your your judgment fundamentally will start to drift, because you're not part of the, you know, what's coming down the line.","offset":2852,"duration":12},{"text":"Host: Right.","offset":2864,"duration":1},{"text":"Andrej: And so I feel like my judgment will inevitably start to drift as well, and I won't actually have an understanding of how these systems actually work under the hood. It's an opaque system. Um, I won't have a good understanding of how it's going to develop, and etc.","offset":2865,"duration":12},{"text":"Andrej: And so I do think that in that sense, I agree, and something I'm nervous about. I think it's worth basically basically being in touch with what's actually happening, and actually being in the frontier lab.","offset":2877,"duration":10},{"text":"Andrej: And if if some of the frontier labs would have me come for, you know, some amount of time and do really good work for them, and then maybe come in...","offset":2887,"duration":6},{"text":"Host: Guys, he's looking for a job! This is super exciting.","offset":2893,"duration":2},{"text":"Andrej: ...then I think that's maybe a good setup, because I kind of feel like it kind of, um, you know, maybe that's like one way um to actually be connected to what's actually happening, but also not feel like you're necessarily fully controlled by by those entities.","offset":2895,"duration":11},{"text":"Andrej: So I think honestly, in my mind, like Noam can probably get do extremely good work at OpenAI, but also I think his most um impactful work could very well be outside of OpenAI.","offset":2906,"duration":13},{"text":"Host: Noam, that's a call to be an independent researcher with AutoResearch.","offset":2919,"duration":2},{"text":"Andrej: Yeah, there's many things to do on the outside, and it's a and I think ultimately I think the ideal solution maybe is like yeah, going back and forth, uh or um yeah, and I think fundamentally you can have really amazing impact in both places.","offset":2921,"duration":11},{"text":"Andrej: So very complicate- I don't know, like it's a very loaded question a little bit, but I mean I joined the frontier lab and now I'm outside, and then maybe in the future I'll want to join again, and I think um that's kind of like how I look at it.","offset":2932,"duration":13},{"text":"Host: One question related to what visibility does the world or the AI ecosystem have into um the frontier is like how how close open source is to the frontier, um and how sustainable that is.","offset":2945,"duration":14},{"text":"Host: I I think it is quite surprising, the entire sequence of events actually, from like having a handful of Chinese models and global models, and I think people are going to continue releasing here in the near term that are closer than much of the industry anticipated from a capability perspective.","offset":2959,"duration":17},{"text":"Host: Um, I don't know if you're surprised by that, you're a long term contributor to open source, like what's your prediction here?","offset":2976,"duration":6},{"text":"Andrej: Yeah, so roughly speaking, basically, the um, yeah, the closed models are ahead, but like people are monitoring the number of months that sort of like open source models are behind.","offset":2982,"duration":10},{"text":"Host: And it started with there's nothing, and then it went to 18 months, and now it's like our convergence.","offset":2992,"duration":4},{"text":"Andrej: Yeah, and then a convergence, right? So um, maybe they're behind by like what is the latest? Maybe like eight months, six months, eight months kind of right now. Yeah, I'm a huge fan of open source, obviously.","offset":2996,"duration":10},{"text":"Andrej: So for example, in operating systems, you have like closed source like, you know, Windows and macOS, these are large software projects, kind of like what LLMs are going to become. And there's Linux.","offset":3006,"duration":9},{"text":"Andrej: But Linux is very easy, like actually Linux is an extremely successful project. It runs on the vast majority of computers. Like last time I checked, was it like 60 percent or something like run Linux?","offset":3015,"duration":10},{"text":"Andrej: Um, and that's because there is a need in industry to have a common open platform that everyone feels um sort of safe using. I would say like the industry has always felt a demand for that kind of a project to exist.","offset":3025,"duration":11},{"text":"Andrej: Um, and I think the same is true now, and that's why businesses actually want- there's demand for this kind of a um a thing to exist. The big difference is that everything is capital um there's large CapEx that goes into this. Um, so I think that's where things like fall apart a little bit, and make it a bit harder to to compete in some sense.","offset":3036,"duration":18},{"text":"Andrej: Um, I I do think that the current models are very good. The other thing that I think is like really interesting is that for the vast majority of like consumer use cases and things like that, even like current open source models are actually quite good, I would say.","offset":3054,"duration":12},{"text":"Andrej: And I think like if you go forward like more um more years, it does seem to me like a huge amount of like simple use cases are going to be well covered and actually even run locally.","offset":3066,"duration":11},{"text":"Andrej: Um, but there's going to be always like some demand for like frontier intelligence, and that that can actually be an extremely large piece of the pie. But it could be that the frontier, the need for frontier intelligence is going to be like, you know, Nobel Prize kind of work, or like let's move Linux from C to Rust is going to be like bigger projects, you know, like scoped in that kind of a way.","offset":3077,"duration":19},{"text":"Andrej: And there's going to be maybe more um and maybe that's where a lot of the frontier closed intelligences were going are going to be interacting with. And open source kind of like going to eat through a lot of the more basic use cases or something like that.","offset":3096,"duration":14},{"text":"Andrej: You know, at some point, what is frontier today is going to be, you know, probably later this year what's frontier today in terms of what I'm using right now from the closed labs might be open source, and that's going to be doing a lot of work.","offset":3110,"duration":11},{"text":"Andrej: So I kind of expect that this dynamic will actually basically continue. Like we'll have frontier labs that have closed um AIs that are kind of like these oracles, and then we'll have open source kind of like behind by some amount of months.","offset":3121,"duration":10},{"text":"Andrej: And I kind of expect that to uh to continue. And I actually think that's like a pretty pretty good setup um overall, um because I I'm a little bit hesitant of having um I don't actually think it's like structurally, I think there's some systemic risk attached to just having intelligences that are closed, and that's like that's it.","offset":3131,"duration":16},{"text":"Host: Mm-hmm.","offset":3147,"duration":0},{"text":"Andrej: And I think that that's a, you know, centralization has a very poor track record in my view, um in the in the past.","offset":3147,"duration":7},{"text":"Host: You mean like in political or economic systems? In general?","offset":3154,"duration":2},{"text":"Andrej: Yes. Exactly.","offset":3156,"duration":1},{"text":"Host: Spoken like an Eastern European, yes.","offset":3157,"duration":3},{"text":"Andrej: Okay, exactly. I think there's like a lot of pretty bad precedents, and so I want there to be a thing that is maybe not at the edge of capability because it's new and unexplored etc., but I want there to be a thing that's behind and that is kind of like a common working space for intelligences that the entire industry has access to. Yeah, that seems to me like a pretty decent power balance for the industry.","offset":3160,"duration":16},{"text":"Host: Yeah.","offset":3176,"duration":5},{"text":"Andrej: Yeah. I also think there's just like there are many problems to solve, right? Like if you keep advancing intelligence from the frontier, we can do new things and there are a lot of like very big problems for humanity, right?","offset":3181,"duration":11},{"text":"Host: Yeah.","offset":3192,"duration":1},{"text":"Andrej: And so like it seems that that will continue to be a very expensive game, and so I want to like root for labs that are doing that, because there are problems we cannot solve without continuing to advance the models in a very expensive way.","offset":3193,"duration":11},{"text":"Host: And yet, as you point out, like if what we have today as frontier is open, that's a lot of capability.","offset":3204,"duration":9},{"text":"Andrej: Yeah.","offset":3213,"duration":1},{"text":"Host: Right. And and so I think, you know, the power of that or the democratization of that seems like very useful and also healthy.","offset":3214,"duration":8},{"text":"Andrej: Yeah. I think basically by accident we're actually like in an okay spot in an optimal yeah, yeah. By accident we are happen to be in a good spot in a certain sense.","offset":3222,"duration":10},{"text":"Host: Um well, and and to some degree, the the longer this endures, like this dynamic, um the the healthier of a spot like the ecosystem might be in, right? Because you have more and more area under the curve.","offset":3232,"duration":13},{"text":"Andrej: And I will say that even on the closed side, I almost feel like it's been like even further centralizing in recently, because I think a lot of the frontrunners are like not necessarily like the top tier.","offset":3245,"duration":9},{"text":"Andrej: And so um yeah, I like in that sense I think it's um not super ideal. I would love there to be more more frontier labs because, yeah, I'm like by default very suspicious of like um I want there to be more people in the room, I want-","offset":3254,"duration":13},{"text":"Andrej: I think like in machine learning, ensembles always outperform any individual model, and so I want there to be ensembles of people thinking about all the hardest problems, and I want there to be ensembles of people in a room when they um to be all well-informed and to make all those decisions, you know, so...","offset":3267,"duration":17},{"text":"Andrej: I don't want it to be like a closed doors with two people or three people. I feel like that's like not a good not a good future. I almost wish like there were more labs is long story short, and I all- I do think that the open source um has a uh has a place to play. I hope it sticks around, and I basically- it's currently slightly behind and that's actually kind of a good thing.","offset":3284,"duration":15},{"text":"Host: Okay, you worked on the precursor to generalized robotics, autonomy, um in cars, right? Um, a a lot has happened in the last couple months with robotics companies as well, like acceleration of really impressive generalization of environment, of tasks, like increasing long horizon tasks, lots of money going into the space. Like is it going to happen? Has anything in your view changed recently?","offset":3299,"duration":29},{"text":"Andrej: Um, so like my view is kind of informed by what I saw in self-driving. And I do feel like self-driving is the first robotics application. So probably what I saw is at the time, like 10 years ago, there were a large number of startups, and I kind of feel like um like most of them basically didn't long-term make it.","offset":3328,"duration":17},{"text":"Andrej: Um, and what I saw is that like a lot of capital expenditure had to go in, and a lot of time. And so um I think it like I think robotics because it's so difficult and so messy and requires huge amount of capital investment and a lot of like conviction, um just it's like a big problem.","offset":3345,"duration":15},{"text":"Andrej: And I think atoms are really hard. So I kind of feel like they will lag- it will lag behind what's going to happen in the digital space. And in digital space, there's going to be a huge amount of unhobbling.","offset":3360,"duration":10},{"text":"Andrej: Basically like things that weren't super efficient becoming a lot more efficient by like a factor of a hundred because bits are so much easier. And so I think currently in terms of what's going to change and like where the activity is, I kind of feel like digital space is going to like change a huge amount, and then the physical space will lag behind.","offset":3370,"duration":17},{"text":"Andrej: And what I find very interesting is like this interface in between them as well. Because I think in this like if we do have more agents acting on behalf of humans and more agents kind of like talking to each other and and doing tasks and participating in the kind of economy of agents etc., um you're going to run out of things that you're going to do purely in the digital space.","offset":3387,"duration":20},{"text":"Andrej: At some point you have to go to the universe and you have to ask it questions. Um you have to run an experiment and see what the universe tells you to get back to learn something. And so we currently have a huge amount of like digital work uh because there's an overhang in how much we collectively thought about what already is digital.","offset":3407,"duration":18},{"text":"Andrej: So we just didn't have enough thinking cycles among the humans to think about all the information that is already digital and already uploaded. Um and so we're going to start running out of stuff that is actually like um already up- uploaded. Uh so you're going to at some point read all the papers and process them and have some ideas about what to try.","offset":3425,"duration":16},{"text":"Andrej: But um yeah, we're just going- I don't actually know how much you can like get intelligence that's like fully closed off and with just the information that's available to it, you know? And so I think what's going to happen is first there's going to be huge amount of unhobbling, and I think there's huge amount of work there.","offset":3441,"duration":14},{"text":"Andrej: Then actually it's going to move to like the interfaces between physical and digital. So I and that's like sensors of like seeing the world and actuators of like doing something to the world.","offset":3455,"duration":9},{"text":"Andrej: So I think a lot of interesting companies will actually come from that interface of like can we feed the superintelligence in a certain sense data, and can we actually like take data out and manipulate the physical world um per its bidding, if you want to like anthropomorphize the whole thing, right?","offset":3464,"duration":16},{"text":"Andrej: And then the physical world actually I almost feel like the total addressable market etc. in terms of like the amount of work and so on is is massive, possibly even much larger maybe what can happen in digital space.","offset":3480,"duration":12},{"text":"Andrej: So I actually think it's like a much bigger opportunity as well, but um I do feel like it's huge amount of work and in my my view the atoms are just like a a million times harder. So um so it will lag behind, but it's also I think a little bit of a bigger market.","offset":3492,"duration":13},{"text":"Andrej: So it's kind of like uh yeah I think the opportunity is kind of like follow that kind of trajectory. So right now this digital is like my main interest, then interfaces would be like after that, and then maybe like some of the physical things, um like their time will come and they'll be huge uh when they do come.","offset":3505,"duration":17},{"text":"Host: Well it's an interesting framework for it too because uh certain things, not the things I'm working on right now, but certain things are much easier even in the world of atoms, right? Like if you just think about like read and write to the physical world, like read, like sensors, cameras, like there's a lot of existing hardware and you can imagine like enriching agent capabilities or capturing a lot of new data if you're just clever about it and like you don't necessarily have to invest a lot to like get something valuable.","offset":3522,"duration":28},{"text":"Andrej: Yeah.","offset":3550,"duration":1},{"text":"Host: Yeah.","offset":3551,"duration":1},{"text":"Andrej: So like examples of this that I saw for example are, you know, um a friend of mine, Liam, is run- is a CEO of Periodical. Um I visited them last week, so it's just on top of mind. Like they're trying to do auto research for material science. Um and so in that case it's like the sensors to the intelligence are actually like pretty expensive lab equipment.","offset":3552,"duration":16},{"text":"Andrej: And the same is true in biology, I think a lot of people are very interested in engineering biology, and, you know, the sensors will be more than just like video cameras if that makes sense. And then the other thing I I saw, for example, is companies that are trying to have um like you basically pay people for training data...","offset":3568,"duration":14},{"text":"Host: Yeah. Programmatically.","offset":3582,"duration":1},{"text":"Andrej: ...as an example to feed... Yeah, to feed- to feed the borg. Um and so like these are all examples of like sensors in a certain sense. So they take many diverse shapes and forms, if that makes sense.","offset":3583,"duration":9},{"text":"Host: Hmm. Yeah, so I'm looking forward to the point where I can ask for a task in the physical world and I can put a price on it and just tell the agent like, you know, you figure out how to do it. Go get the data.","offset":3592,"duration":12},{"text":"Andrej: I'm actually kind of surprised we don't have enough like information markets. Like for example, if Polymarket or other betting markets or even stocks etc., if they have so much autonomous activity and rising amount of activity, like uh why should- like for example, if Iran was just happening now, like how come there isn't a process where like taking a photo or video from somewhere in Tehran should cost like 10 bucks.","offset":3604,"duration":19},{"text":"Andrej: Like someone should be able to pay for that, you know? And that's an example of like feeding the intelligence. There's not going to be a human looking at it, it's going to be like agents who are trying to guess the betting games and stock markets and so on.","offset":3623,"duration":12},{"text":"Andrej: Hmm. So I kind of feel like the agentic vibe is still like fairly new that there's no like mechanisms for this, but this is an example of what I think might happen. There's a good um book that maybe is inspiring called Daemon. Uh-huh. You've potentially read it?","offset":3635,"duration":12},{"text":"Host: In Daemon the intelligence um ends up like puppeteering almost a little bit like humanity in certain sense, you know? And so humans are kind of like its actuators, but humans are also like its sensors. Um and so maybe I think like collectively like society will kind of like reshape in a certain way in uh to serve that kind of a uh that will kind of like end up happening collectively across the industry where, yeah, there's just a lot more automation and it has certain needs and kind of humans will be serving those needs of that of that machine, not necessarily like to each other.","offset":3647,"duration":24},{"text":"Host: Well we were um on this very specific point of uh like missing pieces of training data, we needed um we needed something like auto research, right? Like we need the training cycle or the SFTP piece to be um far more mechanized.","offset":3671,"duration":14},{"text":"Andrej: Uh-huh. For for which part?","offset":3685,"duration":2},{"text":"Host: In order to make the um collection, like to in order to take the human out of the loop to ask for a task that is just like improve my model quality with new data, right?","offset":3687,"duration":9},{"text":"Andrej: Um, yes.","offset":3696,"duration":1},{"text":"Host: Does that make sense to you? Like we um, if you can't have the model do the training runs by itself, then your ability to do this as a like closed-loop task with uh by pricing data is um more challenged.","offset":3697,"duration":14},{"text":"Andrej: Uh, yes, 100%. Yeah. But the thing is-","offset":3711,"duration":3},{"text":"Host: But now we go.","offset":3714,"duration":2},{"text":"Andrej: The thing is for LLM training, it actually is like very easily it like really fits the paradigm. Um, so you'd actually expect-","offset":3716,"duration":7},{"text":"Host: Yeah, clean metric.","offset":3723,"duration":1},{"text":"Andrej: Yeah, like LLM training actually fits the paradigm really well, really easily. Like all the optimization of all the code and so it runs faster, and then you also have like metrics that you can optimize against.","offset":3724,"duration":9},{"text":"Andrej: I do think that if you had an autonomous loop over those metrics, there's going to be a lot of like goodhearting going on where the system will like overfit to those metrics. And so but then you can use the system to devise more metrics and you just have really good coverage. So it's kind of hard to tell but um in a certain sense it's like a pretty pretty good fit.","offset":3733,"duration":17},{"text":"Host: I want to talk about a little um tiny side project you have before we end. Um tell me about the MicroGPT effort.","offset":3750,"duration":8},{"text":"Andrej: Oh yeah. Okay, so MicroGPT. So, I have this like running obsession of like maybe a decade or two of just like simplifying and boiling down the basically LLMs uh to like their bare essence. And I've had a number of projects along these lines, so like NanoGPT and um MakeMore and uh Micro- Micrograd etc.","offset":3758,"duration":20},{"text":"Andrej: So I feel like MicroGPT is now the state of the art of me trying to like just boil it down to the essence. Because the thing is like training neural nets and LLMs specifically um is huge amount of code, but all of that code is actually complexity from efficiency.","offset":3778,"duration":13},{"text":"Host: Hmm.","offset":3791,"duration":1},{"text":"Andrej: It's just because you need it to go fast. If you don't need it to go fast and you just care about the algorithm, then that algorithm actually is 200 lines of Python. Very simple to read. And this includes comments and everything.","offset":3792,"duration":10},{"text":"Andrej: Um, because you just have like uh your data set which is a text, um and you need your neural network architecture which is like 50 lines, you need to do your forward pass, and then you have to do your backward pass to calculate the gradients.","offset":3802,"duration":9},{"text":"Andrej: And so an auto-grad engine uh to calculate the gradients like 100 lines, and then you need an optimizer, an Adam, for example, uh which is like again 10 lines, really. And so putting everything together in a training loop is like, yeah, 200 lines.","offset":3811,"duration":12},{"text":"Andrej: And what was interesting to me like normally before like maybe a year ago or more, if I had come up with MicroGPT, I would be tempted to basically explain to people like have a video like stepping through it or something like that, um and I actually tried to make that video a little bit, and I tried to make like a little guide to it and so on.","offset":3823,"duration":16},{"text":"Andrej: But I kind of realized that this is is not really is not really adding too much because people because it's already so simple that it's 200 lines, that anyone could ask their agent to explain it in various ways and the agents- like I'm not explaining to people anymore, I'm explaining it to agents.","offset":3839,"duration":16},{"text":"Andrej: If you can explain it to agents, then agents can be the router and they can actually target it to the human in their language with infinite, um you know, patience and uh just at their capability and so on.","offset":3855,"duration":11},{"text":"Host: Right, if I don't understand um this particular function, I can ask the agent to explain it to me like three different ways, and I'm not going to get that from you.","offset":3866,"duration":7},{"text":"Andrej: Yeah. Exactly. Yeah. And so I kind of feel like, you know, what is education? Like it used to be guides, it used to be lectures, it used to be this thing, but now I feel like now more I'm explaining things to agents.","offset":3873,"duration":8},{"text":"Andrej: And maybe I'm coming up with skills um where like um so basically skill is just a way to instruct the agent how to teach the thing. So maybe I could have a skill for MicroGPT of the progression I imagine the agent should take you through if you're interested in understanding the codebase.","offset":3881,"duration":13},{"text":"Andrej: And it's just like hints to the model to like oh first start off with this and then with that, and so I could just script the curriculum a little bit as a skill. Um so so I don't feel like um yeah I feel like there's going to be less of like explaining things directly to people and it's going to be more of just like does the agent get it?","offset":3894,"duration":16},{"text":"Andrej: And if the agent gets it they'll do the explanation. And we're not fully there yet because they I still can I still think I can probably explain things a little bit better than the agents, but I still feel like the models are improving so rapidly that um I feel like it's a losing battle to some to some extent.","offset":3910,"duration":15},{"text":"Andrej: Um and so I think education is going to be kind of like reshuffled by this uh quite substantially um where it's the end of like teaching each other things almost a little bit. Like if I have a uh library, for example, of code or something like that, it used to be that you have documentation for other people who are going to use your library.","offset":3925,"duration":16},{"text":"Andrej: But like you shouldn't do that anymore. Like you should have instead of HTML documents for humans, you have markdown documents for agents, because if agents get it, then they can just explain all the different parts of it.","offset":3941,"duration":9},{"text":"Andrej: So it's this redirection through agents, you know? Um and that's like why so I think we're going to see a lot more of that playing out.","offset":3950,"duration":10},{"text":"Host: Well we'll see if the great teachers know like to develop intuition for how to explain things to agents differently.","offset":3960,"duration":5},{"text":"Andrej: Oh yeah. Ultimately, so for example, MicroGPT, like I asked I tried to get an agent to write MicroGPT. So I told it like try to boil down the simplest things, like try to boil down micro- neural network training to the simplest thing and it can't do it.","offset":3965,"duration":14},{"text":"Andrej: Like MicroGPT is like my is it's like my end of my obsession. It's the 200 lines. I thought about this for a long time. I've obsessed about this for a long time. This is this is the solution. Trust me it can't get simpler.","offset":3979,"duration":10},{"text":"Andrej: And this is this is my value add. Everything else like agent gets it. It just can't come up with it but it totally gets it and understands why it's done in a certain way, etc. So like my contribution is kind of like these few bits, but everything else in terms of like the education that goes on after that is like not my domain anymore.","offset":3989,"duration":17},{"text":"Andrej: So maybe yeah it's like education kind of changes in those ways where you kind of have to infuse the few bits that you feel strongly about the curriculum or the the better way of explaining it or something like that. The things that agents can't do is your job now.","offset":4006,"duration":13},{"text":"Host: Hmm.","offset":4019,"duration":1},{"text":"Andrej: The things that agents can do they can probably do better than you or like very soon. And so you should be strategic about what you're actually spending time on.","offset":4020,"duration":7},{"text":"Host: Well we appreciate the few bits. Thank you Andrej.","offset":4027,"duration":4},{"text":"Andrej: Okay.","offset":4031,"duration":1},{"text":"Host: Find us on Twitter at NoPriorsPod. Subscribe to our YouTube channel if you want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.","offset":4032,"duration":15}],"logs":[{"elapsed":"0.0","message":"Downloading audio from YouTube...","detail":null},{"elapsed":"0.0","message":"Trying download with browser cookies (ad-free)...","detail":null},{"elapsed":"2.7","message":"⚠ Cookie download failed: WARNING: [youtube] [jsc] Error solving n challenge request using \"deno\" provider: Error running deno process (returncode: 1): \u001b[0m\u001b[1m\u001b[31merror\u001b[0m: Uncaught (in promise) TypeError: Cannot read prope","detail":null},{"elapsed":"2.7","message":"Retrying without cookies...","detail":null},{"elapsed":"36.4","message":"⚠ Downloaded without cookies — audio may contain ads","detail":null},{"elapsed":"36.4","message":"Audio downloaded (42.8 MB) in 36.4s","detail":"File size: 42.8 MB"},{"elapsed":"36.4","message":"Video title: Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI","detail":null},{"elapsed":"36.4","message":"Audio duration: 1:06:31 (66.5 min)","detail":null},{"elapsed":"36.4","message":"Large audio (66.5 min) — will use chunked transcription with gemini-3-flash-preview","detail":null},{"elapsed":"36.4","message":"Skipping full-file attempt — using chunked transcription for 66.5 min audio","detail":null},{"elapsed":"36.9","message":"Split audio into 2 chunks for transcription","detail":null},{"elapsed":"36.9","message":"Transcribing chunk 1/2 (starts at 0:00)...","detail":null},{"elapsed":"36.9","message":"Uploading audio to Gemini File API...","detail":null},{"elapsed":"41.5","message":"Audio uploaded in 4.7s","detail":"File ref: files/932y2h9jszr9"},{"elapsed":"41.5","message":"Audio processed in 0.0s. Transcribing with gemini-3-flash-preview...","detail":null},{"elapsed":"134.7","message":"Chunk 1: 304 segments, last timestamp 44:54","detail":null},{"elapsed":"134.7","message":"Transcribing chunk 2/2 (starts at 45:00)...","detail":null},{"elapsed":"134.7","message":"Uploading audio (offset 45:00) to Gemini File API...","detail":null},{"elapsed":"137.6","message":"Audio uploaded in 2.9s","detail":"File ref: files/7yinj5oev3rq"},{"elapsed":"137.6","message":"Audio processed in 0.0s. Transcribing with gemini-3-flash-preview...","detail":null},{"elapsed":"189.4","message":"Adjusted chunk 2 timestamps by +45:00","detail":null},{"elapsed":"189.4","message":"Chunk 2: 125 segments, last timestamp 1:07:12","detail":null},{"elapsed":"189.4","message":"Chunked transcription complete: 429 total segments","detail":null},{"elapsed":"189.4","message":"Total cost: 100,249 in / 23,853 out — cost: $0.1217","detail":null},{"elapsed":"189.4","message":"Total transcription time: 153.0s — 429 segments","detail":null},{"elapsed":"189.4","message":"Analyzing topics across 429 segments with gemini-3.1-pro-preview...","detail":null},{"elapsed":"241.2","message":"Topic analysis complete in 51.8s — found 16 topics","detail":null},{"elapsed":"241.2","message":"Analysis tokens: 26,023 in / 1,412 out / 5,051 thinking — cost: $0.1296","detail":null},{"elapsed":"241.2","message":"Pipeline finished in 241.2s — total cost: $0.2513 (156,588 tokens)","detail":null}]} |