Why The Next AI Breakthroughs Will Be In Reasoning, Not Scaling
In this episode of the Lightcone, we dig into the results of a recent o1 hackathon hosted by YC to find out what can be unlocked when founders leverage a SOTA reasoning model.
Transcript
I remember about a year ago, one of these conversations around, are we gonna have AGI? What would that look like? What one of the arguments for it was that, well, like, at some point, the AI will get good enough to just, like, design chips better than, like, humans can, and then it will just, like, eliminate one of its bottlenecks for, like, getting greater intelligence.
And so it feels we're on the pathway to that in a way that we just weren't before. The last episode, we were talking about, you know, what are you gonna do with these two more orders of magnitude. Since then, Sam has told me that he actually wants to go to four orders of magnitude. It's the worst that these models are ever going to be right now, right this moment.
Week to week, you know, there are things that you couldn't do maybe a month ago that you could do really, really well right now. So that sounds like a pretty crazy moment in history. Welcome back to another episode of the Light Cone. I'm Gary. This Jared, Harge, and Diana. And at Y Combinator, we've funded companies worth more than $600,000,000,000.
And we fund hundreds of companies every single year. So we're right there on the edge of seeing what is going to work both in startups and in AI. Recently, Sam Altman wrote this pretty wild essay that predicted that AGI and ASI are coming within thousands of days. Seeing him on Monday, he actually directly estimated, you know, between four and fifteen years. Have you guys read this essay yet?
And what do you think? Yeah. I read it. And one one of the interesting places where I think we have a unique perspective is that we were we had a front row seat to the very beginnings of OpenAI, because OpenAI basically spun out of YC. And so what was cool to me reading this essay is that it's literally the same ideas that Sam was talking about in 2015 when he started OpenAI.
Like, he's been talking about this, like, basically since I've known the guy. And in 2015, when he said these things, he sounded kind of like a crazy person, and not that many people took him seriously. And now, ten years later, it turns out he was right, and actually we were much closer to AGI than anybody thought in 2015. And now it doesn't sound crazy at all. It sounds like totally plausible.
I mean, the essay itself is pretty much the most techno optimist thing I've read in a really long time. Some of the things that he says are coming are pretty wild. Space colonies, fixing the climate problem, your intelligence on tap, being able to solve abundant energy. Yeah.
I think he's basically ushering in this sort of Star Trek future on the back of literally human intelligence, being able to figure out all of physics. Yeah.
Sam was always I I remember back when he was starting OpenAI, one of the things that really motivated him to do it was he believed that when we actually had AGI, basically, we'd be better at doing science than humans were, and therefore, would accelerate the rate of all scientific progress in every scientific field. That was that was part of the motivation from the very beginning.
And I think it was really connected to o one. Even when Sam came and spoke at our at our at our batch a year ago, this is long before o one was publicly released, but it's being worked on in, you know, in in secrecy by OpenAI. That was the thing that he was most excited to talk about was giving GPT more advanced reasoning capabilities. And I think this is the reason.
It's because, like, the thing that's missing from its ability to actually do science and, like, accelerate technological progress is it needs it needs to be able to, like, think through things.
One thing that really strikes you about o one in particular is if you read one of the papers talking about it, so capabilities and potential for the future, it talks about how it does really well in chip design. And I remember about a year ago, one of these conversations around, are we going to have AGI? What would that look like?
One of the arguments for it was that, well, like, at some point, the AI will get good enough to just, like, design chips better than, like, humans can. And then it will just, eliminate one of its bottlenecks for, like, getting greater intelligence. And so it feels like that's already kind of, like, we're on the pathway to that in a way that we just weren't before.
Diana's gonna show a cool demo of doing exactly that. It's fun because we run this hackathon with OpenAI, and Sam came over and judged the winners. And one of the participants was actually Chip Design. This company is called Diode Computer. I think we mentioned them earlier. What they're building is basically AI designer for circuit design.
And their previous product, it could handle in the if if you think about PCB design, there's four major steps. The big expensive part that you need a lot of all of these need a lot of expertise. So the system design, how do you really put together the architecture of it? How do you design all the components, like the resistors they need, the sensors, the specific processing units?
Then you need to go do the layout with schematics, placing then doing the routing. And routing is known to be a NP complete problem because as you have different layers in circuit boards, there's interference. And this is why companies like NVIDIA, Intel, Apple have a gazillion electrical engineers because this is a NP complete problem.
Up to GPT four, which is this company had built, it actually put some constraints and was able to automate a lot of the schematic design. Like, you, as a human, had to design what components it needed to go on the design. And to some extent, the routing, it was simple, which is still pretty cool up to that point. So they were able to automate all that.
But the thing that they demonstrated now with o one was actually able to do the system design and component selection, which is crazy. So it would be able to read all the data sheets and select the right components. So the way the product would work, it could say, want to build a wearable heart rate monitor with an acillimeter and a microcontroller, very high level.
And given this constraint and looking at the database, it would be able to match the specific acillimeter and microcontroller and heart rate monitor sensor and connect it and just output the end result. What we are trying to build today is a wearable heart rate monitor, something like you would see in a whoop, for example. The o one is amazing, but one of the downsides is that it's a bit slow.
So we actually cached a pregenerated, like, system diagram that the one was able to generate. It's pretty good. It has a USB c connector, an IMU, like we requested, a heart rate sensor, and, like, this is a microcontroller. So I'm gonna show you how you can go from this and, like, build a PCB. So we are gonna, like, build the project. The output of this is code.
We actually use Autopilot, which is a a electronics as code language. And you can see that it took all the blocks in the block diagram, stitched them together exactly how we want. And the the second step is it actually is going to generate a layout for the board. And so now, like, we can directly open it, and, here you go. He here's the board. It's, quite it's quite nice.
There's still, like, a couple of, like, fine tuning steps required. For example, we could, like, move, like, this USB type c connector slightly. We can, like, change the shape of the board. But but these are all the components. And then, like, thanks to the system that we built, we can, like, call the auto router on this specific board and actually get a fully working printed circuit board back.
So this is actually one of the examples on the o one paper Yep. That it would do EDA. But actually, they went a step four forward because the example on the paper, they described the EDA step process is this set of tools for circuit design. It does sort of the design of the schematic, also the simulation and bug verification. It's easier to verify stuff than to select and write it.
So this company actually went a step further beyond the paper because the paper did mostly the last stages of verification and simulation. I guess it's an interesting example of using different models for different tasks and in different workflows.
So in order to actually pick the correct components off the bat, you know, even before you, you know, place it on a circuit board, you've got to actually have probably rag on structured you know, taking unstructured data like PDF documentation and turning into a structured form that then four o mini sounds like is being used to actually extract the data and then put it into a Into format for o one.
Yeah. I think this is actually a very common pattern that we're seeing a lot of the interesting products built with AI. You use different kinds of models. So, yes, four o minis for PDF extraction and then o one for the reasoning because it's actually very hard to select the components for parts.
I know Jared, you also work with a lot of hard tech companies, and the whole part of selecting whatever the servo, the motors, the sensors is like so it's it's it's takes a lot of thinking for a human. Yeah. The the other thing I think is interesting about this example is, like, during the batch, before o one came out, Diode had tried to do this with GPT four o, and it just flat out didn't work.
And then they basically tried the same thing, the same prompts, but fed it to o one, and boom, all of a sudden, it worked. And so there really is this sort of like step function capability unlock. They were so excited when I talked to them and they showed me. They were they had this big smile. It's like, wow. They themselves are super impressed.
This hackathon that Diana ran, incidentally, I I think it was a really interesting concept for a hackathon. Like, most most hackathons are like people who are just sort of like building something that they plan to throw away.
And the cool thing about this hackathon is it was all actual YC funded startups that have real businesses, that have funding, that have like a real thing with real users, and they were all building actual features for their product that they plan to release to real users.
It was it was really cool, I think, for us to see how o one unlocked capabilities for real companies, not just like toy projects. Yep. There's another one that was similar in here in terms of reasoning for o one. I think, Harsh, you work with Camphor. Yep. So tell us what Camphor does. It's I mean, the tagline is Devon for CAD. But basically, let you create CAD designs with just natural language.
You just type in something that you want to design, and it just spits out a CAD design for you. So can you design me five airfoils optimized for 50 miles per hour with a minimum drag to lift of 15 at a five degree angle of attack? This is very specific. Yep. Normally, this would require, actually, a mechanical engineer to be running all the simulations and solving the the equations.
And what you're seeing why it's, like, flashing is, like, running all the multiple simulations for them at the same time. So it's actually kind of like a copilot to SolidWorks. Yeah.
They actually built their like, initially, they were going to build this as a plug in to SolidWorks, but they went for, like, the even harder technical approach, which was, now this is just, like, a executable that runs on your desktop, and it essentially opens up SolidWorks for you and And then it just starts clicking around in the UI pretending to be a person? Yep. Nice.
And you saw there what was really cool earlier. They flashed at the math trace. So o one was actually able to write all of these equations, all these partial differential equations, and solve basically naive Stokes questions to actually solve airfoil. That is really cool. The last episode, we were talking about, you know, what are you gonna do with these two more orders of magnitude?
Since then, Sam has told me that he actually wants to go to four orders of magnitude to get to a trillion dollars in, you know, sort of spend. I mean, pretty wild. But on the other hand, like, you could see where that might go. You know, you could imagine an airfoil is still it's, you know, very impressive and complex, but sort of what's we're capable of doing today in 2024.
You could imagine abstracting that to, like, understanding the nature of physics, I suppose. Like, it it would be sort of hard to see that maybe in the current version of o one. But if the scaling laws hold, it seems entirely plausible that, you know, far more difficult engineering challenges such as, you know, room temperature fusion, like, are all sort of ultimately engineering fraud mechanics.
There's weather prediction. There's all these complex physical phenomenas that are very hard to solve, you need basically PhDs. And to Sam's essay, this is a glimpse into what AI and where o one is heading with with this chain of thought and reasoning. Especially, like like, Sam's essay, the the vibes are sort of training intelligence and this new age of intelligence.
And then the o one paper just that I think this whole idea of now you can actually give, like, feedback, not just on, like, the outputs and whether you got the correct answer, but, like, on all of the steps to get there. And, like, you're basically teaching a model how to think. Like, the camper guys are mentioning it to you. Right?
The reasoning traces, and they can probably go back and, like, fine tune the various steps for, like, every output to make sure that the model's thinking how they want it to think.
That one that just is, again, very, the AGI conversations, I feel, like, a year ago were all sort of in this direction of, like, what happens once you can actually start teaching the models to think better versus just, like, spitting out the correct answers. And then the scaling laws, this is, like, even more surface area for, like, throwing compute at the problem. Yes. Right?
Like, now you can just basically put compute at the inference step and And iteratively have something come out that, you know, you can actually spend more money and more time and have a result that iteratively gets better, similar to what you might expect from a human scientific organization. Yep. Maybe more consistently, even.
Diana, do do you wanna talk about the architecture and how they actually created o one? I think a lot of it is inspired from what they've been working for many years since the beginning of OpenAI, I think one of the inspirations is a lot of the work they did with Dota. Yeah.
Does everyone remember when, like, before OpenAI was famous for GPT, the one thing that he was, like, kind of famous for, that at least people in the tech industry knew, was Dota, was, like, winning video game competitions. That was their first big breakthrough. And if I think I think back then, Dota wasn't something that took the world for a storm.
I mean, maybe only the research community kinda knew about it, but it wasn't anything practical. But what was impressive, it was beating a lot of the best Dota players. So Dota is this complex game of resources and planning. Right?
And they implemented a lot of kind of reinforcement learning type of techniques in there, which I think were also inspired early days from AlphaGo and AlphaZero as well on how it solve Go. Yeah. It wasn't just brute forcing through it, but actually having a rework function and and trying to solve towards it. And Yep.
Even this is why there's just so much talk about Q learning because that's sort of the fundamental algorithm behind the family of algorithms behind RL. So yeah. So, like, because of Dota, they got really good at doing reinforcement learning. That's how they got it to work. They just had it play against itself like a million games. And then how does that connect to O1?
So I think this is where there's a bit of a big stepping function because how do you then incorporate that into the family of GPT type of models? GPT is all generative based on predicting the next token and patterns. And then getting those results to check that they're correct.
So I think a lot of it is you had to have a lot of data that was factually correct and fed into probably the model and the training and having a reward function that get it to reason a bit more about the output and make sure that it's correct. So they've probably done a lot of interesting techniques with that, and there's probably a lot of secret sauce on the type of data sources they've done.
Maybe one of the speculations we could do is a lot of very factually correct information. Like math problems and science problems and things like that. Yeah. And that's why I outperformed so much in those. Yes. Yeah.
One of the things I think is interesting, Gary, to your point about the scaling laws is a lot of people are really focused on the next, like, scale up of the model, like the GPT five series of models, which are being trained now and people are working on them and they are gonna come out. But I think people may be underappreciating how big an unlock this other direction is.
Because there's there there's two research directions being explored in parallel. Right? Like, one is the straightforward scale up of the underlying LM, and then this o one direction is like a totally orthogonal research direction in which you unhobble the model by having it do reinforcement learning while actually trying to do things in the real world and getting better at them.
The version that's come out so far, it's still only o one minutei. And if you look at the actual o one preview. Sorry. O one o one preview. And, like, if you look at the performance asset they released, like, the full o one model, which is coming out any day now, is a huge step function above even o one preview, which is what enabled all these incredible results at the hack athon.
Sam was just telling us that, like, o two and o three are not far behind. And so, like, I think people may be underappreciating just how big an unlock we're gonna get. Yeah. And o one also is really opaque still. I mean, if from a, you know, sort of business perspective, this is a new method. I think at great cost to themselves, they actually did create a new dataset to train the chain of thoughts.
You know, it's essentially a giant dataset of, you know, given task x, can you break it down and into you know, break it down into parts? And, you know, what's funny is this sort of rhymes with what Jake Heller figured out for case text that Yeah.
If a given task that you give an LLM is hallucinating or, you know, not consistently giving the output you want, you're trying to make that particular prompt do too many things. You need to break it down into steps. And so what's funny is Jake's prescription is really two parts. You know, one is break it down into steps, and then the other part is evals.
And it sounds like basically with o one, the chain of thoughts will replace the workflow. So you might not need to break it down into steps yourself, but the evals are still really important. Mhmm. Even like in the aftermath of that episode with Jake Heller, it sounds like some YC alums are reaching out and saying, that episode helped us figure out and unlock something really big. Like Yeah.
A lot of people really were just raw dogging their proms. Yeah. They got to you have an example of the company you work with, Dyer, that got to 100%. Yeah. Just by doing exactly what Jake recommended, which is like having a really big eval set and being very careful about testing every step of your reasoning pipeline.
So one of the theories that I have now is ultimately, like, if you superimpose that on what is a moat. I mean, that's one of the key questions that everyone's sort of asking themselves right now. You know? Okay. Like, GP five's coming. Two more orders, maybe four more orders of magnitude are gonna come in terms of a trillion dollars spent on more training. That's pretty wild.
You know, if I'm a wrapper company or I'm trying to do vertical SaaS or I'm trying to, you know, build my own business, what do I do? My theory would be it's the evals.
It's, you know, write the 10,000 test cases, and the only way you get access to the test cases that are proprietary data, that are not, like, commonly available is that you literally you know, that's what a bunch of our companies in this current YC batch are doing. They're doing the hard work of doing enterprise sales.
They're getting embedded and sort of going quote unquote undercover into these, you know, sometimes really boring jobs, sometimes really complex or arcane jobs. You know, it's everything from, you know, I think accounts receivable all the way over to how do you do, like, financial accounting or forensic accounting. Like, it's just all kinds of things that are really not readily available.
You you can almost argue that anything that is consumer and publicly available on the Internet, that's gonna be in the base model. Yep. So then your moat ultimately is for all of the other things that are not already online, whether it's, you know, for case techs being a lawyer or maybe over here on science or, you know, in terms of building airfoils.
Like, what you're trying to find is the the data that is proprietary in some use case, some vertical, that allows you to build that 10,000 test case eval, and then that's actually the value. I mean, this is just a crazy theory. But Yeah. It's actually Yeah. Might might be what happens. Totally.
An interesting implication for startups based on everything you just said is it may be worth thinking about who like, of your customers, picking the ones that will pay you a lot for that final, like, 10%, like, accuracy and perfection.
I think, like, Camfra are actually a good example of it, where there's lots of interest in this sort of text to CAD design amongst, like, hobbyists or people who wanna prototype things and get something up and running very, very quickly.
But there's also, like, a segment of the market where it's people who are literally designing, like, you know, airplane parts where there is no room or, like, margin for error. And o one makes it quite easy or easier now, right, to get to, like, the prototype, like, you know, 80% of the way there.
But I think the strongest technical teams have the option to go all the way and go after the segment of customers who want, like, a % accuracy and will pay a lot for it. Arch, always go all the way. Yeah. Well, I can always go all way.
But I I think it's interesting because one of the things that gets pushed is, does o one, or does AI in general, actually make it commoditize a lot of the tech and make it less important to be a strong technical team? And it just seems unlikely to me. It seems like actually the last It seems it's the opposite. Yeah. Yes.
Like, all of the value is probably gonna be captured by, like, the strongest technical teams who can build on top of whatever the base level of tech is and get the final 10%. Hey, Gary. I think it's the prompts it's it's the evals, and it's also, like, the UI layer and the integrations that go around it.
Because like just the prompts themselves are not a product for a company to actually adopt Camfur. Like, it needs to actually integrate into their existing tools. It needs to have a well thought through UI and Yep. Workflow and all the tooling to sort of make the prompts useful. Yeah. Well, and then it's distribution. Right? Like, how do you actually get in front of people?
How do you establish your brand? And then a perfectly good moat is difficulty switching, actually. And once you have all your data and it's working and paying 10,000 or a hundred thousand dollars ACV, sometimes a million to 10,000,000 ACV, you know, man, it's gonna be hard to switch. So all the classic moats still apply.
You know, this is still software, but, you know, you can unlock this capability, you know, this is a moment. You know? Another point to double down on the importance of evals is that that still applies in the world of o one as founders are wondering how am I gonna still build the best product on top of o one? Does it change?
And everything we discussed in the episode with Jay Keller applies because GigaML is this company that Harsh should work with. Yep. And Carrie too. Very. Right? Yeah. I adopted them. That's true.
Will you tell us a bit about what they what they do? The full full backstory is we funded them for a completely different idea. Something like they're an Indian founding team. And their original idea was something about helping Indian Indian high school students apply to US colleges. Very, like, niche idea. But they're super crack IIT, AI engineers, researchers. Yeah.
And it just bay like, it just happened. Like, we were like, this is not a great idea. AI is, you know, changing the world, and you'd lit like, your research that you've been doing at university, at college, is all aligned with, in particular, fine tuning LLM models. Originally, it wasn't even the AI version of helping Indian high school students apply to It would just do AI, actually.
So it's like a classic YC story where it was like, these are two clearly brilliant engineers. We don't like the idea at all, but we should just fund them anyway and hope something works out.
And the idea they actually pivoted to initially, which they raised the seed round for, was helping companies fine tune open source models so that they could get to, like, equivalent performance as, you know, at the time, was really only OpenAI.
But I think, in general, what we found is that those have not proven to be great businesses because just the cost of the models has gone down and the performance of the open source models has gone up. You just haven't had to fine tune as much as people thought you would need to. Yeah. Because the models are just keep getting better.
It's kinda betting on the different on the opposite direction on AGI that let's just trust that these models are gonna keep getting better and better, which, ergo, doesn't require as much fine tuning. Yeah. And so they pivoted again into, well, like, let's just find a like, let's just find, like, a we're really good at AI now.
We're, like, world experts in, like, fine tuning and squeezing performance out of these models. So let's just, like, find a vertical application for that. And they went into AI customer support, which is, like, competitive. But, again, I just think if you're an intensely technical team, you can still find ways to squeeze out, like, a comparative edge against other teams in the space.
And I think that's what they've done. The problem with customer support is you're dealing with very kind of squishy problems. There's just so many edge cases. It's just the space of things that could go wrong as a customer rep is enormous. Well, it it seems competitive, but the thing is hardly any adoption has actually happened.
Like, it's not like the world has replaced all the customer support agents with AI yet. We can all see that it's going to happen, but it hasn't happened yet. And so from that standpoint, like, it's wide open.
What I what I found, at least when I spoke to the Geekaml team last, is that part of the reason for the lack of adoption is that rules based systems work fairly well for most, like, of the simple cases. And there's just not trust or belief that you can, like, build AI that's good enough to solve, like, the real messy stuff.
And so most companies that were pitched on, like, an AI customer support agent were, like, well, you can't actually go all the way and solve, like, the, like, hardest problems that take up most of the time, and the rules based system works, like, totally fine for everything else. And so I remember when they were first pitching this idea, people would just be like, this is just overkill.
We don't need it. Our rules based system works totally fine. But it seems like that is no longer the case. Yeah. Because they now have some really legit customers. Who's? Zepto just signed up. Okay.
So last time I did office hours with them, they said that they automated 30,000 tickets per day. So they you know, I think Zepto had more than a thousand people working on those 30,000 tickets per day, which is, you know, 30 tickets a day.
And then the interesting thing was, you know, on the one hand, you know, this is probably one of the things that frankly everyone when they think about AI, they're a little bit worried like, you know, are these jobs gonna go away? And the interesting thing about the Zepto customer support job is that it's so not a fun job that I think the turnover rate was something like a few months.
Like, you know, most customer support agents only wanted to work there for, you know, six months or less. So you this actually is an interesting case of when you when something is incredibly rote, it's literally replacing butter passing. Like, these are sometimes not really actually good jobs.
And, you know, hopefully those people can go and do something way more awesome with their time know, brains than, you know, these rote jobs that apologizing for Zepto orders that got misplaced. Exactly. Right.
But the crazy thing they figured out with o one is that, to your point, Harsh, their previous implementation before o one was GPT plus rules and all that, and it would not be able to handle most of the cases. It would have about a 70% error rate. Now what they did is doing the technique like Jake Heller described with really going hardcore on the EBALs plus o one.
During the hackathon, they got to only 5% error, which is that's an order magnitude improvement. Yeah. The other row in this is incredible too. Right? This is what I saying, like, the complex, like, the things that are, like, very complicated that take up lots of time and are expensive to solve, would, like, essentially, like, they cannot do them. They were shrunk. Yeah.
So basically, just, like, 0%. And that's what I'm that that's what they were encountering when they were selling this is a lot of people are well, actually, all of the stuff that we want to automate are these, like, complicated edge cases that waste lots of time. And, like, they just they they couldn't actually do any of that.
But, like, now they're at, like, 15%, and that's with, like, o one preview alone. Oh, no. That's is that 15% up just error? So now they're at like They're at 85%. Yeah. So they went from 0% accuracy to 85% accuracy. Yeah. So the interesting thing here is that o one it's not even o one yet.
It's o one preview. And then it's such a new technique that I think they're trying to protect their advantage right now. You know, if you use o one in ChatGPT, it looks like it will tell you what's really going on. But apparently, they have a fake model that just spits out things to give you the impression that it's breaking it up into steps.
And they've actually, you know, hidden it because they don't want other people to have access to that data yet. But the next step seems like it needs to be some interpretability, directability, and then for that to happen, you know, I'd be curious if o two ends up having that. Like, you want to be able to see, okay, well, show me the work. Show me the steps.
And, oh, like, that step, the third step, you know, can we rerun this, but I want this to branch in this way. Or edit it. I think this is one of the things that would be the next unlock is right now, it has the plan that it comes out, the chain of thought, but you cannot edit it. So imagine now right now, today, o one just outputs, whatever, 15 steps to the problem that you need to solve.
And imagine now being able to edit each of the steps. Then you get into the super, super fine to next level of Jake Heller. So this is the it's the worst that these models are ever going to be right now, right this moment. And, you know, literally week to week, you know, there are things that you couldn't do maybe a month ago that you could do really, really well right now.
So that sounds like a pretty crazy moment in history. So we've been talking a lot about the kinds of companies and ideas that get this wave of uplift from this model improvement for o one. What are the kinds of ideas that are the opposite, that are not getting benefited as much from o one, and perhaps maybe people even should pivot.
Because they're getting they might just get deprecated from the improvements of o one, o two, o three.
I wouldn't go all the way and suggest they should pivot, but I do think companies that are building AI coding agents or AI program engineers are potentially have stuff to think about here because it seems like o one in particular is, like, outperforming on just, you know, solving programming problems, essentially. And I I certainly know some of the teams I work in the past.
Like, a lot of what they've invested in is, like, the chain of thought infrastructure behind this stuff, which is now just a one doesn't is not actually, like, any leap forward for them. They've already, like, invested in that already. And so I think that might be a function of basically the opaque nature of what the chain of thought is.
And once you get it to be directable, that's actually I mean, frankly, that's what users in CodeGen are struggling with even right now. Like, once it starts going down a certain path, you can't really alter things. Like, you you want it to ask you, hey. Do you want me to do it like this or that? And, you know, all of the systems are a little bit struggling with that right now.
I was gonna ask the the inverse question, Diana, which is, like, each new model capability unlocks a new set of startup ideas. Like, a year ago, doing startup ideas where, like, the AI agent would talk on the phone just, like, didn't work. We had a bunch of companies that tried, and all the companies didn't didn't work.
And over the summer, it really started working under the trends from the the the past two batches, like, anything around, like, phone calling is, like, blowing up right now because the models finally work. So, like, with this new o one series of models, what are the startup ideas that, like, just became possible?
To connect to Sam's essay is a lot of things that are gonna make the atom world, physical world better because it's really good at math and physics.
So any startup that's working around mechanical engineering, electrical engineering, chemical engineering, bioengineering, all of these things that really will make our lives better, I think really will are getting an unlock, as we've seen from the demos we highlighted. That's exciting. I mean, it can't just be helping people click a little bit faster.
It's gotta be things that actually create real world abundance for everyone. And that it might just be a little bit of a race. Like, I think there's sort of the fear of AI out there in society right now, and then it's sort of up to the technologist to try to usher in this age of abundance sooner rather than later. And if we can do that, then abundance will win out over fear.
So with that, I think we're out of time for this week of the light cone. We'll see you guys next time.
✨ This content is provided for educational purposes. All rights reserved by the original authors. ✨
Related Videos
You might also be interested in these related videos