Design Experts Critique AI Interfaces

Transcript

Speaker 0:

Over the next decade, new AI user interfaces are going to emerge beyond the common chat UI that we're all familiar with. So in this episode, I will be joined by Rafael Shad, the creator of Notion Calendar, and together, we'll review a bunch of cutting edge AI interfaces that will give us a glimpse of what the future might look like. Welcome to another episode of Design Review. Rafael, welcome.

Thank you for joining us today. Thank you so much for having me. Yeah. Talk a little bit about how you think interfaces are gonna change in the age of AI.

Speaker 1:

We'll look at some examples in a minute, but from a high level, what are the differences between kind of the, say, static web based, two d interfaces that we're used to today with where things are going in the future? Sure.

I think a big difference is that software of today or kinda like up until this point was mostly kind of like just clear things you can point at on the screen that are, you know, kinda nouns, like text, forms, drop downs, buttons, etcetera. And with AI, what really changes is I think so much of the design of what AI does is kinda more verbs.

It's more the workflows, auto complete, auto suggest, go out and gather some information from me etcetera. And we don't really have the tooling yet to kinda draw verbs on the screen and so that's what's really fascinating how you know, the software is now emerging in this new AI world. Awesome. Well, let's take a look at some examples that were submitted by the YC community.

The first one that we're gonna take a look at here is VAPI, which is voice AI for developers.

Speaker 0:

Alright. So VAPI lets developers build, test, and deploy voice agents in minutes rather than months. And it looks like there's a microphone icon here where we can just click in and give it a try. Welcome to Vappy.

Speaker 2:

I'm an assistant that you can talk to like a person.

Speaker 1:

So how's your day going? Going great. And it looks like this demo on the website is really target at the developer that would actually implement this technology, not so much the user that will event eventually use it.

Speaker 2:

You're spot on. VAPI is definitely more geared towards developers who want to integrate voice assistant capabilities into their apps or phone numbers.

Speaker 1:

But, hey, even if you're not a developer, feel free to give it a spin. So one of the first things that I noticed was that when I was speaking, it wasn't there was no visual feedback, making it clear that my voice is actually recognized by the microphone. And then similarly, when the voice was answering, there was no sort of like visual indication that that's what's happening.

So for example, if our laptop was on mute, we were not sure whether the demo was broken or what's going on. Right. So important I guess to kind of pair multimodal cues, so not just rely on voice, in these type of scenarios where you do have a screen.

Speaker 0:

On the phone that would be a different scenario. I think one of the other interesting things to try here is when you're talking to a human, the latency is really important and also interruptions. And it felt pretty fast and pretty natural when we were conversing. I wonder what would happen if we tried to interrupt it. Would it be able to handle it? Let's Let's try it.

And one of the things that I actually do really like here is that they always because this is developer focused, they always rendered,.

Speaker 1:

kinda like a little label that shows you instantly for each each answer the milliseconds of the delay, really kinda building you an intuition. You know, how many milliseconds feels natural versus kinda feels like, oh, I'm talking to a robot. That's a great point because the latency.

Speaker 0:

is the interface in some ways, in that how fast it responds to you, the longer it takes, the less it feels like a natural conversation and the more it feels like you're talking to a robot. The whole point is to make it seem like you're talking to a human. Yeah. And it's almost seems like we're basically talking here to kinda like the dev mode.

Speaker 1:

of this technology. And so whenever you can and are in this situation, like, what are all the things that you might be able to expose such as kind of the milliseconds of latency? Kinda like think of it as dev mode versus sort of like production mode. Yep. Let's try to interrupt her here. Welcome to Vappy. Hi. I was wondering if you could tell me what the weather is.

Like a person. So how's your day going? Alright. So two things happened. One, it didn't pause when you were interrupting. And then two, it entirely missed your question when when it actually got done with with its own sort of agenda.

Speaker 0:

Yep. Awesome. Well, I think my takeaway here is one, developer focused. Number two, showing the latency was actually key because that's so much of what makes this actually feel natural as a human interface.

Speaker 1:

And the voice felt like I was talking to a human which was pretty incredible. Pretty incredible that, you know, startups can now have these things that previously kinda felt like a huge company,.

Speaker 0:

needs to build this technology. Yeah. It's amazing. It's like you would have thought this would be a hundred years away now Yeah. Available today. Totally. Great. Thank you, Vappy.

Okay. Next up, we've got Retail AI and they say super charge your call operations with voice AI. So this is another voice AI. What I thought was interesting on this one, we can actually try it with live phone calls and having their AI voice agent call us and run through, some of these different scenarios that they have.

A receptionist, setting appointments, lead qualification surveys, but I haven't been paying my debt. So we're gonna go debt collection here. Alright. And I'll put in my phone number and my name and email address submitted. Wow I'm actually getting a call. My phone is ringing. Hi is this Aaron? Yes this is Aaron.

Speaker 3:

Hi, Aaron. This is James from Retail Debt Collection Company. The reason for my call today is to discuss the outstanding balance on your account which is currently $800. This balance is past due. Have you received our email notification about this? Oh, I'm sorry. This is not Aaron. This is Steve.

Speaker 0:

Oh, my apologies, Steve. Can you connect me with Aaron by any chance? You know, I haven't seen Aaron for a while. He might be on vacation or something.

Speaker 3:

Got it, Steve. Sorry for the mix up. We must have made a mistake. Hope you have a wonderful rest of your day.

Speaker 0:

Alright. I successfully evaded my debt collection. Awesome.

Speaker 1:

First of all latency is an is an issue. That's what kind of like breaks the illusion of this being a real person. Yeah. I I think one of the things that stood out to me is like,.

Speaker 0:

I threw it a curveball and it totally adjusted because when I typed in the form here I told it that my name was Aaron And a simple system would just keep referring to me as Aaron throughout the entire call. And halfway through I told it, oh no, actually this is not Aaron, this is Steve. And it referred to me as Steve after that. That was really cool. Yeah.

So it actually learned from the conversation. And to like build on that, there's room of for improvement where you could then not just say like, oh sorry, I made a mistake. But actually follow sort of like your cue of you, you know, knowing.

Speaker 1:

Aaron. Right? But he's on vacation. But it kinda like just shut down and that's when they started to feel a little bit more robotic and there's clearly kinda like more room for improvement but already like a great first step that it picked that up. And again, like the voice felt incredibly realistic. It felt like I was talking to a human. I agree with you.

Just the latency was the only thing that gave me a clue that maybe this was not a human. Yep.

And one of the interesting things here is that this may kind of be technology used in a first line, of sort of like, you know, defense, and then you can kind of bring a a human into the loop, when sort of like, you know, maybe 50% of these calls go through robotically, and then you can bring in the the human for for the rest, like for example, you. Yes. Absolutely.

Speaker 0:

And there's a transcript and and a lot more information that a human could then follow-up on. So this eliminates a lot of the grunt work or, you know, work that people are just kind of outsourcing.

Speaker 1:

to try to get done in a in a more automated way. Totally. So you're hinting kinda, you know, the backside of this product that we currently don't see. But you can imagine like really rich UI, AI UI that then kinda shows what happened during the call to the, call center operator.

Speaker 0:

That is a whole other side of this this this technology in this company. Well, Retail AI, this was amazing. Thank you. Now let's take a look at some AI agents. And so agents are effectively, autonomous AI that can go out behind the scenes and can interface with websites. It can make phone calls. It can interact with users.

Lots of things autonomously and then kind of bring back their findings or complete actions on behalf of, the business. Right. They have a few high level instructions and then just go execute autonomously. Yep. Exactly. So, first one we should take a look at here is Gumloop. 10 x your output with AI automation. No coding required.

And I think what they're alluding to here is that when you have these agents, they're executing autonomously Mhmm. And they're making decisions at each step along the way of what to do next. And it's hard as a human who's trying to oversee this and making sure that they're doing the right things to monitor that and make sure that they're on the right track.

So this I think is one way to do that which is to show visual workflows which is I think what we're gonna get from Gumloop here. Cool. Let's give it a try. Let's, let's try start from a template. And we've got one here for web scraping. Scraping the y c directory. Alright. Let's try this template.

And let's begin customizing the template. Okay. Cool. So it looks like what we've got here is a big open ended canvas that we can pan around and zoom in on that gives us a bunch of, boxes for each step in the flow. Mhmm. And Canvas has really emerged as a really interesting kind of almost new document type Mhmm. That.

Speaker 1:

seems to lend itself pretty well to not just kind of for design tools or or kind of brainstorming tools, but lends itself really well for these sort of modeling these kind of like AI processes.

Speaker 0:

Yeah. It's it's great because it gives us, the user, a visual overflow of exactly what steps the agent is going to take. Mhmm. And we can control what it should do at each of these steps. So for example, the first one here is, it gets input which it says specify the batch code like w 24 and go get a coffee. So this is what what should it ask the user at this step.

Then it combines text where it basically combines it with this URL, right here. And then it will take that and it will use the web agent scraper to execute the URL, to get all of the data from the website. It will then extract the data that is on that website, combine text, put it in a list,.

Speaker 1:

and so you can see all the steps here. So one of the things it does pretty well is, using color, to show different type of notes, kinda like input, actions, output, etcetera. I almost feel like I would want like a legend, like which color is what. And then because it is canvas, kind of having different zoom levels showing different fidelity.

So right now, we're so zoomed out, I can't read any of the small text. Mhmm. Why not just kind of hide it and make the note almost collapse it into just, you know, in this case a brown block, in this case a yellow block, to kind of give different zoom levels, different fidelities. I think that's kind of one way, this can really go.

And then here we have, a pretty linear flow, but the canvas and modeling these kinda like AI agent decision trees gets really really powerful when it isn't something you could just kinda like linearly write in a document, like a recipe. First do this, then do this, then do this. But really the power is in sort of like the multi dimensionality and the branching. Mhmm.

And so for as a starter template to kinda like explain the power of this tool to model these processes,.

Speaker 0:

I think one that is multidimensional would really showcase the power of this. Yep. That's a great point. I mean, another advantage for this too is that you can just add text blocks kinda next to these as instructions.

So here was a template that that we just pulled down to get started with and it's got big text explaining, you know, if you're gonna customize this here are some things that you should consider which is helpful as you're just jumping in and and trying to learn a product for the first time.

But I think we're gonna see more and more of this Canvas interface as we have complex workflows that are customizable by users. Ten years from now this may be the standard to interact and control and monitor a lot of these agents.

Speaker 1:

If they're truly everywhere and doing you know so much of the work that we as humans don't want to spend our time doing, this is gonna be the way that we control it. Yeah. And the interesting thing is that you know like kind of what is old is new again. So kind of flowcharts, etcetera. Probably like chip designers like fifty years ago. Yeah. They're like, oh yeah.

We used to, you know, kind of model our things like that. And so it's interesting to kinda like see this paradigm kind of getting resurfaced in the AI era. So, you know, we didn't invent invent this today, but we're building on a lot of legacy,.

Speaker 0:

and standing on the giant on the shoulder of giants here. Yeah. And it's always historically been static, and it seems like what's new is actually making it interactive. Yeah. Awesome. Thank you Gumloop. This is a a pretty incredible product and it it can tell a lot of, attention to detail was paid here. Alright.

Next up for AI agents, we're gonna take a look at AnswerGrid. AnswerGrid has answers at scale and there's a text box here where I can put in some input, but it looks like they've got some suggestions for me. And this is a really nice pattern. Right? Whenever you have a free form.

Speaker 1:

text box where basically you're expected to prompt, write text as an instruction for the LLM, kind of the AI engine, to use as an input. Having some examples to just kinda like turn examples into buttons, where with a single click you can basically fill out like a pretty reasonable example is, is something that's, you know, really good to start. Yep.

It's hard sometimes you come here and you're like, I'm not sure what to use this for, and you're staring at the blank canvas effectively, and this gives you some suggestions to make it easy to get the value of the product. And I would take this even a a level further.

I think it's not just for examples where, you know, as a demo, but what if you can guess or infer from other context of your application, like, what might be, like, a good prompt even as you're kinda using the application, and then make them single click as opposed to just being static examples.

So collapsing sort of like suggested prompts into buttons I think is like a really interesting kind of dynamic.

Speaker 0:

pattern here. Yeah. I'm about to click on AI companies in San Francisco and there's a lot that happens behind this. It seems like a simple button, but alright. So pre filled it, AI companies in San Francisco, and wow, it's pretty fast. Okay, so I get a bunch of companies that I'm sure many of the audience has heard of.

OpenAI, Anthropic, Perplexity, And it's got some information like HQ location and industry and website URLs. I think what's interesting here is to click on the plus button here and we can add something different. So let's just say funding raised.

Speaker 1:

And basically what happened here is we took a prompt as an input, and we got the spreadsheet structured data as an output. And in the background, it went to these websites, scraped it, assembled this this spreadsheet, and now what can do is add columns and have sort of the agent go out again,.

Speaker 0:

not on a sort of like static you know, set of columns that were predefined, but our columns. Yeah. Like things we wanna know, kinda putting the human back into the loop. Yeah. And what's cool is it's almost like every cell of the spreadsheet gets its own AI agent to get the data that we want, which is pretty incredible. It's like a spreadsheet on steroids. Mhmm. Okay.

Let's go Let's figure out the funding that was raised for each of these companies. So I just put funding raised here. This will be a web search. I'll let them pick the format that comes back here. And it looks like they give a couple other suggestions like employees and things like that, but let's just run this and see what happens. Cool. So you can see each agent working.

We get the feedback in every single cell. Mhmm. Curious if they're gonna come back in one at a time or all at the same time. Oh, all at the same time. It was an interesting choice of of kind of like the the run button was there like this arrow up.

Speaker 1:

Well, first of all, kind of a very interesting choice, kinda like, oh, send the agents to like up to the cloud almost. Okay. Send them off. And then maybe as a second point, maybe there's some improvement kind of to make make this button a little bit clearer, like how to actually you want people to click this button basically, kinda like run, play, go. Yep. Yeah.

Play button would have been good too. Right? What's interesting here is, okay, the data that we get back is 470000073. 6.

Speaker 0:

It feels like it's missing The unit. The unit for the condensed version, and maybe that was a setting that I should've added. You know, I told it to pick the, the the data type that was coming back here. But I noticed as I click into each of these, it shows me the answer, 6. 6. It also shows the sources in line, and you can see OpenEye raises 6,600,000,000. 0. So that tells me like, oh, okay.

We're missing a b. But, this is actually really helpful that you can click into each of these and see where it got the information from. This is another common pattern that we see where if AI is going out and doing a thing, how do you know you can trust the results that it brings back? You know, sometimes it hallucinates, sometimes it gets the wrong thing.

Speaker 1:

And so by having a source closely attached that you can just, you know, click on each of these, right here you can see immediately where the sources came from. Helps us to be able to validate and trust the data that the AI agent is bringing back. Totally.

And you know, when you Googled kind of in the past, you just had a list of websites, a list of basically the references or the links and they were your destination.

But now that you ask kinda like chat chat box and you get the answer back, you kinda wanna have the links and the references kinda like inlined and I believe it was maybe perplexity kind of to to do that pattern first where you had like these little round numbered dots right in line with the answer Yep. Kinda showing you which segment or which fragment of the answer comes from where.

And this is a really nice pattern that is sort of like used here, and it could even be used in other contexts or even be inlined here. I guess in a spreadsheet, it works kind of to pull it out into into its own popover. Mhmm.

When you are kind of more looking for space and to condense it, then sort of like the pattern of having the footnotes almost directly in the answer is a really really successful pattern. It's also interesting, you know, you mentioned before about how, you know, we always had flowcharts and these are like modern flowcharts with the canvases.

Speaker 0:

And it's interesting too that, you know, a lot of the citing sources in the footnotes is not a new thing. It's been around since the beginning of books. Yeah. But now it's actually being used in a new way to actually validate and verify information in real time that an agent brings back which is really cool.

Just like in academic papers of the past, footnotes, have your references like from which paper, which data source, actually draw a conclusion.

Speaker 1:

Now we kinda see this pattern more and more in software. Yep. Well, very cool. Thank you AnswerGrid.

Speaker 0:

This is really well done. Alright. So next up, let's take a look at another kind of AI interface that I think we've seen is pretty common which is you put a prompt and you get some kind of output or action that happens Mhmm. Which I think has become pretty common. First up, we've got PolyMet. PolyMet, an AI product designer at your service. Okay. Very interesting.

So you can design and iterate faster with AI and get the production ready code and ship faster. Alright. So let's see. My first project. Sure. Let's start there. Okay. So we get the prompt box which is the core element of a a lot of these prompt to output, interfaces that we see.

It says explain what you wanna build and, similarly to, one we looked at before, it gives us, it looks like, a bunch of pre built prompts that are ready to go. Mhmm. And it seems.

Speaker 1:

takes multimodal input. So we see little icons for the microphone and even an image. So it looks like we might be able to upload a sketch, for example, of an interface, and then it will turn it into, like, the actual thing. Let's try an intermediate one. This says,.

Speaker 0:

create a dashboard for treasury management software with a floating sidebar sidebar with glass morphic collapsible sidebar. Very specific. And super dark orange gradient in the background. Alright. That sounds on brand. So let's click our tiny little button here. At least it's orange. At least it's orange.

And okay. So we've got some animations that are, trying to keep us entertained and engaged. Says at the bottom assembling pixels with tweezers. So I got some humor in the messages that are popping up. But you know, are we gonna be waiting ten seconds or are we gonna be waiting ten minutes here? It's a little unclear. Yeah. They kind of appear and disappear pretty quickly.

I almost, for kind of more technical audience,.

Speaker 1:

we'd love to see kind of like a log to actually see kind of what's going on under the hood, and then also kinda step by step just kinda keeping it on the screen, so you kinda see the progress of the machine, in the background. Yeah. It's tough to tell. You never know,.

Speaker 0:

you know, because this is one of the main challenges with, this new kind of interface is takes a long time to generate from a prompt a very complex output like an editable web page that can then be saved to code or, a big graphic or a video. You know, a lot of times you can go prompt to video.

And so how do you keep the user engaged if it's a short enough window where you can just wait for the output? Or if it's gonna be, you know, three minutes, five minutes, ten minutes longer than that, you know, maybe you need a way to just tell them to come back and that you'll email them or message them or something when it's ready.

And maybe some good prior art to borrow from there is today's kind of.

Speaker 1:

meta search engines for flights. They also take a while and they already show you kind of in a low resolution some early results. Yep. And you can always start to interact with the filters as kinda more results come in. Yep. Okay. Great. Wow.

This is pretty cool. It looks like we've got some.

Speaker 0:

some glass I've never heard glassmorphic before but like the glass kind of UI, I.

Speaker 1:

would call it that. Yeah. So actually kinda like the prompt is entirely there were examples for the prompt Mhmm. But it was basically completely free form. Yep. And I might not be familiar with some of these kind of design terms, kinda glassmorphic or skeuomorphic or flat design or gradient or whatever. Yep.

And so having maybe a interface here that gives me sort of like selection and ideas can almost like pills maybe. These are design terms that can almost like you know like Lego bricks can like drag in Yeah. Versus just I need to know these terms and just type them or learn them from the examples. So there's kind of opportunity to build kind of like a richer prompt builder. Yep.

I think part of the power of these open ended prompts is that they can be anything. Mhmm. They can accept any input like glassmorphic.

Speaker 0:

and it knows what to do. Like it figured it out. It's kinda like good news, it can be anything. Bad news, it can be anything. Exactly. And and I think the worry as a user when using these interfaces is that you're worried you're gonna tell it something like glassmorphic and it's not gonna know what that is.

And you're gonna wait for two minutes for it to generate it and then it's it's not gonna be the thing that you're expecting. You have to start all over again. So that would also be a really interesting, kind of design challenge.

Speaker 1:

For the output generated, what are the things the machine actually respected from your prompt, and where did it sort of like ignore or struggle with?

Kind of giving that feedback back to the human and the prompt, maybe with like little squiggly lines or kind of things, or with color kind of showing, you know, what did it index on from the prompt and execute on, and what did it sort of like maybe ignore or fail.

If there could be sort of like that feedback loop, then it can help the human to refine your prompt and kinda learn how to, you know, interact with the machine. And help the AI to figure out what it did well Mhmm. And should keep doing, and what it didn't do well and should get more data to improve. I think what's cool here is like, okay, this seems pretty interactive. It looks like it may oh, wow.

There's even hovers here. So how do we now get feedback to this design to improve it? It looks like there's an edit button. So I guess I can click into these.

Speaker 0:

I wonder if I could give it another prompt.

Speaker 1:

It looked like when you clicked on one of the sub elements that you can sort of like prompt on a sort of like module basis. Yeah. Let's see. Okay. So I'm editing.

Speaker 0:

What if I just took this and what if I said okay. Explain your changes. Okay. So make make this Make the sidebar blue. Mhmm. Run revision. Okay. Okay.

Now we're waiting again. Hopefully this will be faster.

Speaker 1:

Hopefully, it's like an incremental change. Right? Yeah. Where we can sort of only submit to Delta and not like do sort of like a single shot, do the whole thing over again. Yeah. Not just for us kind of to wait but also for you know resources, preserve kind of the the existing design that we did like and didn't wanna turn blue. Yeah. So let's see how we can sort like deal with this.

And and for consistency too because you know especially when you're prompting to create graphics,.

Speaker 0:

one of the challenges is if you wanna change one element at a hat on a person,.

Speaker 1:

it's hard to keep the rest of it consistent. Exactly. That's a common challenge now, and so if they're able to do this then I I think that that speaks pretty highly. Yeah. So so so any interface designer or technical team kind of figure out the challenge of how to kind of add sub prompts or how to only change kind of iteratively,.

Speaker 0:

that that that's kind of like really the the frontier. It looks like it did it. Did the rest change or did it No. I believe the rest is is the same here. Awesome. Well, is very cool, PolyMatt. Great work. Another common pattern that we're seeing with some of these new AI interfaces is adaptive AI interfaces.

Speaker 1:

And why don't you tell the audience a little bit about what that means? Yeah. Think some of the adaptive interface that we see emerge based on the content of, for example, an email or document, the interface then dynamically changes, which typically isn't, you know, static software. It typically wasn't the case.

And so here, it's kinda like the input is the actual content, and then the output of the AI LLM is then the UI to interact back with that content. And so a really interesting space to emerge for kinda like, you know, looking at the design challenges of that. Yeah. It's it's interesting to think about things like Microsoft Word. Right?

Where like the thing that everybody is so familiar with is a billion buttons on the top row because they're never sure which one you might need because they don't know the context of how you're editing. And with AI now,.

Speaker 0:

we don't need to show all the buttons, we can just show you buttons that are relevant. Yeah. Challenges of course are kind of predictability. People love to have like their, you know, billion buttons and like the exact place. And so let's let's have a look how this is, is solved and what the challenges are. Okay.

So first up we've got Zuni and Zuni is building a smarter email app for founders where you can stay on top of your inbox with a little help from Zuni.

Speaker 4:

They've got a demo here, so we're gonna go ahead and watch the demo, and then we'll we'll, comment on that. What if you could respond to emails as fast as you make decisions? With Zuni, now you can. Zuni makes you faster and more focused by letting you decide, not draft, your responses. Here in Zuni, my tray shows me three important emails it knows I need to act on.

Let's start with the one at the top here. Looks like I missed a call and may want to reschedule. On the right hand side, I can see responses that Zuni thinks might apply. I'll hit y to confirm a time, let's say 1PM tomorrow, and hit enter to generate my email. Looks great. Let's send it and move on. Staying in control of my inbox and ahead of my day.

Speaker 0:

Okay. Well, one of the interesting things here I think is is this interface on the side where it's pulling up the user's email and it's suggesting specific responses to that email based on the content of that email. It's it's almost changing what the reaction buttons are. Exactly. And it almost seems like what's the right level of abstraction.

Speaker 1:

to reply to an email in the future? So this still makes me kind of process each email just maybe like a little faster, but I still have to go do it. If you go like higher level abstraction, can so like an email inbox of the future just kinda like autopilot do that for me.

If we go all the way kinda like down the lateral abstraction, we would talk more about so like AI interfaces that are more autocomplete, And this gets kinda right in the middle, sort of like off of the the levels of attraction where, you know, we process each email, there is no draft yet, but there are sort of like predefined prompts that are adaptive per email that I can select.

I think a really interesting sort like tweak on that would be, what if so like the best guest draft for that email would already be sitting in the inbox for me. Mhmm. And then I can, on a higher level than, so like text editing, I can prompt it to change it depending on so like what I as a as a human in the loop actually kind of feel.

Speaker 0:

the answer should be. Yep. And and I think it's it's interesting that they made these very high level and then when you click on each one if they need more information like you click dismiss, it'll just go away. No other context is needed. The first one here it said confirm a call time and the UI actually knew that we needed input of what call time should we suggest. Mhmm.

And so it knows the next step based on which response you wanna give which it's showing you a custom UI for how to respond and not showing you a list of five text boxes of which maybe one is the one that you should fill out and respond. Mhmm. And a little detail here is because we're already with hands on the keyboard, basically like in email mode,.

Speaker 1:

and processing the inbox and typing anyway, being able to access all these adaptive kind of like options by just keyboard shortcut with a single letter,.

Speaker 0:

is is is really on point. Yeah. You had a great point earlier around. If buttons are moving, people value consistency and remembering, oh, that button is there and I go here to click it. And that is actually interesting where the buttons and the responses are technically changing for every single email,.

Speaker 1:

but the the keys that you're pressing do not. And so you can kinda keep your hand right there and and know what to expect each time. Now one kind of interaction design challenge with these hot keys, they're not like with a modifier key like a keyboard shortcut. So command c for copy for example has a modifier key. But these are just hot keys where I just pry press y. It just does it. Right?

But what if I think that my cursor is focused inserting text and I wanna kinda reply yes, then basically my first why keystroke like submits a button. Right? Mhmm. And so there's always this challenge of really being very clear when an input element is focused and you're typing versus now typing on the keyboard will just do stuff in your UI. Yep. Adaptive UIs of the future here. There we go.

Awesome.

Speaker 0:

Thank you, Zuni. Very cool. We've seen some voice examples. Let's take a look at a video example now. This is Arjul, arjul dot a I, and they have a product that's almost like an AI, video studio to create production quality videos. This actually has a deep fake AI version of me. Incredible.

And so, what we can do here is we can type our own custom script right here into this box and make it say whatever we want, create different blocks where we can change camera angles for each of these, different settings, and we can change the body language too. So for this one where I say here I am, pointing to myself for this point, I have selected the point to myself example here.

And you had to select that manually? Yeah. So I selected it manually. You could imagine in the future they would auto detect it. Right. Right. That would be really interesting. It's cool here that as you hover over these, you can see the different,.

Speaker 1:

samples of of what will happen. Yeah. You can almost imagine how you could highlight certain parts of the script, and then from a drop down there, kind of choose suggested, but then also kind of standard Yep. Part of this this library. Kinda like, you know, just try, point to myself. I'm I'm what is it? I'm crushing it. Or I'm just I'm crushed.

And so so there's like a lot of kinda like interplay with the text interface to the left. Yep.

Speaker 5:

Alright. So let's hit play, and and let's watch it. Hi. I'm Aaron and today we're going to make a new design review video all about AI interfaces.

Speaker 0:

Here I am pointing to myself for this point. Cool. So the the voice was was very good. That sounded like me and I just typed a bunch of text to get that. And what was the training data to get to create your deepfake? Yeah. So they just need a few minutes of video of me or whoever talking.

Speaker 1:

and then they can basically process it automatically in their models to create their deepfake. And you were saying something completely different than Totally different. This year. That's incredible. Yep. And so let's let's so if this is all AI generated, let's try to to change change the name here. Yeah. Let's say I'm Rafael.

Speaker 0:

And today so it's processing.

Speaker 5:

Hi. I'm Rafael, and today we're going to make a new design review video all about AI interfaces.

Speaker 0:

Here I am pointing to myself for this point. Okay. And so the other thing that you notice here is the video is very blurry. Why is that? And it's got this thing over it and that's because the easier or the faster part in generating this is actually in creating the voice.

And the hard part is it takes many minutes to actually process and generate the video with the right lip movement to match the text that you've entered. And so rather than showing you something, you know, lips moving that is off from what you what you've, put in, They first show you just kind of a blurry version with the audio so you can get a sense Yeah. Of like what it's gonna be like.

Then you click generate here and then that you know it says twelve minutes right here is how long it's gonna take. So so they're trading off.

Speaker 1:

basically fidelity Mhmm. For immediacy, and basically putting the human comes back in the loop. Because if it was just the generate button, right, we would wait for twelve minutes, figure out that something is not quite right, and then kinda like, you know, give the machine a new prompt and wait until it comes back.

So this is a really clever trick to really kind of create this iterative human machine collaboration interface. And I think blurring the video is is is a is a great design.

Speaker 5:

approach to do that. Here's a version that I've generated. Hello. My name is Aaron, and today I'll be sharing incredible news straight from the YC partners. We're moving YC to four batches by year. This is one of the biggest changes YC has made since it launched in 02/2005.

Speaker 0:

Here's the thinking behind it. Previously, if you So there you go. You can see it looks like me. It has the different expressions, the camera cuts, the lips match, the the voice that was added. Pretty incredible. Yeah. Think trading off Fidelity.

Speaker 1:

for latency, really putting the human in the loop to iterate on the script very quickly while sort of like the the full on generation of the video can be happen later as one thing. The other thing is kind of opportunity to potentially bring some of the selection UI kind of closer to the input itself, the text. I think there's a lot of room there to grow. Yeah. Very cool. Awesome. Thank you, Arjul.

Awesome. Well, these were all incredible examples. What are some of our takeaways from some of the interfaces we've seen today? Yeah. Well, I think when we start first kinda started to get this LLM technology, everything was sort of like a chat box, and people just kind of like prompting it. Yeah.

And now within just a few short, like like a few short months or or one, two years, we see this explosion of AI interface and AI components that really kind of are built AI natively. Totally different modalities, how to interact with this new technology with the LLMs, and really just endless opportunity for, iteration and,.

Speaker 0:

building a new world of software. Yeah. I I think you said it really well at the beginning here, which is, these are all verbs. We're creating videos. We have agents going out executing tasks and so much of it is how do you keep the user in the loop and in control while AI does its magic. And we've seen some, pretty amazing interfaces.

Speaker 1:

to get that level of control and and make sure it's doing the right thing Yeah. That leads to incredible output that would have taken days, years It almost feels like possible. It almost feels like back in like 2010 or so when when touch devices really kinda came on the market and everything had to reinvent that kinda touch first.

We're at one of those moments again where like all of software, all the components that we kind of took for granted, they are really being reimagined and reshaped by the builders and startups and designers out there right now. Future's gonna be incredible. Yeah. Yeah.

Speaker 0:

Well, thank you to everybody who submitted. These are pretty groundbreaking UIs that I think are gonna continue to evolve over the next decade and really excited to see, where they end up. So thank you, Rafael, for joining me for this episode, and, we'll see you on another design review. Thank you so much.

Founder Tools

Need help?

Design Experts Critique AI Interfaces

Transcript

Related Videos

Explore More Content