Transcript
[00:00:00] James Landay: The LLM allowed us to tell what they had written. And sometimes you get a kid who just writes almost nothing. And we were able to use the LLM to encourage those kids who didn't write much to actually write more. And that was one of the big outcomes, is you were able to actually have kids write more. And the kids thought the writing was one of their favorite parts of this experience.
[00:00:30] Russ Altman: This is Stanford Engineering's The Future of Everything and I'm your host, Russ Altman. If you're enjoying the show or if it's helped you in any way, please consider rating and reviewing it. We like to get fives, but give us what we deserve. Your input is extremely important for helping us grow the show and spreading the word.
[00:00:46] Today, James Landay from Stanford University will tell us that AI is not just good at creating text and answering our questions, it can motivate us as a coach, and it can teach us as a tutor. It's the future of AI coaching.
[00:01:01] Before we get started, another reminder to rate and review the show, give us a five, it'll help spread the news.
[00:01:14] Large language models like ChatGPT, and many others have started to sprout all over and people are using them for productivity. We're writing letters, we're editing our text, we're answering questions. There's a lot of stuff people are doing, but what we don't always think about is can we use this AI as part of a coach or a tutor to help us get from where we are to where we want to be.
[00:01:36] Well, James Landay is a professor of computer science at Stanford University, and he's a co-director at the Stanford Institute for Human Centered AI. He will tell us that he's built a prototype health fitness coach that uses AI. He's also created a tutoring system for elementary school students, where they learn about the planets, about global warming, about fires, all through a mixture of large language models, getting outside and interacting.
[00:02:04] James, you were on the show in 2019, May of 2019. It was great. But now it seems like you're working a lot on using AI for teaching, for coaching. How has your work evolved in the last few years? And what are you really excited about?
[00:02:18] James Landay: Well, a lot of stuff has happened in the world since 2019, Russ, as you may recall. But most of my projects I see as more of ten or twenty year efforts. They're big problems that if you solve them can have a big impact on the world. So problems like, how do we improve the education system, or how do we improve people's health, to me are not things we solve in three years. They're things we work on for a long time and then along the way, we come up with new ideas that might improve it. So, for example, in health, when I spoke to you before, we talked about work on how you can give people better awareness of what they're doing towards their fitness goals by having what we call ambient display.
[00:02:58] So, for example, story or images on the lock screen of their phone. What we've been working on since then is, how can we give people more of the sense of having a personal coach, even if they can't afford to have a personal coach? So, you know, just like personal tutoring, personal coaching is highly effective for individuals who are trying to change their health or fitness. But many of us can't afford to have a personal coach because it's quite expensive and there's not enough coaches out there to handle all of us. So, combined with what's going on in AI, LLMs, for example, we're able to use that,
[00:03:31] Russ Altman: Large language models.
[00:03:33] James Landay: Yeah, large language models, or as we like to say here at Stanford, foundation models, because we're going to build other applications on top of those foundations. We can use that type of model to get at people's real needs for coaching.
[00:03:47] So you can think of all these fitness apps that you might have to have out there, whether it's Apple's fitness app or Android from fitness app from Google, Google fit. They tend to be very quantitative in form. So you, maybe, you have an app where you can put some goals in quantitatively. But it's the qualitative issues that often trip people up. You know, the app's telling me, hey, you know, go running every morning, but hey, I need to drop my kid off at preschool and I can't fit that into my schedule or,
[00:04:18] Russ Altman: You know, their user interfaces. I'm sorry to interrupt. The user interfaces are terrible. I have a Garmin watch and sometimes I do a five mile run. And at the end it says, unproductive workout. That's all it says. And I just want to Frisbee it into the, into a brick wall. So you you've already gotten me, but please continue.
[00:04:37] James Landay: What's nice about these large language models is they're really good at processing text, right? And a lot of the ways we might express some of these qualitative issues of what's worked for us in the last years when we've done our fitness activities or where we might have barriers. These models are actually good at being able to process that kind of information if we have a conversation with it, like you might have with a coach. And then help you develop a plan for your fitness that accounts for those issues.
[00:05:12] So we've built an application called GPT coach where we used a large language model. And you know, it's much harder said than done to actually get these things to do what you want. But essentially we can create a bunch of what we call agents that use different pieces of the model. And this way we can check what kind of things we're telling you, and we can keep the model on track because we use a technique called motivational interviewing, which is popularized by researchers here in the School of Medicine at Stanford for coaching.
[00:05:44] And we can have this coach pretty much use that style of interviewing to find out what's worked for you in the past, what your barriers are, and then together come up with a fitness plan. And we tested this in the lab with people and they were blown away by how good it was. And in fact, when I first tested it, I said this could be a product right now. That's how well I thought it worked.
[00:06:06] Russ Altman: So like so many, so much exciting things there. So first of all, I take it, this is not out of the box ChatGPT.
[00:06:12] James Landay: Um, not out of the box. We have to essentially break it up into a bunch of different agents who have different tasks that they're trying to do in terms of this interviewing and also keep it on track and check where we are in the process. What kind of information we've gotten. And we also get information out of your, you know, you have a Garmin watch. A lot of people have an Apple watch or, uh, just an iPhone. We're able to actually take three months of their prior data and use that to also drive the conversation, understanding what they've been doing.
[00:06:44] Russ Altman: So another really intriguing thing you made a quick reference to, is that you've been kind of working, it sounds like, with the School of Medicine experts at fitness or health. Tell me about that, because now you're trying to get this language, large language model to do things at a very kind of virtuoso level. So what, what is the role of the health professionals and how do you get their knowledge into this model?
[00:07:08] James Landay: Yeah. I mean, most of my projects tend to be interdisciplinary working with folks outside of computer science. So sometimes even in computer science in a different subfield. Um, so, you know, as you mentioned the last time I was on the show, we talked about buildings and smart buildings. And I work with people in civil engineering and education and sustainability on that, as well as health.
[00:07:29] Um, one of the other projects I've had going for a while that I talked about last time was these ambient displays for health and fitness. Now, as part of that, uh, we collaborate quite closely with people in public health in School of Medicine, and, you know, that team has a lot of experience on training coaches and what are the best methods for coaches.
[00:07:49] So, in this case, they actually created a manual. And so we were able to take that manual and understand how it's structured in information and actually feed that in to large language model as well to help our system kind of do it in that style. And we also consult with them on this. They haven't been as much of a close collaborator like on this project, but we've consulted back and forth with them and they may become closer as we get to the next stage. Because all we've done so far is be able to replicate, you know, that first thirty minutes or sixty minute meeting you might have with a coach, they're meeting them for the first time and they're trying to understand what your goals are and what your barriers and what's worked and come up with a plan.
[00:08:32] But the next step of the project is, okay, now what does it mean for me to have a coach on my phone over the next six months as I'm, you know, partaking in my exercise program? And so that interface is a harder design, because this first one really is a conversational interface. It's like, we're having a meeting. But now I'm going to have this application running on my phone and we don't think language is the interface that you always want to use, where you want to have graphical elements that you would see, and there, we learned from some of that prior research I've done with a ambient display that's showing some kind of visual story as you go. But now you might want to use language, for example, when you might want to say, it might remind you, hey, Russ, you committed that you were going to go running this morning and you might want to say, oh, but I have a big meeting with James Landay and he's really important. I can't miss it.
[00:09:23] Russ Altman: It's really, really important.
[00:09:25] James Landay: Right. And maybe, or you're sick. And so you might use that and then the coach might be able to kind of adjust your plan. So changing your plan or having exceptions or being able to, you know, highlight when you're not feeling up to it and maybe it's going to, you know, work on some other way to motivate you or give you a, uh, an easier goal that might help you get there.
[00:09:47] So we'll have a kind of combination interface now with traditional graphical interface and ambient display. But also with the part where you might then still converse when you want to. So that's what we're designing right now. And the plan then is to short, do a short-term study of that, like three weeks just to get the bugs out. But then the true goal and all this kind of work is, can we run a study that might run over three, four, six months and show that people are doing better at hitting their goals or changing their behavior than they would otherwise without control?
[00:10:21] Russ Altman: Yeah. So this sounds very exciting. And by the way, sign me up for that. Um, so, but here's a question. When you have a coach, I know some people who've had coaches and I know there's a process of interviewing them because you're always trying to find the coach who's like perspective on life and whose perspective on health and wellness kind of matches yours. You know, some people like the drill sergeant. You know, we've all seen these ridiculous videos on YouTube where they're like, go, you can do it. You know, work harder, work harder, work until you drop. And then there's other people who are like much more like, let's do whatever you want to do. Do you imagine that you're going to have to, or do you think the LLMs already can modulate their tone based on the kind of the preferences of your users?
[00:11:03] James Landay: Yeah, that's a really great question. So in fact, as we're designing the visual version of this, we see a kind of avatar like character that is representing the coach. And so we are doing an online study to just test these different designs for those coaches and also the personalities. And that's what we want to discover, whether different people have a different preference for types. And we're trying to design it in a way that we could probably flip in different, you know, personalities and such. Now, my caution to my grad students is we probably don't want to do that in the first version of this, because then that just makes the study results harder to interpret.
[00:11:46] Russ Altman: Right. Good point.
[00:11:47] James Landay: You know, was it because you had the different coach? But I think for a product, you're probably gonna want both different visual themes, like, you know, we have a space theme and a beach theme, you know, different people, you know, maybe you want a quick coho theme.
[00:12:01] Russ Altman: Dude, we have to exercise, dude.
[00:12:04] James Landay: Right. So there's different themes in the visuals, but there might be different themes on some people want to coach that's really tough and pushes them. Like I had a coach once who could make me cry literally. And like, 'cause he could push me beyond where I could push myself. And some people are okay with that, but other people, would quit if they had that. So understanding that is something that we're looking at, but we probably won't push it into the first version because again, it just adds too many confounds for study.
[00:12:31] But I think for a further long-term thing, yes, you'll probably have different personalities. And yes, in terms of the LLM, the LLM, you can do that. But we also see the need for it in the visuals, as well as the personality of the avatar. So we're looking at that in all aspects. We probably won't see that in the first version of this.
[00:12:50] Russ Altman: And I'm very aware that I'm giving you all of these features and that you need to, you know, walk before you can run. But that won't stop me. My next question is, what about group activity? So a lot of people get a lot of their sustenance and kind of support in their, especially in health, like they're part of a running group, or they're part of a rowing or a workout group. Do you have a, maybe not today, but is in your vision, some sort of social support from other humans or whatever?
[00:13:16] James Landay: It's not in the initial version. I've worked in this area for a long time and I have to tell you, people think, oh yeah, social, you got to add social. But we did some of the early studies on this and we found that social can actually also backfire and make it worse. So I remember this, when I was in Seattle at the University of Washington and running a research lab for an Intel. We've done one of our first studies like this and the people were in a group and literally one woman drove by one of the women who was walking in Seattle up a hill and the lady said, why are you walking up the hill? You can get credit just for walking the flat.
[00:13:50] So, you know, the group even was discouraging her from doing extra. So it is something you have to be very careful about how you design it, um, because it could also cause the opposite results. So we're not looking at group right now, but it is something obviously for, uh, again, if it was a real product, you might think about where it might, you integrate that.
[00:14:08] Russ Altman: And I, my next question is a little bit about a definition. Uh, even in this conversation and definitely in your writings you've talked about this idea of ambient awareness and I just want, could you define that for me? And let me know why it's an important thing because that's not the kind of thing I usually attribute to a computer or even to an LLM. Like usually my LLM is in a little white box on my computer.
[00:14:29] James Landay: Yeah.
[00:14:29] Russ Altman: It has no idea if I'm in a rainstorm or at a spa. So what is ambient awareness to you?
[00:14:34] James Landay: So, when I think about what is an ambient awareness or ambient display, it really comes out of, this idea that a lot of times we're attending to something else in the real world. We're not like staring at our phone doing something on it, right? And how do we take advantage of those glances at our phone to actually communicate information to us in the background, even if it's not the primary task.
[00:15:03] So a lot of these fitness apps and things like that, for you to know what's going on there, you need to turn on the app and look at it and go, oh yeah, I've walked this much today or I've run this much. But only if you go check it, are you going to be aware of what's going on? And so our research is based on this idea that people who are really good at sticking to their goals or people are just much more aware of what they've done. They're tracking it more. Either they're actively checking or they're just aware. Oh, I know I parked my car over there. I've walked this far, but oh, those of us who are less aware, have a harder time, you know, understanding, are we doing well today? Do we need to do more? And so the idea of ambient awareness is can we have a display that you might just glance at and see that gives you a sense of how you're doing.
[00:15:52] So we take advantage of the lock screen of the phone or the wallpaper when you unlock as a way of just seeing some kind of display that gives you a sense. So the one example I think you see out there today is on an Apple watch. You might see those rings if you use that display. I think, you know, they got that probably from our research because we've been working on this for years. But it's a little too subtle. Most people are like, not even aware, well, what do those rings mean? And is that good or bad? It's kind of small on most people's watches. And so for us, it would have to be something you really see, and by just a glance, you have a sense of, Hey, I'm doing pretty well today, or I'm doing well this week. Or no, I should bring my gym bag because I really need to do more. And so we want to take advantage of those glances, even if you're not running the app explicitly, to get a sense of how well you're doing.
[00:16:40] Russ Altman: No, I love that idea because in addition to my phone and my watch I have like literally right now I have two monitors in front of me. There's a lot of real estate not being used.
[00:16:48] James Landay: Right.
[00:16:48] Russ Altman: A little corner telling me, you know, you're not acting healthy or whatever that could be a really important thing.
[00:16:54] This is The Future of Everything with Russ Altman. More with James Landay next.
[00:17:10] Welcome back to The Future of Everything I'm Russ Altman and I'm speaking with James Landay from Stanford University.
[00:17:15] In the last segment, we talked about James's work building health fitness coaches, and they seem to work and they look very promising. But he's taken some of the same ideas now and is using it on elementary school children to help them have a more rich learning experience. It involves getting them outside, interacting with the world, and using large language models, which he will tell us gets them to write more. Maybe large language models are not the end of writing as we know it.
[00:17:44] But I know James, that you're also looking at education and specifically elementary school education, which is arguably the most critical in the same way that pediatric medicine, that's the future. So tell me what, what's happening in, um, AI for elementary students?
[00:17:59] James Landay: Yeah, so again, one of my long-term projects is this project called the Smart Primer. And the whole idea there is, can we use narrative stories with activities embedded in the narrative as a way to get kids engaged in their education? And the high-level motivation of this is that, you know, many of us do well in the school system. It's kind of a factory school system. And anyone who's here at Stanford probably did well in that system. But there's probably a lot of talent out there that just never fit in that and didn't get really motivated and excited by school. And then, you know, in some ways they don't meet their potential in society and probably end up in careers that are less satisfying and less, uh, economically productive.
[00:18:45] And so one of my goals was, is there a way to motivate kids outside of traditional school to learn, and maybe that will carry over into their other educational outcomes over their life. So, the Smart Primer is a series of projects trying to explore the use of narrative in a personalized tutor. So we've built a variety of these over time, but now that the AI part of this is starting to work better by having these foundation models and LLMs, we can do more.
[00:19:17] So a couple of summers ago, we built this application we call Moon Story. It runs on a smartphone. And as part of this, kids learn about the environment. They learn about the planets and the Sun and the scale differences between the planets and each other and the Sun. And they learn about the scale difference of the distances between the planets and the Sun. So in fact, to use this, they do mobile AR on the phone and we had these kids doing,
[00:19:50] Russ Altman: You said something, mobile AR.
[00:19:51] James Landay: Yeah.
[00:19:52] Russ Altman: Define what that is.
[00:19:53] James Landay: So mobile, mobile augmented reality. So not having to wear some goggles, but instead on your phone, we can see through the camera and see objects in the physical world, but we can overlay data on top of it. So for example, we had kids come over here to Stanford. We have on our science and Engineering quad, these huge, like, I call them big marbles. It's a big artistic, uh, installation. There's, you know, something like ten or twelve of these huge marbles.
[00:20:20] Russ Altman: I love those things. I love them. They look like planets.
[00:20:23] James Landay: They really do look like planets. So I had this idea of, oh, could we map those to the planets and the distance between a subset of them to kind of scale down distance to the planets? And what do you know, we were able to put the Sun at one end of that quad and go through the inner planets pretty accurately of being the right distance between them. And so the kids see the Sun and then as they go to these planets, they actually can see the scale of the real planets relative to the Sun and the distances they walk, they learn are kind of the relative distances.
[00:20:58] And then they get through all the inner planets. And then we get them go from Mars to Jupiter. So Jupiter is the first outer planet. They have to walk all the way from the Science and Engineering quad to Stanford's memorial church, which is maybe half mile or third mile away. And that shows, that's how far from those inner planets that you were just going like, you know, you know, twenty meters to now, you know, you're going something like four hundred or five hundred meters to get to that one.
[00:21:24] And if you want to go to Saturn, by the way, it's off past that, you know, other end of the campus. So, you know, that whole story about the planets was also put into a story based on an ancient Chinese story about the Moon goddess and her husband who was an archer who had to shoot down these seven orbs that were making the earth too hot. And so we have this story that they learn about global warming, as well as the planets, and it's all embedded in the story. And then as part of this the key thing that LLM let us do is, they had to write about what they learned and what they were thinking about changing in their everyday life is about sustainability.
[00:22:06] The LLM allowed us to tell what they had written. And sometimes you get a kid who just writes almost nothing. And we were able to use the LLM to encourage those kids who didn't write much to actually write more. And that was one of the big outcomes is, you were able to actually have kids write more. And the kids thought the writing was one of their favorite parts of this experience. 'Cause they got feedback from the Moon goddess on what they wrote.
[00:22:33] Russ Altman: Woah. So you had the characters from the story,
[00:22:36] James Landay: Yeah.
[00:22:36] Russ Altman: Embodied within the LLM?
[00:22:38] James Landay: Right. And it was all personalized to what you wrote. And that was the only place we use the LLM in this. And we got learning gains that we test by doing a pretest and a post-test and a test a few weeks later. Um, but the big surprise to me, you know, there are a lot of other results, but the big one was, hey, we got kids to write more, and that's really hard.
[00:22:59] Russ Altman: Especially since people are saying LLMs are going to be the downfall of writing by humans, and you have a counterexample there. Is just to understand this a little bit more, is it that they're interacting with the LLM and it's prompting them like, hey, what did you do today? What did you think about that planet thing? Like, how does the LLM get them to write?
[00:23:17] James Landay: So in this case, the LLM asked them some specific questions about how they were, what things they might change in their lives with respect to sustainability and about what they had learned. And then if they didn't really write much, it kind of, encouraged them to write more, ask follow up questions. And even if they had written something, it could then respond relative to what they had written and also encourage other response. So it's like you really had a person who read what you wrote and gave you feedback that was really relevant to what you wrote, rather than a canned response that a computer program might do in the past.
[00:23:52] Russ Altman: The other thing that you didn't stress, but that I have to note is that this was presumably outside, the kids were moving. The kids were not in a classroom. And this really does, you kind of delivered on the, your introduction was that some kids don't, the classroom environment is not where they excel. And you could imagine that by putting them outside, putting them in space, having them move, that this created a whole different set of skills and interests, and it just seems like you deliver it on that promise.
[00:24:20] James Landay: Yeah. So one big idea of this, and this was also, you know, when I got the original idea comes from really the science fiction author, Neal Stephenson's 1995 novel, The Diamond Age. Like I give Neal Stephenson full credit 'cause this is where this idea came from. And I've been thinking about it since 1995, though, not really seriously till 2010, when the iPad came out and I thought, oh, that's the device he was describing.
[00:24:45] But the idea was, kids today, they're sitting inside on a screen. They're not outside playing, you know, stickball for you New Yorkers, baseball for us, skateboarding, things in the outside like we did when we were growing up and parents lament that their kids are just inside. So as part of this, I didn't want to just create another thing that was just going to force you to be inside.
[00:25:06] Obviously, there's parts of it that you might do inside, like reading a book. But there's other parts where we wanted you to go out and do an activity in the real world, whether it's in your backyard, on your block, or maybe with your parents down on a trail. So we've done a previous one actually, where you look at, uh, eucalyptus trees and you take a picture of one and you smell the leaf and you actually learn about the fires with eucalyptus in the Oakland Hills.
[00:25:31] And you learn about kind of the controversy are, you know, they're not native, should we get rid of them or not? And you have to kind of debate. So that was part of that story. But since then, we've gone, we've doubled down on what we could do with the LLM. So this summer we built a new system. Now in this system, which we call ACORN, again, it has a, uh, environmental theme.
[00:25:53] Kids learn about, um, trees again that are local here to the Bay Area because that's what we're using. But, you know, this kind of tree we could find, um, in other places. So they learn about the California oak. So they learn about oak trees, and they learn about the ecosystem of the different animals that use the acorns as well as live in the trees. And there's these other characters that are these animals that are going to teach them these things.
[00:26:15] Now, what's different about this one is in the previous one, we had to design the whole story and write the whole story. And we worked with authors to help us. And it's all kind of written down in the code. In this one we simply had to define the characters to find the outline of the story of what would happen. And then make some constraints on certain things you would have had to accomplish or learn before you could move to a certain part of it. Then from that, the LLM generated the whole story on the fly for each kid. And the kid could take a different path through the story, depending on what they wanted to learn.
[00:26:53] So there's two big things. And again, we have outside mobile augmented reality as part of this, but what was really interesting in this one is, one, writing those stories and getting it done well is hard. Two, this allowed that to be done automatically and personalized to the how the kid did it. And what we found is we had learning gains bigger than any study I've ever done on education with large effect sizes, which is hard to get.
[00:27:21] Russ Altman: So this is working.
[00:27:24] James Landay: So it was working. Again these are small, hey, you did this thing for an hour. It's not like a whole curriculum. Now there's real potential in this last one. Because we wrote it in that way we think we can now build a tool on top of that toolkit that would allow educators or curriculum specialists to essentially come up with the curriculum and only have to outline what they want and the learning goals and how to tie them together, and then the system could generate it. So we really have a chance of scaling this up in a way that we didn't before. So that's the next step is to build that kind of tool on top of it.
[00:27:58] Russ Altman: Incredibly exciting. So listen, in the last minute or two, I want to ask you about a little bit, something different, which is that you're a leader of a, of an institution at Stanford devoted to human centered AI. And I wanted to ask you, sounds good to me, what is human centered AI and why is it different? Why, is our current AI not human centered?
[00:28:19] James Landay: Yeah, I would say current AI in general is not human centered unless you try to think about it. And we started the institute a little over five years ago now with this idea of human centered AI. But after a couple of years of that, I got a little dissatisfied and felt that was just kind of an empty promise. We were just saying human centered without trying to define what did it mean to make something human centered. And so what's interesting about AI systems is they have more chance of having what I would call side effects on other parts of your community or society.
[00:28:51] So this can happen in traditional software, but in AI, it's much more common. For example, what is the impact of your system on the people who label your training data in Africa? And if you don't pay them well, or if you cut them off, there's an impact there. Or what's the impact in this medical system where you're not the user of it, but your doctor maybe uses it and decides that you're not going to get some lifesaving care. You're impacted even though you're not a user. So much of how we think about designing software systems is about what we call user centered design. Let's involve the users and make sure their needs are there. But in these AI systems, there's more and more of the case where the user is not the same person as those who are impacted.
[00:29:35] So what I've advocated, it's human centered AI means is we need to still do user centered design, but we need to go beyond that and also do community centered design to get the community that might surround it, that are impacted. Let's say somebody is affected, uh, by a criminal judgment of whether they should get prison or, you know, home release. Or whether they, what, how much bail they should have. That affects, not just the judge who's using it.
[00:30:02] And then finally, if an AI system becomes ubiquitous, think about, you know, the ubiquity of our social media applications and what kind of information we see there, you can start to have societal level effects. So human centered AI means we need to design at the user level, but also community level and society level. And think about all those together when we're designing AI systems, if we want them to have a positive impact.
[00:30:25] Russ Altman: That is fantastic. And it sounds like you are walking the walk through the projects that you've just been telling us about. So that really is a great vision.
[00:30:34] Thanks to James Landay. That was the future of AI coaching.
[00:30:38] You've been listening to The Future of Everything. You know, we have more than 250 episodes in our back catalog, so you can listen at a moment's notice to the future of many things. Also, please remember to hit the follow icon on whatever app you're listening to, to make sure you're always alerted to our new episodes.