AI is Revolutionizing Scientific Discovery
This is something of a nice change. I've given a lot of scientific talks, and no one claps and cheers when I come on—not normally, even when I come on. It's really exciting. It's really wonderful to be here.
I guess I should start off assuming that not everyone in this cavernous hall knows who I am. Who am I? I'm someone who has done some work in AI for science, who really believes that we can use AI systems, these technologies, these ideas to change the world in a very specific way—to make science go faster, to enable new discoveries. I think it's really, really wonderful. We have the opportunity to take these tools, these ideas, and aim them toward the question of how can we build the right AI systems so that sick people can become healthy and go home from the hospital.
And it's been kind of a really wonderful and winding journey for me to end up here. I was originally trained as a physicist. I thought I was going to be a "laws of the universe" physicist. If I was very, very lucky, I could do something that would end up one sentence in a textbook. And I did physics, and I went to actually do a PhD in physics. And then kind of what I was working on didn't really grab me. I just—it didn't feel like what I wanted to do. So I dropped out. I didn't start a startup—that would have been very on point for this event—but I dropped out, and I ended up working at a company that was doing computational biology. How do we get computers to say something smart about biology? And I loved it. I loved it not just because it was fun, but it was something that would let me do what I thought I was good at: write code, manipulate equations, think hard thoughts about the nature of the world, and use it toward this very applied purpose that at the end we want to enable—we want to make medicines, or we want to enable others to make medicines.
Then I really kind of became a biologist and a machine learner. Actually, a machine learner because I left that job and I went back to grad school in biophysics and chemistry, and I no longer had access to this incredible computer hardware that I had when I was working at my previous job. In fact, they had custom ASICs for simulating how proteins—this part of your body that I'll talk about—move. And since I didn't have that anymore, but I still wanted to work on the same problems—well, I didn't want to just do the same thing with less compute. And so I started to learn, and I was getting very interested in statistics, in machine learning. We didn't call it AI back then. In fact, we didn't even call it machine learning—that was a bit disreputable. I said, "I'm working in statistical physics." But you know, how are we going to develop algorithms? How are we going to learn from data and do that instead of very large compute? And I guess it turns out, in terms of AI, in addition to very large compute, to answer new problems.
And after this, I joined Google DeepMind, and really joining a company that wanted to say, "How are we going to take these powerful technologies and all kind of these ideas?"—and they were becoming very, very readily apparent how powerful these technologies were, with applications to especially games, but also to things like data centers and others. "How are we going to take these technologies and use them to advance science and really push forward the scientific frontier?" And how can we do this in an industrial setting with an incredibly fast pace, working with some really smart people, working with great computer resources, and with all that, you darn well better make some progress. And it's been really, really fun, and the fact that I'm on this stage indicates that we made some progress.
And I think really the guiding principle for me has been that when we do this work, that ultimately we are building tools that will enable scientists to make discoveries. And what I think is really heartening about the work we've done, and the part that really I think still just resonates with me at my core, is there are about I think 35,000 citations of AlphaFold. But within that, there are tens of thousands of examples of people using our tools to do science that I couldn't do on my own, but are using it to make discoveries—be it vaccines, be it drug development, be it how the body works. And I think that's really, really exciting.
And the part I want to talk to you about today, and the story I want to tell you, is a bit about the problem, a bit about how we did it. And I think especially the role of research and machine learning research, and the fact that it isn't just off-the-shelf machine learning. And then I want to tell you a little bit about what happens when you make something great and how people use it and what it does for the world.
So, I'll start with the world's shortest biology lesson. The cell is complex. For people who have only studied biology in high school or in college, you might have this idea that the cell is a couple parts that have labels attached to them, and it's kind of simple. But really, it looks much more like what you see on the screen. It's dense. It's complex. In terms of crowding, it's like the swimming pool on the 4th of July, and it's full of enormous complexity.
Humans have about 20,000 different types of proteins. Those are some of the blobs you see on the screen. They come together to do practically every function in your cell. You can see that kind of green tail is the flagellum of an E. coli—that's how it moves around. And you can see, in fact, how it moves around. And you can see that thing that looks like it turns, and in fact it turns and drives this motor. All of this is made of proteins.
When people say that DNA is the instruction manual for life, well, this is what it's telling you how to do. It's telling you how to build these tiny machines. And biology has evolved an incredible mechanism to build the machines it needs—literal nanomachines—and build them out of atoms.
And so your DNA gives you instructions that say, "Build a protein." Now you might say your DNA is a line, and so are proteins in a certain sense. It's instructions on how to attach one bead after another, where each bead is a specific kind of molecular arrangement of atoms. And you should wonder: if my DNA is aligned and I am very much not one-dimensional, what happens in between?
And the answer is: after you make this protein and assemble it one piece at a time, it will fold up spontaneously into a shape—like you've opened your IKEA bookshelf, and instead of having to do the hard work, it simply builds itself, and you get this quite complex structure. You can see quite a typical protein, a kinase, for those of you who are biologists in the audience over there. And you can see this very complex arrangement of atoms, and that arrangement is functional. And the majority—not everyone—of the proteins in your body undergo this transformation, and that is what functions, and that is incredibly small. So light itself is a few hundred nanometers in size, and that's a few nanometers in size. So it's smaller than you can see in a microscope.
And for a long time, scientists have wanted to understand this structure because they use it to predict how changes in that protein might affect disease. How does that work? How does biology work? Often, if you make a drug, it is to interrupt the function of a certain protein like this one.
Now, scientists have, through an incredible amount of cleverness, figured out the structure of lots of proteins, and it remains to this day exceptionally difficult. Right? You shouldn't imagine this as, "I want to determine the structure of a protein. So I shall open the lab protocol for protein structure determination. I shall follow the steps." It consists of cleverness, of ideas, of finding many ways. In this case, I'm describing one type of protein structure determination—experimental measurement—where you convince that big ugly molecule I just showed you to form a regular crystal, kind of like table salt. No one has an easy recipe for this. So they try many things. They have ideas, and it's exceptionally difficult and filled with failure, like many things in science.
And you're really looking at kind of one way to get an idea of how difficult this is. Just one kind of ordinary paper that we were using. I flipped to the back, and it said, you know, in their protocol, "After more than a year, crystals began to form." Right? So not only did they do all these hard experiments, but they had to wait about a year to find out if it worked. And probably that year wasn't spent waiting—it was trying a thousand other things that didn't work as well.
Once you do that, you can take this to a synchrotron—a modest thing. You can see the cars rigging the outside of this instrument so that you can shine incredibly bright X-rays on it and get what is called a diffraction pattern, and you can solve that, and you can deposit it in what's called the PDB, or the Protein Data Bank.
And one of the things that enabled the work we did is that scientists 50 years ago had the foresight to say, "These are important, these are hard. We should collect them all in one place." So there's a dataset that represents essentially all the academic output of protein structures in the community and available to everyone. So our work was on very public data.
About 200,000 protein structures are known. They pretty regularly increase at about 12,000 a year. But this is much, much smaller than the need. Getting the kind of input information—the DNA that tells you about a protein—is much, much, much, much easier. So billions of protein sequences are being discovered. About 3,000 times faster are we learning about protein sequence than protein structure.
Okay, that's all scientific content, but I should talk to you about the little thing we did, which has this kind of schematic diagram. We wanted to build an AI system. In fact, we didn't even care if it was an AI system. That's one of the nice things about working in AI for science—you don't care how you solve it. If it ended up being a computer program, if it ended up being anything else, we want to find some way to get from the left, where each of those letters represents a specific building block of the protein, considered in order. We want to put something in the middle—AlphaFold—and we want to end up with something on the right. And you'll see two structures there if you look closely, where the blue is our prediction and the green is the experimental structure that took someone a year or two of effort. If you want to put an economic value on it, on the order of $100,000, and you can see we were able to do this. And I want to tell you how.
And there were really three components to doing this, or to do any machine learning problem, and you can say you have data, and you have compute, and you have research. And I feel like we tell too many stories about the first two and not enough about the third.
In data, we had 200,000 protein structures. Everyone has the same data. In terms of compute, this isn't LLM scale. The final model itself was 128 TPU v3 cores, roughly equivalent to a GPU per core for two weeks. This is again within the scope of, say, academic resources, but it's worth saying—really, most of your compute, when you think about how much compute you need, don't get distracted by the number for the final model—the real cost of compute is the cost of ideas that didn't work, all the things you had to do to get there. And then finally, research. And I would say this is all but about two people that worked on this—it's a small group of people that end up doing this. So really, when you look at these machine learning breakthroughs, they're probably fewer people than you imagine, and really this is where our work was differentiated.
We came up with a new set of ideas on how do we bring machine learning to this problem. And I can say earlier systems, largely based on convolutional neural networks, did okay. They certainly made progress. If you replace that with a transformer, you're honestly about the same. If you take the ideas of a transformer and much experimentation and many more ideas, then that's when you start to get real change.
And in almost all the AI systems you can see today, a tremendous amount of research and ideas and what I would call mid-scale ideas are involved. It isn't just about the headlines where people will say "transformers," you know, "scaling," "test time inference." These are all important, but they're one of many ingredients in a really powerful system.
And in fact, we can measure how much our research was worth. So AlphaFold 2 is the system that is quite famous, the one that was quite a large improvement. AlphaFold 1 was the best in the world, but someone—the AlQuraishi lab—did a very careful experiment where they took AlphaFold 2, the architecture, and they trained it on 1% of the available data, and they could show that AlphaFold 2 trained on 1% of the data was as accurate or more accurate as AlphaFold 1, which was the state-of-the-art system previously. So there's a very clean thing that says that the third of these ingredients—research—was worth a hundredfold of the first of these ingredients—data.
And I think this is generally really, really important—that one of the big things, as you're all thinking, as you're all in startups or thinking about startups, think about the amount to which ideas, research, discoveries amplify data, amplify compute—they work together with it. We wouldn't want to use less data than we have. We wouldn't want to use less compute than we have available. But ideas are a core component when you're doing machine learning research, and they really helped to transform the world.
We can even go back and we can do ablations, and we can say what parts matter. And don't focus too much on the details. We pulled this from our paper. You can see here this is the difference compared to the baseline. And you take either of those, and you can see that each of the ideas that you might remove from our final system—kind of discrete, identifiable ideas—some of which were incredibly popular research areas within the field. Like this work came out, and a part of it was equivariant, and people said, "Equivariance—that is the answer! AlphaFold is an equivariant system, and it's great. We must do more research on equivariance to get even more great systems." Well, I was very confused by this because the sixth row there—no IPA, invariant point attention—that removes all the equivariance in AlphaFold, and it hurts a bit, but only a bit. AlphaFold itself, on this GDT scale that you can see on the left graph—AlphaFold 2 was about 30 GDT better than AlphaFold 1, and equivariance explains two or three of this. It isn't about one idea. It's about many mid-scale ideas that add up to a transformative system.
And it's very, very important when you're building these systems to think about what we would call in this context biological relevance. We would have ideas that were better. We kind of got our system grinding 1% at a time. But what really mattered was when we crossed the accuracy that it mattered to an experimental biologist who didn't care about machine learning. And you have to get there through a lot of work and a lot of effort. And when you do, it is incredibly transformative.
And we can measure against this axis where the dark blue axis—the other systems available at the time. And this was assessed. Protein structure prediction is in some ways far ahead of LLMs or the general machine learning space in having blind assessment. Since 1994, every two years, everyone interested in predicting the structure of proteins gets together and predicts the structure of a hundred proteins whose answer isn't known to anyone except the research group that just solved it, right? Unpublished. And so you really do know what works. And we had about a third of the error of any other group on this assessment.
But it matters because once you are working on problems in which you don't know the answer, you get to really measure how good things are. And you can really find that a lot of systems don't live up to what people believe over the course of their research. And because even if you have a benchmark, we all overfit to our ideas, to the benchmark, right? Unless you have held out. And in fact, the problems you have in the real world are almost always harder than the problems you train on, right? Because you have to learn from much data, and you apply it to very important singular problems. So it is very, very important that you measure well, both as you're developing and when people are trying to decide whether they should use your system. External benchmarks are absolutely critical to figuring out what works, and that's what really helps drive the world forward.
So just some wonderful examples of this—typical performance for us. These are blind predictions. You can see they're pretty darn good.
Also important—we made it available, and we thought it was, and we did a lot of assessment, but we decided that it was very important to make it available in two ways. One is that we open-sourced the code, and we actually open-sourced the code about a week before we released a database of predictions, starting originally at 300,000 predictions and later going to 200 million—essentially every protein from an organism whose genome has been sequenced. And this made an enormous difference.
And one of the most interesting kind of sociological things is this huge difference between when we released a piece of code that specialists could use, and we got some information, and then when we made it available to the world in this database form. It was really interesting—kind of, you know, you release something, and every day you check Twitter to find out, or check X to find out what's going on. And what we would really see is even after that CASP assessment, I would say that the structure predictors were convinced this obviously was this enormous advance, solved the problem. But general biologists, the people we wanted to use, the people who didn't care about structure prediction, they cared about proteins to do their experiments—they weren't as sure. They said, "Well, maybe CASP was easy. I don't know." And then this database came out, and people got curious, and they clicked in, and the amount to which the proof was social was extraordinary—that people would look and say, "How did DeepMind get access to my unpublished structure?" You know, this moment at which they really believed it—that everyone had a protein, either had a protein that they hadn't solved or had a friend who had a protein that was unpublished, and they could compare, and that's what really made the difference.
And having this database, this accessibility, this ease led everyone to try it and figure out how it worked. Word of mouth is really how this trust is built. And you can kind of see some of these testimonials, right? "I wrestled for three to four months trying to do this scientific task. You know, this morning I got an AlphaFold prediction, and now it's much better. I want my time back," right? You know, you really appreciate AlphaFold when you run it on a protein that for a year refused to get expressed and purified—meaning they, for a year, they couldn't even get the material to start experiments. These are really important. When you build the right tool, when you solve the right problem, it matters, and it changes the lives of people who are doing things not that you would do, but building on top of your work.
And I think it's just extraordinary to see these and the number of people I talked to. The time that I really knew this tool mattered—in fact, there was a special issue of Science on the nuclear pore complex a few months after the tool came out. And the special issue was all about this particular very large kind of several hundred protein system. And three out of the four papers in Science about this made extensive use of AlphaFold. I think I counted over a hundred mentions of the word "AlphaFold" in Science, and we had nothing to do with it. We didn't know it was happening. We weren't collaborating. It was just people doing new science on top of the tools we had built, and that is the greatest feeling in the world.
And in fact, users do the darnedest things. They will use tools in ways you didn't know were possible. The tweet on the left from Yoshitaka Moriwaki came out two days after our code was available. We had predicted the structure of individual proteins, but we were working on building a system that would predict how proteins came together. But this researcher said, "Well, I have AlphaFold. Why don't I just put two proteins together, and I'll put something in between?" You could think of this as prompt engineering, but for proteins. And suddenly they find out this is the best protein interaction prediction in the world, right? That when you train on these, a really, really powerful system will have additional, in some sense, emergent skills as long as they're aligned.
People started to find all sorts of problems that AlphaFold would work on that we hadn't anticipated. It was so interesting to see the field of science in real time reacting to the existence of these tools, finding their limitations, finding their possibilities, and this continues, and people do all sorts of exciting work—be it in protein design, be it in others—on top of either the ideas and often the systems we have built.
One application that really I thought was really important is that people have started to learn how to use it to engineer big proteins or to use it in part of—and I want to tell this story for two reasons. One is I think it's a really cool application, but the second is how it really changes the work of science. And often people will say, "Science is all about experiments and validation. So it's great that you have all these AlphaFold predictions. Now all we have to do is solve all the proteins the classic way so that we can tell whether your predictions are right or wrong." And they're right about one thing. Science is about experiments. Science is about doing these experiments. But they're wrong about another thing. Science is about making hypotheses and testing them, not about the structure of a particular protein.
In this case, the question was they took this protein on the left called the contractile injection system—but that's a mouthful. They like to call it the molecular syringe. And what it does is it attaches to a cell and injects a protein into it. And the scientists at the Zhang Lab at MIT were saying, "Well, can we use this protein to do targeted drug delivery? Can we use it to get gene editors like CRISPR into the cell?" They tried over a hundred methods to figure out how to take this protein, which they didn't have a structure of—this is just kind of a rendition after the fact—and say, "How can we change what it recognizes?" I think it's originally involved in plant defense or something like that, and they didn't know how to do it.
And they ran an AlphaFold prediction. You can see the one on the left. I wouldn't even say it's a great AlphaFold prediction, but almost immediately they looked at that and said, "Wait a minute. Those legs at the bottom are how it must recognize and attach to cells. Why don't we just replace those with a designed protein?" And so almost immediately, as soon as they got the AlphaFold prediction, they re-engineered to add this design protein that you see in red to target a new type of cell. And they take this system, and then they show, in fact, that they can choose cells within a mouse, and they can inject proteins—in this case, fluorescent proteins. So there you'll see the color, and they can target the cells they want within a mouse brain. And so they are using this to develop a new type of system of targeted drug discovery.
And we see many more examples. We see some in which scientists are using this tool to try thousands and thousands of interactions to figure out which ones are likely to be the case. In fact, discovered a new component of how eggs and sperm come together in fertilization. Many, many of these discoveries that are built on top of this.
And I like to think that our work made the whole field of what's called structural biology—biology that deals with structures, you know—five or 10% faster. But the amount to which that matters for the world is enormous, and we will have more of these discoveries.
And I think ultimately, structure prediction and larger AI for science should be thought of as an incredible capability to be an amplifier for the work of experimentalists—that we start from these scattered observations, these natural data. This is our equivalent of all the words on the internet. And then we train a general model that understands the rules underneath it and can fill in the rest of the picture.
And I think that we will continue to see this pattern, and it will get more general—that we will find the right foundational data sources in order to do this. And I think the other thing that has really been a property is that you start where you have data, but then you find what problems it can be applied to. And so we find enormous advance, enormous capability to understand interactions in the cell or others that are downstream of extracting the scientific content of these predictions, and then the rules they use can be adapted to new purposes.
And I think this is really where we see the foundational model aspect of AlphaFold or other narrow systems. And in fact, I think we will start to see this on more general systems, be them LLMs or others, that we will find more and more scientific knowledge within them, and we'll use them for important purposes.
And I think this is really where this is going. And I think the most exciting question in AI for science is: how general will it be? Will we find a couple of narrow places where we have transformative impact, or will we have very, very broad systems? And I expect it will ultimately be the latter as we figure it out.
Thank you.