[Video] Putting the AI in R&D — with Badhri Srini­vasan, Tony Wood, Rosana Kapeller, Hugo Ceule­mans, Saurabh Sa­ha and Shoibal Dat­ta

Dur­ing BIO this year, I had a chance to mod­er­ate a pan­el among some of the top tech ex­perts in bio­phar­ma on their re­al-world use of ar­ti­fi­cial in­tel­li­gence in R&D. There’s been a lot said about the po­ten­tial of AI, but I want­ed to ex­plore more about what some of the larg­er play­ers are ac­tu­al­ly do­ing with this tech­nol­o­gy to­day, and how they see it ad­vanc­ing in the fu­ture. It was a fas­ci­nat­ing ex­change, which you can see here. The tran­script has been edit­ed for brevi­ty and clar­i­ty. — John Car­roll

Thanks to PPD Biotech for their spon­sor­ship of this event at #BIO2019.

John Car­roll:  This pan­el con­ver­sa­tion orig­i­nat­ed last March when I had an in­ter­ac­tion with some­body on Twit­ter, which was very in­ter­est­ing. It was some­body in AI who made the claim that they were in a po­si­tion to do some­thing dra­mat­ic with the way that phar­ma com­pa­nies de­vel­op new drugs, par­tic­u­lar­ly at the dis­cov­ery stage. And the num­ber that he used was more than $400 mil­lion for lead iden­ti­fi­ca­tion alone (which in­cludes the cost of cap­i­tal). This caused a sig­nif­i­cant re­sponse among a va­ri­ety of peo­ple who don’t have more than $400 mil­lion to do lead iden­ti­fi­ca­tion.

We ex­changed some emails af­ter that. He point­ed to a study that Steve Paul did when he was at Eli Lil­ly, I know some of the guys who were in­volved. One was Bernard Munos. Bernard and I had an ex­change. I said in this study you said the hard cost of lead iden­ti­fi­ca­tion at Eli Lil­ly is $146 mil­lion. I mean, you could start a heck of a com­pa­ny for $146 mil­lion and that’s not just for lead iden­ti­fi­ca­tion. And he came back and we had this ex­change: It’s a huge process that in­volves all this mon­ey and all this work.

So I’ve been think­ing about it ever since then. And what I re­al­ly want­ed to do to­day was to bring to­geth­er a group of ex­perts who are grap­pling with this area, this tech­nol­o­gy of ar­ti­fi­cial in­tel­li­gence and ma­chine learn­ing, be­cause I think every­body’s fo­cused on it in one fash­ion or an­oth­er. And some­times it’s prac­ti­cal and some­times it’s kind of very fu­tur­is­tic. But every­body up here to­day is in a very prac­ti­cal sit­u­a­tion, of hav­ing bud­gets, of hav­ing staffs, of hav­ing plans and work­ing every­thing out so they can ap­ply AI in a re­al world way right now, as it de­vel­ops in­to some­thing else. That is the way I would like to set the stage for this con­ver­sa­tion.

With that, I’d like to in­tro­duce every­body here to­day. We have Badhri Srini­vasan who is the head of Glob­al De­vel­op­ment Op­er­a­tions at No­var­tis. Tony Wood, Se­nior Vice Pres­i­dent, Med­i­c­i­nal Sci­ence and Tech­nol­o­gy for GSK. Rosana Kapeller, for­mer­ly of Nim­bus, cur­rent­ly an en­tre­pre­neur-in-res­i­dence at GV, or a com­pa­ny that I used to call Google. Hugo Ceule­mans, the Sci­en­tif­ic Di­rec­tor of Dis­cov­ery Sci­ences at Janssen. Saurabh Sa­ha, the Se­nior Vice Pres­i­dent of R&D at Bris­tol My­ers. And Shoibal Dat­ta, Vice Pres­i­dent, Head of Prod­uct and Tech­nol­o­gy, PPD Biotech.

So Rosana, you’ve been di­rect­ly in­volved in this at Nim­bus and al­so cur­rent­ly at GV. When you hear some of the dif­fer­ent pro­jec­tions in terms of what you can do with AI to­day, what’s your re­ac­tion? Is this some­thing that is go­ing to take a while to de­vel­op? What’s ac­tu­al­ly hap­pen­ing now?

Rosana Kapeller:  I’m very in­ter­est­ed in the in­ter­sec­tion of ma­chine learn­ing and drug dis­cov­ery. So that’s one of the rea­sons ac­tu­al­ly why I joined GV, to learn more about it. And I think that it’s some­thing that we all have to learn more about it be­cause it’s re­al­ly go­ing to aug­ment what we do in drug dis­cov­ery and al­so what we do in de­vel­op­ing drugs fur­ther dur­ing de­vel­op­ment.

Is AI go­ing to be sort of the panacea that is go­ing to cure all our prob­lems right now? The an­swer, I think is no. I think that ma­chine learn­ing at the cur­rent stage can be ap­plied to dis­crete pieces of the drug dis­cov­ery de­vel­op­ment spec­trum. But it’s not go­ing to get you — let’s iden­ti­fy a tar­get and get you to man in three years. It’s not go­ing to be that.

John Car­roll:  Okay, so every­body here has their own par­tic­u­lar take on this. When you hear it, some­thing like $146 mil­lion for lead iden­ti­fi­ca­tion, where do you see the op­por­tu­ni­ties? Where are peo­ple go­ing right now? Tony?

Tony Wood: John, the op­por­tu­ni­ty from our stand­point isn’t the $146 mil­lion, it’s the lit­tle bet­ter than 10% suc­cess rates dur­ing Phase II clin­i­cal stud­ies. And so our fo­cus re­al­ly is about a prob­lem that if, for ex­am­ple, we’ve shown from pub­li­ca­tions in ge­net­ics that you can in­crease your suc­cess rate to around about 20% if you fo­cus on ge­net­i­cal­ly in­spired tar­gets.

John Car­roll:  That would be 20% for every­thing go­ing in­to the clin­i­cals.

Tony Wood:  That’s right. It’s clin­i­cal study. So in func­tion­al ge­nomics, which if you like is the glue that con­nects ge­net­ics to the drug dis­cov­ery process­es that you’ve just been talk­ing about, that’s a place where we can gen­er­ate da­ta of the right shape to make AI/ML a worth­while pur­suit. And one thing that I re­al­ly want to get across here is that, for me, it’s not so much about the method, it’s about two things.

It’s about which prob­lem are you try­ing to solve? Is it an im­pact­ful one, in our case sur­vival in Phase II. And it’s about what da­ta shape can you cre­ate to sup­port the use of AI/ML method­ol­o­gy. Fun­da­men­tal­ly, AI/ML needs nar­row and deep da­ta. And our prob­lem in gen­er­al, be­cause of the last 20 years of the way we’ve con­duct­ed ex­per­i­men­ta­tion, is we’ve pro­duced wide and shal­low da­ta. So the fo­cus is where can we gen­er­ate da­ta that’s go­ing to re­al­ly al­low this tech­nique to be mas­sive­ly im­pact­ful.

Rosana Kapeller: I couldn’t agree more and I think that’s vast­ly un­der­es­ti­mat­ed; the shape of the da­ta that you need to ba­si­cal­ly be uti­lized by ma­chine learn­ing.

Badhri Srini­vasan:   Yes, and I com­plete­ly agree with both the com­ments here. So some­times it start­ed off as, okay, we can just ap­ply an AI or ML and then just run off with it. And to both of your com­ments, the da­ta that we have, we have spent the last so many years col­lect­ing this da­ta for one pur­pose, which is we get it on the CRF, we send it for some reg­is­tra­tion, be done with it. This is a dif­fer­ent kind of ap­pli­ca­tion. That’s go­ing to be the hard part. How do we get that da­ta? How do we get it in the shape we want and what is that da­ta and what is the prob­lem we’re try­ing to solve? So couldn’t agree more with this. That’s the hard part.

Saurabh Sa­ha: Maybe I can key off of one com­ment Rosana made. AI can be used in a very dis­creet way if you ask the right ques­tion first. You have to ask the right ques­tion, I think Tony men­tioned this. First you have to ask the ques­tion. Sec­ond, you have to have large la­beled da­ta sets. And third you have to have the cross func­tion­al ex­per­tise present to be able to do that analy­sis.

What we’ve learned in im­muno-on­col­o­gy is that one dri­ver may not an­swer all our ques­tions on whether a pa­tient is re­spond­ing or not re­spond­ing. We tend to think in bi­na­ry as hu­mans, but there are a num­ber of mark­ers where if you in­cre­men­tal­ly look at each one in ag­gre­gate, the quan­ti­ta­tive blend of those mark­ers may be much more in­for­ma­tive in terms of giv­ing us an idea of who’s re­spond­ing and not re­spond­ing. And one of the dis­crete ex­am­ples of this is an im­age analy­sis. So when we do any of our clin­i­cal tri­als now in the im­muno-on­col­o­gy space, we try to get biop­sy sam­ples at base­line dur­ing a pa­tient, when the pa­tient’s be­ing treat­ed and post-pro­gres­sion.

Now get­ting that da­ta and look­ing at histopathol­o­gy, it’s al­most im­pos­si­ble for a pathol­o­gist or for a hu­man to be able to look at those slides and be able to tell just vi­su­al­ly look­ing at the vi­su­al land­scape, which pa­tient is re­spond­ing and why they’re re­spond­ing. But us­ing ma­chine learn­ing now we’re able to work with some in­cred­i­ble star­tups in the Cam­bridge area where they can tag cell types. They can look at bound­aries of tu­mor cells ver­sus im­mune cells ver­sus stro­mal tis­sue and be able to feed tons and tons of these da­ta, these slides, these im­ages from all dif­fer­ent types of pa­tients and tell us, okay, what are the mark­ers that are as­so­ci­at­ed with the re­sponse? And which aren’t as­so­ci­at­ed with re­sponse? That’s a very dis­crete ques­tion that we can ask of ma­chine learn­ing at this time.

Tony Wood: That’s a re­al­ly fan­tas­tic ex­am­ple be­cause you have the case of im­age analy­sis be­ing able to gen­er­ate mas­sive amounts of da­ta and ac­tu­al­ly, quite frankly, see things that we can’t. To the hu­man eye, you look at these things, they of­ten look ex­act­ly the same. So that’s, for me, an area where here the method is do­ing some­thing that we clear­ly can­not do with any oth­er ap­proach. And the point about histopathol­o­gy ex­tends al­so in­to cell bi­ol­o­gy and the op­por­tu­ni­ty then to use cell phe­no­type on the back of func­tion­al ge­nomics to in­form changes as­so­ci­at­ed with ge­net­ic vari­ants and what have you.

And so for me, that’s what we’re look­ing for. We’re look­ing for the in­ter­sec­tion of mea­sure­ment tech­nolo­gies and da­ta sets which can al­low us to see the world of bi­ol­o­gy in a dif­fer­ent way from that which we’ve pre­vi­ous­ly per­ceived it.

John Car­roll: So how long has Janssen been work­ing on AI?

Hugo Ceule­mans: We have been at it for a cou­ple of years. I agree with what all the col­leagues here said, that the im­pact and de­vel­op­ment is huge. The chal­lenge with de­vel­op­ment is al­so what is the avail­able da­ta for that? There is an up­side to mov­ing a bit ear­li­er, to mov­ing to dis­cov­ery where cu­mu­la­tive­ly a lot of com­pa­nies have a lot more da­ta avail­able and present and an­no­tat­ed and la­beled, which is ex­act­ly fit for pur­pose if you want to go in­to this AI space. So hav­ing im­ages is fan­tas­tic, but very of­ten you do need to la­bel da­ta to make sense of them. So in the dis­cov­ery space, this is some­what eas­i­er and we do see that there is a lot of in­ter­est in putting all these ques­tions to the test right now. Do we have the right types of da­ta? Can we make true on this promise that is be­ing raised by the AI meth­ods? Can we join forces?

So yes­ter­day we made an an­nounce­ment that 10 phar­ma com­pa­nies are ac­tu­al­ly go­ing to join forces in a very pri­va­cy pre­serv­ing way. So they’re not just grow­ing, throw­ing the da­ta at each oth­er, but in a very pri­va­cy pre­serv­ing way, look­ing at the cu­mu­la­tive ware­hous­es of those phar­ma com­pa­nies to try to see whether we do have in­deed the right vol­ume of da­ta and whether by work­ing to­geth­er we can prove that these make a dif­fer­ence.

Saurabh Sa­ha: And Hugo men­tions a re­al­ly good point. It’s not just the qual­i­ty of the da­ta, which is a para­mount, it’s the ac­tu­al vol­ume of da­ta. Be­cause if you have a large sam­ple size and a few di­men­sions that you’re look­ing at, clas­si­cal sta­tis­tics prob­a­bly work just as fine, that’s what we did in grad­u­ate school. But when you have the num­ber of vari­ables or di­men­sions far ex­ceed­ing the amount of da­ta that you ac­tu­al­ly have, where the sam­ple size is much less than the num­ber of di­men­sions that you have, that’s when ma­chine learn­ing ac­tu­al­ly falls apart. That’s when you have a large dif­fi­cul­ty and the fact of the mat­ter is we’re at that in­fan­cy in stage. We just don’t have enough da­ta to feed in­to the al­go­rithms to get a mean­ing­ful out­put.

John Car­roll: There’s a va­ri­ety of dif­fer­ent per­spec­tives here on this par­tic­u­lar ques­tion and Shoibal, I want­ed to have you ad­dress one point that I’ve been in­ter­est­ed in for a while, which is it seems that if you’re a big­ger com­pa­ny, there are a lot more op­por­tu­ni­ties to see about how you can take ad­van­tage of this. You can in­vest in it over a pe­ri­od of years. You can see it bring fruit, you can try to in­crease your per­cent­ages from 10% to 20% which is a big thing Hal Bar­ron likes to talk about a lot. But is this some­thing that’s just re­served for the big­ger com­pa­nies? Or are there oth­er strate­gies here that the oth­er com­pa­nies can get in­volved in?

Shoibal Dat­ta: No, I don’t think it’s on­ly for the big­ger com­pa­nies. It’s in­ter­est­ing that the con­ver­sa­tion start­ed off with that kind of shape, den­si­ty and topol­o­gy of da­ta. And right now the abil­i­ty to gath­er large amounts of da­ta — it’s uni­ver­sal. I think it’s not unique to any one com­pa­ny. Fun­da­men­tal­ly, I think imag­ing clear­ly was the first to break with prac­ti­cal ex­am­ples of how it could be ap­plied and I think you can fol­low health­care. You saw it come in­to de­ci­sion sup­port first and now it’s go­ing to be em­braced in the main­stream.

I do think that with the per­va­sive­ness of de­vices and sen­sors, the amount of da­ta that will soon be avail­able to an­a­lyze is go­ing to be suf­fi­cient. The ques­tions and the abil­i­ty of ML to ap­proach those kinds of ques­tions, that’s, I think, what will help be the tip­ping point on this one. I do be­lieve we are go­ing to get in­to a new gen­er­a­tion of, for ex­am­ple, dig­i­tal bio­mark­ers or nov­el end­points based on these and they will be soon part of ex­plorato­ry stud­ies, if they’re not al­ready, in clin­i­cal tri­als.

John Car­roll: So Rosana, you work with peo­ple on the biotech side, on the small­er com­pa­ny side as well. What’s your take on this?

Rosana Kapeller: I do agree with what he’s say­ing. I think that biotech ac­tu­al­ly is in­vest­ing heav­i­ly in ba­si­cal­ly gen­er­at­ing the da­ta and cre­at­ing in­ter­sec­tions with ma­chine learn­ing and AI groups to be able to work on that. I to­tal­ly agree with that, but the point that I ac­tu­al­ly want to make is the shape of the da­ta and the con­tent of the da­ta that you use, that you need from ma­chine learn­ing.

So one of my frus­tra­tions, and I al­ways talk to my col­leagues and this is like, “Oh, but there’s so much da­ta in phar­ma. Why can’t we use all that da­ta?” Es­pe­cial­ly in chem­istry. Okay, you have all this sol­u­bil­i­ty da­ta, per­me­abil­i­ty da­ta and some. Why can’t we put this all to­geth­er and use ma­chine learn­ing to teach us all these dif­fer­ent fea­tures of mol­e­cules? You can’t be­cause the da­ta has been gen­er­at­ed over time, dif­fer­ent ex­per­i­ments, dif­fer­ent as­says. There is so much vari­a­tion in there that the ma­chine can­not in­ter­pret it.

John Car­roll:  And isn’t that why imag­ing is one of the ar­eas where you’re go­ing to find the first ex­ploita­tion be­cause it’s more pure, the in­for­ma­tion is more pure, it’s for the ma­chines?

Shoibal Dat­ta: I think the qual­i­ty of the da­ta and imag­ing in­her­ent­ly lent it­self to these ap­proach­es.

Tony Wood:  I think that that’s crit­i­cal. These meth­ods are very good at find­ing dis­con­ti­nu­ities in da­ta that have noth­ing to do with the ques­tion that you’re try­ing to solve. And when you’re putting a jig­saw puz­zle to­geth­er from pieces that come from dif­fer­ent puz­zles, which is a sort of anal­o­gy that we face in dis­cov­ery da­ta, there’s a re­al prob­lem as­so­ci­at­ed with that.

In ad­di­tion, com­pu­ta­tion­al chem­istry, I’m go­ing to let you drag me in­to com­pu­ta­tion­al chem­istry for a lit­tle while. We’ve been at this for 20 years now and the mes­sage that we al­ready have in place based on quan­tum me­chan­ics, based on oth­er pre­dic­tive ap­proach­es, you know this very well from your his­to­ry, they’re al­ready pret­ty good.

So where­as I’ve no doubt that AI/ML can solve those prob­lems, the im­pact that it’s go­ing to have rel­a­tive to the meth­ods that we cur­rent­ly have is prob­a­bly go­ing to be less­er than in ar­eas like the one that [Hal] and I con­stant­ly talk about be­cause there is a prob­lem that we sim­ply can’t solve right now and one which is en­abled by new da­ta col­lec­tion tech­nolo­gies that we’ve been talk­ing about like im­age analy­sis. So we can’t use any oth­er method to get to the bot­tom of deal­ing with these mas­sive da­ta sets.

So for us, it’s very much about what’s the right prob­lem to solve based on im­pact? Where are the da­ta sets that are go­ing to be suit­able for so­lu­tion of that prob­lem like­ly to be cre­at­ed as a con­se­quence of tech­nolo­gies that are be­com­ing de­vel­oped, crisper imag­ing, et cetera. And then build­ing our fo­cus around that, whilst at the same time putting in place a da­ta in­fra­struc­ture for every­thing else that will pre­pare the cul­ture in our or­ga­ni­za­tion to put da­ta at the cen­ter of every­thing we do.

Whether that’s AI/ML en­abling func­tion­al ge­nomics or it’s more typ­i­cal rou­tines, sta­tis­ti­cal or cal­cu­la­tion meth­ods en­abling the pre­dic­tion of small mol­e­cule or pro­tein struc­tures that will get to the sav­ings that your propo­si­tion that start­ed your ini­tial in­ter­est here.

Saurabh Sa­ha: So I agree with Tony and Rosana. I will say that if you look in the chem­i­cal space, one area that we’ve had some suc­cess, I think the fields had some suc­cess, is ask­ing spe­cif­ic ques­tions on: is a mol­e­cule… a HER chan­nel mod­u­la­tor or not? And so we had mil­lions of da­ta points in­ter­nal­ly and oth­ers in the in­dus­try have the same. And ap­ply­ing ML to that space and ask­ing that ques­tion, we’ve been able to get pre­dic­tion rates from 70% to 95% sen­si­tiv­i­ty speci­fici­ty. So that’s ac­tu­al­ly a very, very spe­cif­ic ques­tion.

Now that’s in a two-di­men­sion­al chem­i­cal space, what mol­e­cules look like on pa­per. But I’d love to ask Rosana — and from the Nim­bus and Schrödinger days — on three-di­men­sion­al space I think is much more dif­fi­cult when you’re try­ing to pre­dict on pa­per look­ing at the en­tropy or the in­ter­ac­tions of a mol­e­cule with the bi­o­log­i­cal prop­er­ties that you can pre­dict affin­i­ty, sol­u­bil­i­ty and oth­er things. And Jonathan Mon­tagu wrote a great blog on this on Bruce Booth’s site, which I think every­one should read. I think it’s a fan­tas­tic les­son.

Rosana Kapeller: It is hard and that’s some­thing that we’re work­ing on right now and I think ma­chine learn­ing is go­ing to be able to aug­ment that. So I think it’s the in­ter­sec­tion of ma­chine learn­ing and physics-based ap­proach­es may be able to ac­tu­al­ly solve part of the prob­lem right now and we’re do­ing a lot of work on that. There is some­thing we call mul­ti-pa­ra­me­ter op­ti­miza­tion be­cause, Tony, you will agree with that, the hard­est thing is to find one mol­e­cule that has the ide­al de­sir­able prop­er­ties that you need to make a drug. That’s why it takes so long to go from a hit to have a drug in peo­ple that ac­tu­al­ly acts. And if we could ac­tu­al­ly im­prove that even by 20% or 30%, that would be huge. And I think that’s where we’re putting a lot of ef­fort on.

John Car­roll: I think that the eco­nom­ics of R&D is some­thing that ob­vi­ous­ly every­one here is very in­volved in as well, try­ing to fig­ure out if there’s a more ef­fi­cient way of do­ing it to cut down not on­ly just the cost, but al­so the time that goes in­to this sort of thing. I was talk­ing to Clay Sie­gall from Seat­tle Ge­net­ics a cou­ple of days ago and he’s at the point now of hav­ing his sec­ond drug hit the mar­ket 21 years af­ter found­ing the com­pa­ny.

And these are long time­lines that we’re talk­ing about. Al­ny­lam took about 20 years to get its first drug out in­to the mar­ket­place. At the same time though, we are al­so see­ing that these com­pa­nies are com­ing along now, not just the the top 15, but all the rest of these com­pa­nies are com­ing up as well. And every­body’s go­ing to have their own kind of per­spec­tives on this. I am cu­ri­ous though, Shoibal, in terms of where the most com­mon pit­falls are right now for com­pa­nies that are be­gin­ning to look at AI, where do you see most of the ini­tial mis­takes?

Shoibal Dat­ta: I think it’s go­ing back to what I think Tony was say­ing — what prob­lem am I try­ing to solve? What da­ta do I have avail­able for it? And what are the right ap­proach­es? Be­cause there’s no mag­i­cal an­swer to that. You still have to go through that. I think every com­pa­ny went through where we have 30 years of clin­i­cal tri­al da­ta and every­body has in some way, shape or form tried to make sense of it. And most have strug­gled to come up with some­thing use­ful.

John Car­roll: So at a com­pa­ny the size of No­var­tis, you’ve got a glob­al op­er­a­tion like sev­er­al of the folks up here have right now. So how do you tack­le this from the ini­tial per­spec­tive. How do you get in­to AI in R&D for the first time?

Badhri Srini­vasan: I ac­tu­al­ly want to go back to what Shoibal said and I to­tal­ly agree. It’s a young sci­ence and sud­den­ly there seems to be a lot of promise and every­body says throw AI at it, ML at it, and sud­den­ly you should have a so­lu­tion. But what does that mean? What is the prob­lem that we’re try­ing to solve? Do we know that we have suf­fi­cient da­ta to ac­tu­al­ly ad­dress that prob­lem? I think there is a lot of ground­work that we need to do. And so we try to look at it in a more dis­ci­plined way, just be­cause we are the biggest, just be­cause we’ve got mon­ey to spend, just be­cause there’s a lot of ret­ro­spec­tive da­ta there.

And this is the oth­er point I want­ed to pick up from the ques­tion you asked Shoibal as well. There is a ton of ret­ro­spec­tive da­ta, but it’s all col­lect­ed with a dif­fer­ent pur­pose in mind. It has been all col­lect­ed over the last 10, 15, 20 years. What AI and ML need now is a very dif­fer­ent form of that da­ta. So when we look at it, we say, do we have that? Should we go col­lect that? And is it to solve a spe­cif­ic prob­lem? And sec­ond, is it a one prob­lem, one so­lu­tion? Or can we scale this ap­proach? Is it some­thing that we can then say, “Okay, now I’ve done this and I can ap­ply this across the board.”

If we take that kind of a dis­ci­plined ap­proach, we feel like we’ll start to get some­where. Not that we’re there now, but at least we’ll start to make progress in the right di­rec­tion. As op­posed to say­ing, “Here was a fan­tas­tic suc­cess, but I don’t know what to do af­ter that.”

John Car­roll: So, I am cu­ri­ous about where we’re head­ed here with this, which is what you were ad­dress­ing right now. If you get the da­ta right, if the da­ta stars com­ing in a uni­form and pre­dictable fash­ion, and you can use this across a va­ri­ety of ar­eas … that’s the first thing. Is the in­dus­try at the point where this is hap­pen­ing now, where the da­ta com­ing in is pure and fair­ly re­li­able? Or is there still more work to be done on that?

Tony Wood: Oh, I think we’re just right at the be­gin­ning of that process quite frankly. There are ear­ly in­di­ca­tions. We’ve talked about them so far, so I won’t re­peat them, where the promise is cer­tain­ly be­com­ing man­i­fest. And you can look across every as­pect from idea gen­er­a­tion through to the launch and mar­ket­ing of med­i­cines, and find ar­eas where you can see the op­por­tu­ni­ty for that to hap­pen. So, I think it will be­come much more preva­lent. I think the big­ger ques­tion then as we think about it is once you get there, how do you get the cul­ture right? Be­cause this is an ex­er­cise in putting the da­ta col­lec­tion strat­e­gy and an­a­lyt­i­cal strat­e­gy at the be­gin­ning of the ex­pe­ri­ence. And not at the end of the ex­per­i­ment when some­body gets hand­ed a hard disk full of ter­abytes of da­ta and has to find some sense in it.

So, one thing that I think is im­por­tant for us to con­sid­er is how the, let’s call them a mul­ti­lin­gual team, is as­sem­bled. And we have in­di­vid­u­als who can, if you like, bridge the gap be­tween ad­vanced an­a­lyt­ics, which comes from peo­ple that have come from a very dif­fer­ent back­ground, and sci­en­tists who un­der­stand the na­ture of the prob­lem that they’re try­ing to solve. So, I see lots of op­por­tu­ni­ty de­vel­op­ing over time. I think we will get to a point ul­ti­mate­ly where the biggest is­sue we face though is one of mak­ing sure that we get the right cul­ture to have da­ta-dri­ven or­ga­ni­za­tions.

Rosana Kapeller:  I’m just go­ing to agree with you in spades. I think that’s the biggest is­sue right now, the mind­set and the lost in trans­la­tion. The don’t speak the same lan­guage, so it’s go­ing across. And if you don’t have these teams to talk to each oth­er, you’re not go­ing to make the best of it.

John Car­roll:  You talked about this, the ad­ver­tis­ing folks. Just talk about that for a sec­ond, just in terms of the lost-in-trans­la­tion as­pect of this.

Rosana Kapeller:  Yeah, it’s com­plete­ly lost in trans­la­tion. My ex­pe­ri­ence was when I was at Nim­bus, be­tween the med­i­c­i­nal chemists and the com­pu­ta­tion­al chemists. And I think that in the be­gin­ning, it was very sur­pris­ing to me, they don’t speak the same lan­guage. They don’t even think about how to solve the prob­lems in the same way, and how to in­te­grate the dif­fer­ent so­lu­tions. And the on­ly way that we ac­tu­al­ly made it suc­cess­ful at Nim­bus was when we start­ed putting them to­geth­er, and mak­ing de­ci­sions to­geth­er. They had to co-lead the pro­grams and make de­ci­sions, to be able to speak the same lan­guage.

So, I think that is the same thing that we’re see­ing in ma­chine learn­ing and AI; a lot folks that come from ma­chine learn­ing, they don’t know much of bi­ol­o­gy. They don’t un­der­stand how you solve prob­lems in bi­ol­o­gy and vice ver­sa, so they don’t have a com­mon lan­guage. So, we have to de­vel­op that com­mon lan­guage.

Hugo Ceule­mans:  One of the chal­lenges is not just mak­ing the right type of da­ta, the right vol­ume of da­ta, the right qual­i­ty of da­ta, but then match­ing those dif­fer­ent da­ta sets. And some of those will be small, by all kinds of re­stric­tions. And it’s this mul­ti­dis­ci­pli­nar­i­ty that comes in in bridg­ing those. So, how do you bridge from a set that an­no­tates pure chem­istry ver­sus some that has al­ready a lot more bi­ol­o­gy, ver­sus some clin­i­cal as­pects? With this ex­change of mul­ti­dis­ci­pli­nar­i­ty comes in, in­to bridg­ing things. I think one of the biggest mis­takes that is of­ten made is we gen­er­ate an aw­ful lot of da­ta in one as­say or a few as­says. We’re re­al­ly good at that. Now, we throw ma­chine learn­ing on it.

To pre­dict what, to try to get to grips with what ex­act­ly? And I think I com­plete­ly agree, hav­ing that mul­ti­dis­ci­pli­nar­i­ty is ab­solute­ly key to bridg­ing all those dif­fer­ent as­pects.

Saurabh Sa­ha:  So, in im­muno-on­col­o­gy, we think that ul­ti­mate­ly mul­ti­ple mark­ers or com­pos­ite bio­mark­ers will lead the way in be­ing able to tell us which com­bi­na­tion of drugs will best be ef­fec­tive for a giv­en can­cer or pa­tient. And to the point that Rosana and Hugo make, it’s very dif­fi­cult … it’s ex­treme­ly dif­fi­cult to trans­late from dis­cov­ery, trans­la­tion­al, through the clin­ic. But it’s more dif­fi­cult just to fig­ure out how to trans­late or link ge­nomics da­ta, pro­teomics da­ta, flow da­ta, T cell ac­ti­va­tion da­ta, imag­ing da­ta just with­in the trans­la­tion­al space. So, just when we can try to come up with mark­ers that can pre­dict which pa­tients would re­spond.

Lis­ten­ing to this con­ver­sa­tion re­minds me of a com­ment that Free­man Dyson once made: does sci­ence evolve through ideas or tools? And if you look back at the era of the steam en­gine, the tools of the steam en­gine pre­ced­ed our un­der­stand­ing of the laws of ther­mo­dy­nam­ics. And that’s not ter­ri­bly dif­fer­ent from where we are to­day in terms of ar­ti­fi­cial in­tel­li­gence. We have the tool. But we don’t nec­es­sar­i­ly un­der­stand, okay, what are all the branch points in the de­ci­sion tree and the prob­a­bil­i­ties. How are they weight­ed to come to a rec­om­men­da­tion that this is what the mod­el out­puts? We re­al­ly don’t un­der­stand this black box.

So, I think peo­ple have com­pared this to ge­nomics of 20 years ago, where we had the op­po­site prob­lem. We had tons of da­ta, but we didn’t re­al­ly have the tools to ef­fec­tive­ly an­a­lyze it. And now we’re at the re­verse sit­u­a­tion. So, we’re kind of re­peat­ing his­to­ry here.

Shoibal Dat­ta:  So, sor­ry, but if I could ask a ques­tion. Do we need to un­der­stand the black box? And if you look at health­care and clin­i­cal de­ci­sion sup­port over there, there is not a fun­da­men­tal need to un­der­stand what the black box is. And in R&D, the goal posts have shift­ed as we un­der­stood more about the sys­tems that we study; in­creas­ing­ly have a need to be able to ex­plain why that’s hap­pen­ing, than it used to be be­fore. What’s the right thing to do over here?

Saurabh Sa­ha: Yeah, that’s a great ques­tion. So, I fun­da­men­tal­ly be­lieve that cod­ing lacks moral­i­ty. So, if you’re a physi­cian and you are get­ting a piece of pa­per that says, “This pa­tient should get this set of drugs,” do you re­al­ly be­lieve there’s qual­i­ty in the da­ta? Was there poor qual­i­ty, good qual­i­ty? Do you have trans­paren­cy as to how that da­ta came about, it land­ed? And how it was an­a­lyzed, this black box? And third, has it been bi­ased in any way? One of the chal­lenges with ML is that you on­ly are re­al­ly be­ing able to pre­dict based on your train­ing sets. Large­ly your train­ing sets that your work­ing, that’s be­ing fed in­to the mod­el.

And if you have un­mea­sured or un­known da­ta that is im­por­tant for let’s say, pre­dict­ing a re­sponse for a pa­tient, that da­ta’s not be­ing picked up. Be­cause it’s un­known, it’s un­mea­sured. So, by de­f­i­n­i­tion, it hasn’t gone in­to your train­ing set. So, if you’re miss­ing that com­po­nent and you’re mak­ing a rec­om­men­da­tion, and you’ve left that out, then I think it’s a fun­da­men­tal chal­lenge. Be­cause you re­al­ly don’t know how you got there. Maybe it was Google, I don’t know which com­pa­ny it was, where they looked at reti­nal scans and tried to pre­dict MIs, my­ocar­dial in­farc­tion. And what they end­ed up ac­tu­al­ly scor­ing for was the age of the pa­tient, in­stead of ac­tu­al­ly pick­ing up how the reti­na can pre­dict that.

Tony Wood: Let me pick up on the point you made ear­li­er here. Be­cause there’s a great da­ta in­te­gra­tion op­por­tu­ni­ty. We can now de­scribe the char­ac­ter of cells at mul­ti­ple lev­els of phe­no­typ­i­cal ab­strac­tion from DNA struc­ture, all the way through to an im­age. And this is part of the prob­lem with the com­plex mol­e­c­u­lar bio­mark­er propo­si­tion, you need how­ev­er to have some ground truth in or­der to con­nect that mas­sive vari­abil­i­ty to some­thing that you know to be true. And that ground truth in the con­text of our strat­e­gy is hu­man ge­net­ics. So, if you go from high con­fi­dence vari­ance to deep char­ac­ter­i­za­tion of cell char­ac­ter, then I think you’re in a place where we can start to re­solve some of these prob­lems with re­gard to the use of mol­e­c­u­lar bio­mark­ers, pa­tient se­lec­tion.

Be­cause fun­da­men­tal­ly, we’re an­chor­ing the whole ex­er­cise on some­thing that we know to be true. And so, that’s where our fo­cus in func­tion­al ge­nomics comes from. It’s not just about tar­get iden­ti­fi­ca­tion. If we get that right, it echoes out in­to a lot of the oth­er process­es that we’ve been talk­ing about. One ad­di­tion­al point around the black box that I want to bring in at this point, which is prob­a­bly worth­while con­sid­er­ing, I’ve nev­er en­coun­tered a com­put­er yet that didn’t pro­duce an an­swer. That the is­sue is not the an­swer, the is­sue is the cost of find­ing out whether the an­swer is cor­rect or not. And that’s a chal­lenge for ex­am­ple in the syn­thet­ic area.

What we need to be able to do with these meth­ods, is use the fact they can in­te­grate across mul­ti­ple dif­fer­ent di­men­sions. And start not just wor­ry­ing about, can I find an ac­tive com­pound? But can I de­sign an ac­tive com­pound that I can syn­the­size at ton­nage scale lat­er on, with­out hav­ing to go through the de­vel­op­ment process op­ti­miza­tion. Then you’re at a place where you’re de­sign­ing for every­thing right from the be­gin­ning. And that’s when I think you get the huge im­pact from sav­ings, with re­gards to re­al­ly in­te­grat­ing chal­lenges across the time scales.

John Car­roll: I would like to shift a lit­tle bit away from the re­al world, what you’re do­ing to­day, and the chal­lenges to where you think this is go­ing. Be­cause where we are to­day is not where this is go­ing. And there are all sorts of dif­fer­ent ideas about this. So, where are we go­ing to be four, five years down the road? Where is this head­ed? Be­cause four or five years in biotech is noth­ing. I mean, it’s now. So, I’d like to get your ideas. Badhri, what do you think this is go­ing to be for No­var­tis, reimag­in­ing med­i­cine?

Badhri Srini­vasan: So, first of all, I think where we are go­ing is not where we are to­day. And that’s a re­al­ly nice segue in­to this. I think where we will be is … the in­dus­try is ma­ture, you’ve heard a lot of peo­ple here say we are at a bit of an in­flec­tion point. We’re try­ing to un­der­stand the space. This is ear­ly days in us un­der­stand­ing what to do with this da­ta and how to fo­cus it on a prob­lem. Where we will be, I hope­ful­ly where we’re get­ting there, is we’re us­ing more and more da­ta to say, “Okay, do I un­der­stand the dis­ease state bet­ter?” Be­fore I even start any kind of treat­ment, “Do I un­der­stand the dis­ease state bet­ter? If I can un­der­stand the dis­ease state bet­ter, can I bring med­ica­tions that are ap­pro­pri­ate to that dis­ease state?” That depth of un­der­stand­ing I think is where we will be four, five years from now. If you’re then say­ing 10 years or (10)ish from now, again, which is not a long time­frame in a phar­ma biotech world, I think we will start to see that shift to ac­tu­al­ly us­ing that da­ta. To then say, “How are we short­en­ing the time­frame? How are we im­prov­ing our prob­a­bil­i­ty of suc­cess, et cetera?” But in that four, five year time­frame, I think it’s more of a bet­ter un­der­stand­ing of the in­sights that we have. Whether it’s pa­tient in­sights or dis­ease in­sights, or even mol­e­cule, com­pound in­sights, I think that’s where we will start.

John Car­roll:  Tony, I know this is a con­ver­sa­tion at GSK.

Tony Wood:  Yeah, look, we’re go­ing to be in a place where we have bet­ter tar­get iden­ti­fi­ca­tion, along the lines that I’ve de­scribed. And as a con­se­quence of that, an abil­i­ty to bet­ter iden­ti­fy pa­tients who are like­ly to ben­e­fit from our med­i­cines.

John Car­roll: That’s it?

Tony Wood: That’s our fo­cus.

John Car­roll:  Rosana?

Rosana Kapeller:  Would you say I’m an out­spo­ken per­son?

John Car­roll:  Well, I hope so.

Rosana Kapeller:  As every­one is say­ing, we’re very much in the be­gin­ning. And we’re still try­ing to fig­ure out how to put this all to­geth­er, and what’s go­ing to be the ben­e­fit of it. But I think that we right now are fac­ing what Ko­dak faced with dig­i­tal cam­eras. If we don’t em­brace it, most of us will have a Ko­dak mo­ment…

Tony Wood:  Mm-hmm (af­fir­ma­tive).

Rosana Kapeller:  Da­ta is go­ing to con­tin­ue to ac­cu­mu­late, we’ll con­tin­ue to gen­er­ate more da­ta. We are go­ing to fig­ure out how to get da­ta in the right shape, for­mat, et cetera. We will work on the cul­tur­al piece. We will make im­prove­ments for pa­tients and all of that. So, I think peo­ple should em­brace it in­stead of try­ing to push it away. That’s my per­son­al opin­ion.

John Car­roll: I think fear is dri­ving this as much as any­thing; the fear of miss­ing out. Hugo?

Hugo Ceule­mans: I think one key as­pect of where we’ll be go­ing is in­te­grat­ing more. Be­cause the one dif­fer­ence ver­sus con­sumer, is that in con­sumer there were a few mono­lith­ic da­ta own­ers who have mas­sive ac­cess to the da­ta. I think in phar­ma, in clin­i­cal, that is not the case. The stake­hold­er land­scape is way big­ger. So, we will need modal­i­ties to de­ploy AI across mul­ti­ple da­ta own­ers, across mul­ti­ple stake­hold­ers, in or­der to achieve the dream of where to land this. And I think a lot of the new­er play­ers, they will com­bine a vi­sion of method­ol­o­gy com­bined with, “This is com­ple­men­tary da­ta, this is da­ta that big phar­ma or the clin­ic does not have. This is a miss­ing in this.”

And we bring this to­geth­er through an AI that can learn across mul­ti­ple stake­hold­ers, while re­spect­ing the IP rights and the busi­ness in­ter­ests, but al­so the pa­tient rights, the clin­i­cal as­pects of all these stake­hold­ers. And I think that will be a ma­jor chal­lenge, that is very spe­cif­ic to the health­care space. That in some of the oth­er ar­eas where AI has made ear­ly progress, was less promi­nent, the frag­men­ta­tion of the space. That’s some­thing we need to con­quer.

John Car­roll:  I saw a study done not too long ago, about a month and a half ago, where they had looked in­to the suc­cess rates of drug de­vel­op­ment. Which is kind of an ob­ses­sion in R&D … and one that I share. It was in­ter­est­ing, be­cause they looked at all the dif­fer­ent phas­es, go­ing through the whole pre­clin­i­cal, Phase I, Phase II, Phase III. And they saw a sig­nif­i­cant in­crease in Phase III suc­cess­es. Every­thing else was pret­ty much the same, and noth­ing has re­al­ly changed all that dra­mat­i­cal­ly. It’s pret­ty bad. The 10% or what­ev­er that might be, is poor. And I think that most peo­ple would agree if you could make that 20%, you’d be cel­e­brat­ed around the world.

I think one of the rea­sons why the late stage de­vel­op­ment is work­ing out bet­ter, is com­pa­nies, big­ger com­pa­nies and all com­pa­nies, are do­ing bet­ter in terms of de­cid­ing what they want to take in­to Phase III. And al­so, there’s been a dri­ve to­ward more spe­cial­iza­tion. Where you see the com­pa­nies, par­tic­u­lar­ly the larg­er com­pa­nies, are be­ing very clear about where they think they can make a dif­fer­ence. Where they can come up with the new prod­ucts that’ll have an im­pact on mar­ket. So, does AI con­tin­ue to dri­ve spe­cial­iza­tion? Does it make it more im­por­tant to un­der­stand par­tic­u­lar ar­eas of spe­cial­iza­tion? And will this be a con­tin­u­a­tion of that process?

I’m cu­ri­ous from Janssen’s per­spec­tive, be­cause you do cov­er quite a lot of ter­ri­to­ry.

Hugo Ceule­mans: You need to in­te­grate across a big­ger space. Be­cause those nich­es will get small­er and small­er. Dis­eases will be split up in sub­types. You need suf­fi­cient da­ta pow­er. But I ab­solute­ly do think if you have a very clear vi­sion about, “This is the type of dis­eases. This is the type of treat­ments we want to aim for,” this will be en­abled by AI and by in­te­gra­tion across larg­er and in­ter­con­nect­ed da­ta sets.

Tony Wood: Okay, I think whether or not dis­eases are split in­to sub­types is an in­ter­est­ing ques­tion. And our fo­cus is very much in the ear­ly phas­es, to fo­cus on the bi­ol­o­gy. And it may be that by fo­cus­ing on the bi­ol­o­gy and what we learn there, that we get to think about the con­stel­la­tion of dis­eases in a very dif­fer­ent way. So, for me I think the ju­ry’s still out on that one. We may find that what we do is just group dis­eases in a dif­fer­ent way from the pre­vi­ous ap­proach we’ve tak­en through di­ag­no­sis and pre­sen­ta­tion.

Saurabh Sa­ha: John, to your point about suc­cess rates, so I think we may be look­ing at the same study. I think it was a CRM study pub­lished in Na­ture Re­views Drug Dis­cov­ery. The suc­cess rate from Phase I to launch is about 7%, 8%. From Phase II to launch is about 14%, 15%. And Phase III to launch is about 60%. So, if you look at that da­ta, there’s in­her­ent­ly some­thing very in­ter­est­ing about that. It says that, “Why are we do­ing these long com­plex Phase I stud­ies, if you’re re­al­ly not gain­ing much ben­e­fit and suc­cess from Phase I to II, in terms of trad­ing, your port­fo­lio? You’re re­al­ly see­ing huge leaps when you go from Phase III to launch.”

So, I think where we as a com­pa­ny and I think the in­dus­try would ben­e­fit from AI or what­ev­er you want to call it, is in­creas­ing the prob­a­bil­i­ty of get­ting that Phase I study to read out much more da­ta. Much more rich da­ta, where you have a clear proof of con­cept. That no­tion’s been around for over 20, 30 years. But a re­al proof of con­cept, one that tells you that, “You know what? Stop the pro­gram, or con­tin­ue and you’re go­ing to have a Phase III-like suc­cess rate in the Phase II set­ting.” One of the things that is clear, I think is that sci­ence evolves. Or there’s rev­o­lu­tions in sci­ence based on be­ing able to mea­sure things bet­ter.

(If) we were to mea­sure a pa­tient’s tu­mor, for ex­am­ple, to an in­fi­nite amount. Like, know every­thing that’s go­ing on in a pa­tient’s tu­mor and have the tools to be able to do that, to an­a­lyze the con­se­quences of those da­ta that we’re get­ting out of that tu­mor, one could pre­dict I think with al­most cer­tain­ty what will hap­pen ul­ti­mate­ly to that tu­mor, and what with Dar­win­ian evo­lu­tion may end up hap­pen­ing based on the fun­da­men­tal ge­net­ics of that can­cer. So, what I think when you say four to five years, in the last two years, we’ve seen a huge dif­fer­ence in the can­cer space. Be­cause it’s on­ly in the last two years that we’ve been mean­ing­ful­ly col­lect­ing sam­ples at base­line, on treat­ment, post-pro­gres­sion.

And it’s on­ly in that two-year pe­ri­od that we’ve ac­tu­al­ly de­ployed all these now tools to look at an­dro­genic­i­ty by se­quenc­ing mu­ta­tions, look­ing at in­flam­ma­tion, by look­ing at PD-L1, look­ing at CD8, and look­ing at these mark­ers. And an­a­lyz­ing the tran­scrip­tome, look­ing at gene ex­pres­sion in sin­gle cell RNA analy­ses. So, all that’s hap­pened in the last 18 months or two years. Now prospec­tive­ly in the next four to five years? Think about where that’s go­ing to be. We’re go­ing to have all that da­ta, be­ing able to mea­sure a tu­mor as pre­cise­ly as we pos­si­bly can. So, there’s go­ing to be great ad­vances in maybe the next two years.

John Car­roll:  I’d like the CRO per­spec­tive here, be­cause in a lot of dif­fer­ent re­spects, you can’t spe­cial­ize. You have to cov­er a large ter­ri­to­ry for a va­ri­ety of dif­fer­ent peo­ple. But at the same time they’re com­ing to you and ask­ing about this, so how do you ad­dress what … where you’re go­ing to be in four or five years from now?

Shoibal Dat­ta:  Sure, and I think to try and … dis­till some of the con­ver­sa­tion on this ques­tion, I mean, the in­dus­try as a whole is mov­ing … I think we’ve gone past the plumb­ing, com­put­er in­fra­struc­ture, cu­ra­tion, in­ges­tion is­sues. Every­body’s kind of fig­ured that part out. We’re be­gin­ning to un­der­stand what the lim­i­ta­tions are. And back to the ques­tion of da­ta qual­i­ty, da­ta depth, da­ta den­si­ty, and we re­al­ly see dig­i­tal­ly en­abled tri­als, the per­va­sive­ness of clin­i­cal grade con­sumer, sort of, de­vices and sen­sors as pow­er­ing some of that. And our fo­cus is re­al­ly to be able to run those kinds of stud­ies to the same stan­dards of in­tegri­ty and qual­i­ty like we do tra­di­tion­al tri­als. So we have to pre­pare for that. And it’s not just a tech­nol­o­gy prob­lem. It’s a tech­nol­o­gy process… It’s a mul­ti­func­tion­al, cross-dis­ci­pli­nary ex­er­cise.

John Car­roll: Saurabh, you brought up the idea about what’s go­ing on in can­cer right now. Ob­vi­ous­ly, Bris­tol My­ers has a stake in the can­cer field, in I/O in par­tic­u­lar. And there’s a lot of mon­ey go­ing in­to that. It seems like those com­pa­nies that are deeply in­vest­ed in­to cer­tain spe­cif­ic ar­eas are go­ing to find it the most pro­duc­tive when it comes to us­ing AI as they build up ex­per­tise, and as they con­tin­ue to build up the da­ta and every­thing else with it.

I’ve seen some of the sta­tis­tics in terms of the in­vest­ment in can­cer ver­sus every­thing else. It’s al­ways way out front. There’s been these mas­sive ad­vances. You’re get­ting in­to small­er and small­er nich­es here, where you’re break­ing up can­cers in­to very spe­cif­ic sub­pop­u­la­tion groups, and so on. So you can see how AI could make a big dif­fer­ence there. You could see how that would hap­pen.

I’m cu­ri­ous from the rest of every­body’s per­spec­tive here, what are the oth­er ar­eas where you’re go­ing to find the great­est pro­duc­tiv­i­ty? [Daphne Koller] works with Gilead. Gilead knows the liv­er re­al­ly well. This is like one of those ar­eas where, “Give me the liv­er,” and then they’re go­ing to fig­ure it out. So I’m cu­ri­ous, what are your spe­cif­ic ar­eas where you find the great­est op­por­tu­ni­ties as it re­lates to the dis­eases? Which dis­eases, oth­er than can­cer per­haps, are we go­ing to find the great­est ad­vances in? Badhri?

Badhri Srini­vasan: So I think on­col­o­gy al­ways has a tra­di­tion of lead­ing, so that’s for sure, and as you point­ed out as well. But I ac­tu­al­ly think the ap­pli­ca­tions go far be­yond that. I can think of oph­thal­mol­o­gy, for ex­am­ple. I can think of res­pi­ra­to­ry and liv­er. We don’t un­der­stand re­al­ly what’s go­ing on in that, for ex­am­ple, in that space, in the im­munol­o­gy, he­pa­tol­ogy space. I think the ap­pli­ca­tions are far be­yond and ap­ply to oth­er ar­eas.

They are catch­ing up now. Used to be that on­col­o­gy was al­ways in the lead, and the oth­ers were a lit­tle bit more “tra­di­tion­al,” but I think the oth­er ar­eas are catch­ing up. Some of the im­age analy­sis that we see, etc, is ex­tend­ing to oth­er ar­eas as well. So I think it’s across the board.

Tony Wood: And where you un­der­stand the ge­net­ics of the dis­ease, and you can take that un­der­stand­ing and point to the cells in which quan­ti­ta­tive trait loci are ex­pressed, you have the ba­sic in­gre­di­ents to fol­low a func­tion­al ge­nomics ap­proach en­abled by AI/ML, to ad­dress any par­tic­u­lar dis­ease with high con­fi­dence in the way that I’ve de­scribed. So we don’t think about it so much from a point of view, is it can­cer, or is it neu­rode­gen­er­a­tion? But rather, where are the ba­sic in­gre­di­ents that al­low us to con­nect to­geth­er these tech­nolo­gies and im­prove our chances of suc­cess?

Saurabh Sa­ha: So I would add, in our case … So we’re not just a can­cer com­pa­ny. We have a num­ber of oth­er dis­ease ar­eas — in car­dio­vas­cu­lar, fi­bro­sis, in im­muno sci­ences. Au­toim­mune dis­ease, I think, is one that we’re go­ing to re­al­ly make some ma­jor head­way. So as a com­pa­ny, we be­long to a con­sor­tium that looks at the UK Biobank da­ta that’s com­ing out. It’s about half a mil­lion pa­tients. And to Tony’s point, there’s just an in­cred­i­bly rich amount of ge­net­ic da­ta from ex­ome se­quenc­ing in that biobank.

Rosana Kapeller: So au­toim­mune dis­eases, I think, it’s go­ing to be big.

John Car­roll:  Hugo?

Hugo Ceule­mans: All types of dis­eases where-

John Car­roll:  And add on a ques­tion. Where are we not go­ing to find progress? I’m kind of cu­ri­ous about that too.

Hugo Ceule­mans:  And that’s a re­al­ly tough one to an­swer. I think the nice thing about im­munol­o­gy, and the nice thing about me­tab­o­lism is that they come back in so many dis­eases, and that’s where AI is good at. You bring it in­to a new area, and it says, “Hey, wait a minute, I rec­og­nize this. I’ve seen this be­fore some­where else.” And I drag that in­for­ma­tion, and I give you sug­ges­tions on where to go. So that’s where it will be help­ful.

Now it’s very hard to pre­dict where it will do that. So it may be that in some rare dis­ease, you’ll rec­og­nize some­thing you have seen be­fore in one of the more tra­di­tion­al ar­eas that got a lot of ex­plo­ration. And there’s a lot of serendip­i­ty there. So pre­dict­ing up­front, “Here’s where I am go­ing to rec­og­nize those pat­terns,” is tough. But I think com­mon themes are the things like im­munol­o­gy, like me­tab­o­lism, that ac­tu­al­ly play in every dis­ease.

John Car­roll:  So I did want to turn to the au­di­ence here. Alex! How did I know you were go­ing to ask a ques­tion? Go. Go.

Alex Zha­voronkov:  Right. So there is a lot of hype about AI, and nowa­days it’s ac­tu­al­ly quite tough to dis­tin­guish who is who, be­cause there are so many ar­eas where it can be ap­plied. So I’ll ask a very con­crete ques­tion, very quan­ti­ta­tive. So my ques­tion is that cur­rent­ly in your com­pa­nies, how long does it take to go from tar­get pre­sen­ta­tion to lead in an an­i­mal, and how much does it cost in your opin­ion? So those two fig­ures, how long, how quick­ly, and did you al­ready see the im­prove­ment ex­per­i­men­tal­ly with AI?

Badhri Srini­vasan:  So it’s high­ly var­ied, so I’m try­ing to think how to an­swer your ques­tion very specif­i­cal­ly, and I don’t be­lieve I can an­swer your ques­tion very specif­i­cal­ly. Have we al­ready seen the im­prove­ment? I think not yet. I think we are in the space now where we’re start­ing to ex­plore. So the an­swer of hav­ing seen the im­prove­ment, no. We’re in the process, I would say.

Tony Wood: I guess to me that’s a sort of “how long is a piece of string” ques­tion. There are so many oth­er fac­tors that are im­por­tant in that process, which de­ter­mine its speed. What’s the rel­a­tive fo­cus? What’s your con­fi­dence to pick of your points and to use an­i­mal mod­els at the end of that se­quence? What’s your con­fi­dence in a re­sult in that an­i­mal mod­el be­ing mean­ing­ful?

There’s vol­umes writ­ten, for ex­am­ple, on the dif­fer­ence in qual­i­ty in an­i­mal mod­els. So it can be quick if you’re fo­cused on a great tar­get, and you have con­fi­dence in trans­la­tion, and you’re pick­ing an area where we al­ready know how to ex­e­cute very ef­fec­tive­ly. So for me, I think this is a ve­hi­cle to help us make de­ci­sions. It is not a panacea that will fix oth­er prob­lems that AI/ML sim­ply can­not.

John Car­roll:  In most cas­es, isn’t AI be­ing ap­plied in ad­di­tion to every­thing else you’re do­ing? It’s re­al­ly an aug­men­ta­tion as op­posed to a re­place­ment of any­thing that may be go­ing on, it seems to me. Any­one else want to weigh in here?

Tony Wood: Most­ly, the ap­proach­es, at least in the ear­ly R&D phas­es, is a hy­poth­e­sis gen­er­a­tor and aug­ment­ing some­thing else you’re al­ready do­ing.

But let me come back to the point about imag­ing, be­cause that is a very clear area. We can­not do it any oth­er way. So there are these unique pieces where it is the on­ly method that can be ap­plied.

Speak­er 2: Much has been writ­ten about the emer­gence of Chi­na as an AI su­per­pow­er, both be­cause of heavy in­vest­ment by the Chi­nese gov­ern­ment at all lev­els, and per­haps a low­er stan­dard of pri­va­cy than we have here in the Unit­ed States. From your per­spec­tives as glob­al com­pa­nies, do you see a no­table rise in Chi­na in terms of their strength in AI, rel­a­tive to the Unit­ed States? And is the US gov­ern­ment do­ing enough to both in­vest in AI, and are the pri­va­cy stan­dards in the Unit­ed States an en­cum­brance, in your opin­ions?

John Car­roll:  Is there any­one you’d like to di­rect that to in par­tic­u­lar? Okay. It’s up for grabs. Who wants to grab it? Hugo? It’s yours, Hugo.

Hugo Ceule­mans:  Oh, it’s mine. So first of all, I’m based in Eu­rope, so mak­ing state­ments about ei­ther the US or Chi­na, I may not be best po­si­tioned. Now one thing, pa­tients are so cen­ter to our at­ten­tion, we can­not make progress while at the same time los­ing fo­cus on pa­tients. So just waiv­ing pa­tients’ rights, pa­tients’ con­cerns, that’s who we do it for. So we should not low­er the bar on that one. Do we do enough in ar­ti­fi­cial in­tel­li­gence in the West, to just in­clude a bit of Eu­rope?

So prob­a­bly we can do more. Prob­a­bly we can do bet­ter at giv­ing the right in­cen­tives. I do think we should al­so be­come stronger in work­ing to­geth­er. I think a lot of the ini­tia­tives in the West have been iso­lat­ed, frag­ment­ed. I think that com­ing to­geth­er in a biotope will give us a crit­i­cal mass that will be need­ed in such a com­pe­ti­tion. But by no means I think we should jeop­ar­dize pa­tients’ rights and pri­va­cy rights to ac­com­plish that. I do not think that that is the an­swer.

Speak­er 3: Hugo, just to fol­low up on that ques­tion. Where­as we have a lot of star­tups that are in­ter­est­ed in AI … Alex, we’ve ac­tu­al­ly start­ed an al­liance with some of you as well, and I think that goes to my ques­tion. How are we go­ing to com­bine, col­lab­o­rate, part­ner? Be­cause a lot of peo­ple are say­ing, “Hey, the da­ta, the or­ga­ni­za­tion, the type of da­ta that you put in, is the out­come that you’re go­ing to have at the end, and how we speed up this R&D.”

So how are we go­ing to de­moc­ra­tize and speed up the R&D process by com­bin­ing nim­ble star­tups that are dis­rupt­ing, and have the right data­bas­es, the right codes, and have asked the right ques­tions? Be­cause all of us have an al­liance, be­cause each of us are ask­ing dif­fer­ent type of ques­tions. So how are we go­ing to de­moc­ra­tize, and how are we go­ing to group to­geth­er and col­lab­o­rate? So that’s my ques­tion to maybe each of you.

Tony Wood: I guess from my per­spec­tive, it goes back to, it’s about the da­ta. And for me, where we need to be fo­cus­ing are the tech­nolo­gies that are go­ing to en­able the gen­er­a­tion of ap­pro­pri­ate and rel­e­vant da­ta against the back­drop of some la­bel­ing, which gives you a grand state of truth. So I’m wor­ried less about the method­ol­o­gy de­vel­op­ment. I think what we should be much more fo­cused on are some of the prob­lems in da­ta col­lec­tion that you’ve heard my fel­low pan­el mem­bers talk about, am­bu­la­to­ry mol­e­c­u­lar bio­mark­er mea­sure­ments, things like that. I’ll stop there. It’s about da­ta de­vel­op­ment, da­ta ac­qui­si­tion tech­nol­o­gy, not about the an­a­lyt­i­cal method.

Speak­er 4:  So no mic, but I’m go­ing to speak loud enough. Build­ing off Maria’s ques­tion, let’s go one step deep­er. Let’s talk about busi­ness mod­els. So 130 plus AI in “drug dis­cov­ery” com­pa­nies, 15 top phar­ma com­pa­nies. How do you ad­ju­di­cate, eval­u­ate, and then how do you think about busi­ness mod­els from a da­ta as­set mod­el eco­nom­ics?

Tony Wood: That’s a re­al­ly good ques­tion be­cause it’s very dif­fi­cult. (Laugh­ter.)

Speak­er 4:  Hugo asked that to me last week at the Lon­don con­fer­ence. That’s his ques­tion.

Tony Wood: Yeah. Let me just put a plea out there. We need some means of as­sess­ing da­ta on per­for­mance stan­dards, right? This all start­ed with the orig­i­nal im­age net­work, where quite frankly there was a glob­al stan­dard that every­one was aim­ing against. You didn’t have to … You could get away from this sort of cur­rent col­lab­o­rate, suck it in, oh, you’ll find out when you pay the cost of find­ing out whether or not our pre­dic­tions are right. That’s fun­da­men­tal­ly the prob­lem we face.

So how do we cre­ate an en­vi­ron­ment, be it in the con­text of this se­cure da­ta phi­los­o­phy that you de­scribed, Hugo, or oth­ers, where we just sim­ply have a set of Im­a­geNet-Like stan­dards that one can cre­ate syn­thet­ic datasets against, what have you, but an im­par­tial means of judg­ing where are the best ex­am­ples? Where are there in­vest­ments that are re­al­ly de­liv­er­ing mas­sive im­prove­ments? And where are the just many ver­sions of the same fla­vor? We just don’t have that right now, and it makes it very hard.

John Car­roll: Okay.

Alex: So I just want­ed to add to that. There are a few Im­a­geNet-like projects, for ex­am­ple, gen­er­a­tive chem­istry. So there are two cur­rent­ly. So peo­ple cre­at­ed large da­ta sets, opened them up, la­beled them, cre­at­ed mod­els, and cre­at­ed a leader­board. But what is in­ter­est­ing is that most of the phar­ma com­pa­nies, when they’re part­ner­ing, they’re not look­ing at those re­sults. They’re typ­i­cal­ly … It’s kind of … I think it’s a lit­tle bit more of a crony rep­u­ta­tion play at this point in time. So peo­ple do not re­al­ly look at the bench­marks.

Tony Wood: I guess I just re­spond by say­ing it’s nec­es­sary but not suf­fi­cient in our analy­sis.

Matt Clark: Great. Hi, Matt Clark from El­se­vi­er. One as­pect of AI that we haven’t dis­cussed much is how it changes mak­ing de­ci­sions. Right? We have all the AIs, we have all the read­outs, but we ac­tu­al­ly have to make de­ci­sions that we wouldn’t have oth­er­wise made based on this in­for­ma­tion to change the core. So for ex­am­ple, as part of the drug suc­cess rates at dif­fer­ent clin­i­cal tri­al phas­es, when I was in big phar­ma, I read hun­dreds of project re­ports, and I nev­er saw a drug team kill its own project.

They’d usu­al­ly go on a cou­ple years be­fore some­one out­side said, “Hey, we’ve got to stop this. You can’t save it any­more.” So I would say, what is the as­pect of think­ing in your or­ga­ni­za­tions for, now that you have this da­ta com­ing in from AI, us­ing it to change how you make de­ci­sions, and make dif­fer­ent de­ci­sions than you made in the past? How are you chang­ing the or­ga­ni­za­tion to ac­count for that, as well as just the raw sci­ence of I have a new piece of in­for­ma­tion com­ing in?

John Car­roll: Saurabh?

Saurabh Sa­ha:  Yeah, so I think your ques­tion doesn’t nec­es­sar­i­ly have to be scoped just to AI. I think just reg­u­lar sta­tis­tics, a p val­ue should be enough to kill a project. I think you just have to have the mind­set to be able to pull the trig­ger and say, “This is done.” This is tra­di­tion­al­ly very dif­fi­cult for com­pa­nies to do. One thing that we’ve adopt­ed is a no­tion of truth seek­ing over pro­gres­sion seek­ing. So hav­ing this no­tion that when a tar­get is iden­ti­fied, all the way through when the mol­e­cule may al­ready be in the clin­ic, you nev­er stopped val­i­dat­ing that tar­get as­set pair. You’re con­stant­ly val­i­dat­ing it.

Whether you’re us­ing AI, whether you’re us­ing reg­u­lar sta­tis­tics, what­ev­er the means is, ex­ter­nal da­ta, com­peti­tor da­ta, aca­d­e­m­ic da­ta, what­ev­er it is, you’re con­stant­ly eval­u­at­ing whether the tar­get as­set pair that you’re pur­su­ing at year five is still as com­pelling and ex­cit­ing as it was five years ago. And if you can mea­sure up to that stan­dard, then I think we’ll have high­er suc­cess rates.

John Car­roll:  Okay. Well, I said we’d get out here sharp, about 8:30, so it’s 8:31. I know you guys have a busy day. I want to thank every­body on our pan­el here. We’re go­ing to hear a lot more about AI. I’d like to par­tic­u­lar­ly thank PPD Biotech for spon­sor­ing to­day’s con­ver­sa­tion, and hope to see you all lat­er on. Thanks for com­ing.

Da­ta Lit­er­a­cy: The Foun­da­tion for Mod­ern Tri­al Ex­e­cu­tion

In 2016, the International Council for Harmonisation (ICH) updated their “Guidelines for Good Clinical Practice.” One key shift was a mandate to implement a risk-based quality management system throughout all stages of a clinical trial, and to take a systematic, prioritized, risk-based approach to clinical trial monitoring—on-site monitoring, remote monitoring, or any combination thereof.

Mer­ck scraps Covid-19 vac­cine pro­grams af­ter they fail to mea­sure up on ef­fi­ca­cy in an­oth­er ma­jor set­back in the glob­al fight

After turning up late to the vaccine development game in the global fight against Covid-19, Merck is now making a quick exit.

The pharma giant is reporting this morning that it’s decided to drop development of 2 vaccines — V590 and V591 — after taking a look at Phase I data that simply don’t measure up to either the natural immune response seen in people exposed to the virus or the vaccines already on or near the market.

Endpoints News

Keep reading Endpoints with a free subscription

Unlock this story instantly and join 98,700+ biopharma pros reading Endpoints daily — and it's free.

Adeno-associated virus-1 illustration; the use of AAVs resurrected the gene therapy field, but companies are now testing the limits of a 20-year-old technology (File photo, Shutterstock)

Af­ter 3 deaths rock the field, gene ther­a­py re­searchers con­tem­plate AAV's fu­ture

Nicole Paulk was scrolling through her phone in bed early one morning in June when an email from a colleague jolted her awake. It was an article: Two patients in an Audentes gene therapy trial had died, grinding the study to a halt.

Paulk, who runs a gene therapy lab at the University of California, San Francisco, had planned to spend the day listening to talks at the American Association for Cancer Research annual meeting, which was taking place that week. Instead, she skipped the conference, canceled every work call on her calendar and began phoning colleagues across academia and industry, trying to figure out what happened and why. All the while, a single name hung in the back of her head.

Endpoints Premium

Premium subscription required

Unlock this article along with other benefits by subscribing to one of our paid plans.

Jackie Fouse, Agios CEO

Agios scores its sec­ond pos­i­tive round of da­ta for its lead pipeline drug — but that won't an­swer the stub­born ques­tions that sur­round this pro­gram

Agios $AGIO bet the farm on its PKR activator drug mitapivat when it recently decided to sell off its pioneering cancer drug Tibsovo and go back to being a development-stage company — for what CEO Jackie Fouse hoped would be a short stretch before they got back into commercialization.

On Tuesday evening, the bellwether biotech flashed more positive topline data — this time from a small group of patients in a single-arm study. And the executive team plans to package this with its earlier positive results from a controlled study to make its case for a quick OK.

Endpoints News

Keep reading Endpoints with a free subscription

Unlock this story instantly and join 98,700+ biopharma pros reading Endpoints daily — and it's free.

Vir's CMO says he's sur­prised that a low dose of their he­pati­tis B drug ap­pears promis­ing in ear­ly slice of da­ta — shares soar

Initial topline data from a Phase I study of a new therapeutic for chronic hepatitis B virus was so promising that it surprised even the CMO of the company that produces it.

Vir Biotechnology on Tuesday announced that its VIR-3434 molecule reduced the level of virus surface antigens present in a blinded patient cohort after eight days of the trial with just a single 6 mg dose. Six of the eight patients in the cohort were given the molecule, and the other two a placebo—all six who received the molecule saw a mean antigen reduction of 1.3 log10 IU/mL, Vir said.

Endpoints News

Keep reading Endpoints with a free subscription

Unlock this story instantly and join 98,700+ biopharma pros reading Endpoints daily — and it's free.

Eli Lil­ly demon­strates that 2 an­ti­bod­ies beat 1 for guard­ing against se­vere Covid-19. But can that solve the first an­ti­body’s prob­lem amid slow up­take?

It seems safe to say that two antibodies are better than one.

Eli Lilly released the largest results yet on Tuesday for their Covid-19 neutralizing antibody cocktail, announcing that the combo reduced deaths and hospitalizations in coronavirus patients by 70%. Across 1,000 patients, there were 11 such events in the treatment group and 36 in the placebo group.

The breakdown for deaths alone was even starker: 10 in the placebo group and 0 in the treatment group. Lilly added that the drug hit secondary endpoints for reducing viral load and alleviating symptoms, although they did not disclose numbers.

Endpoints News

Keep reading Endpoints with a free subscription

Unlock this story instantly and join 98,700+ biopharma pros reading Endpoints daily — and it's free.

George Yancopoulos (L) and Len Schleifer (Regeneron)

Re­gen­eron touts pos­i­tive pre­lim­i­nary im­pact of its Covid an­ti­body cock­tail, pre­vent­ing symp­to­matic in­fec­tions in high-risk group

Regeneron flipped its cards on an interim analysis of the data being collected for its Covid-19 antibody cocktail used as a safeguard against exposure to the virus. And the results are distinctly positive.

The big biotech reported Tuesday morning that their casirivimab and imdevimab combo prevented any symptomatic infections from occurring in a group of 186 people exposed to the virus through a family connection, while the placebo arm saw 8 of 223 people experience symptomatic infection. Symptomatic combined with asymptomatic infections occurred in 23 people among the 223 placebo patients compared to 10 of the 186 subjects in the cocktail arm.

Endpoints News

Keep reading Endpoints with a free subscription

Unlock this story instantly and join 98,700+ biopharma pros reading Endpoints daily — and it's free.

Drug­mak­ers 'inch­ing ahead' in in­creas­ing ac­cess to drugs world­wide, with Glax­o­SmithK­line lead­ing the pack

Top drug developers are “inching ahead” in improving access to much-needed drugs around the world — an issue that has been underscored by the Covid-19 pandemic. But there’s still more work to do, Access to Medicine Foundation executive director Jayasree Iyer said.

Every two years, the Access to Medicines Index ranks the top 20 biotechs leading the push for better access to medicines in low- and middle-income countries. This year’s report, published Tuesday, looks at drug access in 106 countries.

News brief­ing: Nestlé whips up re­search col­lab­o­ra­tion with new­ly-un­veiled Flag­ship up­start; Mar­i­anne De Backer joins Kro­nos board

Flagship Pioneering tapped into a variety of trendy R&D themes when it officially debuted Senda Biosciences a few months ago, most prominently its focus on the microbiome, computational biology and cellular interactions. And while it’s all still in its infancy, the founders clearly elicited some high-profile attention from a major player which straddles the line between food and medicine.

Nestlé Health Science has partnered with Senda on one of its initial slate of R&D focuses, aligning itself with the biotech on metabolics, with a focus on some big targets, including obesity and glycemia.