Da­ta Lit­er­a­cy: The Foun­da­tion for Mod­ern Tri­al Ex­e­cu­tion

In 2016, the In­ter­na­tion­al Coun­cil for Har­mon­i­sa­tion (ICH) up­dat­ed their “Guide­lines for Good Clin­i­cal Prac­tice.” One key shift was a man­date to im­ple­ment a risk-based qual­i­ty man­age­ment sys­tem through­out all stages of a clin­i­cal tri­al, and to take a sys­tem­at­ic, pri­or­i­tized, risk-based ap­proach to clin­i­cal tri­al mon­i­tor­ing—on-site mon­i­tor­ing, re­mote mon­i­tor­ing, or any com­bi­na­tion there­of.

In the­o­ry, this new guid­ance freed re­searchers to take ad­van­tage of pow­er­ful new tech­nolo­gies that sim­pli­fy re­mote mon­i­tor­ing and en­able such mon­i­tor­ing to de­liv­er study-wide in­sights that can speed an­swers and en­hance pa­tient safe­ty. Yet in prac­tice, clin­i­cal tri­al mon­i­tor­ing has re­mained ground­ed in on-site source da­ta ver­i­fi­ca­tion, a cost­ly, time-con­sum­ing process that does not ad­dress risk or site per­for­mance da­ta. It’s time for that to change. Da­ta lit­er­a­cy is the change agent.

On­go­ing ed­u­ca­tion is nec­es­sary to keep pace with tech­no­log­i­cal trans­for­ma­tion

This is­sue is not unique to the clin­i­cal tri­al in­dus­try; it has been a chal­lenge every­where da­ta are be­ing used to en­hance de­ci­sion mak­ing. Yet oth­er in­dus­tries have out­stripped ours in train­ing their work­forces, en­hanc­ing both em­ploy­ees’ skills and their job sat­is­fac­tion.

So why not the clin­i­cal tri­al in­dus­try? Yes, change is hard. Peo­ple are com­fort­able with ex­ist­ing, proven frame­works. Fur­ther, with no clear roadmap for the im­ple­men­ta­tion of risk-based qual­i­ty man­age­ment, it is eas­i­er to throw up con­cerns. Two stand­bys:

“Our spon­sors may not be com­fort­able with the change.”

“The new process­es may not sat­is­fy reg­u­la­to­ry re­quire­ments.”

Most­ly, these are mere smoke­screens. Spon­sors will be com­fort­able when they are ed­u­cat­ed on the ad­van­tages of da­ta lit­er­a­cy. Fur­ther, ICH reg­u­la­tions re­quire at least con­sid­er­ing the use of a risk-based ap­proach—and, in its ab­sence, a sol­id, doc­u­ment­ed ra­tio­nale why it was not used. Cer­tain­ly, tech­nol­o­gy sys­tems are built with these reg­u­la­to­ry re­quire­ments in mind; some plat­forms even stream­line reg­u­la­to­ry re­port­ing with sys­tem-au­to­mat­ed au­dit trails.

The re­al is­sue is more per­son­al; tech­nol­o­gy has out­paced the da­ta lit­er­a­cy of study team mem­bers, so they shy away from more ad­vanced da­ta-dri­ven risk-based ap­proach­es.

Con­sid­er the pace of tech­no­log­i­cal in­no­va­tion and the ex­po­nen­tial growth of da­ta sources used in tri­als. If clin­i­cal tri­al per­son­nel have an av­er­age tenure of 10 years, but have not been de­vel­op­ing new da­ta-cen­tric skill sets in that time, they are woe­ful­ly be­hind.
It is time to reskill the work­force.

Da­ta lit­er­a­cy must be­come the new norm

Da­ta lit­er­a­cy is a sim­ple con­cept. Look­ing at da­ta re­lat­ed to your area of spe­cial­ty, you can de­duce in­sights, ask ap­pro­pri­ate ques­tions, and make clear da­ta-based de­ci­sions. With­in the clin­i­cal tri­al ecos­phere, each role re­quires a dif­fer­ent de­gree of skill both vis-à-vis da­ta in gen­er­al and the tri­al da­ta specif­i­cal­ly. For in­stance, clin­i­cal re­search as­so­ciates on­ly re­quire ba­sic knowl­edge. They must un­der­stand el­e­men­tary math­e­mat­ic prin­ci­ples, such as mean, me­di­an, and mode; rec­og­nize the gen­e­sis of their clin­i­cal tri­al da­ta, and be able to in­ter­pret sim­ple da­ta vi­su­al­iza­tions such as graphs and charts.

Cen­tral Mon­i­tors need a more ad­vanced un­der­stand­ing of math­e­mat­i­cal prin­ci­pals and an abil­i­ty to rec­og­nize both the cor­rect in­ter­pre­ta­tion of da­ta and when da­ta are be­ing mis­in­ter­pret­ed. Fur­ther, they need sto­ry-telling skills to com­mu­ni­cate da­ta-dri­ven in­sights to those who are not them­selves da­ta lit­er­ate.

Da­ta Sci­en­tists re­quire the most ad­vanced skills. They need an un­der­stand­ing of math­e­mat­i­cal con­cepts such as lin­ear al­ge­bra and prob­a­bil­i­ties and dis­tri­b­u­tions, flu­en­cy in sta­tis­tics, pro­fi­cien­cy in stan­dard and end-user con­fig­urable vi­su­al­iza­tions, pro­gram­ming ca­pa­bil­i­ties, and a work­ing knowl­edge of ma­chine learn­ing al­go­rithms.

At every lev­el, da­ta lit­er­a­cy en­ables peo­ple to make da­ta-based de­ci­sions rather than ex­pe­ri­ence- or in­tu­ition-based de­ci­sions. That is a key com­pe­ten­cy for risk-based qual­i­ty man­age­ment.

True risk-based qual­i­ty man­age­ment stems from da­ta an­a­lyt­ics

A risk-based qual­i­ty man­age­ment strat­e­gy sorts risks in­to cat­e­gories of high, medi­um, or low. Tech­nol­o­gy then al­lows re­searchers to quick­ly sur­face and care­ful­ly mon­i­tor those risks that could ac­tu­al­ly en­dan­ger the pa­tients or de­rail the study—whether or not the risks are ini­tial­ly an­tic­i­pat­ed.

The Cen­tral Mon­i­tor is the air traf­fic con­troller. View­ing all ag­gre­gat­ed da­ta at the study lev­el, site lev­el, and pa­tient lev­el, they are able to iden­ti­fy miss­ing or in­con­sis­tent da­ta, da­ta out­liers, and da­ta vari­abil­i­ty. They can spot pro­to­col de­vi­a­tions, sys­tem­at­ic er­rors, and da­ta-in­tegri­ty is­sues. They can an­a­lyze site char­ac­ter­is­tics and per­for­mance met­rics. With this analy­sis, cen­tral mon­i­tors can de­ter­mine the need for fur­ther re­mote or on-site mon­i­tor­ing, care­ful­ly tar­get­ing in­ves­ti­ga­tions to da­ta that sig­nal pos­si­ble risk. This ap­proach ex­po­nen­tial­ly in­creas­es the like­li­hood of iden­ti­fy­ing and cor­rect­ing is­sues ear­ly, when ac­tion can have a re­al im­pact on both pa­tient safe­ty and tri­al out­comes. It al­so elim­i­nates un­nec­es­sary (and re­source-in­ten­sive) clean­ing of every piece of da­ta, keep­ing the fo­cus on­ly on the da­ta that are most rel­e­vant.

Da­ta Sci­en­tists take this analy­sis one step fur­ther, us­ing ma­chine learn­ing and pre­dic­tive an­a­lyt­ics to track trends in com­par­i­son to oth­er sites and pa­tients, both sim­i­lar and dis­sim­i­lar. This broad­er analy­sis pin­points spe­cif­ic sites that bear clos­er ex­am­i­na­tion, again, tar­get­ing ar­eas of sus­pect­ed risk—and again, max­i­miz­ing risk-based qual­i­ty man­age­ment.

To mod­ern­ize tri­als, da­ta alone are not enough

To­day, clin­i­cal re­searchers are drown­ing in da­ta. Yet, with­out da­ta lit­er­a­cy, the num­bers mean noth­ing; it is im­pos­si­ble to de­ter­mine ei­ther the qual­i­ty or the im­pli­ca­tions of the da­ta.

Da­ta lit­er­a­cy al­lows re­searchers to un­der­stand both—an un­der­stand­ing that is cru­cial to risk-based qual­i­ty man­age­ment. Re­searchers must have a fa­cil­i­ty with num­bers, charts, and graphs. They must have a clear grasp of the da­ta’s ori­gin, how the da­ta were col­lect­ed, cleaned, and an­a­lyzed. Cru­cial­ly, they must be able to iden­ti­fy the most valu­able da­ta and to as­cer­tain their true mean­ing.

With­out that abil­i­ty, clin­i­cal tri­als can­not in­no­vate, it­er­ate, or mod­ern­ize. It is re­al­ly that sim­ple.