Com­pre­hen­sive char­ac­ter­i­za­tion of plas­mid and AAV gene ther­a­py prod­ucts with Forge Bi­o­log­ics’ hy­brid se­quenc­ing ap­proach

Re­com­bi­nant ade­no-as­so­ci­at­ed vi­ral (rAAV) vec­tors re­main at the fore­front of the gene ther­a­py rev­o­lu­tion, of­fer­ing hope for treat­ing pa­tients with ge­net­ic dis­or­ders. De­vel­op­ing as­say pan­els for rAAV drug prod­ucts to eval­u­ate their po­ten­cy, pu­ri­ty, and iden­ti­ty re­li­ably is es­sen­tial to ac­cel­er­at­ing the avail­abil­i­ty of these ther­a­peu­tics to pa­tient com­mu­ni­ties. Pre­cise se­quence iden­ti­ty test­ing of the plas­mid start­ing ma­te­ri­als used in rAAV man­u­fac­tur­ing or the ge­net­ic ma­te­r­i­al en­cap­si­dat­ed with­in an rAAV vec­tor can be ac­com­plished with a mul­ti­fac­eted se­quenc­ing ap­proach, such as that im­ple­ment­ed at Forge Bi­o­log­ics.

His­tor­i­cal­ly, the gold stan­dard for DNA se­quenc­ing has been chain-ter­mi­na­tion se­quenc­ing (al­so known as Sanger se­quenc­ing). While this method al­lows for se­quenc­ing of mod­er­ate­ly long (<1 Kb) DNA frag­ments with rea­son­able ac­cu­ra­cy, it re­quires a pri­ori knowl­edge of tar­gets to se­quence het­ero­ge­neous pop­u­la­tions. Adap­ta­tion of se­quenc­ing meth­ods for use with gene ther­a­py prod­ucts re­quires al­ter­na­tive ap­proach­es that al­low com­pre­hen­sive and high-qual­i­ty char­ac­ter­i­za­tion of vi­ral vec­tors and plas­mid DNA while main­tain­ing high through­put.

Forge has de­vel­oped a state-of-the-art hy­brid se­quenc­ing ap­proach for com­pre­hen­sive analy­sis of rAAV vec­tors and plas­mid DNA, over­com­ing typ­i­cal plat­form lim­i­ta­tions such as gaps in cov­er­age due to high GC con­tent or low com­plex­i­ty se­quences. In­ter­pre­ta­tion of se­quenc­ing da­ta as a qual­i­ty mea­sure for raw ma­te­ri­als and gene ther­a­py prod­ucts re­quires spe­cial­ized analy­sis. Forge’s in-house an­a­lyt­i­cal team pro­vides a mul­ti-step bioin­for­mat­ics analy­sis for each gene ther­a­py prod­uct, en­sur­ing that se­quenc­ing reads are cor­rect­ly as­signed to their source genomes or se­quences with a high de­gree of con­fi­dence. The de­vel­oped meth­ods al­low for ef­fi­cient se­quenc­ing and analy­sis, cut­ting down the time to re­sults by 75% com­pared with tra­di­tion­al out­sourced se­quenc­ing as­says.

Lim­i­ta­tions of cur­rent se­quenc­ing tech­nolo­gies in cell and gene ther­a­py

Ad­vances in se­quenc­ing method­ol­o­gy over the past decade have al­lowed for ap­pli­ca­tions of ge­nom­ic char­ac­ter­i­za­tion in cell and gene ther­a­py de­vel­op­ment, man­u­fac­tur­ing, and prod­uct re­lease test­ing. Specif­i­cal­ly, se­quence iden­ti­ty test­ing is now of­ten per­formed as part of char­ac­ter­i­za­tion test­ing for both rAAV vec­tors and high-qual­i­ty plas­mid DNA used as raw ma­te­ri­als for gene ther­a­py man­u­fac­tur­ing.

When char­ac­ter­iz­ing rAAV vec­tors and plas­mid DNA, both short- and long-read se­quenc­ing meth­ods of­fer unique ad­van­tages and lim­i­ta­tions. Short-read se­quenc­ing, al­so known as next-gen­er­a­tion se­quenc­ing (NGS), in­volves read­ing short frag­ments of DNA, typ­i­cal­ly around 100-600 base pairs in length. This method is known for its high ac­cu­ra­cy and high cov­er­age depth, low er­ror rates, and rel­a­tive­ly low cost per base. As a re­sult, short-read se­quenc­ing is suit­able for iden­ti­fy­ing sin­gle nu­cleotide vari­ants and small in­ser­tions or dele­tions in the rAAV or plas­mid genome. The high cov­er­age depth of this se­quenc­ing method al­so makes it help­ful in de­tect­ing low-fre­quen­cy vari­ants in AAV or plas­mid pop­u­la­tions. How­ev­er, one of the main lim­i­ta­tions of short-read se­quenc­ing is the dif­fi­cul­ty in re­solv­ing repet­i­tive re­gions such as in­vert­ed ter­mi­nal re­peats (ITRs) and com­plex struc­tur­al vari­a­tions due to the short-read length.

On the oth­er hand, long-read se­quenc­ing, al­so known as third-gen­er­a­tion se­quenc­ing, en­ables the se­quenc­ing of a sin­gle mol­e­cule of DNA, of­ten thou­sands to tens of thou­sands of base pairs in length. These mod­ern plat­forms can over­come lim­i­ta­tions found in oth­er se­quenc­ing meth­ods, such as short reads, read­ing er­rors, or ar­ti­facts in­tro­duced dur­ing frag­men­ta­tion, am­pli­fi­ca­tion, or bench work for sam­ple prepa­ra­tion. Long-read se­quenc­ing is ben­e­fi­cial for re­solv­ing com­plex ge­nom­ic re­gions, in­clud­ing repet­i­tive se­quences or struc­tur­al vari­a­tions. It can fa­cil­i­tate the read­out of full-length rAAV genomes, in­clud­ing ITRs and flank­ing se­quences [1,2]. How­ev­er, long-read se­quenc­ing meth­ods may have high­er er­ror rates com­pared to short-read se­quenc­ing, and the cost per base can be rel­a­tive­ly high­er.

Gene ther­a­py de­vel­op­ers and man­u­fac­tur­ers of­ten choose be­tween the two se­quenc­ing meth­ods by bal­anc­ing fac­tors such as ac­cu­ra­cy, cost, read length, the com­plex­i­ty of the ge­nom­ic re­gions to be se­quenced, and based on the ma­te­r­i­al to be se­quenced (e.g., plas­mid DNA vs. rAAV).

Com­pre­hen­sive char­ac­ter­i­za­tion of rAAV vec­tors through a hy­brid short – and long-read se­quenc­ing ap­proach

Em­ploy­ing se­quenc­ing-based ap­proach­es for rAAV vec­tor char­ac­ter­i­za­tion can pro­vide valu­able in­sights dur­ing process de­vel­op­ment and in­to clin­i­cal and com­mer­cial man­u­fac­tur­ing. Com­plete and ac­cu­rate se­quenc­ing can en­hance vec­tor de­sign, un­cov­er pre­vi­ous­ly hid­den pack­ag­ing chal­lenges, and pro­vide cru­cial qual­i­ty and safe­ty da­ta as part of rAAV re­lease test­ing.

Forge’s op­ti­mized hy­brid short- and long-read DNA se­quenc­ing ap­proach lever­ages the strengths of each plat­form to pro­vide com­pre­hen­sive and high­ly ac­cu­rate se­quenc­ing da­ta for gene ther­a­py prod­ucts (Fig­ure 1A). Uti­liz­ing both short- and long-read se­quenc­ing pro­vides or­thog­o­nal val­i­da­tion of se­quenc­ing da­ta, al­low­ing for con­sis­tent cov­er­age through­out the genome, even in chal­leng­ing re­gions (Fig­ure 1B). While short reads can cap­ture ar­eas of ge­net­ic vari­a­tion, long-read se­quenc­ing al­lows for the de­tec­tion of com­plex struc­tur­al vari­ants, such as in­ver­sions, du­pli­ca­tions, dele­tions, or translo­ca­tion. In ad­di­tion, short-read se­quenc­ing is lim­it­ed in the abil­i­ty to ful­ly char­ac­ter­ize the rAAV genome, a crit­i­cal qual­i­ty at­tribute be­ing the iden­ti­ty of the gene ther­a­py prod­uct. Con­verse­ly, long-read se­quenc­ing en­ables quan­ti­ta­tive and qual­i­ta­tive char­ac­ter­i­za­tion of full, par­tial, and emp­ty rAAV genomes (Fig­ure 1C). In fact, emp­ties al­ways have some ge­nom­ic con­tent, and long-read se­quenc­ing re­vealed what were pre­vi­ous­ly thought to be emp­ty cap­sids to con­tain ITR-bear­ing short DNA frag­ments.

Fig­ure 1. An in­no­v­a­tive hy­brid short- and long-read se­quenc­ing ap­proach for rAAV char­ac­ter­i­za­tion. A) Graph­i­cal schemat­ic of Forge Bi­o­log­ics’ hy­brid short- and long-read ap­proach to rAAV se­quenc­ing. B) A graph­i­cal rep­re­sen­ta­tion of se­quenc­ing cov­er­age of an rAAV prod­uct. The top track cor­re­sponds to the GC con­tent of the vec­tor, where­as the bot­tom four tracks rep­re­sent three short-read se­quenc­ing ap­proach­es and long-read se­quenc­ing of the vec­tor. C) Vi­su­al­iza­tions of the dif­fer­ences be­tween “par­tial” and “full” bands of an rAAV prod­uct, with the par­tial band ex­hibit­ing a high pro­por­tion of short, ITR-ad­ja­cent frag­ments.

In ad­di­tion to con­firm­ing genome se­quence iden­ti­ty of rAAV vec­tors, a hy­brid short- and long-read se­quenc­ing method can screen for non-tar­get se­quences (process-re­lat­ed im­pu­ri­ties), such as resid­ual plas­mid DNA and resid­ual mam­malian host cell DNA (Fig­ure 2A & 2B). Se­quence analy­sis of rAAV vec­tors can fur­ther en­sure that onco­genic el­e­ment se­quences (e.g., E1A/E1B, SV-40 large tu­mor anti­gen, etc.) are not present in the prod­uct (Fig­ure 2B). Forge’s hy­brid se­quenc­ing ap­proach has high sen­si­tiv­i­ty, with the abil­i­ty to de­tect resid­ual con­t­a­m­i­nants down to 0.01% in the fi­nal prod­uct (as demon­strat­ed by spike-in ex­per­i­ments, Fig­ure 2C).

Fig­ure 2. Analy­sis of im­pu­ri­ties in rAAV prod­ucts us­ing a se­quenc­ing-based ap­proach. A) Num­ber and length of reads in rAAV sam­ple align­ing with rAAV vec­tor genome (red), resid­ual plas­mid (or­ange), or oth­er im­pu­ri­ties (blue). B) A rep­re­sen­ta­tion of the genome com­po­si­tion of an rAAV prod­uct lot, quan­ti­fy­ing the per­cent­age of vec­tor genomes and resid­ual im­pu­ri­ties in the sam­ple. C) A spike-in ex­per­i­ment us­ing a main plas­mid and a set of six “con­t­a­m­i­nant” vec­tors, show­ing de­tec­tion down to < 0.01%.

In-depth char­ac­ter­i­za­tion of plas­mid DNA by long-read se­quenc­ing

Pro­duc­tion of high-qual­i­ty rAAV starts with uti­liz­ing high-qual­i­ty plas­mid DNA for triple trans­fec­tion dur­ing the up­stream AAV man­u­fac­tur­ing process. En­sur­ing se­quence iden­ti­ty is a crit­i­cal as­pect of plas­mid DNA char­ac­ter­i­za­tion and qual­i­ty con­trol. The high GC con­tent and palin­dromic na­ture of ITRs makes gene of in­ter­est (GOI) plas­mids sus­cep­ti­ble to mu­ta­tions in these re­gions dur­ing plas­mid pro­duc­tion. Mu­ta­tions in the ITR re­gion can im­pact rAAV pro­duc­tion [3]. There­fore, it is im­per­a­tive to con­firm the se­quence of ITR re­gions in GOI plas­mids pri­or to use in up­stream AAV man­u­fac­tur­ing.

Forge Bi­o­log­ics em­ploys a long-read se­quenc­ing-based analy­sis pipeline to rapid­ly char­ac­ter­ize GMP-Path­way and GMP grade plas­mids pro­duced us­ing Forge’s plas­mid man­u­fac­tur­ing ser­vices (Fig­ure 3A).

In­cor­po­rat­ing se­quenc­ing-based an­a­lyt­ics in­to mul­ti­ple steps of plas­mid man­u­fac­tur­ing work­flows en­sures that on­ly high-qual­i­ty and high-pu­ri­ty ma­te­ri­als are used for down­stream ap­pli­ca­tions (e.g., rAAV man­u­fac­tur­ing). A par­tic­u­lar chal­lenge in man­u­fac­tur­ing ITR-con­tain­ing plas­mids is the pos­si­bil­i­ty for spon­ta­neous gen­er­a­tion of sub-pop­u­la­tions of bac­te­ria har­bor­ing trun­cat­ed ITRs (Fig­ure 3B). From the ini­tial gen­er­a­tion of bac­te­r­i­al mas­ter cell banks, long-read se­quenc­ing check­points en­sure even cov­er­age, and mon­i­tor ITR in­tegri­ty and sta­bil­i­ty through­out cul­ture and plas­mid pro­duc­tion, with the abil­i­ty to de­tect the pres­ence of any sub-pop­u­la­tions con­tain­ing trun­cat­ed ITRs (Fig­ure 3C-D).

Fig­ure 3. Long-read se­quenc­ing of high-qual­i­ty DNA en­sures the ab­sence of sub-pop­u­la­tions in bac­te­r­i­al cell banks. A) Schemat­ic of plas­mid DNA prepa­ra­tion and se­quenc­ing work­flow. B) Graph­i­cal rep­re­sen­ta­tion of spon­ta­neous gen­er­a­tion of bac­te­r­i­al sub-pop­u­la­tions con­tain­ing trun­cat­ed ITRs. The GOI plas­mid con­tains in­tact ITR re­gions (blue) at the ini­tial time­point (T1) post-trans­for­ma­tion in­to a bac­te­r­i­al host (e.g., E. coli). As the bac­te­ria is pas­saged (T2-T3), sub-pop­u­la­tions may ac­quire spon­ta­neous mu­ta­tions re­sult­ing in a trun­cat­ed ITR re­gion (red) in the GOI plas­mid. The pro­por­tion of the bac­te­r­i­al pop­u­la­tion har­bor­ing a trun­cat­ed ITR may in­crease across pas­sages (T4). C) Vi­su­al rep­re­sen­ta­tion of a plas­mid across four time­points. A trun­cat­ed sub­pop­u­la­tion can be ob­served grow­ing as time­points ad­vance, with the graph­i­cal rep­re­sen­ta­tions show­ing the cov­er­age across the plas­mid genome, the ob­served trun­ca­tion per­cent, and a his­togram of read lengths show­ing a sec­ondary peak ap­pear­ing, cor­re­spond­ing to the trun­cat­ed (short­er) sub­pop­u­la­tion. D) Graph­i­cal rep­re­sen­ta­tion of cov­er­age across a plas­mid genome, show­ing even cov­er­age across most of the genome, with a not­ed dele­tion in the 5’ ITR se­quence.

Forge Bi­o­log­ics’ state of the art hy­brid se­quenc­ing ap­proach for plas­mid and rAAV gene ther­a­py prod­ucts

Cur­rent­ly avail­able se­quenc­ing so­lu­tions have re­quired sig­nif­i­cant de­vel­op­ment and op­ti­miza­tion to be com­pat­i­ble with gene ther­a­py prod­ucts. How­ev­er, prop­er se­quenc­ing of plas­mid start­ing ma­te­ri­als and rAAV prod­ucts is crit­i­cal for main­tain­ing prod­uct qual­i­ty and safe­ty. Forge’s hy­brid short- and long-read se­quenc­ing ap­proach al­lows for adapt­able, re­li­able, and com­pre­hen­sive char­ac­ter­i­za­tion of prod­ucts, in­clud­ing the de­tec­tion of resid­ual im­pu­ri­ties with high sen­si­tiv­i­ty (Fig­ure 4). The or­thog­o­nal val­i­da­tion that Forge’s hy­brid short-and long-read genome se­quenc­ing work­flow pro­vides in­her­ent­ly in­creas­es con­fi­dence in the safe­ty and qual­i­ty of gene ther­a­py prod­ucts.

Fig­ure 4. Overview of se­quenc­ing-based meth­ods for plas­mid and rAAV gene ther­a­py prod­ucts

Learn more about Forge Bi­o­log­ics’ state of the art hy­brid se­quenc­ing ap­proach for plas­mid and rAAV gene ther­a­py prod­ucts and con­nect with our team here.


1. Tai PWL, Xie J, Fong K, Seetin M, Hein­er C, Su Q, Weiand M, Wilmot D, Zapp ML, Gao G. Ade­no-as­so­ci­at­ed Virus Genome Pop­u­la­tion Se­quenc­ing Achieves Full Vec­tor Genome Res­o­lu­tion and Re­veals Hu­man-Vec­tor Chimeras. Mol Ther Meth­ods Clin Dev. 2018 Feb 13;9:130-141. doi: 10.1016/j.omtm.2018.02.002. PMID: 29766023; PM­CID: PMC5948225.

2. Namkung S, Tran NT, Manokaran S, He R, Su Q, Xie J, Gao G, Tai PWL. Di­rect ITR-to-ITR Nanopore Se­quenc­ing of AAV Vec­tor Genomes. Hum Gene Ther. 2022 Nov;33(21-22):1187-1196. doi: 10.1089/hum.2022.143. PMID: 36178359; PM­CID: PMC9700346.

3. As­sad W, Vo­los P, Mak­si­mov D, Khav­ina E, De­vi­atkin A, Mityae­va O, Volchkov P. AAV genome mod­i­fi­ca­tion for ef­fi­cient AAV pro­duc­tion. He­liy­on. 2023 Apr 1;9(4):e15071. doi: 10.1016/j.he­liy­on.2023.e15071. PMID: 37095911; PM­CID: PMC10121408.

Con­tribut­ing Au­thor: Es­ko Kaut­to, Ph.D., Se­nior Sci­en­tist, An­a­lyt­i­cal De­vel­op­ment, Forge Bi­o­log­ics


Rachael Hardison, Ph.D.

Senior Manager, Technical Sales & Scientific Advisory, Forge Biologics