说明书管理 - 图文

来源：用户分享时间：2025/10/29 2:09:27 本文由

说明：文章内容仅供预览，部分内容可能不全，需要完整文档或者需要复制内容，请下载word后使用。下载word有问题请添加微信号:xxxxxxx或QQ：xxxxxx 处理（尽可能给您提供完整文档），感谢您的支持与谅解。

AutomatedDetectionofOff-LabelDrugUse

KennethJung1*,PaeaLePendu3,WilliamS.Chen3,SrinivasanV.Iyer1,BenReadhead2,JoelT.Dudley2,NigamH.Shah31ProgramInBiomedicalInformatics,StanfordUniversity,Stanford,California,UnitedStatesofAmerica,2IcahnSchoolofMedicineatMountSinai,NewYork,NewYork,UnitedStatesofAmerica,3CenterforBiomedicalInformaticsResearch,StanfordUniversity,Stanford,California,UnitedStatesofAmerica

Abstract

Off-labeldruguse,definedasuseofadruginamannerthatdeviatesfromitsapprovedusedefinedbythedrug’sFDAlabel,isproblematicbecausesuchuseshavenotbeenevaluatedforsafetyandefficacy.Studiesestimatethat21%ofprescriptionsareoff-label,andonly27%ofthosehaveevidenceofsafetyandefficacy.Wedescribeadata-miningapproachforsystematicallyidentifyingoff-labelusagesusingfeaturesderivedfromfreetextclinicalnotesandfeaturesextractedfromtwodatabasesonknownusage(Medi-SpanandDrugBank).Wetrainedahighlyaccuratepredictivemodelthatdetectsnoveloff-labelusesamong1,602uniquedrugsand1,472uniqueindications.Wevalidated403predictedusesacrossindependentdatasources.Finally,weprioritizewell-supportednovelusagesforfurtherinvestigationonthebasisofdrugsafetyandcost.

Citation:JungK,LePenduP,ChenWS,IyerSV,ReadheadB,etal.(2014)AutomatedDetectionofOff-LabelDrugUse.PLoSONE9(2):e89324.doi:10.1371/journal.pone.0089324

Editor:IndraNeilSarkar,UniversityofVermont,UnitedStatesofAmerica

ReceivedAugust26,2013;AcceptedJanuary17,2014;PublishedFebruary19,2014

Copyright:?2014Jungetal.Thisisanopen-accessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalauthorandsourcearecredited.

Funding:TheauthorsacknowledgesupportfromNationalInstitutesofHealthgrantU54-HG004028fortheNationalCenterforBiomedicalOntology.KJwassupportedbytheSmithStanfordGraduateFellowship.WCwasfundedbytheBio-XInterdisciplinaryInitiativesProgram(http://biox.stanford.edu/grant/iip_program.html).Thefundershadnoroleinstudydesign,datacollectionandanalysis,decisiontopublish,orpreparationofthemanuscript.CompetingInterests:Theauthorshavedeclaredthatnocompetinginterestsexist.*E-mail:kjung@stanford.edu

Introduction

Off-labeldruguseoccurswhenadrugisusedinamannerthatdiffersfromitsapproveduseasdescribedbyitsFDAlabel.Thispracticeiscommonandprovidesapathwayforclinicalinnovation.However,suchusesescapethescientificscrutinythatgoesintothelabelingandmarketingofnewmedicines[1,2].Estimatesfromoffice-basedpracticesfoundthat21%ofprescrip-tionsareoff-label[3].Ofthese,73%hadlittleornoscientificsupport[3,4],raisingconcernsaboutpatientsafetyandcoststothehealthcaresystem.Forinstance,tiagabinewasapprovedforuseasanadjunctivetherapyforpartialepilepsies.However,whenusedasthesoleorprimarytreatment,itwasfoundtocauseseizures.In1998,20%ofusesoftiagabinewereoff-label,butby2004thisfractionhadincreasedto94%[5].

Off-labeluseistosomeextentinevitablebecausenoteveryconditioncanbetestedduringpre-approval[6,7].Nevertheless,allstakeholdersinthehealthcaresystemhaveaninterestinthetimely,systematicdetectionofoff-labeluse.Drugmanufacturersarerequiredtoreportonoff-labeluseobservedinpost-marketingsurveillanceintheEuropeanUnion[8].Regulatoryagenciesandclinicalresearcherscanuseknowledgeofemergingoff-labelusestoidentifypotentialbenefitsorrisksthatrequirefurtherinvestigation.Furthermore,patientsandtheirhealthcareprovidersshouldminimizeexposuretoriskswithoutclinicalbenefit.Unfortunately,currentpharmacovigilanceandpost-marketsurveillanceeffortsintheUnitedStatesdonotmonitoroff-labeluse.StandardsurveillanceapproachesusingtheFDA’sAdverseEventReportingSystem(FAERS)donotspecificallyaccountforuseinoff-labelindications;effortssuchastheObservationalMedicalOutcomesPartnership(OMOP)andtheMini-Sentinelprojectsdonot

specificallylookatoff-labeluse[9];andphysiciansurveys,suchastheNDTI,arelimitedbycoverage,timelinessandcost.

Inthiswork,wefocusontheproblemofautomaticallydiscoveringoff-labelusesofdrugs—definedastheuseofdrugsforunapprovedindications—fromelectronichealthrecordsandrankthenewlydiscoveredusesforfollowupbasedonriskandcostmetrics.Atitscore,weneedtomatchdrugstothediseasestheyarebeingusedtotreat.Werefertosuchmatchesasdrug-indicationusagepairs,andsaythataused-to-treatrelationshipexistsbetweenthedruganddisease(theindication).

PreviousworkbyWeietal[10]usedstructuredandsemi-structureddatafromRxNorm,MedlinePlus,SIDER2,andWikipediatocompileacomprehensivelistofdrug-indicationusagepairs.Similarly,Xuetal[11]useddatafromClinical-Trials.govandMedlinetocompilesuchalist.However,boththeseeffortsrelyoncurateddatasourcesthatmaynotreflectcurrentclinicalpractice.Incontrast,thedatainelectronichealthrecordsrepresentscurrentclinicalpracticeandcandiscoversuchusagesbeforetheyareincorporatedintocurateddatasources.

Thus,widespreadadoptionofelectronicmedicalrecords(EMR)providesanopportunitytodetectoff-labeluseinanautomated,scalableandtimelymanner[8].However,structureddatainEMRsusuallydonotexplicitlylinkdiseasestothedrugsbeingusedtotreatthem[2]andisnotascomprehensiveasthefreetextofclinicalnotes[12].Therefore,NaturalLanguageProcessing(NLP)isoftenusedtoextractused-to-treatrelationshipsbetweendrugsandindicationsfromclinicaltext.Previouseffortsuseoneoftwoapproaches:thefirstapproachidentifiesused-to-treatrelationshipsatthelevelofspecificoccurrencesofdrugsandindicationsintext.Forexample,fromthephrase,‘‘onPlavixforPAD’’,aused-to-treatrelationshipbetweenclopidogreland

February2014|Volume9|Issue2|e89324

PLOSONE|www.plosone.org

AutomatedDetectionofOff-LabelDrugUse

peripheralarterydiseaseisdetected.Submissionstothe2010i2b2NLPChallenge[13]representthestateoftheartofthisapproach.Thebestperformingmethodsrequireexamplesoftextinwhichoccurrencesofdrugs,indicationsandtherelationshipsbetweenthemareexplicitlylabeled[14].Suchlabeledtrainingdataisdifficulttoobtain(thei2b2Challengeincluded871labelednotes)andcollectionsoflabeledtextcoveringalldrugsandindicationsarenotavailable.Toovercomethislimitation,analternativeapproachistoinferused-to-treatrelationshipsatthepopulationlevel—ratherthanaskingwhetherasentenceornoteimpliesaninstanceofaused-to-treatrelationship,weaskwhetherthedataasawholesuggeststhataused-to-treatrelationshipholdsingeneral[15–17].Thebasicideaistocountthenumberoftimesadrugandindicationarementionedinthesameclinicalrecord,andcomparethatcounttotheexpectedco-mentionsbychance.Wehavepreviouslyusedsuchanapproachfordetectingdrug-relatedadverseevents[18],identifyingdrug-druginteractions[19],andprofilingdrugusages[17].Suchapproachescanuserelativelysimple,methodsfordetectingdrugandindicationmentionsinfreetextthatdonotrequirelabeledtextcorporafortraining.Asaresult,suchapproachesscaletoverylargecollectionsofclinicaltextandtheentirerangeofdrugsandindicationsencounteredinthedata.InJungetal[20],wedemonstratedthatitispossibletodetectoff-labelusageusinginputsderivedfromclinicaltext,combinedwithpriorknowledgeofdrugsandindicationsfromMedi-SpanandDrugBank.Otherresearchers[21]havealsousedpriorknowledgeofknownusagestomatchdrugsandknownindicationmentionsinclinicalnotesdemonstratingthatuseofpriorknowledgedoesimprovetheaccuracyofdetectingused-to-treatrelationships.

Inthispaper,webuildonourpreviouswork.First,wehaveimprovedtheaccuracyoftheclassifierbytakingknownusageintoaccountwhencountingco-mentionsofdrugsandindicationsintheclinicalnotesinordertoreducespuriousassociationsarisingfromco-morbidities.Second,wehavefilteredthesetofpredictednoveloff-labelusagesforsupportinindependent,complementarydatasources.WealsofilteredoutspuriousassociationsduetocausalrelationshipsusingtheSIDER2database[22].Finally,inordertotriagetheoff-labelusesforfollow-up,wedevelopedindicesofdrugcostandriskassociatedwithadrug’susagebasedontheunitpriceandknownadverseeventsofdrugs.Theseindiceswereusedtorankoff-labelusagesbytheriskthattheypresenttopatients,alongwiththeirmonetarycost.Highcostandhighriskusagesarenaturalcandidatesforfurtherinvestigationastheyrepresentexpensiveandpotentiallydangerouscases.Whereas,lowcostandlowriskusagescouldbepotentialexpandedindications.Ourmethodsdonotrequirelabeledtrainingtext,andthuscombinethescalabilityofassociation-basedapproacheswiththediscriminativepowerofmachinelearningtechniques.

publicallyavailablegeneexpressiondata[24].WereducespuriousresultsarisingfromdrugadverseeventsbyfilteringtheseusagesusingSIDER2,yieldingafinalsetof403well-supportednoveloff-labelusages.Overall,wetested1,602uniquedrugsand1,475uniqueindications,resultingin403well-supportednoveloff-labelusagesthatweprioritizedbytheirpotentialrisksandcost.TheoverallapproachandresultsaresummarizedinFigure1.

Aclassifierfordetectingused-to-treatrelationships

Classifierssuchassupportvectormachinesmapinputs,orfeatures,tooutputs.Inthisstudy,theinputscomefromclinicaltextanddomainknowledgeaboutdrugsfromMedi-SpanandDrugBank.Medi-Spanencodesinformationaboutknowusages,whileDrugbankencodesinformationaboutdrugtargetsandmechanismsofaction.Foreachdrug–indicationpair,weconstructasetoffeaturesthattheclassifierusestopredictwhetheraused-to-treatrelationshipholdsbetweenthedrugandindication.Theclassifierlearnstomakeaccuratepredictionsusinginputsforwhichweknowthedesiredoutput,i.e.,positiveornegativeexamplesofknownusages[25].WeconstructedsuchagoldstandarddatasetofknownusagesfromtheMedi-SpanDrugIndicationsDatabase(WoltersKluwerHealth,Indianapolis,IN)aspositiveexamples,alongwithnegativeexamplesconstructedasdetailedinMethods.AnSVMclassifierwastrainedonarandomsubset(80%)ofthegoldstandardandachievedapositivepredictivevalueof0.963,specificityof0.991,sensitivityof0.764andF1scoreof0.852ontheremaining20%ofthegoldstandard(seeFigure2).Featureablationexperimentsshowedthateachgroupoffeaturescontributedtooverallperformance,particularlywithrespecttosensitivityandpositivepredictivevalue(Table1).Individually,thefeatureslearnedfromclinicalnotesintheStanfordTranslationalResearchIntegratedDataEnvironment(STRIDE)andMedi-Spanyieldedsensitivitiesof0.681and0.662respectively,whileallfeaturestogetherresultedinasensitivityof0.764.

Inidentifyingpopulationlevelassociations,drugsanddiseasesmayalsogetassociatedbecauseofcausalrelationships(i.e.,thedrugiscausingthedisease,asanadversedrugevent)orindirectrelationships(i.e.,thediseaseisacommonco-morbidityofanapprovedindication)ratherthanused-to-treatrelationships.Wecountco-mentionsofdrugsandindicationstakingknownindicationsintoaccount,andasaresult,obtainsubstantiallybetterperformancethanpreviousmethodsthatignoreknownindications[20].Similarly,thePPVachievedusingallfeatureswas0.963,substantiallybetterthanthe0.936achievedusingonlyfeaturesderivedfromjustSTRIDEandconsistentwiththehypothesisthatpriorknowledgeisabletoreducespuriousresultsarisingfromcausalandindirectrelationships[21].

Predictingnoveloff-labelusages

WeappliedanSVMtrainedontheentiregoldstandarddatasettoall2,362,950possibledrug-diseasepairstofindused-to-treatrelationships.SVMsdonotoutputclassmembershipprobabilities;thuswefitalogisticregressionmodeltotheoutputoftheSVMtoestimatetheprobabilityoftheused-to-treatrelationshipbeingtrueforagivendrug-diseasepair[26].Applyingacut-offof0.99tothisestimateyielded14,174highconfidenceused-to-treatrelation-ships,whichweinterpretaspotentialdrug-indicationusagepairs.AfterfilteringoutknownusageslistedinMedi-SpanandtheNationalDrugFile–ReferenceTerminology(NDF-RT)[23],weremovedusagesinwhichthepredictedindicationiscloselyrelatedtoalreadyknownindicationsasdescribedinMethods,resultingin6,142highconfidencenovelusages.Becauseapprovedusagesarepresumablyknown,theseareinterpretedtobehighconfidencenoveloff-labelusages.

February2014|Volume9|Issue2|e89324

Results

WetrainedanSVMclassifiertorecognizeused-to-treatrelationshipsbetweendrugsandindicationsandappliedtheclassifiertoallpossibledrug-indicationpairs.Filteringforhighpredictionconfidenceyielded14,174highconfidenceused-to-treatrelationships.Wethenremovedknownusageslistedintwocuratedsourcesofknownusage—Medi-SpanandtheNationalDrugFile–ReferenceTerminology(NDF-RT)[23],leaving6,142predictionsthatcouldbenoveloff-labelusages.Weassessedsupportfortheputativenoveloff-labelusesinindependentandcomplementarydatasourcesincludingtheFDA’sAdverseEventReportingSystem(FAERS)andMEDLINE.Whenpossible,wealsoassessedthebiologicalplausibilityoftheseusagesusing

PLOSONE|www.plosone.org

AutomatedDetectionofOff-LabelDrugUse

Figure1.Overviewofmethodsandresults.Foreachofthe2,362,950possibledrug-indicationpairs,wecalculated9empiricalfeatures(e.g.,co-mentioncount)fromthefreetextofclinicalnotesinSTRIDEand16domainknowledgefeatures(e.g.,similarityinknownusagetootherdrugsusedtotreattheindication)fromMedi-SpanandDrugbank.ThesefeatureswereusedbyanSVMclassifiertrainedonagoldstandarddatasettorecognizetheused-to-treatrelationship,yieldingasetofpredictionsthatwerefilteredforknownusages,nearmissesintheindications,andsupportintwoindependentandcomplementarydatasets(FAERSandMEDLINE).PredictedusagesthatappearedtobedrugadverseeventslistedinSIDER2wereremoved.Theresultingsetof403well-supportednoveloff-labelusageswerebinnedusingindicesofriskandcost.doi:10.1371/journal.pone.0089324.g001

SupportinFAERS,MEDLINEandSIDER2

The6,142highconfidencenoveloff-labelusageswereexaminedforpositivesupportintwoindependentandcomple-mentarydatasources(FAERSandMEDLINE)andfornegativesupportinSIDER2asdescribedinMethods.FAERScasereportsexplicitlylinkindicationsandthedrugsusedtotreatthem[27].Thesereportsarecreatedbypatients,healthcareprovidersanddrugmanufacturers,anddirectlyreflectclinicalpractice.Incontrast,MEDLINEprovidescuratedannotationsofthebiomed-icalliteraturewithtermsfromtheNationalLibraryofMedicine’sMedicalSubjectHeadings(MeSH)vocabulary.Wefoundthat766noveloff-labelusagesaresupportedbyatleast10recordsin

Figure2.Trainingandtestingaclassifiertorecognizeused-to-treatrelationships.Wecreatedagoldstandardofpositiveandnegativeexamplesofknowndrugusage.PositiveexamplesweretakenfromMedi-Span.WecreatednegativeexamplesbyrandomlyselectingpositiveexamplesandthenrandomlychoosingadrugandindicationwithroughlythesamefrequencyofmentionsinSTRIDEastherealusage.ThesewerethencheckedagainstMedi-Spantofilteroutinadvertentlygeneratedknownusages.Thegoldstandarddatasetcontained4negativeexamplesforeachpositivecase.Foreachdrug-indicationpairinthegoldstandard,wecalculatedfeaturessummarizingthepatternofmentionsofthedrugsandindicationsin9.5millionclinicalnotesfromSTRIDE.WeusedMedi-SpanandDrugbanktocalculatefeaturessummarizingdomainknowledgeaboutdrugsandtheirusages.80%ofthegoldstandardwasusedtotrainanSVMclassifier,andtheresultingmodelwastestedontheremaining20%.doi:10.1371/journal.pone.0089324.g002

PLOSONE|www.plosone.org3February2014|Volume9|Issue2|e89324

AutomatedDetectionofOff-LabelDrugUse

Table1.Performanceofclassifieronhold-outtestsetusingdifferentfeaturesets.

FeatureSet¨veSTRIDEonlyNa?STRIDEonlyMedi-SpanonlyDrugbankonlySTRIDE+Medi-SpanSTRIDE+DrugbankAll

PPV0.7710.9360.9450.8310.9670.9360.963

Specificity0.9640.9880.9900.9810.9940.9880.993

SensitivityF10.4830.6810.6920.3770.7440.6970.764

0.5940.7880.7780.5180.8410.7990.852

Weperformedfeatureablationexperimentstoassessthecontributionof

differentfeaturesetstotheperformanceoftheclassifierfordetectingused-to-treatrelationships.Thefirstcolumnindicatesthefeaturesusedtotrainandtesttheclassifiers.Classifierperformancewasevaluatedinaholdouttestsetof1,749positiveand7,035negativeexamplesofdrugusageaftertraininginasetof7,112positiveand27,938negativeexamples.Thefirstrowshows

performanceusingSTRIDEderivedfeaturesinwhichco-mentionsarecountedwithoutregardtopresentknownindicationsintheclinicalrecord.doi:10.1371/journal.pone.0089324.t001

FAERS,and537ofthosearealsosupportedbyatleasttwoarticlesco-annotatedwiththedrugandindicationinMEDLINE[28].WethenfilteredoutusagesthatappearedtobebonafidedrugadverseeventslistedinSIDER2inordertoeliminatedrug-diseasepairsthatareactuallydrug-adverseeventrelationships,leavinguswith466candidatenoveloff-labelusages.WemanuallyexaminedthesetofilteroutknownusagesthatweremissedinMedi-SpanandtheNDF-RT,leavinguswith403well-supportednoveloff-labelusages.

Theseusages(TableS1)cover210drugsand184indications,andrecapitulatepreviouslynotedpatternsofoff-labelusage(Figure3).Medicalspecialtiessuchasoncologyhavebeennotedtohavehighratesofoff-labelusage[29,30].Consistentwiththisobservation,therearemanycancerdrugsamongourresults—e.g.,ofatumumabfornon-Hodgkin’slymphoma[31]andfludar-abineforchronicmyelogenousleukemia[32].Otherpreviouslynotedusagepatternsincludetheuseoftheanti-seizuremedica-tionssuchaspregabalinandlamotrigineformigraines[33,34],andtheuseofimmuno-modulatorssuchasetanerceptandadalimumab,twoTumorNecrosisFactor(TNF)inhibitors,forsystemiclupuserythematosus(SLE)[35,36].Interestingly,etaner-ceptandinfliximab,anotherTNFinhibitor,havebothbeeninvestigatedastreatmentsforSLE[37],lendingsupporttotheclassifier’sprediction.However,etanerceptandadalimumabhavealsobeenimplicatedincausingSLE[38,39].Thus,inthiscaseboththeused-to-treatandcausalrelationshipsmaybetrue.

Figure3.Distributionofindicationclassesinpredictednovelusages.Eachindicationforthe403highconfidencenovelusageswithsupportinFAERSandMEDLINEwasmappedtothefirstleveloftheNDF-RTdiseasehierarchy.63usageswerenotmappedtoNDF-RTandwereleftoutofthischart.

doi:10.1371/journal.pone.0089324.g003

action.Forinstance,simvastatinislinkedtodiabetesbyPPAR-gamma;simvastatintreatmentenrichesagenesetknowntobeactivatedbyPPAR-gammaactivity,whilePPAR-gammaagonists,e.g.,thiazolinediones,areknowntobeusedtotreatdiabetes[42,43].

Manualvalidationofthepredictedusages

Examinationofthe403well-supportednoveloff-labelusagesrevealedterminologicalchallenges.Forinstance,wepredictthatalendronicacidisusedtotreatosteopenia,theclinicalprecursortoosteoporosis.However,Medi-SpanandtheNDF-RTlisttheindicationasosteoporosisinsteadofosteopenia—i.e.,theyencodetheused-to-preventrelationship.Suchissuesreflectchallengesinnormalizingmedicalterms.Asaresult,althoughwecandetectused-to-treatrelationshipsquitewell,recognizingwhetherornotusesarealreadyknownisdifficult.

Somepredictedusesrepresentbonafidenewusesconfirmedinthebiomedicalliteraturebycasereports,clinicaltrials,orresourcessuchasMedlinePlus,butnotyetincorporatedinourcuratedsourcesofknownusage(seeTable2forselectedexamples).Forinstance,oursystempredictsthatbevacizumabisusedtotreatovariancancer.ThisusagehasbeenshowntoimproveprogressionfreesurvivalinaphaseIIItrial[44]andhasbeenapprovedintheEU,butdoesnotyetappearinMedi-Span,Drugbank,theNDF-RTorMedlinePlus.Theseresultsshowthatitispossibletodetectemergingoff-labelusebeforeithasbeenofficiallyrecognized.

Plausibilitybasedonmechanismsofaction

Wealsoevaluatedtheplausibilityofthenovel,predictedoff-labelusagesusingpreviouslypublishedmethods[24]appliedtogeneexpressiondatafromtheConnectivityMap[40]andNCBIGeneExpressionOmnibus[41].Briefly,ifadrugmodulatesgeneexpressionintheoppositemannerthanadiseasecondition,thedrugisconsideredaplausibletreatmentfortheindication.Thisapproachrequiresgeneexpressiondataforbothdrugexposureandthediseasecondition.Ofourwell-supportednovelusages,twohadappropriatepublicallyavailabledataandbothyieldedsignificantgenesetssuggestingpossiblemechanismsofaction(TableS2).Givensparsecoverageofdrugsanddiseasesinpublicdata,itisdifficulttoapplythisprocesssystematically.Nevertheless,thismethodyieldedtestablehypothesesregardingmechanismsof

PLOSONE|www.plosone.org

Prioritizingpredictedoff-labelusagesforfurtherinvestigation

WedesignedindicesofdrugriskandcostusingadverseeventassociationsandunitcostdatafromMedi-Spantoobjectivelytriageusagesforfurtherinvestigation.Thedrugriskindexis

February2014|Volume9|Issue2|e89324

搜索更多关于：说明书管理 - 图文的文档

说明书管理 - 图文.doc 将本文的Word文档下载到电脑，方便复制、编辑、收藏和打印

下载这篇word文档

本文链接：https://www.diyifanwen.net/c40o2w19hn88n6j487kl8_1.html（转载请注明文章来源）