morepotentialknowledgewhichmustthenbevalidated.
EvidenceBuildingandValidation-comesfromtheseriesofweakap-
proacheseachofwhicharecombinedtogivearatingwhichcanbeusedto
validateknowledge.Firstlyadditionalredundantsourcesofthesamedataare
identi edviaasimplesearch,e.g.google ndingrelevantURLs.Thisredun-
dancybyitselfisnolongerconsideredvalidevidenceasthereliabilityofsources
themselvesmustalsobetakenintoaccount,e.g.asourcewithlotsofpoten-
tialacademicsisconsideredabettersourcethanasourcedetailingjustasingle
academic.
Theratingofsourcesin uencesArmadillo’sperceivedvalidityofthepoten-
ingthepreviouslymentionedSimMetriclibrary,section
3,theextractedentitiesarecrossexaminedforsimplesimilarities:ifsomething
issuitablydissimilarfromtherest,forexampleafailedcapture40timeslonger
thannormal,itisconsideredmorepossiblyanerroranditsvalidityratingis
decreasedaswellastheratingofthesourcefromwhichitwasfound.Atpresent
thecrosssimilaritytestsareacombinationofSimMetric’ssimilaritymetrics,
althoughmorecomplexapproachescouldbeemployed.Thecontextarounda
capturecanalsodetaillikelihood,e.g.thesimilarityofPartofSpeechtagssur-
roundinginstances,variousothersimilaritytechniquestoprovidefurtherweak
evidencecouldbeused,forexampleavectorspacemodelofthesource(e.g.web-
page)similarity,acomparisonofsimilaritiesinalinkanalysisofdatasourcesor
ananalysisofthesourcedocumentsDOMstructureanditssimilaritytoother
sources.
Thiscombinationofmultipleweaktechniquescanprovideimprovedcon-
denceintheextractedknowledgeArmadillo nds,(againalthoughnotyet
implementeditisenvisionedtoprioritisetechniquesthroughInformationGain
andmemoryortimecosts).
Extractionofpotentialknowledge-armadilloasinpreviouspapers[5]inte-
gratestheAmilcare[2](LP)2algorithmtoextractpotentialnewknowledge,but
canbeextendedtoencompassdi erentMLalgorithms,forexampleT-Rex[7].
Usingratingsfromtheevidencebuildingapproach(Section4.3)theInformation
Extractionlearnsitscontextualrulesonthemosthighlyratedsourcesusingthe
mostlikelyinstancesasseeddata.
搜索“diyifanwen.net”或“第一范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,第一范文网,提供最新人文社科Southampton and The Open University. Preface(17)全文阅读和word下载服务。
相关推荐: