第一范文网 - 专业文章范例文档资料分享平台

Estimating the quality of data in relational databases(8)

来源:用户分享 时间:2021-06-02 本文由义无反顾 分享 下载这篇文档 手机版
说明:文章内容仅供预览,部分内容可能不全,需要完整文档或者需要复制内容,请下载word后使用。下载word有问题请添加微信号:xxxxxx或QQ:xxxxxx 处理(尽可能给您提供完整文档),感谢您的支持与谅解。

4

4.1RatingtheQualityofDatabasesNecessaryProceduresforGoodnessEstimation

Theamountofdatainpracticaldatabasesisoftenlarge.Tocomputetheexactsoundnessandcompletenessofaparticularviewwewouldneedto(1)authenticateeveryvaluepairinthestoredview,and(2)determinehowmanypairsaremissingfromthisview.Thismethodisclearlyinfeasibleinanyrealsystem.Thus,wemustresorttosamplingtechniques[16,4].Samplingtechniquesallowustoestimatethemeanandvarianceofaparticularparameterofapopulationbyusingasamplewhichisusuallyonlyafractionofthesizeoftheentirepopulation.Thetheoryofstatisticsalsogivesusmethodsforestablishingasamplesizetoachievepredeterminedaccuracyoftheestimates.Itisthenpossibletosupplementourestimateswithcon denceintervals.Formoredetaileddiscussiononsamplingfromdatabasesthereaderisreferredtotheliteratureonthetopic(see,forexample,[12]foragoodsurvey).Notethattwodi erentpopulationsmustbesampled.Toestimatesoundnesswesamplethegiven(stored)view,whereastoestimatecompleteness,wesampletheidealview.Toestablishbothsoundnessandcompletenessitisnecessarytohaveaccesstotheidealdatabase.Forsoundness,weneedtodeterminewhetheraspeci cvaluepairofthestoreddatabaseisintheidealdatabase.Forcompleteness,itisnecessarytodeterminewhetheraspeci cpairfromtheidealdatabaseisinthestoreddatabase.Theseprocedures(verifyapairfromastoreddatabaseagainsttheidealdatabaseandretrieveanarbitrarypairfromtheidealdatabase)mustbeimplementedinanad-hocmanner[1].Foreachconcretedatabase,humanexpertisewillberequired.Theexpertwillaccessavarietyofavailablesourcestoperformthesetwoprocedures.Notethatthise ortisperformedonlyonceandonlyforasample,whichthenhelpsestimatetheoverallgoodness.

Acriticalstageofoursolutionistobuildasetofhomogeneousviewsonastoreddatabase,calledagoodnessbasis.Thegoodnessoftheviewsofthisbasiswillbemeasuredandthere-afterusedinestablishingthegoodnessofanswerstoarbitraryqueriesagainstthisdatabase.Sincewecannotguaranteeasinglesetofviewsthatwillbehomogeneouswithrespecttobothqualitymeasures,weconstructtwoseparatesets:asoundnessbasisandacompletenessbasis.Inconstructingeachbasis,weconsidereachdatabaserelationindividually.Eachre-lationmaybepartitionedbothhorizontally(byaselection)andvertically(byaprojection),andthebasiscomprisestheunionofallsuchpartitions.Selectionsarelimitedtoranges;i.e.,theselectioncriteriaisaconjunctionofconditions,whereeachindividualconditionspeci esanattributeandarangeofpermittedvaluesforthisattribute.

Weassigntoanincorrectvaluepairthevalue0andtoacorrectpairthevalue1.Thus,wecanrepresentanerrordistributionpatterninaviewextensionasatwo-dimensionalmatrixof0sand1s,inwhichrowscorrespondtothetuplesandcolumnscorrespondtotheattributesoftheview.Avalueinaparticularcellofthismatrixiseither0or1dependingonthecorrectnessofthecorrespondingpairofattributevalues.Wecallthisnewdatastructure

搜索“diyifanwen.net”或“第一范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,第一范文网,提供最新人文社科Estimating the quality of data in relational databases(8)全文阅读和word下载服务。

Estimating the quality of data in relational databases(8).doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印
本文链接:https://www.diyifanwen.net/wenku/1196349.html(转载请注明文章来源)
热门推荐
Copyright © 2018-2022 第一范文网 版权所有 免责声明 | 联系我们
声明 :本网站尊重并保护知识产权,根据《信息网络传播权保护条例》,如果我们转载的作品侵犯了您的权利,请在一个月内通知我们,我们会及时删除。
客服QQ:xxxxxx 邮箱:xxxxxx@qq.com
渝ICP备2023013149号
Top