The chi-squared test can also be used to rank the at-tributes by the degree of association.Figure 2shows how the chi-squared values differ for the size and lifespan properties.There are two important points to take from this ?gure.First,the attribute association differs across properties for a given trace –for example,in CAM-PUS the uid shows a relatively strong association with the lifespan,yet a weak association with the size.The second point is that the relative rankings differ across traces.For example,on CAMPUS the middle compo-nent of a ?le name has strong association with lifespan and size,but the association is much weaker on DEAS03and EECS03.
Although we show only two properties in these graphs,similarly diverse associations exist for other properties (e.g.,directory entry lifespan and read/write ratio).In Section 5we show how these associations can be dynamically discovered and used to make predictions.
The chi-squared test described in this section is a one-way test for association.This test provides statistical ev-idence that individual attributes are associated with ?le properties.It does not,however capture associations be-tween subsets of the attributes and ?le properties.It also does not provide an easy way to understand exactly what those associations are.One can extend this methodology to use -way chi-square tests,but the next section dis-cusses a more ef?cient way for both capturing multi-way associations and extracting those associations ef?ciently.
5
搜索“diyifanwen.net”或“第一范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,第一范文网,提供最新工程科技Abstract Attribute-Based Prediction of File Properties(8)全文阅读和word下载服务。
相关推荐: