We present evidence that attributes that are known to the file system when a file is created, such as its name, permission mode, and owner, are often strongly related to future properties of the file such as its ultimate size, lifespan, and access pattern.
5The ABLE Predictor
The results of the previous section establish that each of a?le’s attributes(?le name,uid,gid,mode)are,to some extent,associated with its long term properties (size,lifespan,and access pattern).This fact suggests that these associations can be used to make predictions on the properties of a?le at creation time.The chi-squared results also give us hope that higher order as-sociations(i.e.,an association between more than one at-tribute and a property)may exist,which could result in more accurate predictions.
To investigate the possibility of creating a predictive model from our data,we constructed an Attribute-Based Learning Environment(ABLE).ABLE is a learning en-vironment for evaluating the predictive power of?le at-tributes.The input to ABLE is a table of information about?les whose attributes and properties we have al-ready observed and a list of properties for which we wish to predict.The output is a statistical analysis of the sam-ple,a chi-squared ranking of each?le attribute relative to each property,and a collection of predictive models that can be used to make predictions about new?les.
In this paper,we focus on three properties:the?le size,the?le access pattern(read-only or write-only),and the?le lifespan.On UNIX?le systems,there are two as-pects of?le lifespan that are interesting:the?rst is how long the underlying?le container(usually implemented as an inode)will live,and the other is how long a par-ticular name of a?le will live(because each?le may be linked from more than one name).We treat these cases separately and make predictions for each.
To simplify our evaluation,each property we wish to predict is represented by a Boolean predicate.For exam-ple:
size
size16KB
inode lifespan1sec
?le name lifespan1sec
read-only
write-only
We believe these properties are representative of properties that a?le or storage system designer might use to optimize for different classes of?les.For exam-ple,if we know that a?le will be read-only,then we might choose to replicate it for performance and avail-ability,but this optimization would be inappropriate for ?les that are written frequently but rarely read.Write-only?les might be stored in a partition optimized for writes(e.g.,a log-structured?le system),and short-lived ?les could live their brief lives in NVRAM.In Section
6,for example,we show that by identifying small,short-lived?les and hot directories,we can use predictions to optimize directory updates in a real?le system.
ABLE consists of three steps:
Step1:Obtaining Training Data.Obtain a sample of ?les and for each?le record its attributes(name,
uid,gid,mode)and properties(size,lifespan,and
access pattern).
Step2:Constructing a Predictive Classi?er.For each ?le property,we train a learning algorithm to clas-
sify each?le in the training data according to that
property.The result of this step is a set of predic-
tive models that classi?es each?le in the training
data and can be used to make predictions on newly
created?les.
Step3:Validating the e the model to pre-dict the properties of new?les,and then check
whether the predictions are accurate.
Each of these steps contains a number of interesting issues.For the?rst step,we must decide how to obtain representative samples.For the second,we must choose
搜索“diyifanwen.net”或“第一范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,第一范文网,提供最新工程科技Abstract Attribute-Based Prediction of File Properties(9)全文阅读和word下载服务。
相关推荐: