We present evidence that attributes that are known to the file system when a file is created, such as its name, permission mode, and owner, are often strongly related to future properties of the file such as its ultimate size, lifespan, and access pattern.
the hottest possible region on CAMPUS,and on EECS03 nearly6%of all the disk accesses during the trace are now con?ned to the target area.These percentages may seem small,but keep in mind that we are focusing only on small?les and directories,and normal?le traf?c is the dominant cause of disk accesses in these workloads. 7Conclusions
We have shown that the attributes of a?le are strong hints of how that?le will be used.Furthermore,we have exploited these hints to make accurate predictions about the longer-term properties of?les,including the size, read/write ratio,and lifespan.Overall,?le names pro-vide the strongest hints,but using additional attributes can improve prediction accuracy.In some cases,accu-rate predictions are possible without considering names at ing traces from three NFS environments,we have demonstrated how classi?cation trees can predict ?le and directory properties,and that these predictions can be used within an existing?le system.
Our results are encouraging.Contemporary?le sys-tems use hard-coded policies and heuristics based on general assumptions about their workloads.Even the most advanced?le systems do no more than adapt to vio-lations of these assumptions.We have demonstrated how to construct a learning environment that can discover pat-terns in the workload and predict the properties of new ?les.These predictions enable optimization through dy-namic policy selection–instead of reacting to the prop-erties of new?les,the?le system can anticipate these properties.Although we only provide one example?le system optimization(clustering of hot directory data), this proof-of-concept demonstrates the potential for the system-wide deployment of predictive models.
ABLE is a?rst step towards a self-tuning?le system or storage device.Future work involves automation of the entire ABLE process,including sample collection, attribute selection,and model building.Furthermore, since changes in the workload will cause the accuracy of our models to degrade over time,we plan to auto-mate the process of detecting when models are failing (or are simply suboptimal)and retraining.When cata-clysmic changes in the workload occur(e.g.,tax season in an accounting?rm,or September on a college cam-pus),we must learn to detect that such an event has oc-curred and switch to a new(or cached)set of models. We also plan to explore mechanisms to include the cost of different types of mispredictions in our training in or-der to minimize the anticipated total cost of errors,rather than simply trying to minimize the number of errors.
搜索“diyifanwen.net”或“第一范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,第一范文网,提供最新工程科技Abstract Attribute-Based Prediction of File Properties(20)全文阅读和word下载服务。
相关推荐: