Dynamic Bayesian networks provide a compact and natural representation for complex dynamic systems. However, in many cases, there is no expert available from whom a model can be elicited. Learning provides an alternative approach for constructing models of
network structure we need only re-evaluate changes to the family of X 0 . Second, the term that evaluates the family of X 0 is a function only of the su cient statistics for X 0 and its parents. Thus, these su cient statistics are the only aspects of the data that we need to preserve. For each choice of parents for X 0, we need to collect statistics on di erent events. Evaluation of local changes usually involves computation of new su cient statistics, and then an evaluation of the score with respect to the statistics of the new model and its dimension. The Bayesian score is somewhat more complex. It involves taking prior distribution over models and parameters in to account. Without going into details, we note that for some choices of priors, such as the BDe priors of Heckerman et al. 1995], the main feature of
BIC also hold for the Bayesian score: the score decomposes in to a sum of terms, and the score depends only on the su cient statistics collected from data. Although the Bayesian score and the BIC are asymptotically equivalent, for small sample sizes the Bayesian score often performs better.i i i i
2.3 Learning DBNs: Incomplete DataThe main di culty with learning from partial observations is that we no longer know the counts in the data. As a consequence, the score no longer decomposes into separate components corresponding to individual families. The most common solution to the missing data problem is the Expectation-Maximization (EM) algorithm Dempster et al. 1977; Lauritzen 1995]. The algorithm is an iterative procedure that searches for a parameter vector which is a local maximum of the likelihood function. It starts with some initial (often random) parameter vector . It then repeatedly executes a two-phase procedure. In the E-step, the current parameters are used to complete the data by\ lling in" unobserved values with their expected values. In the M-step, the completed data is used as if it was real, in a maximum likelihood estimation step. More precisely, given a current parameter vector, the algorithm computes the expected su cient statistics (ESS) for D relative to:
搜索“diyifanwen.net”或“第一范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,第一范文网,提供最新人文社科Discovering the hidden structure of complex dynamic systems(6)全文阅读和word下载服务。
相关推荐: