### First learning an approximateBN structure. The approach constrains

First emerging in the late 60s, BNs were inspired by graph theory, combinatorial optimisationand traditional statistics. The key pre-requisite of a successful BN implementation is an accuratestructure.

To this end a large variety of structure learning algorithms exist, falling into threebroad categories:•Constraint Based Learning: where a set of dependence conditions between variables in-forms the network structure; e.g. the Chow Liu algorithm (Chow & Liu 1968)•Bayesian Model Averaging: where Bayesian statistics are used to estimate a structuresand are then combined; e.g. via Markov Chain Monte Carlo (Friedman & Koller 2000)•Scoring Based Learning: where the search is treated as a combinatorial optimisation/searchproblem informed by a (function derived) score for a proposed structure; e.g. the K2 al-gorithm (Cooper & Herskovits 1992)The relative merits of these approaches have been analysed in depth (Cheng & Greiner 1999,Carvalho 2009, Daly et al. 2011), with comparable efficacy found in most methods.

Thesefindings made the choice of a control algorithm a straightforward endeavour, with the popularK2, BIC, Bdeu, and Chow Liu algorithms all worthy candidates. However, owing to packagelimitations explored in the subsequent section (3.1), the Chow Liu algorithm was selected asthe basis for comparison. Consequently it has been explored in greater depth than the otherBN algorithms within this chapter.2.2 Chow-Liu Minimum Weight Spanning TreesThe Chow-Liu is a Maximum Weight Spanning Trees (MSWT) definition algorithm, first de-scribed in a paper by Chow & Liu (1968), is an efficient mechanism for learning an approximateBN structure.

The approach constrains the search space by assuming that the BN in questionis a tree. Chow and Liu proposed a straightforward algorithmic approach to traverse the searchspace and hence construct an optimal tree where the edge that adds the maximum mutualinformation pair is added iteratively to the tree provided no path already exists between thenodes. For example:•Suppose a five variable/node dataset ‘A’, ‘B’, ‘C’, ‘D’, ‘E’•Order the possible edge’s by a pairwise information measure e.

g. EdgeAB, EdgeCE,EdgeBC, EdgeAE, EdgeBD, EdgeAC, EdgeBE, EdgeAD, EdgeCE, EdgeCD•Iteratively construct the graph ensuring the MWST1. criteria is satisfied (representedgraphically in figure 2.1 below)