Creating, Viewing and Sampling a Bayesian Network

This demonstration illustrates the creation, viewing and sampling processes of the example Weather bayesian network.

Contents

Creating the Graph

The first component of a bayesian network is its directed acyclic graph (DAG). We create the graph of the Weather network as a sparse matrix. A nonzero element (i, j) means that node i is a parent of node j.

node = struct('Cloudy', 1, ...
              'Sprinkler', 2, ...
              'Rain', 3, ...
              'WetGrass', 4);

G = sparse(4, 4);
G(node.Cloudy, node.Sprinkler) = 1;
G(node.Cloudy, node.Rain) = 1;
G([node.Rain node.Sprinkler], node.WetGrass) = 1;

Creating the Conditional Probability Distribution

The second component of a bayesian network is its conditional probability distribution (CPD). In the case of discrete bayesian networks (the ones supported in this toolbox), this is a conditional probability table (CPT) for each of the nodes. We create a cell array of ND-arrays. Each ND-array encodes the CPD of a node. The first dimension of a ND-array corresponds to the node and the rest dimensions to the parents of the node: For example, P{4}(2, 1, 1) is the probability of the 4th node taking its 2nd value (level), given that its 1st parent takes its 1st value and its 2nd parent also takes its first value. The sums along the 1st dimension must all be 1.

% create the CPT
P{node.Cloudy} = [0.5; 0.5];
P{node.Sprinkler} = reshape([0.5 0.5 0.9 0.1], [2 2]);
P{node.Rain} = reshape([0.8 0.2 0.2 0.8], [2 2]);
P{node.WetGrass} = reshape([0.01 0.99 1.0 0.0 0.1 0.9 0.1 0.9], [2 2 2]);

% demonstrate for each node that the sums along the 1st dimension of the
% CPT are all 1
reshape(sum(P{node.WetGrass}, 1), 1, [])
reshape(sum(P{node.Sprinkler}, 1), 1, [])
reshape(sum(P{node.Rain}, 1), 1, [])
reshape(sum(P{node.WetGrass}, 1), 1, [])
ans =

     1     1     1     1


ans =

     1     1


ans =

     1     1


ans =

     1     1     1     1

Creating Annotations for the Network

Next we create a cell array of strings containing the names of our variables and a cell array of cell arrays of strings containing the labels of the levels of each variable. The variable names will appear when we visualize the network using the view method. The variable names and the level labels will also be used in the datasets sampled from the network using the rnd method. Consult MATLAB® Statistics Toolbox (TM) documentation to learn more about levels and labels of categorical arrays and datasets. Finally we create a string with a description for the network.

% create variable names
VarNames = fieldnames(node)';

% create level labels
Labels = {{'false', 'true'}, ...
          {'false', 'true'}, ...
          {'false', 'true'}, ...
          {'false', 'true'}};

% create description
description = ['Example from Russell and Norvig, "Artificial' ...
               'Intelligence: a Modern Approach", Prentice Hall, ' ...
               '1995, page 454.'];

Creating the Network

We create a network object net by providing the graph G and CPD P as input arguments and the variable names, level labels and description as parameter name/value pairs to the constructor of the org.mensxmachina.bnet.Network class, which is the class for bayesian networks used thoughout this toolbox. We also provide a 'ClassNames' parameter declaring that all of our variables are nominal. Consequently, datasets sampled using rnd will consist of nominal only variables.

% construct the network object
net = org.mensxmachina.bnet.Network(...
    G, P, 'VarNames', VarNames, 'Labels', Labels, ...
    'ClassNames', 'nominal', 'Description', description);

Visualizing the Network Graph

We visualize the network graph using the view method. Notice that the nodes are labelled with the variable names we supplied.

% view the network graph
net.view();

Sampling the Network

We create a dataset of 10 samples from the network, using the rnd method. Notice that the variable names and level labels are the ones we supplied.

% sample the network
a = net.rnd(10)
Sampling...

Creating column #1, 'Cloudy'...

Creating column #2, 'Rain'...

Creating column #3, 'Sprinkler'...

Creating column #4, 'WetGrass'...

a = 

    Cloudy    Sprinkler    Rain     WetGrass
    false     false        false    true    
    true      false        true     true    
    true      false        false    true    
    true      false        true     true    
    false     true         false    false   
    false     true         false    false   
    false     true         false    false   
    true      false        true     true    
    false     true         false    false   
    true      false        true     true