Creating, viewing and sampling a Bayesian network

This demo illustrates the creation, viewing and sampling of the example sprinkler Bayesian network from Artificial Intelligence: A Modern Approach (1st Edition). Terms variable and node are used interchangeably.

Contents

Creating the structure

The first component of a Bayesian network is its structure, a directed acyclic graph (DAG). In MATLAB®, graphs are represented as sparse matrices. A nonzero element in the matrix denotes an edge in the graph. We create the structure of the sprinkler network. The structure has 4 nodes and edges 1->2, 1->3, 2->4 and 3->4.

% create structure
structure = sparse([1 1 2 3], [2 3 4 4], 1, 4, 4);

Creating the conditional probability distributions

The second component of a Bayesian network is the conditional probability distributions (CPDs) of the nodes given values of their parents. In general, a CPD is the probability distribution of a set of response variables given values of a set of explanatory variables. The CPDs in a Bayesian network are CPDs of a single response variable, the explanatory variables being the parents of that variable.

The variables of the sprinkler network are cloudy, sprinkler, rain, and wetGrass. All of them are binary variables taking values false and true. We represent these values by Statistics Toolbox™ categorical arrays with levels false and true.

We create a tabular CPD for each variable. Tabular CPDs are encoded as tables. For each tabular CPD, we supply the variable names, the variable values, the CPD-variable types and the values of the CPD. A CPD-variable type is either Explanatory or Response. The values of a tabular CPD are ND arrays. Each value of the ND array is the probability of the corresponding variable-value combination.

import org.mensxmachina.stats.cpd.cpdvartype;
import org.mensxmachina.stats.cpd.tabular.tabcpd;

% create variable values -- same for all variables
varValues = nominal([1; 2], {'false', 'true'});

% create CPDs

E = cpdvartype.Explanatory;
R = cpdvartype.Response;

cloudyCpd = tabcpd(...
    {'cloudy'}, ...
    {varValues}, ...
    R, ...
    reshape([0.5 0.5], 2, 1))

sprinklerCpd = tabcpd(...
    {'cloudy', 'sprinkler'}, ...
    {varValues, varValues}, ...
    [E R], ...
    reshape([0.5 0.5; 0.9 0.1], 2, 2))

rainCpd = tabcpd(...
    {'cloudy', 'rain'}, ...
    {varValues, varValues}, ...
    [E R], ...
    reshape([0.8 0.2; 0.2 0.8], 2, 2))

wetGrassCpd = tabcpd(...
    {'sprinkler', 'rain', 'wetGrass'}, ...
    {varValues, varValues, varValues}, ...
    [E E R], ...
    reshape([1 0.1 0.1 0.01 0 0.9 0.9 0.99], 2, 2, 2))

% put them all together
cpd = {cloudyCpd, sprinklerCpd, rainCpd, wetGrassCpd};
cloudyCpd = 
        cloudy = false     cloudy = true
                   0.5               0.5

sprinklerCpd = 
                      sprinkler = false     sprinkler = true
    cloudy = false                  0.5                  0.5
    cloudy =  true                  0.9                  0.1

rainCpd = 
                      rain = false     rain = true
    cloudy = false             0.8             0.2
    cloudy =  true             0.2             0.8

wetGrassCpd = 
                                       wetGrass = false     wetGrass = true
    sprinkler = false, rain = false                   1                   0
    sprinkler =  true, rain = false                 0.1                 0.9
    sprinkler = false, rain =  true                 0.1                 0.9
    sprinkler =  true, rain =  true                0.01                0.99

Creating the network

We create the sprinkler network by supplying its structure and its CPDs.

import org.mensxmachina.pgm.bn.bayesnet;

% create Bayesian network
BayesNet = bayesnet(structure, cpd);

Viewing the structure

Bayesian networks are viewed by Bayesian network viewers. We use a biograph-based Bayesian network viewer, which uses a Bioinformatics Toolbox™ biograph, to view the structure of the sprinkler network.

import org.mensxmachina.pgm.bn.viewers.biograph.biographbayesnetviewer;

% create Bayesian network Viewer
Viewer = biographbayesnetviewer(BayesNet);

% view the Bayesian network structure
Viewer.viewbayesnetstructure();

Sampling the network

A Bayesian network is itself a CPD and can be sampled. We sample a CPD by supplying a Statistics Toolbox™ dataset array containing values of the explanatory variables of the CPD. The sample is a dataset containing values for the response variables of the CPD.

We get a random sample with 10 observations from the sprinkler network by supplying an empty 10-by-0 dataset array, since there are no explanatory variables in Bayesian networks.

% get a random sample from the Bayesian network
D = random(BayesNet, dataset.empty(10, 0))
Sampling...

Creating column #1, 'cloudy' (1 of 4, 25.00%)...

Creating column #3, 'rain' (2 of 4, 50.00%)...

Creating column #2, 'sprinkler' (3 of 4, 75.00%)...

Creating column #4, 'wetGrass' (4 of 4, 100.00%)...

D = 

    cloudy    sprinkler    rain     wetGrass
    true      false        true     true    
    false     false        false    false   
    false     false        false    false   
    false     false        false    false   
    true      false        true     true    
    false     true         false    true    
    false     true         true     true    
    false     false        false    false   
    false     false        false    false   
    false     false        false    false