With the tools we have so far, we can only define very simple
probability distributions. In fact, if we define a distribution over
a tuple type, the different components will be independent of each
other.
For example, try <a : flip 0.5, b : flip 0.5>,
which produces the result
*.a | *.b
[True; False] | [True; False]
---------------|---------------
[True] | [True] | 0.25
[True] | [False] | 0.25
[False] | [True] | 0.25
[False] | [False] | 0.25
We see in this result that the components a and b are
independent. We can introduce dependence and correlations by defining
variables.
When we define a variable, we associate its name with some expression. That expression stochastically returns an outcome, which becomes the value of the variable. We can then refer to the name of the variable in some subsequent expression, and get its value.
The basic form for defining variables is the let
expression, which has the form let v = e_1 in e_2. The name
of the variable, v, must be an identifier.
e_1 is the expression defining v, while e_2 is the
expression in which v may appear.
This let expression can be understood as defining the experiment
which begins by running e_1, assigns the outcome to v, and
then runs e_2 and returns its outcome, using the assigned value
of v wherever v appears.
From a probabilistic point of view, e_1 defines a probability
distribution over the value of v, while e_2 defines a
conditional distribution over the result given v.
For example, try
let x = flip 0.5 in <a : x, b : x>The result is
*.a | *.b
[True; False] | [True; False]
---------------|---------------
[True] | [True] | 0.5
[False] | [False] | 0.5
Compare this result to the result of
<a : flip 0.5, b : flip 0.5> from above.
In that case, *.a and *.b were independent of each other.
Now, they are either both True or both False, because both
are equal to the value of x.
There is another way to define a variable, in a pattern appearing in a
case statement.
The pattern v, where v is an identifier, is similar to the
pattern _, in that it matches any value.
In addition, it defines v to be the value matched.
For example, try
case <true, 'hello> of <_, x> : x
In fact, there is a more general form of the let expression:
let pat = e_1 in e_2, where pat is a pattern.
This expression can be understood as evaluating e_1, matching
the outcome to pat, and binding any variables in pat
accordingly. Then e_2 is evaluated, using the bound values for
variables appearing in the pattern.
This form is useful for defining multiple variables in a single
expression.
For example:
let <x, 'a, y, _> = <true, 'a, 'hello, flip 0.9> in <a : x, b : y>
With variable definitions, we have the tools to construct Bayesian networks. The following defines the classical three-node Burglary-Earthquake-Alarm network.
let b = flip 0.01 in
let e = flip 0.001 in
let a = case <b,e> of
# <false, false> : flip 0.01
# <false, true> : flip 0.1
# <true, false> : flip 0.7
# <true, true> : flip 0.8 in
< burglary : b, earthquake : e, alarm : a >
This is a general way to define a Bayesian network. For each node, we
have a let expression. For root nodes, the definition is a
simple dist expression (or flip for binary nodes). For
nodes with parents, the definition is a case over the values of
the parents, defining a conditional probability table, that specifies
a probability distribution over the node for each assignment of values
to the parents. After all the nodes have been defined, they are
bundled together into a tuple. The probability distribution over the
tuple is the distribution defined by the Bayesian network.
Instead of full conditional probability tables, structured conditional probability tables can easily be defined. For example, we can say that alarm only depends on earthquake when burglary is false (an example of context specific independence), with
...
let a = case <b,e> of
# <false, false> : flip 0.01
# <false, true> : flip 0.1
# <true, _> : flip 0.75 in
...
We could also make the alarm node a noisy-or node:
... let a = (b & flip 0.7) | (e & flip 0.3) in ...