Continuous Emission Hidden Markov Models

Continuous Emission Hidden Markov Models

AUTHOR:

  • William Stein, 2010-03
class sage.stats.hmm.chmm.GaussianHiddenMarkovModel

Bases: sage.stats.hmm.hmm.HiddenMarkovModel

GaussianHiddenMarkovModel(A, B, pi)

Gaussian emissions Hidden Markov Model.

INPUT:

  • A – matrix; the N x N transition matrix
  • B – list of pairs (mu,sigma) that define the distributions
  • pi – initial state probabilities
  • normalize –bool (default: True)

EXAMPLES:

We illustrate the primary functions with an example 2-state Gaussian HMM:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,1), (-1,1)], [.5,.5]); m
Gaussian Hidden Markov Model with 2 States
Transition matrix:
[0.1 0.9]
[0.5 0.5]
Emission parameters:
[(1.0, 1.0), (-1.0, 1.0)]
Initial probabilities: [0.5000, 0.5000]

We query the defining transition matrix, emission parameters, and initial state probabilities:

sage: m.transition_matrix()
[0.1 0.9]
[0.5 0.5]
sage: m.emission_parameters()
[(1.0, 1.0), (-1.0, 1.0)]
sage: m.initial_probabilities()
[0.5000, 0.5000]

We obtain a sample sequence with 10 entries in it, and compute the logarithm of the probability of obtaining his sequence, given the model:

sage: obs = m.sample(10); obs
[-1.6835, 0.0635, -2.1688, 0.3043, -0.3188, -0.7835, 1.0398, -1.3558, 1.0882, 0.4050]
sage: m.log_likelihood(obs)
-15.2262338077988...

We compute the Viterbi path, and probability that the given path of states produced obs:

sage: m.viterbi(obs)
([1, 0, 1, 0, 1, 1, 0, 1, 0, 1], -16.67738270170788)

We use the Baum-Welch iterative algorithm to find another model for which our observation sequence is more likely:

sage: m.baum_welch(obs)
(-10.6103334957397..., 14)
sage: m.log_likelihood(obs)
-10.6103334957397...

Notice that running Baum-Welch changed our model:

sage: m
Gaussian Hidden Markov Model with 2 States
Transition matrix:
[   0.415498136619    0.584501863381]
[   0.999999317425 6.82574625899e-07]
Emission parameters:
[(0.417888242712, 0.517310966436), (-1.50252086313, 0.508551283606)]
Initial probabilities: [0.0000, 1.0000]
baum_welch(obs, max_iter=500, log_likelihood_cutoff=0.0001, min_sd=0.01, fix_emissions=False, v=False)

Given an observation sequence obs, improve this HMM using the Baum-Welch algorithm to increase the probability of observing obs.

INPUT:

  • obs – a time series of emissions
  • max_iter – integer (default: 500) maximum number of Baum-Welch steps to take
  • log_likelihood_cutoff – positive float (default: 1e-4); the minimal improvement in likelihood with respect to the last iteration required to continue. Relative value to log likelihood.
  • min_sd – positive float (default: 0.01); when reestimating, the standard deviation of emissions is not allowed to be less than min_sd.
  • fix_emissions – bool (default: False); if True, do not change emissions when updating

OUTPUT:

  • changes the model in places, and returns the log likelihood and number of iterations.

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.log_likelihood([-2,-1,.1,0.1])
-8.858282215986275
sage: m.baum_welch([-2,-1,.1,0.1])
(4.534646052182..., 7)
sage: m.log_likelihood([-2,-1,.1,0.1])
4.534646052182...
sage: m
Gaussian Hidden Markov Model with 2 States
Transition matrix:
[   0.999999999243 7.56983939444e-10]
[   0.499984627912    0.500015372088]
Emission parameters:
[(0.1, 0.01), (-1.49995081476, 0.50007105049)]
Initial probabilities: [0.0000, 1.0000]

We illustrate bounding the standard deviation below. Note that above we had different emission parameters when the min_sd was the default of 0.01:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.baum_welch([-2,-1,.1,0.1], min_sd=1)
(-4.07939572755..., 32)
sage: m.emission_parameters()
[(-0.2663018798..., 1.0), (-1.99850979..., 1.0)]

We watch the log likelihoods of the model converge, step by step:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: v = m.sample(10)
sage: stats.TimeSeries([m.baum_welch(v,max_iter=1)[0] for _ in range(len(v))])
[-20.1167, -17.7611, -16.9814, -16.9364, -16.9314, -16.9309, -16.9309, -16.9309, -16.9309, -16.9309]

We illustrate fixing emissions:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.9,.1]], [(1,2),(-1,.5)], [.3,.7])
sage: set_random_seed(0); v = m.sample(100)
sage: m.baum_welch(v,fix_emissions=True)
(-164.72944548204..., 23)
sage: m.emission_parameters()
[(1.0, 2.0), (-1.0, 0.5)]
sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.9,.1]], [(1,2),(-1,.5)], [.3,.7])
sage: m.baum_welch(v)
(-162.854370397998..., 49)
sage: m.emission_parameters()
[(1.27224191726, 2.37136875176), (-0.948617467518, 0.576236038512)]
emission_parameters()

Return the parameters that define the normal distributions associated to all of the states.

OUTPUT:

  • a list B of pairs B[i] = (mu, std), such that the distribution associated to state i is normal with mean mu and standard deviation std.

EXAMPLES:

sage: hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9]).emission_parameters()
[(1.0, 0.5), (-1.0, 3.0)]
generate_sequence(length, starting_state=None)

Return a sample of the given length from this HMM.

INPUT:

  • length – positive integer
  • starting_state – int (or None); if specified then generate a sequence using this model starting with the given state instead of the initial probabilities to determine the starting state.

OUTPUT:

  • an IntList or list of emission symbols
  • TimeSeries of emissions

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.generate_sequence(5)
([-3.0505, 0.5317, -4.5065, 0.6521, 1.0435], [1, 0, 1, 0, 1])
sage: m.generate_sequence(0)
([], [])
sage: m.generate_sequence(-1)
Traceback (most recent call last):
...
ValueError: length must be nonnegative

Example in which the starting state is 0 (see trac 11452):

sage: set_random_seed(23);  m.generate_sequence(2)
([0.6501, -2.0151], [0, 1])

Force a starting state of 1 even though as we saw above it would be 0:

sage: set_random_seed(23);  m.generate_sequence(2, starting_state=1)
([-3.1491, -1.0244], [1, 1])

Verify numerically that the starting state is 0 with probability about 0.1:

sage: set_random_seed(0)
sage: v = [m.generate_sequence(1)[1][0] for i in range(10^5)]
sage: 1.0 * v.count(int(0)) / len(v)
0.0998200000000000
log_likelihood(obs)

Return the logarithm of a continuous analogue of the probability that this model produced the given observation sequence.

Note that the “continuous analogue of the probability” above can be bigger than 1, hence the logarithm can be positive.

INPUT:

  • obs – sequence of observations

OUTPUT:

  • float

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.log_likelihood([1,1,1])
-4.297880766072486
sage: set_random_seed(0); s = m.sample(20)
sage: m.log_likelihood(s)
-40.115714129484...
viterbi(obs)

Determine “the” hidden sequence of states that is most likely to produce the given sequence seq of observations, along with the probability that this hidden sequence actually produced the observation.

INPUT:

  • seq – sequence of emitted ints or symbols

OUTPUT:

  • list – “the” most probable sequence of hidden states, i.e., the Viterbi path.
  • float – log of probability that the observed sequence was produced by the Viterbi sequence of states.

EXAMPLES:

We find the optimal state sequence for a given model:

sage: m = hmm.GaussianHiddenMarkovModel([[0.5,0.5],[0.5,0.5]], [(0,1),(10,1)], [0.5,0.5])
sage: m.viterbi([0,1,10,10,1])
([0, 0, 1, 1, 0], -9.0604285688230...)

Another example in which the most likely states change based on the last observation:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.viterbi([-2,-1,.1,0.1])
([1, 1, 0, 1], -9.61823698847639...)
sage: m.viterbi([-2,-1,.1,0.3])
([1, 1, 1, 0], -9.566023653378513)
class sage.stats.hmm.chmm.GaussianMixtureHiddenMarkovModel

Bases: sage.stats.hmm.chmm.GaussianHiddenMarkovModel

GaussianMixtureHiddenMarkovModel(A, B, pi)

Gaussian mixture Hidden Markov Model.

INPUT:

  • A – matrix; the N x N transition matrix
  • B – list of mixture definitions for each state. Each state may have a varying number of gaussians with selection probabilities that sum to 1 and encoded as (p,(mu,sigma))
  • pi – initial state probabilities
  • normalize –bool (default: True); if given, input is normalized to define valid probability distributions, e.g., the entries of A are made nonnegative and the rows sum to 1, and the probabilities in pi are normalized.

EXAMPLES:

sage: A  = [[0.5,0.5],[0.5,0.5]]
sage: B  = [[(0.9,(0.0,1.0)), (0.1,(1,10000))],[(1,(1,1)), (0,(0,0.1))]]
sage: hmm.GaussianMixtureHiddenMarkovModel(A, B, [1,0])
Gaussian Mixture Hidden Markov Model with 2 States
Transition matrix:
[0.5 0.5]
[0.5 0.5]
Emission parameters:
[0.9*N(0.0,1.0) + 0.1*N(1.0,10000.0), 1.0*N(1.0,1.0) + 0.0*N(0.0,0.1)]
Initial probabilities: [1.0000, 0.0000]

TESTS:

If a standard deviation is 0, it is normalized to be slightly bigger than 0.:

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [[(1,(0,0))]], [1])
Gaussian Mixture Hidden Markov Model with 1 States
Transition matrix:
[1.0]
Emission parameters:
[1.0*N(0.0,1e-08)]
Initial probabilities: [1.0000]

We test that number of emission distributions must be the same as the number of states:

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [], [1])
Traceback (most recent call last):
...
ValueError: number of GaussianMixtures must be the same as number of entries of pi

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [[]], [1])
Traceback (most recent call last):
...
ValueError: must specify at least one component of the mixture model

We test that the number of initial probabilities must equal the number of states:

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [[]], [1,2])
Traceback (most recent call last):
...
ValueError: number of entries of transition matrix A must be the square of the number of entries of pi
baum_welch(obs, max_iter=1000, log_likelihood_cutoff=1e-12, min_sd=0.01, fix_emissions=False)

Given an observation sequence obs, improve this HMM using the Baum-Welch algorithm to increase the probability of observing obs.

INPUT:

  • obs – a time series of emissions
  • max_iter – integer (default: 1000) maximum number of Baum-Welch steps to take
  • log_likelihood_cutoff – positive float (default: 1e-12); the minimal improvement in likelihood with respect to the last iteration required to continue. Relative value to log likelihood.
  • min_sd – positive float (default: 0.01); when reestimating, the standard deviation of emissions is not allowed to be less than min_sd.
  • fix_emissions – bool (default: False); if True, do not change emissions when updating

OUTPUT:

  • changes the model in places, and returns the log likelihood and number of iterations.

EXAMPLES:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[.9,.1],[.4,.6]], [[(.4,(0,1)), (.6,(1,0.1))],[(1,(0,1))]], [.7,.3])
sage: set_random_seed(0); v = m.sample(10); v
[0.3576, -0.9365, 0.9449, -0.6957, 1.0217, 0.9644, 0.9987, -0.5950, -1.0219, 0.6477]
sage: m.log_likelihood(v)
-8.31408655939536...
sage: m.baum_welch(v)
(2.18905068682..., 15)
sage: m.log_likelihood(v)
2.18905068682...
sage: m
Gaussian Mixture Hidden Markov Model with 2 States
Transition matrix:
[   0.874636333977    0.125363666023]
[              1.0 1.45168520229e-40]
Emission parameters:
[0.500161629343*N(-0.812298726239,0.173329026744) + 0.499838370657*N(0.982433690378,0.029719932009), 1.0*N(0.503260056832,0.145881515324)]
Initial probabilities: [0.0000, 1.0000]

We illustrate bounding the standard deviation below. Note that above we had different emission parameters when the min_sd was the default of 0.01:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[.9,.1],[.4,.6]], [[(.4,(0,1)), (.6,(1,0.1))],[(1,(0,1))]], [.7,.3])
sage: m.baum_welch(v, min_sd=1)
(-12.617885761692..., 1000)
sage: m.emission_parameters()
[0.503545634447*N(0.200166509595,1.0) + 0.496454365553*N(0.200166509595,1.0), 1.0*N(0.0543433426535,1.0)]

We illustrate fixing all emissions:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[.9,.1],[.4,.6]], [[(.4,(0,1)), (.6,(1,0.1))],[(1,(0,1))]], [.7,.3])
sage: set_random_seed(0); v = m.sample(10)
sage: m.baum_welch(v, fix_emissions=True)
(-7.58656858997..., 36)
sage: m.emission_parameters()
[0.4*N(0.0,1.0) + 0.6*N(1.0,0.1), 1.0*N(0.0,1.0)]
emission_parameters()

Returns a list of all the emission distributions.

OUTPUT:

  • list of Gaussian mixtures

EXAMPLES:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[.9,.1],[.4,.6]], [[(.4,(0,1)), (.6,(1,0.1))],[(1,(0,1))]], [.7,.3])
sage: m.emission_parameters()
[0.4*N(0.0,1.0) + 0.6*N(1.0,0.1), 1.0*N(0.0,1.0)]
sage.stats.hmm.chmm.unpickle_gaussian_hmm_v0(A, B, pi, name)

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[1]], [(0,1)], [1])
sage: sage.stats.hmm.chmm.unpickle_gaussian_hmm_v0(m.transition_matrix(), m.emission_parameters(), m.initial_probabilities(), 'test')
Gaussian Hidden Markov Model with 1 States
Transition matrix:
[1.0]
Emission parameters:
[(0.0, 1.0)]
Initial probabilities: [1.0000]
sage.stats.hmm.chmm.unpickle_gaussian_hmm_v1(A, B, pi, prob, n_out)

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[1]], [(0,1)], [1])
sage: loads(dumps(m)) == m   # indirect test
True
sage.stats.hmm.chmm.unpickle_gaussian_mixture_hmm_v1(A, B, pi, mixture)

EXAMPLES:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[1]], [[(.4,(0,1)), (.6,(1,0.1))]], [1])
sage: loads(dumps(m)) == m   # indirect test
True

Previous topic

Hidden Markov Models

Next topic

Distributions used in implementing Hidden Markov Models

This Page