Add MLE estimation for Markov chains #658

jstac · 2022-11-27T01:28:55Z

Adds a straightforward implementation of MLE estimation for Markov transition matrices.

Estimation is from a single time series. States that are never visited in the data are removed from state before being added to the MarkovChain object (since otherwise the matrix is not stochastic).

This PR solves one part of the problem in #640, which is more naturally separated out.

jstac · 2022-11-27T01:30:20Z

All help getting this merged is appreciated! CC @mmcky @Smit-create

oyamad

The same functionality in QuantEcon.jl is called estimate_mc_discrete. Use the same name?

oyamad · 2022-11-27T02:44:10Z

Another approach is to accept a series of state values (rather than indices) and let np.unique derive the corresponding series of indices as well as the values of the states that are visited.

X = ['a', 'b', 'b', 'd', 'a']
state_values, indices = np.unique(X, return_inverse=True)

# state_values
# array(['a', 'b', 'd'], dtype='<U1')

# indices
# array([0, 1, 1, 2, 0])

Series of vector state values:

X = [[0, 0], [1, 2], [1, 2], [3, 3]]
state_values, indices = np.unique(X, return_inverse=True, axis=0)

# state_values
# array([[0, 0],
#        [1, 2],
#        [3, 3]])

# indices
# array([0, 1, 1, 2])

So the code would look like:

def estimate_P(X):
    X = np.asarray(X)
    axis = 0 if X.ndim > 1 else None
    state_values, indices = np.unique(X, return_inverse=True, axis=axis)
    
    n = len(state_values)
    P = np.zeros((n, n))  # dtype=float to modify in place
    _count_transition_frequencies(indices, P)
    P /= P.sum(1)[:, np.newaxis]
    
    mc = MarkovChain(P, state_values=state_values)
    return mc

oyamad · 2022-11-27T03:05:26Z

Just to recall (I had completely forgotten it), there is a thread of interesting discussions here in QuantEcon/QuantEcon.jl#161 from five years ago.

coveralls · 2022-11-27T03:41:01Z

Coverage increased (+0.06%) to 95.313% when pulling 074d98e on markov_est into fcde8c9 on master.

jstac · 2022-11-27T18:50:13Z

Thanks @oyamad for the thoughts. I've changed the function name as you suggested.

The discussion you mention was interesting. We have the same issue in #640 --- the need to estimate a discrete MC from continuous data.

I suggest that we merge this as is and then open issues for adding functionality, including the option you mention of passing in state values rather than indices.

mmcky

thanks @jstac. I will merge this in 24 hours pending any further public review.

oyamad · 2022-11-28T11:01:33Z

quantecon/markov/estimate.py

+from .core import MarkovChain
+
+def estimate_mc_discrete(index_series, state_values=None):
+    """


Suggested change

"""

r"""

oyamad · 2022-11-28T11:01:42Z

quantecon/markov/estimate.py

+    new_n = len(state_values)
+    P = np.empty((new_n, new_n))
+
+    # Normalize
+    row_sum = np.sum(trans_counter, axis=1)
+    for i in range(new_n):
+        P[i, :] = trans_counter[i, :] / row_sum[i]


for loop should be avoided when possible:

Suggested change

new_n = len(state_values)

P = np.empty((new_n, new_n))

# Normalize

row_sum = np.sum(trans_counter, axis=1)

for i in range(new_n):

P[i, :] = trans_counter[i, :] / row_sum[i]

P = trans_counter / trans_counter.sum(1)[:, np.newaxis]

oyamad

Register estimate_mc_discrete to markov/__init__.py.
Do python qe_apidoc.py in the docs directory.

oyamad · 2022-11-28T11:20:21Z

We might ignore this case, but if index_series is e.g. [0, 0, 0, 1] (the last number appears only once), then this function (or the algorithm itself) does not work properly. (The same problem applies to the QuantEcon.jl version as well.)

jstac · 2022-11-29T22:06:27Z

Thanks @oyamad, much appreciated.

@mmcky, would you mind to make those suggested changes and merge (or delegate and then merge)?

Let's ignore the side case that @oyamad pointed out for now (although it's a good point).

Smit-create · 2022-12-01T11:40:03Z

@jstac @mmcky I have rebased this branch on top of latest master branch (to fix a doc error which existed in this branch but is fixed in master). Also, made the suggested changes pointed out by @oyamad. Thanks!

jstac · 2022-12-01T19:36:55Z

Many thanks @Smit-create !

All looks good to me. It will be great to get this merged some we can finish up #640

jstac · 2022-12-03T02:18:39Z

@oyamad or @mmcky please merge when ready

oyamad · 2022-12-03T13:52:01Z

This is a proposal of refactoring of the current implementation (avoiding unnecessary array creations, fixing a bug in _fill_counters, and editing the docstrings to shorten the line length).

This is a proposal of extending to be able to accept a path of state values, as described in Add MLE estimation for Markov chains #658 (comment).

>>> X = ([0, 0], [1, 0], [1, 1], [0, 1], [0, 0])
>>> mc = qe.markov.estimate_mc_discrete(X)
>>> mc.P
array([[0., 0., 1., 0.],
       [1., 0., 0., 0.],
       [0., 0., 0., 1.],
       [0., 1., 0., 0.]])
>>> mc.state_values
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

If you happen to have a path of state indices:

>>> X_indices = (0, 2, 3, 1, 0)
>>> state_values = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
>>> mc = qe.markov.estimate_mc_discrete(X_indices)
>>> mc.state_values
array([0, 1, 2, 3])
>>> mc.state_values = state_values[mc.state_values]
>>> mc.state_values
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

But the second implementation is x1.5 ~ x2 slower (seemingly due to sorting inside np.unique):

rng = np.random.default_rng(281334440973545911265024757027747152088)
n = 1_000
T = 1_000_000
X = rng.integers(n, size=T)

%timeit qe.markov.estimate_mc_discrete(X)

Version 1:

55.4 ms ± 2.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Version 2:

85.3 ms ± 3.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

jstac · 2022-12-03T18:48:53Z

@oyamad Many thanks for these proposals. Number 1 is a clear improvement and should be adopted. Regarding 2, there is a speed trade-off but I am still in favor because, at least for macro models, these discretizations tend to happen once at the start, rather than repeatedly.

One small point on proposal 2: In most instances, when people generate a vector time series, they will generate it as a matrix where columns are observations, as in https://quanteconpy.readthedocs.io/en/latest/tools/lss.html#quantecon.lss.simulate_linear_model. Should we handle that case?

oyamad · 2022-12-04T01:00:57Z

@jstac I did not realize the column major-ness of the output of simulate_linear_model. In that particular case, one can pass x.T.

I would like to keep the simple assumption "X is any array_like such that X[t] contains the t-th observation". (I added a note in 8296b62.)

jstac · 2022-12-04T22:23:11Z

@oyamad

I would like to keep the simple assumption "X is any array_like such that X[t] contains the t-th observation".

I think you are right. I'm in favor of all these proposed changes.

jstac · 2022-12-04T22:34:06Z

I propose that we change estimate_mc_discrete to estimate_mc because, at least in economics, a Markov chain always has a finite state space`. (I realize the same is not true in all of mathematics.)

If not, then we should change it to estimate_finite_mc, which is accurate, sounds more natural and hence is easier to remember.

oyamad · 2022-12-05T00:42:22Z

That makes sense.
Maybe we want to be consistent with the function called fit_discrete_mc in #640 (comment)?

estimate_mc / fit_mc
estimate_finite_mc / fit_finite_mc

(I have no strong preference over these. The shorter the better?)

jstac · 2022-12-05T03:44:58Z

In my mind, "fit" sounds like we are fitting continuous data to a discrete model. Hence it's appropriate for #640 but not here.

So, following your principle that shorter is better, let's go for estimate_mc.

oyamad · 2022-12-05T13:47:34Z

Changed the function name to estimate_mc in 8547cba.

("fit_discrete_mc" is fine with me, but just to argue for logical consistency: I am with you for the convention that a Markov chain is intrinsically finite, but if we follow that convention, isn't "discrete_mc" redundant?)

jstac · 2022-12-05T19:35:36Z

"fit_discrete_mc" is fine with me, but just to argue for logical consistency: I am with you for the convention that a Markov chain is intrinsically finite, but if we follow that convention, isn't "discrete_mc" redundant?

Yes, good point. You are right @oyamad .

Perhaps redundancy is sometimes useful to provide emphasis.

(Information theory says that redundancy is the key to communication in a noisy channel...)

oyamad · 2022-12-14T00:31:36Z

If we go with Version 2 in #658 (comment), we can merge markov_est_unique branch into this PR and then merge this PR itself into master.

jstac · 2022-12-14T00:39:05Z

If we go with Version 2 in #658 (comment), we can merge markov_est_unique branch into this PR and then merge this PR itself into master.

Agreed, and thanks. Please go ahead @oyamad or @Smit-create

oyamad

Ready to merge.

Smit-create · 2022-12-14T02:34:15Z

Merging, thanks @oyamad, @jstac and @mmcky

jstac mentioned this pull request Nov 27, 2022

ENH: Adding Discrete Approximation of VAR Methods #640

Merged

oyamad reviewed Nov 27, 2022

View reviewed changes

mmcky requested review from mmcky and oyamad November 28, 2022 01:46

mmcky approved these changes Nov 28, 2022

View reviewed changes

mmcky added enhancement ready labels Nov 28, 2022

oyamad reviewed Nov 28, 2022

View reviewed changes

jstac and others added 10 commits December 1, 2022 17:06

Added MLE estimation for Markov chains

5efb5b2

Changed name to estimate_mc_discrete

080f786

fix flake import njit

546e650

FIX: import numpy for tests

8fffc85

fix imports for tets

297bf04

Fix an error

ba4203c

Changed function name

2d9429f

Register estimate_mc_discrete in quantecon.markov namespace

29cc5c4

Avoid using for loop

a834574

Update docs

b96f317

Smit-create force-pushed the markov_est branch from a95015c to b96f317 Compare December 1, 2022 11:37

Smit-create requested a review from oyamad December 1, 2022 11:40

Rewrite to accept path of state values

d54aa81

Add a note

8296b62

Change the function name to estimate_mc

8547cba

Import from quantecon.markov

074d98e

oyamad approved these changes Dec 14, 2022

View reviewed changes

Smit-create merged commit b0c4bcf into master Dec 14, 2022

Smit-create deleted the markov_est branch December 14, 2022 02:34

jstac mentioned this pull request Dec 18, 2022

Add MLE estimation of Markov chains QuantEcon/QuantEcon.jl#283

Open

jstac mentioned this pull request Dec 30, 2022

Add function fit_discrete_mc #678

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MLE estimation for Markov chains #658

Add MLE estimation for Markov chains #658

jstac commented Nov 27, 2022

jstac commented Nov 27, 2022

oyamad left a comment

oyamad commented Nov 27, 2022 •

edited

Loading

oyamad commented Nov 27, 2022

coveralls commented Nov 27, 2022 •

edited

Loading

jstac commented Nov 27, 2022

mmcky left a comment

oyamad Nov 28, 2022

oyamad Nov 28, 2022

oyamad left a comment

oyamad commented Nov 28, 2022

jstac commented Nov 29, 2022

Smit-create commented Dec 1, 2022

jstac commented Dec 1, 2022

jstac commented Dec 3, 2022

oyamad commented Dec 3, 2022

jstac commented Dec 3, 2022

oyamad commented Dec 4, 2022

jstac commented Dec 4, 2022

jstac commented Dec 4, 2022

oyamad commented Dec 5, 2022

jstac commented Dec 5, 2022

oyamad commented Dec 5, 2022

jstac commented Dec 5, 2022

oyamad commented Dec 14, 2022

jstac commented Dec 14, 2022

oyamad left a comment

Smit-create commented Dec 14, 2022

Add MLE estimation for Markov chains #658

Add MLE estimation for Markov chains #658

Conversation

jstac commented Nov 27, 2022

jstac commented Nov 27, 2022

oyamad left a comment

Choose a reason for hiding this comment

oyamad commented Nov 27, 2022 • edited Loading

oyamad commented Nov 27, 2022

coveralls commented Nov 27, 2022 • edited Loading

jstac commented Nov 27, 2022

mmcky left a comment

Choose a reason for hiding this comment

oyamad Nov 28, 2022

Choose a reason for hiding this comment

oyamad Nov 28, 2022

Choose a reason for hiding this comment

oyamad left a comment

Choose a reason for hiding this comment

oyamad commented Nov 28, 2022

jstac commented Nov 29, 2022

Smit-create commented Dec 1, 2022

jstac commented Dec 1, 2022

jstac commented Dec 3, 2022

oyamad commented Dec 3, 2022

jstac commented Dec 3, 2022

oyamad commented Dec 4, 2022

jstac commented Dec 4, 2022

jstac commented Dec 4, 2022

oyamad commented Dec 5, 2022

jstac commented Dec 5, 2022

oyamad commented Dec 5, 2022

jstac commented Dec 5, 2022

oyamad commented Dec 14, 2022

jstac commented Dec 14, 2022

oyamad left a comment

Choose a reason for hiding this comment

Smit-create commented Dec 14, 2022

oyamad commented Nov 27, 2022 •

edited

Loading

coveralls commented Nov 27, 2022 •

edited

Loading