[Question] Different results every run #1128

gusamarante · 2025-01-13T20:01:23Z

Description

I am trying to estimate a 4-state Gaussian Hidden Markov Model. Two things are happening:

Every time I run the same code, I am getting different estimates for the parameters, even when setting a random seed. Is this the behavior?
Even though the transition matrix edges is different every time, the computed steady state distribution always ends up in uniform distribution.

It may very well be the case that I am doing something wrong, but I went deep into the documentation and could not find something to help. Due to the sample size, I believe there is no identification problem.

Reproduction Code

I am reading data from this excel worksheet.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from pomegranate.distributions import Normal
from pomegranate.hmm import DenseHMM

n_states = 4

# READ DATA
file_path = r"path/to/file/NAVs.xlsx"
df = pd.read_excel(file_path, index_col=0)
df.index = pd.to_datetime(df.index)
rets = df.resample("M").last().pct_change(1).dropna()  # about 330 lines and 5 columns

# THE MODEL
hmm = DenseHMM(
    distributions=[Normal() for _ in range(n_states)],
    verbose=True,
)
hmm = hmm.fit(X=np.array([rets.values]))  # to make sure that X is 3D

# Transition Probability (changes every time I run)
trans_prob = pd.DataFrame(np.exp(np.array(hmm.edges)))
trans_prob = trans_prob.div(trans_prob.sum(axis=1), axis=0)  # Reduce numerical error

# Stationary distribution (always outputs a uniform distribution)
vals, vecs = np.linalg.eig(trans_prob)
stat_dist = pd.Series(vecs[:, np.argmax(vals)])
stat_dist = stat_dist * np.sign(stat_dist)
stat_dist = stat_dist / stat_dist.sum()
stat_dist.plot(kind="bar")
plt.show()

# States probabilities (changes every time I run)
state_probs = pd.DataFrame(data=hmm.predict_proba(np.array([rets.values]))[0], index=rets.index)
state_probs.plot()
plt.show()

# Predicted / Most likely State (changes every time I run)
state_pred = pd.Series(data=hmm.predict(np.array([rets.values]))[0], index=rets.index)
state_pred.plot()
plt.show()

The text was updated successfully, but these errors were encountered:

jmschrei · 2025-01-16T18:32:29Z

Sorry you're encountering this issue.

The only source of randomness in pomegranate's HMMs should be in the initial clustering. The predictions do not involve randomness at all. How are you setting a seed? You might need to set both the numpy and torch random seeds. Unfortunately, torch is a bit challenging to ensure randomness for. Maybe you could try the first-k initialization and see? Can you ost your code where you set the random seed?

gusamarante changed the title ~~Different results every run~~ [Question] Different results every run Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Different results every run #1128

[Question] Different results every run #1128

gusamarante commented Jan 13, 2025

jmschrei commented Jan 16, 2025

[Question] Different results every run #1128

[Question] Different results every run #1128

Comments

gusamarante commented Jan 13, 2025

Description

Reproduction Code

jmschrei commented Jan 16, 2025