Indices of model fit quality for SSMs (R2) #46
Replies: 1 comment 2 replies
-
@DominiqueMakowski, I think you are correct on both accounts. Using measures of absolute fit is necessary to determine whether the best fitting model is adequate. Moreover, absolute measures can help diagnose deficiencies in the model. I also think you are right about R2 not being the right measure because we want to know how well the model fits the entire distribution. I think there are a few reasonable approaches. One would be to compare the probability mass between different quantiles of the conditional rt distributions. For example, the probability mass between the .4 and .45 might be .15 for correct responses, but the 95% credible interval of the posterior predictive distribution for this statistic might range between .16 and .19. You could possibly do something with the conditional CDF, but a downside in my opinion is that a misfit in the leading edge of the distribution would potentially carry through (e.g., shift the CDF to the left or right for the values > t). Yet another idea might be to use a distance metric for the conditional distributions. The challenge would be selecting the right metric. KL divergence is popular, but has the disadvantage of being asymmetric. Wasserstein distance is symmetric, but is more challenging to compute. |
Beta Was this translation helpful? Give feedback.
-
I've been looking in the literature for some examples or discussions about assessing the fit quality and/or the predictive power of SSMs, but I found nothing.
I think the Bayes factor comparison allowed by Pigeons is one amazing feature, but having an "absolute" index (like an R2) would be a good addition. Especially since a model (especially basic ones with no predictor variables) could in principle be very "good" (by posterior predictive checks; the predicted distribution matches exactly the data distribution), but have an R2 of 0 (each observation is not predicted at all).
The two problems I can foresee is that SSMs are fundamentally distributional, but that's not necessarily an issue in itself. Bayesian R2 exist and does return a distribution as well, from which one can compute the mean & CI.
The second specificity is that SSMs jointly predict two outcomes (RT and choice). But I wonder if it would make sense, in principle, to compute say the R2 for the RTs (within each choice), and then the R2 for the choice (an R2 variant made for categorical outcomes) separately. Of course, by doing that separately we lose some nuance, but is this approach fundamentally misguided?
Beta Was this translation helpful? Give feedback.
All reactions