-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mutual information between different high dimensional continuous signal #10
Comments
Those dimensionalities are a little high for the nearest neighbor based estimators. It might work if the signal is intrinsically low dimensional. |
Sorry, but I have not much knowledge in what is upper or lower bounds for MI? but following your dimensionality reduction, if i apply PCA on 300 dim signal but first components could not capture the most of the variences. do you think its ok to use mi after this? also i have applied the MINE (code found in github) in my data, yes for higher dimention (for example over 10) its not stable. although i am not sure the code is proper though. |
I've heard from others that MINE is actually not that stable, so it may not just be you. You can always do dimensionality reduction to get a LOWER bound on Mutual information. So if you take the first K components and then apply NPEET, that can be interpreted as an estimator for a lower bound on mutual information. It's hard to tell how good the lower bound will be... but you could try using K=10 or 20 components and see how stable your estimate looks. If K is too small, you will probably lose a lot of information. If K is too large, the NPEET estimates will be unstable. What would be cool is to make a plot of the mutual information estimate using different K, with error bars constructed using the shuffle/permutation test. You should see the MI estimate go up with K, but then at some point the error bars will become large. Hopefully you can find a good middle ground (i.e., a K which has large mutual information estimates, but small error bars). |
Some of the issues with MINE are discussed in this paper: |
I have tried with increasing pca components applied to my 300 dim signal. then i computed MI between reduced signal and my other 21 (y) dimention signal. I have used this from this plot should i choose K=52 is my optimum value? |
Interesting plot, thanks! I think it does make sense to pick K=52 as optimal. The way the decrease looked on the right side surprised me. My prediction was that error bars would get large, and I didn't predict the decrease. I looked around for some literature on this but didn't find anything. A little discussion about what (might!) be going on. |
One other thought: It's probably hard to come up with results that say that MI is systematically under-estimated in high dimensions because it is always possible that the data still lie approximately on a low dimensional manifold (in which case we'd still expect knn estimators to work). However, when you do PCA, all the components are normalized so that as you add more top components the effective dimensionality stays large, even if that isn't true for the original data. |
Hello Prof. Greg Ver Steeg,
I want to compute MI between two high dimensional continues time varying signal. their dimension are 39 and 300. It seems like this toolbox is not suitable for that. do you know if there is any easy way to measure the MI in this situation?
The text was updated successfully, but these errors were encountered: