-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Density heatmap for large datasets #314
Comments
This would be an excellent feature, but it is unclear (1) how much support is there for this kind of idea in plotly and (2) how much this would weigh on the memory footprint of the dataset and widget. Chemiscope is built assuming that everything can be made portable, and even the dynamical loading of structures is something we never exploited much. Perhaps one possibility would be to still have only a few hardcoded representative structures in the dataset, but add volumetric data that can be visualized in plotly, to give a better sense of the distribution of data. In this sense, one could imagine of providing "shape" data for the property panel similar to what we recently added for structures. This way, one could visualize a convex hull, or do something a volumetric plot of the density of points. Perhaps it'd help to advance the discussion if you explained what is the problem you are facing and want to solve. |
Thanks for the fast reply and consideration. I'm hoping to use chemiscope to visualise structures stored in the NOMAD database. The dream is to have an interactive plot that updates in "near real time" as a user specifies/adjusts their query. This is an example query and there are already some interactive widgets. The whole database contains ~10 million structures so I'm suspicious (but admit I haven't checked this...) that visualising large queries will be very slow atm and that some sort of heatmap + dynamical loading of structure data is the way to go. We're planning to precompute averaged SOAP vectors (element agnostic) and MACE descriptors (learned alchemical embedding) for every structure. Then use some combination of PCA and parametric UMAP for the dimensional reduction depending on the size of the query. |
In terms of your points |
Since this is intended for a specific deployment of chemiscope at NOMAD, another possible solution would be to replace the map widget entirely, and only re-use the other parts of the code. You could write a new widget with whatever technology works best to display the heatmap and link it to the chemiscope structure viewer, loading structures on-demand. |
I think one possibility that would combine a lot of advantages and be relatively easy would be to have the query generate a chemiscope .json that is then loaded dynamically, "sparsified" to show some representative structures. If you then zoom in, one could then have a button to update the view, re-generating a .json for that section. |
For very large datasets it would be nice to have the option of replacing the scatter plot with a density heatmap. I'm imagining loading a random structure from each "bin" and maybe dynamically updating the binning with the zoom level.
Happy to have a go at this myself but keen for any suggestions!
@Luthaf already suggested using a custom loadStructure callback for visualising the structures on demand.
The text was updated successfully, but these errors were encountered: