Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Quick class method to get cluster count and cluster composition #436

Open
1 task done
Jeff-oakley opened this issue Oct 27, 2023 · 7 comments
Open
1 task done

Comments

@Jeff-oakley
Copy link

Email (Optional)

[email protected]

Problem

Can we have some class method to get cluster count (under indicator basis) and cluster composition more easily? Or can I double check if the code below is correct to get how many clusters we have?

# Assuming cs is a defined cluster subspace

feature_multiplicity = [1] # First cluster is empty cluster

for orbit in cs.orbits:
    feature_multiplicity.extend([orbit.multiplicity for _ in range(len(orbit.bit_combos))])

feature_multiplicity.append(1) # Last cluster is 1/dielectric constant

Proposed Solution

Add two class method? For the cluster count, maybe just use my code above or the corrected version if my code is not correct.

For the cluster composition, need a bit work on mapping bit_combos into a string? Is there a class attribute to map the indices in bit_combos into certain specie?

Alternatives

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@kamronald
Copy link
Collaborator

kamronald commented Oct 27, 2023

Hi Bin,

If you do cs.num_corr_functions, that should get what you're looking for. I'm assuming by "cluster" you mean correlation function. If you mean actual geometric clusters of sites, then cs.num_clusters is also a function.

For your second point, we do not have this feature yet. To get the relationship between bit_combos and a species (only in indicator basis) I think the best option right now is to print out a site space:
cs.orbits[0].site_bases[0].site_space
or
from smol.cofe.space.domain import get_site_spaces
get_site_spaces(cs.structure)
and the bit combo should correspond to the index of the species listed in the site_space. I think it would be a good idea to add the feature you mention.

@Jeff-oakley
Copy link
Author

Thanks Ronald. cs.num_corr_functions will only print the dimensionality of feature matrix. I am talking about how to obtain sth like: "how many Mn-Mn dimers we have with geometry defined in a specific orbit". This is essentially the value of structuralwrangle.feature_matrix, but has to be scaled up in a way. If you check the code I provided, this is the way I think how feature_matrix can be scaled up to reflect number of clusters for each orbit with certain specie decorator. Is that correct?

@kamronald
Copy link
Collaborator

kamronald commented Oct 27, 2023

Thanks for clarifying, I think I understand your question better now. So you want to multiply a value of your correlation vector in the feature matrix by a certain value (N) to obtain a concentration, and you are trying to obtain N?
If that is so, I think your for loop should be changed to:

for orbit in cs.orbits:
    feature_multiplicity.extend([orbit.multiplicity * len(arr) for arr in orbit.bit_combos])

Your code as written would only multiply the matrix element by the orbit multiplicity. However the composition of a cluster decoration may be degenerate by value len(arr), so you should multiply by that degeneracy as well.

@kamronald
Copy link
Collaborator

kamronald commented Oct 27, 2023

Actually thinking about it again, @Jeff-oakley I think you were right the first time. That extra multiplicity I mentioned shows up in the orbit but not in the correlation function, I believe.

@lbluque
Copy link
Collaborator

lbluque commented Nov 7, 2023

Hi @Jeff-oakley and @kamronald,

Obtaining the correlation function multiplicities should be implemented in the cs.function_ordering_multiplicities property.

As you mention, the only way to obtain the total number of specific cluster occupations (such as your example Mn-Mn dimers) right now is to use a cs with an indicator basis (assuming the occupation you want is included). To do so, you simply need to add the normalized=False to cs.corr_from_structure, or equivalently if using a ClusterExpansionProcessor should already be computing the extensive value. I would double check if I am not off by a multiplicity factor somewhere....

For the case of other basis functions, I have code that is not fully tested to obtain the transformation matrix needed to compute cluster counts from correlation vectors, but I have not had the time to clean it up and fully test it. However I would be happy to push it to a dev branch in case you are interested.

@Jeff-oakley
Copy link
Author

Thanks Ronald the Luis! I will look into that.

The indicator basis should be good enough for now but it would be great to have the cluster count for other basis as well:)

@Jeff-oakley
Copy link
Author

Hi @Jeff-oakley and @kamronald,

Obtaining the correlation function multiplicities should be implemented in the cs.function_ordering_multiplicities property.

As you mention, the only way to obtain the total number of specific cluster occupations (such as your example Mn-Mn dimers), efficiently right now is to use a cs with an indicator basis (assuming the occupation you want is included). To do you simply need to add the normalized=False to cs.corr_from_structure, or equivalently if using a ClusterExpansionProcessor this should already be the computed extensive value. I would double check if I am not of by a multiplicity factor somewhere....

For the case of other basis functions, I have code that is not fully tested to obtain the transformation matrix needed to compute cluster counts from correlation vectors but I have not had the time to clean it up and fully test it. However I would be happy to push it to a dev branch in case you are interested.

Hi Luis - I tried what you recommended. However I obtain fractional amount of clusters. Why we have 0.08333333333333333 number of cluster? Or maybe the cluster counting is not correct? I prepared one example as attached below
github_debug.zip

@Jeff-oakley Jeff-oakley reopened this Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants