Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New update to ClusterSubspace.from_cutoffs can miss some clusters below cutoff #104

Closed
lbluque opened this issue Dec 12, 2020 · 8 comments · Fixed by #138
Closed

New update to ClusterSubspace.from_cutoffs can miss some clusters below cutoff #104

lbluque opened this issue Dec 12, 2020 · 8 comments · Fixed by #138
Assignees
Labels
bug Something isn't working

Comments

@lbluque
Copy link
Collaborator

lbluque commented Dec 12, 2020

Expected Behavior

When creating a ClusterSubspace.from_cutoffs all the clusters of given size below the provided cutoff should be obtained.

Current Behavior

The newest update which improved performance of orbit/cluster generation has too tight of a search window such that in some cases it may miss clusters that are within the given cutoff. For example for a system I am currently using setting the cutoff for clusters of size 4 to 3, misses a quadruplet of diameter 2.97. Setting the cutoff slightly higher to 3.1 correctly finds it.

Possible Solution

The optimal point where not too many unnecessary or ideally no clusters are discarded is needed to get the correct behavior.

@lbluque lbluque added the bug Something isn't working label Dec 12, 2020
@lbluque lbluque self-assigned this Dec 12, 2020
@lbluque
Copy link
Collaborator Author

lbluque commented Dec 12, 2020

Actually for pairs and triplets it seems to do just fine. Even setting the cutoff to exactly the diameter of a cluster finds that cluster. But for quads the issue remains.

@lbluque
Copy link
Collaborator Author

lbluque commented May 21, 2021

@zhongpc thanks for looking into this (#128). The solution of removing the division by 2 gives the correct results but makes the code a lot slower since the initial pool of clusters includes much larger clusters than those within the given cutoffs.

I am not sure if the issue actually comes from this function, or the pymatgen method for getting neighbors. It seems that could be were the issue arises, since I also notices pairs and triplets have no issue when compared with pyabinitio.

As a meanwhile solution, we could increase max_lp but I think the factor of 2 (as in pyabinitio) is too much, it would be better to have the smallest number that gives the correct results. And ideally check if there is actually some wrong behavior from pymatgen.

@qchempku2017
Copy link
Collaborator

qchempku2017 commented Sep 15, 2021

Recapped this issue with Peichen.

The purpose of adding an max_lp in your neighbor search diameter, is to gurantee that, measuring from the primitive cell centoid [0.5, 0.5, 0.5], any site in all your searched orbits of size N should not be further away than diameter + max_lp. So the max_lp should be chosen as the maximum possible distance from the centoid to another point INSIDE the primitive cell.

This distance is clearly not max(a,b,c)/2, And setting to max(a,b,c), like peichen did, should work, but is too large and not efficient. Geometrically, norm(vector_a+vector_b+vector_c)/2 is the minimun correct max_lp distance to use. And we should also add a small numerical tolerance, like 0.01

@qchempku2017
Copy link
Collaborator

And another point is that, in this line of code, you should append point clusters' coordinates as np.mod(site.frac_coords, 1), not site.frac_coords:

    for nbit, site, sbasis in zip(nbits, exp_struct, site_bases):

        new_orbit = Orbit([np.mod(site.frac_coords, 1)],
                          exp_struct.lattice,
                          [list(range(nbit))], [sbasis], symops)

@qchempku2017
Copy link
Collaborator

qchempku2017 commented Sep 15, 2021

Because for your radius search assumption to work, all the orbits in your searched cluster subspace must have at least one point inside the initial 0.5, 0.5,0.5 primitive cell, but the input format of your cluster expansion structure does not necessarily guarantee that. The fractional coordinates can be any real number, and is not automatically rounded to [0, 1]. So to make your algorithm robust to input data, I think we should mod the coordinates by 1 before feeding into your cluster search algorithm. And I think this is what pymatgen also does before finding neighbors

@qchempku2017
Copy link
Collaborator

BTW, since you are generating a list of neighbors for each cluster size (unlike in pyabinitio, where Danill used whatever maximum cluster radius in that dictionary), here is a new problem. You need to specify cutoff cluster diameters to monotonically decrease with cluster size.

Since you generate higher order clusters by appending new sites in the neighbor sphere to smaller clusters, if your lower order cluster cutoff is smaller than that of higher order, you may miss out some higher order clusters which could have been built based on longer low-order clusters.

This modification is excellent to improve computational cost, but you also need to add a line in the documentation to mention the need of a decreasing input.

@qchempku2017
Copy link
Collaborator

image

I have modified the mentioned points in the cn-sgmc branch. Please update master, and mark issue solved.

These tests in the attached picture are done on (LiMnVTi)(OF), with a=b=c=2.96984848, alpha=beta=gamma=60. Smol gives the same number of correlation functions as pyabinitio, and finishes search in slightly shorter time (23s vs 20s at 1231 ECIs), up to quintuplets, and total of 1231 ECIs.

@lbluque
Copy link
Collaborator Author

lbluque commented Sep 15, 2021

Thanks a lot @qchempku2017 !

I am cherry picking this fix into master, and running all tests to make sure things are working well.

The only last suggestion I have is adding a relative tolerance rather than an absolute one to handle different cell sizes appropriately, any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants