-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New update to ClusterSubspace.from_cutoffs can miss some clusters below cutoff #104
Comments
Actually for pairs and triplets it seems to do just fine. Even setting the cutoff to exactly the diameter of a cluster finds that cluster. But for quads the issue remains. |
@zhongpc thanks for looking into this (#128). The solution of removing the division by 2 gives the correct results but makes the code a lot slower since the initial pool of clusters includes much larger clusters than those within the given cutoffs. I am not sure if the issue actually comes from this function, or the pymatgen method for getting neighbors. It seems that could be were the issue arises, since I also notices pairs and triplets have no issue when compared with pyabinitio. As a meanwhile solution, we could increase |
Recapped this issue with Peichen. The purpose of adding an max_lp in your neighbor search diameter, is to gurantee that, measuring from the primitive cell centoid [0.5, 0.5, 0.5], any site in all your searched orbits of size N should not be further away than diameter + max_lp. So the max_lp should be chosen as the maximum possible distance from the centoid to another point INSIDE the primitive cell. This distance is clearly not max(a,b,c)/2, And setting to max(a,b,c), like peichen did, should work, but is too large and not efficient. Geometrically, norm(vector_a+vector_b+vector_c)/2 is the minimun correct max_lp distance to use. And we should also add a small numerical tolerance, like 0.01 |
And another point is that, in this line of code, you should append point clusters' coordinates as np.mod(site.frac_coords, 1), not site.frac_coords:
|
Because for your radius search assumption to work, all the orbits in your searched cluster subspace must have at least one point inside the initial 0.5, 0.5,0.5 primitive cell, but the input format of your cluster expansion structure does not necessarily guarantee that. The fractional coordinates can be any real number, and is not automatically rounded to [0, 1]. So to make your algorithm robust to input data, I think we should mod the coordinates by 1 before feeding into your cluster search algorithm. And I think this is what pymatgen also does before finding neighbors |
BTW, since you are generating a list of neighbors for each cluster size (unlike in pyabinitio, where Danill used whatever maximum cluster radius in that dictionary), here is a new problem. You need to specify cutoff cluster diameters to monotonically decrease with cluster size. Since you generate higher order clusters by appending new sites in the neighbor sphere to smaller clusters, if your lower order cluster cutoff is smaller than that of higher order, you may miss out some higher order clusters which could have been built based on longer low-order clusters. This modification is excellent to improve computational cost, but you also need to add a line in the documentation to mention the need of a decreasing input. |
I have modified the mentioned points in the cn-sgmc branch. Please update master, and mark issue solved. These tests in the attached picture are done on (LiMnVTi)(OF), with a=b=c=2.96984848, alpha=beta=gamma=60. Smol gives the same number of correlation functions as pyabinitio, and finishes search in slightly shorter time (23s vs 20s at 1231 ECIs), up to quintuplets, and total of 1231 ECIs. |
Thanks a lot @qchempku2017 ! I am cherry picking this fix into master, and running all tests to make sure things are working well. The only last suggestion I have is adding a relative tolerance rather than an absolute one to handle different cell sizes appropriately, any thoughts? |
Expected Behavior
When creating a
ClusterSubspace.from_cutoffs
all the clusters of given size below the provided cutoff should be obtained.Current Behavior
The newest update which improved performance of orbit/cluster generation has too tight of a search window such that in some cases it may miss clusters that are within the given cutoff. For example for a system I am currently using setting the cutoff for clusters of size 4 to 3, misses a quadruplet of diameter 2.97. Setting the cutoff slightly higher to 3.1 correctly finds it.
Possible Solution
The optimal point where not too many unnecessary or ideally no clusters are discarded is needed to get the correct behavior.
The text was updated successfully, but these errors were encountered: