-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge devel to master to release v0.2.23 #794
Conversation
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.9.2 → v0.9.3](astral-sh/ruff-pre-commit@v0.9.2...v0.9.3) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.9.3 → v0.9.4](astral-sh/ruff-pre-commit@v0.9.3...v0.9.4) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
<!--pre-commit.ci start--> updates: - [github.com/astral-sh/ruff-pre-commit: v0.9.4 → v0.9.6](astral-sh/ruff-pre-commit@v0.9.4...v0.9.6) <!--pre-commit.ci end--> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
) Refactor the codes to read and write ABACUS/STRU, and move the functions in a single file abaucs/stru.py Now, now using dpdata.system to read an ABACUS STRU will also return below informations in data dict: ``` { "masses": list of atomic masses, "pp_files", list of pseudo potential files, "orb_files", list of orbital files, "dpks_descriptor": the deepks descriptor file, } ``` And, these information can also be written to a new STRU file automatically. Later, I will based on this commit to fix the bug in dpgen deepmodeling/dpgen#1711 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a dedicated module for structure file handling, enhancing the parsing and conversion of lattice, species, and coordinate data. - **Refactor** - Streamlined data extraction processes for simulation and relaxation workflows, reducing redundant operations and improving error clarity. - Updated plugin methods to leverage the enhanced structure processing functions for improved efficiency. - **Tests** - Improved test setups and cleanups, ensuring consistent handling of structure files and robust validation of the new parsing logic. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: root <pxlxingliang> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
CodSpeed Performance ReportMerging #794 will not alter performanceComparing Summary
|
📝 WalkthroughWalkthroughThis pull request upgrades the ruff linter version in the pre-commit configuration and significantly refactors the ABACUS modules. The changes remove legacy functions for cell and coordinate extraction and consolidate logic into new STRU file parsing functions. A dedicated module is introduced to handle STRU files, replacing scattered functionality in multiple modules. Test setups and teardown routines are updated accordingly, and method calls in the Abacus plugin are simplified by leveraging the new STRU functions. Changes
Sequence Diagram(s)sequenceDiagram
participant Plugin as AbacusSTRUFormat
participant STRU as dpdata.abacus.stru
participant System as Data System
Plugin->>STRU: get_frame_from_stru(file_name)
STRU->>STRU: Parse STRU file (split blocks, extract positions, etc.)
STRU-->>Plugin: Return structured data
Plugin->>STRU: make_unlabeled_stru(data, frame_idx)
STRU-->>Plugin: Generate unlabeled STRU file output
Possibly related PRs
Suggested reviewers
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (15)
dpdata/abacus/relax.py (1)
211-212
: Expandingmove
for each frame
The same'move'
value is assigned to all frames. Confirm this is intended. If per-frame values differ, consider storing a distinct entry for each frame.dpdata/abacus/scf.py (1)
213-217
: Return early if not converged
This prevents processing partially converged states. Consider logging a warning or an error to highlight incomplete data.dpdata/abacus/md.py (1)
219-221
: Expand'move'
across MD frames
Assigning the first'move'
entry to all frames may be fine if'move'
is constant. Otherwise, consider a per-frame value.dpdata/abacus/stru.py (6)
57-82
: Add error checks for empty inputCurrently, if the
lines
list is empty or malformed, attempting to split on line 70 may raise an index error. Consider adding a guard clause to raise a descriptive error if the block is unexpectedly empty or missing fields.
126-270
: Consider a more modular approachThis function handles many potential keywords and values within a single routine, which can be harder to maintain or extend. You could split the parsing of different aspects (move, velocity, mag, angles, etc.) into smaller helper functions or use a more structured data format to improve readability.
272-315
: Consolidateisinstance
checks and use a ternary operatorYou can simplify the logic by merging the
isinstance
checks foratommag
and replacing theif
-else
block formag_norm
with a ternary operator. For example:- if not (isinstance(atommag, list) or isinstance(atommag, float)): + if not isinstance(atommag, (list, float)): raise RuntimeError(f"Invalid atommag: {atommag}") ... - if isinstance(atommag, list): - mag_norm = np.linalg.norm(atommag) - else: - mag_norm = atommag + mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag🧰 Tools
🪛 Ruff (0.8.2)
291-291: Multiple
isinstance
calls foratommag
, merge into a single callMerge
isinstance
calls foratommag
(SIM101)
306-309: Use ternary operator
mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag
instead ofif
-else
-blockReplace
if
-else
-block withmag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag
(SIM108)
317-336
: Correct the spelling of "carteisan_coords"The function name and docstring wording contain a small typo. Consider renaming to improve clarity and consistency with standard terminology:
-def get_carteisan_coords(coords, coord_type, celldm, cell): +def get_cartesian_coords(coords, coord_type, celldm, cell):
338-415
: Rename unused loop variable and consider ternary operators
- The loop variable
iline
(line 375) is never referenced; rename it to_
or remove it.- The blocks at lines 396-399 and 401-404 could be expressed as ternary assignments to simplify:
-for iline in range(atom_numbs[it]): +for _ in range(atom_numbs[it]): ... -if all([i is None for i in move]): - move = [] -else: - move = np.array(move, dtype=bool) +move = [] if all(i is None for i in move) else np.array(move, dtype=bool) -if all([i is None for i in velocity]): - velocity = [] -else: - velocity = np.array(velocity) +velocity = [] if all(i is None for i in velocity) else np.array(velocity)🧰 Tools
🪛 Ruff (0.8.2)
375-375: Loop control variable
iline
not used within loop bodyRename unused
iline
to_iline
(B007)
396-399: Use ternary operator
move = [] if all([i is None for i in move]) else np.array(move, dtype=bool)
instead ofif
-else
-blockReplace
if
-else
-block withmove = [] if all([i is None for i in move]) else np.array(move, dtype=bool)
(SIM108)
401-404: Use ternary operator
velocity = [] if all([i is None for i in velocity]) else np.array(velocity)
instead ofif
-else
-blockReplace
if
-else
-block withvelocity = [] if all([i is None for i in velocity]) else np.array(velocity)
(SIM108)
487-787
: Refactor for maintainability and consider stacklevel in warningsThis function is quite large and handles many details (linking files, processing multiple optional fields, etc.). Splitting it into smaller helpers can improve readability. Also, consider including
stacklevel=2
(or higher) forwarnings.warn
calls to help callers locate the warning source:-warnings.warn( +warnings.warn( "pp_file is not provided, will use empty string for pseudo potential file.", + stacklevel=2 )🧰 Tools
🪛 Ruff (0.8.2)
623-627: Use a single
if
statement instead of nestedif
statements(SIM102)
641-641: No explicit
stacklevel
keyword argument found(B028)
tests/test_abacus_relax.py (6)
21-24
: Consolidate repeated file-copy logicCopying the same files in multiple test classes can be centralized to reduce duplication. Consider using a shared setup fixture or helper function for better maintainability.
30-31
: Tear-down file removal repeatedFile removal logic is repeated in various tearDown methods. Encapsulating this step in a shared routine or test fixture might simplify future modifications.
100-103
: Use a shared setup for STRU file copiesSimilar to the previous tests, copying "STRU.h2o" to "STRU" can be managed by a common fixture or helper, reducing duplication across test classes.
119-120
: Repeated teardown logicRemoving "STRU" is consistent with other test classes. Consolidating this pattern would improve overall code clarity.
139-142
: Suggest shared teardown utilityAgain, the same removal logic appears here. Moving it into a shared teardown utility or fixture can help reduce boilerplate and errors.
185-188
: Unify test cleanupRepeating file removal steps in each teardown method can be refactored by leveraging a common cleanup strategy or a utility function shared by these test classes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.pre-commit-config.yaml
(1 hunks)dpdata/abacus/md.py
(3 hunks)dpdata/abacus/relax.py
(4 hunks)dpdata/abacus/scf.py
(2 hunks)dpdata/abacus/stru.py
(1 hunks)dpdata/plugins/abacus.py
(3 hunks)tests/test_abacus_relax.py
(5 hunks)tests/test_abacus_stru_dump.py
(5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
dpdata/abacus/stru.py
291-291: Multiple isinstance
calls for atommag
, merge into a single call
Merge isinstance
calls for atommag
(SIM101)
306-309: Use ternary operator mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag
instead of if
-else
-block
Replace if
-else
-block with mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag
(SIM108)
375-375: Loop control variable iline
not used within loop body
Rename unused iline
to _iline
(B007)
396-399: Use ternary operator move = [] if all([i is None for i in move]) else np.array(move, dtype=bool)
instead of if
-else
-block
Replace if
-else
-block with move = [] if all([i is None for i in move]) else np.array(move, dtype=bool)
(SIM108)
401-404: Use ternary operator velocity = [] if all([i is None for i in velocity]) else np.array(velocity)
instead of if
-else
-block
Replace if
-else
-block with velocity = [] if all([i is None for i in velocity]) else np.array(velocity)
(SIM108)
623-627: Use a single if
statement instead of nested if
statements
(SIM102)
641-641: No explicit stacklevel
keyword argument found
(B028)
🔇 Additional comments (19)
dpdata/abacus/relax.py (4)
17-17
: Import new function fromstru
This addition is consistent with the refactoring approach. No issues found.
49-49
: Enhanced error message
This clarified text about atom number mismatch is helpful for debugging.
183-187
: Confirm logic for removing spins
Removing spins fromdata
might be fine, but double-check that there's no scenario requiring spin data later on. Also verify that"atom_numbs"
reliably exists indata
.
194-194
: Use ofget_coords_from_log
The function call appears correct. Please ensure that thenatoms
argument aligns with the expected data across all usage sites.dpdata/abacus/scf.py (6)
11-12
: Added imports for length and pressure conversions
These imports align with the code’s approach to unit handling. No concerns.
205-205
: Minor comment modification
No functional change introduced here.
218-224
: Centralizing data retrieval withget_frame_from_stru
Ensure"atom_numbs"
is always defined in the returned dictionary. Removing spins is likely correct for SCF scenarios, but confirm you never need the spin data further.
226-226
: Comment on retrieving magmom and forces
No logical changes are made here.
238-238
: Initialize'orig'
to zero
This sets a default origin. Confirm there's no requirement to retain a non-zero origin from prior data.
240-241
: Scaling stress by cell volume
Usingdata["cells"][0]
might be fine for single-frame SCF. For multi-frame scenarios, consider scaling stress frame-by-frame instead.dpdata/abacus/md.py (2)
16-16
: Importget_frame_from_stru
This aligns with the refactoring strategy for STRU-based data retrieval.
166-171
: Adopt new method for reading geometry
Usingget_frame_from_stru
is consistent with other modules. Confirm removing spins is valid for MD workflows or if they’re needed.dpdata/abacus/stru.py (2)
14-55
: Well-structured parsing logicThe block-based parsing approach here is clear and well documented. The implementation neatly handles multiple keyword-defined sections in the ABACUS STRU file. Good job!
417-485
: Comprehensive file parsing logicThis function verifies the file path and parses all necessary blocks for atomic and lattice data. The step-by-step workflow is well documented, and the return structure is clear. No pressing issues found.
dpdata/plugins/abacus.py (1)
11-11
: Streamlined imports and usageImporting and using
get_frame_from_stru
andmake_unlabeled_stru
directly improves readability and avoids referencing removed module-level functions. No issues found with this refactoring.Also applies to: 24-24, 42-47
tests/test_abacus_stru_dump.py (3)
10-10
: LGTM!The import statement has been correctly updated to reflect the renamed function.
32-42
: LGTM! Well-structured test case.The new test method thoroughly validates the STRU file reading functionality by checking:
- Presence of required files (pp_files, orb_files, dpks_descriptor)
- Content validation for atomic species, numerical orbitals, and descriptors
211-211
: LGTM!The test method has been correctly renamed to match the updated function name.
Also applies to: 213-213
.pre-commit-config.yaml (1)
24-24
:✅ Verification successful
Verify the Ruff version.
Let's check if v0.9.6 is the latest stable version of Ruff.
🌐 Web query:
What is the latest stable version of Ruff?
💡 Result:
The latest stable version of Ruff is 0.9.6, released on February 10, 2025[2]. This version is available for installation via
pip
and includes recent updates such as Python 3.13 compatibility and improved linting rules[2][3].For comparison:
- Version 0.9.2 introduced fixes for notebook diagnostics and rule improvements[5].
- Version 0.8.6 included updates to string formatting and error handling[5].
Ruff follows a custom versioning scheme where minor versions indicate breaking changes and patch versions denote bug fixes[4].
Citations:
- 1: https://ruff.io/docs/release-notes.html
- 2: https://pypi.org/project/ruff/
- 3: https://docs.astral.sh/ruff/
- 4: https://docs.astral.sh/ruff/versioning/
- 5: https://github.com/astral-sh/ruff/releases
- 6: https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md
- 7: https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff
Ruff version is up-to-date.
The
rev: v0.9.6
update in.pre-commit-config.yaml
correctly reflects the latest stable release of Ruff (released February 10, 2025), so no changes are needed.
if move is not None: | ||
data["move"] = move |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Preserving 'move'
data
In other modules, 'move'
is expanded for multiple frames. Verify that returning it unmodified here is consistent with the rest of the code.
Summary by CodeRabbit
Chores
New Features
Refactor
Tests