Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge devel to master to release v0.2.23 #794

Merged
merged 5 commits into from
Feb 14, 2025

Conversation

njzjz
Copy link
Member

@njzjz njzjz commented Feb 14, 2025

Summary by CodeRabbit

  • Chores

    • Upgraded the pre-commit configuration to improve code quality.
  • New Features

    • Introduced robust parsing and generation for simulation structure files, enhancing the reliability of data extraction.
  • Refactor

    • Streamlined the processing of structural and atomic information, resulting in clearer error messages and more consistent output.
  • Tests

    • Optimized file management during test setups and teardowns to ensure reliable handling of simulation files.

njzjz and others added 5 commits January 24, 2025 17:42
<!--pre-commit.ci start-->
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.9.2 →
v0.9.3](astral-sh/ruff-pre-commit@v0.9.2...v0.9.3)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
<!--pre-commit.ci start-->
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.9.3 →
v0.9.4](astral-sh/ruff-pre-commit@v0.9.3...v0.9.4)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
<!--pre-commit.ci start-->
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.9.4 →
v0.9.6](astral-sh/ruff-pre-commit@v0.9.4...v0.9.6)
<!--pre-commit.ci end-->

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
)

Refactor the codes to read and write ABACUS/STRU, and move the functions
in a single file abaucs/stru.py

Now, now using dpdata.system to read an ABACUS STRU will also return
below informations in data dict:
```
{
           "masses": list of atomic masses,
            "pp_files", list of pseudo potential files,
            "orb_files", list of orbital files,
            "dpks_descriptor": the deepks descriptor file,
}
```
And, these information can also be written to a new STRU file
automatically.


Later, I will based on this commit to fix the bug in dpgen
deepmodeling/dpgen#1711


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a dedicated module for structure file handling, enhancing
the parsing and conversion of lattice, species, and coordinate data.

- **Refactor**
- Streamlined data extraction processes for simulation and relaxation
workflows, reducing redundant operations and improving error clarity.
- Updated plugin methods to leverage the enhanced structure processing
functions for improved efficiency.

- **Tests**
- Improved test setups and cleanups, ensuring consistent handling of
structure files and robust validation of the new parsing logic.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: root <pxlxingliang>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@njzjz njzjz enabled auto-merge February 14, 2025 07:55
Copy link

codspeed-hq bot commented Feb 14, 2025

CodSpeed Performance Report

Merging #794 will not alter performance

Comparing njzjz:devel-0.2.23 (bd42e6a) with master (5b7af6f)

Summary

✅ 2 untouched benchmarks

@njzjz njzjz disabled auto-merge February 14, 2025 07:56
Copy link

coderabbitai bot commented Feb 14, 2025

📝 Walkthrough

Walkthrough

This pull request upgrades the ruff linter version in the pre-commit configuration and significantly refactors the ABACUS modules. The changes remove legacy functions for cell and coordinate extraction and consolidate logic into new STRU file parsing functions. A dedicated module is introduced to handle STRU files, replacing scattered functionality in multiple modules. Test setups and teardown routines are updated accordingly, and method calls in the Abacus plugin are simplified by leveraging the new STRU functions.

Changes

File(s) Change Summary
.pre-commit-config.yaml Updated the ruff-pre-commit revision from v0.9.2 to v0.9.6.
dpdata/abacus/md.py, dpdata/abacus/relax.py Removed get_cell and get_coords; integrated new logic using get_frame_from_stru; modified handling of atomic coordinates, spins, and the move attribute.
dpdata/abacus/scf.py Removed various STRU parsing functions (e.g., get_block, parse_stru_pos, etc.); added a new unified get_frame_from_stru; updated management of magnetic moments and forces.
dpdata/abacus/stru.py Introduced a new module with functions for splitting STRU blocks, parsing atomic species, numerical orbitals, lattice constants/vectors, positions, and generating unlabeled STRU files.
dpdata/plugins/abacus.py Refactored method calls by removing the module prefix, directly using the imported get_frame_from_stru and make_unlabeled_stru functions from the STRU module.
tests/test_abacus_relax.py Revised setup and teardown to handle the STRU file by copying and removing the file appropriately, streamlining the file management in tests.
tests/test_abacus_stru_dump.py Renamed parse_stru_pos to parse_pos_oneline in both import and test cases; added a new test method (test_read_stru) to verify the STRU file reading process.

Sequence Diagram(s)

sequenceDiagram
    participant Plugin as AbacusSTRUFormat
    participant STRU as dpdata.abacus.stru
    participant System as Data System

    Plugin->>STRU: get_frame_from_stru(file_name)
    STRU->>STRU: Parse STRU file (split blocks, extract positions, etc.)
    STRU-->>Plugin: Return structured data
    Plugin->>STRU: make_unlabeled_stru(data, frame_idx)
    STRU-->>Plugin: Generate unlabeled STRU file output
Loading

Possibly related PRs

Suggested reviewers

  • wanghan-iapcm
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (15)
dpdata/abacus/relax.py (1)

211-212: Expanding move for each frame
The same 'move' value is assigned to all frames. Confirm this is intended. If per-frame values differ, consider storing a distinct entry for each frame.

dpdata/abacus/scf.py (1)

213-217: Return early if not converged
This prevents processing partially converged states. Consider logging a warning or an error to highlight incomplete data.

dpdata/abacus/md.py (1)

219-221: Expand 'move' across MD frames
Assigning the first 'move' entry to all frames may be fine if 'move' is constant. Otherwise, consider a per-frame value.

dpdata/abacus/stru.py (6)

57-82: Add error checks for empty input

Currently, if the lines list is empty or malformed, attempting to split on line 70 may raise an index error. Consider adding a guard clause to raise a descriptive error if the block is unexpectedly empty or missing fields.


126-270: Consider a more modular approach

This function handles many potential keywords and values within a single routine, which can be harder to maintain or extend. You could split the parsing of different aspects (move, velocity, mag, angles, etc.) into smaller helper functions or use a more structured data format to improve readability.


272-315: Consolidate isinstance checks and use a ternary operator

You can simplify the logic by merging the isinstance checks for atommag and replacing the if-else block for mag_norm with a ternary operator. For example:

-    if not (isinstance(atommag, list) or isinstance(atommag, float)):
+    if not isinstance(atommag, (list, float)):
         raise RuntimeError(f"Invalid atommag: {atommag}")

...

-        if isinstance(atommag, list):
-            mag_norm = np.linalg.norm(atommag)
-        else:
-            mag_norm = atommag
+        mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag
🧰 Tools
🪛 Ruff (0.8.2)

291-291: Multiple isinstance calls for atommag, merge into a single call

Merge isinstance calls for atommag

(SIM101)


306-309: Use ternary operator mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag instead of if-else-block

Replace if-else-block with mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag

(SIM108)


317-336: Correct the spelling of "carteisan_coords"

The function name and docstring wording contain a small typo. Consider renaming to improve clarity and consistency with standard terminology:

-def get_carteisan_coords(coords, coord_type, celldm, cell):
+def get_cartesian_coords(coords, coord_type, celldm, cell):

338-415: Rename unused loop variable and consider ternary operators

  1. The loop variable iline (line 375) is never referenced; rename it to _ or remove it.
  2. The blocks at lines 396-399 and 401-404 could be expressed as ternary assignments to simplify:
-for iline in range(atom_numbs[it]):
+for _ in range(atom_numbs[it]):

...

-if all([i is None for i in move]):
-    move = []
-else:
-    move = np.array(move, dtype=bool)
+move = [] if all(i is None for i in move) else np.array(move, dtype=bool)

-if all([i is None for i in velocity]):
-    velocity = []
-else:
-    velocity = np.array(velocity)
+velocity = [] if all(i is None for i in velocity) else np.array(velocity)
🧰 Tools
🪛 Ruff (0.8.2)

375-375: Loop control variable iline not used within loop body

Rename unused iline to _iline

(B007)


396-399: Use ternary operator move = [] if all([i is None for i in move]) else np.array(move, dtype=bool) instead of if-else-block

Replace if-else-block with move = [] if all([i is None for i in move]) else np.array(move, dtype=bool)

(SIM108)


401-404: Use ternary operator velocity = [] if all([i is None for i in velocity]) else np.array(velocity) instead of if-else-block

Replace if-else-block with velocity = [] if all([i is None for i in velocity]) else np.array(velocity)

(SIM108)


487-787: Refactor for maintainability and consider stacklevel in warnings

This function is quite large and handles many details (linking files, processing multiple optional fields, etc.). Splitting it into smaller helpers can improve readability. Also, consider including stacklevel=2 (or higher) for warnings.warn calls to help callers locate the warning source:

-warnings.warn(
+warnings.warn(
     "pp_file is not provided, will use empty string for pseudo potential file.",
+    stacklevel=2
 )
🧰 Tools
🪛 Ruff (0.8.2)

623-627: Use a single if statement instead of nested if statements

(SIM102)


641-641: No explicit stacklevel keyword argument found

(B028)

tests/test_abacus_relax.py (6)

21-24: Consolidate repeated file-copy logic

Copying the same files in multiple test classes can be centralized to reduce duplication. Consider using a shared setup fixture or helper function for better maintainability.


30-31: Tear-down file removal repeated

File removal logic is repeated in various tearDown methods. Encapsulating this step in a shared routine or test fixture might simplify future modifications.


100-103: Use a shared setup for STRU file copies

Similar to the previous tests, copying "STRU.h2o" to "STRU" can be managed by a common fixture or helper, reducing duplication across test classes.


119-120: Repeated teardown logic

Removing "STRU" is consistent with other test classes. Consolidating this pattern would improve overall code clarity.


139-142: Suggest shared teardown utility

Again, the same removal logic appears here. Moving it into a shared teardown utility or fixture can help reduce boilerplate and errors.


185-188: Unify test cleanup

Repeating file removal steps in each teardown method can be refactored by leveraging a common cleanup strategy or a utility function shared by these test classes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b7af6f and bd42e6a.

📒 Files selected for processing (8)
  • .pre-commit-config.yaml (1 hunks)
  • dpdata/abacus/md.py (3 hunks)
  • dpdata/abacus/relax.py (4 hunks)
  • dpdata/abacus/scf.py (2 hunks)
  • dpdata/abacus/stru.py (1 hunks)
  • dpdata/plugins/abacus.py (3 hunks)
  • tests/test_abacus_relax.py (5 hunks)
  • tests/test_abacus_stru_dump.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
dpdata/abacus/stru.py

291-291: Multiple isinstance calls for atommag, merge into a single call

Merge isinstance calls for atommag

(SIM101)


306-309: Use ternary operator mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag instead of if-else-block

Replace if-else-block with mag_norm = np.linalg.norm(atommag) if isinstance(atommag, list) else atommag

(SIM108)


375-375: Loop control variable iline not used within loop body

Rename unused iline to _iline

(B007)


396-399: Use ternary operator move = [] if all([i is None for i in move]) else np.array(move, dtype=bool) instead of if-else-block

Replace if-else-block with move = [] if all([i is None for i in move]) else np.array(move, dtype=bool)

(SIM108)


401-404: Use ternary operator velocity = [] if all([i is None for i in velocity]) else np.array(velocity) instead of if-else-block

Replace if-else-block with velocity = [] if all([i is None for i in velocity]) else np.array(velocity)

(SIM108)


623-627: Use a single if statement instead of nested if statements

(SIM102)


641-641: No explicit stacklevel keyword argument found

(B028)

🔇 Additional comments (19)
dpdata/abacus/relax.py (4)

17-17: Import new function from stru
This addition is consistent with the refactoring approach. No issues found.


49-49: Enhanced error message
This clarified text about atom number mismatch is helpful for debugging.


183-187: Confirm logic for removing spins
Removing spins from data might be fine, but double-check that there's no scenario requiring spin data later on. Also verify that "atom_numbs" reliably exists in data.


194-194: Use of get_coords_from_log
The function call appears correct. Please ensure that the natoms argument aligns with the expected data across all usage sites.

dpdata/abacus/scf.py (6)

11-12: Added imports for length and pressure conversions
These imports align with the code’s approach to unit handling. No concerns.


205-205: Minor comment modification
No functional change introduced here.


218-224: Centralizing data retrieval with get_frame_from_stru
Ensure "atom_numbs" is always defined in the returned dictionary. Removing spins is likely correct for SCF scenarios, but confirm you never need the spin data further.


226-226: Comment on retrieving magmom and forces
No logical changes are made here.


238-238: Initialize 'orig' to zero
This sets a default origin. Confirm there's no requirement to retain a non-zero origin from prior data.


240-241: Scaling stress by cell volume
Using data["cells"][0] might be fine for single-frame SCF. For multi-frame scenarios, consider scaling stress frame-by-frame instead.

dpdata/abacus/md.py (2)

16-16: Import get_frame_from_stru
This aligns with the refactoring strategy for STRU-based data retrieval.


166-171: Adopt new method for reading geometry
Using get_frame_from_stru is consistent with other modules. Confirm removing spins is valid for MD workflows or if they’re needed.

dpdata/abacus/stru.py (2)

14-55: Well-structured parsing logic

The block-based parsing approach here is clear and well documented. The implementation neatly handles multiple keyword-defined sections in the ABACUS STRU file. Good job!


417-485: Comprehensive file parsing logic

This function verifies the file path and parses all necessary blocks for atomic and lattice data. The step-by-step workflow is well documented, and the return structure is clear. No pressing issues found.

dpdata/plugins/abacus.py (1)

11-11: Streamlined imports and usage

Importing and using get_frame_from_stru and make_unlabeled_stru directly improves readability and avoids referencing removed module-level functions. No issues found with this refactoring.

Also applies to: 24-24, 42-47

tests/test_abacus_stru_dump.py (3)

10-10: LGTM!

The import statement has been correctly updated to reflect the renamed function.


32-42: LGTM! Well-structured test case.

The new test method thoroughly validates the STRU file reading functionality by checking:

  • Presence of required files (pp_files, orb_files, dpks_descriptor)
  • Content validation for atomic species, numerical orbitals, and descriptors

211-211: LGTM!

The test method has been correctly renamed to match the updated function name.

Also applies to: 213-213

.pre-commit-config.yaml (1)

24-24:

✅ Verification successful

Verify the Ruff version.

Let's check if v0.9.6 is the latest stable version of Ruff.


🌐 Web query:

What is the latest stable version of Ruff?

💡 Result:

The latest stable version of Ruff is 0.9.6, released on February 10, 2025[2]. This version is available for installation via pip and includes recent updates such as Python 3.13 compatibility and improved linting rules[2][3].

For comparison:

  • Version 0.9.2 introduced fixes for notebook diagnostics and rule improvements[5].
  • Version 0.8.6 included updates to string formatting and error handling[5].

Ruff follows a custom versioning scheme where minor versions indicate breaking changes and patch versions denote bug fixes[4].

Citations:


Ruff version is up-to-date.

The rev: v0.9.6 update in .pre-commit-config.yaml correctly reflects the latest stable release of Ruff (released February 10, 2025), so no changes are needed.

Comment on lines +248 to +249
if move is not None:
data["move"] = move
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Preserving 'move' data
In other modules, 'move' is expanded for multiple frames. Verify that returning it unmodified here is consistent with the rest of the code.

@njzjz njzjz merged commit 35a0af0 into deepmodeling:master Feb 14, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants