-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix]#921 #925 Manual scrapping integration fixed. #924
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe changes in this pull request involve updates to the Changes
Assessment against linked issues
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
src/program/services/downloaders/__init__.py (1)
155-155
: Update docstring to reflect new return typeThe
get_instant_availability
method's docstring currently states "Check if the torrent is cached," but it now returns anOptional[TorrentContainer]
. Please update the docstring to reflect that it may returnNone
when the torrent is not available.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
src/program/services/downloaders/__init__.py
(1 hunks)src/routers/secure/scrape.py
(5 hunks)
🔇 Additional comments (2)
src/routers/secure/scrape.py (2)
222-225
: Good handling of optional return value from get_instant_availability
You've correctly updated the code to handle the Optional[TorrentContainer]
returned by get_instant_availability
. Checking for None
before setting stream.is_cached
ensures robust error handling.
285-285
: Verify the structure of containers
At line 285, containers
is assigned as a list containing a single dictionary:
containers = [{file.file_id: file for file in torrent_container.files}]
Please ensure that this structure matches the expected format for the client consuming this data. If the client expects a list of container dictionaries, this is acceptable. However, if only one container is expected without nesting in a list, consider adjusting the assignment to:
containers = {file.file_id: file for file in torrent_container.files}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (2)
entrypoint.sh (2)
6-6
: Add port number validationWhile setting a default port is good, it's recommended to validate that the PORT value is within the valid range (1-65535) to prevent runtime issues.
-PORT=${PORT:-8080} +PORT=${PORT:-8080} +if ! echo "$PORT" | grep -qE '^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$'; then + echo "PORT is not a valid port number (1-65535). Exiting..." + exit 1 +fi
88-90
: Improve non-root execution and debug setupSeveral improvements can be made to the non-root execution:
- debugpy should be installed during image build, not at runtime
- Command execution could be improved for better error handling
- poetry add debugpy - exec su -m $USERNAME -c "cd /riven/src && poetry run python3 -m debugpy --listen 0.0.0.0:5678 main.py --port $PORT" + exec su -m $USERNAME -c "cd /riven/src && set -e && poetry run python3 -m debugpy --listen 127.0.0.1:5678 main.py --port $PORT"- su -m "$USERNAME" -c "cd /riven/src && poetry run python3 main.py --port $PORT" + exec su -m "$USERNAME" -c "cd /riven/src && set -e && poetry run python3 main.py --port $PORT"Also, consider adding debugpy to your
pyproject.toml
as a development dependency instead of installing it at runtime.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
entrypoint.sh
(2 hunks)
🔇 Additional comments (1)
entrypoint.sh (1)
81-90
: Verify port configuration alignment with manual scraping functionality
The addition of port configuration seems unrelated to the PR's primary objective of fixing manual scraping integration. Let's verify if this change is intentional and necessary.
✅ Verification successful
Port configuration changes are intentional and necessary
The port-related changes are a valid addition based on the following findings:
- A dedicated commit (81bacd5) addresses the need for configurable ports to support running multiple Riven instances in the same space
- The changes are properly implemented in both CLI arguments (
cli.py
) and the main application - The feature is documented in the PR description and release notes
- The default port is properly configured (8080) with command-line override capability
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if port-related changes are mentioned in the issue or PR
gh issue view 921 | grep -i "port"
gh pr view 924 | grep -i "port"
# Check if there are any other port-related changes in the codebase
rg -l "port.*=.*8080" --type py
Length of output: 156
Script:
#!/bin/bash
# Check for port-related configurations and usage in Python files
rg -l "port" --type py
# Check for any port-related command line arguments
rg -A 3 "add_argument.*port" --type py
# Look for default port configurations
rg "PORT.*=.*\d+" --type py
# Check main.py specifically for port usage
cat src/main.py
Length of output: 6380
Script:
#!/bin/bash
# Check the CLI argument handling for port
cat src/program/utils/cli.py
# Check the PR description and commits for port-related changes
gh pr view 924 --json body,commits
# Check for any recent changes to port-related configurations
git diff HEAD~5 -- src/main.py src/program/utils/cli.py entrypoint.sh
Length of output: 5212
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (5)
src/program/services/scrapers/shared.py (5)
93-94
: Enhance year mismatch logging for better debuggingConsider including the actual year values in the log message to make debugging easier.
- logger.debug(f"Ignoring torrent {torrent.infohash} due to not in time year range, {item.log_string}") + logger.debug(f"Ignoring torrent {torrent.infohash} due to year mismatch (expected: {item.aired_at.year}, got: {torrent.data.year}), {item.log_string}")
102-103
: Fix grammatical error in log messageThere's a minor grammatical error in the log message.
- logger.debug(f"Ignoring torrent {torrent.infohash} due mismatch in season size, required {len(needed_seasons)} current {len(torrent.data.seasons)}, {item.log_string}") + logger.debug(f"Ignoring torrent {torrent.infohash} due to mismatch in season size, required: {len(needed_seasons)}, current: {len(torrent.data.seasons)}, {item.log_string}")
109-111
: Include specific season information in log messageConsider including which season is missing in the log message for better debugging context.
- logger.debug(f"Ignoring torrent {torrent.infohash} due to missing required season, {item.log_string}") + logger.debug(f"Ignoring torrent {torrent.infohash} due to missing required season (needed: {needed_seasons}, found: {list(torrent.data.seasons)}), {item.log_string}")
127-128
: Include specific episode and season numbers in log messageConsider including the expected and actual episode/season numbers in the log message for better debugging context.
- logger.debug(f"Ignoring torrent {torrent.infohash} due to mismatch in season and episode numbers, {item.log_string}") + logger.debug(f"Ignoring torrent {torrent.infohash} due to mismatch in season/episode numbers (expected S{item.parent.number}E{item.number}, found seasons: {list(torrent.data.seasons)}, episodes: {list(torrent.data.episodes)}), {item.log_string}")
130-131
: Include item type in category mismatch log messageConsider including the item type in the log message for better debugging context.
- logger.debug(f"Skipping torrent {infohash} as it does not fit to any category. {item.log_string}") + logger.debug(f"Skipping torrent {infohash} as item type '{item.type}' does not fit any category. {item.log_string}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
src/program/services/scrapers/shared.py
(3 hunks)
🔇 Additional comments (2)
src/program/services/scrapers/shared.py (2)
135-135
: LGTM! Enhanced error logging
The improved error logging for invalid torrents and garbage torrents will help with debugging issues. The inclusion of the specific error messages is particularly helpful.
Also applies to: 143-143
Line range hint 1-156
: Verify location of item_type fix
While these logging enhancements are valuable, I notice that the fix for the missing item_type
argument mentioned in the PR objectives (#921) is not present in this file.
src/routers/secure/scrape.py
Outdated
@@ -290,7 +292,7 @@ def get_info_hash(magnet: str) -> str: | |||
"message": "Started manual scraping session", | |||
"session_id": session.id, | |||
"torrent_id": torrent_id, | |||
"torrent_info": torrent_info, | |||
"torrent_info": torrent_info.dict(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the api issues in pr #915
just waiting for it to get merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that's in draft, with multiple comments from reviewer. Is there a timeframe you are looking to merge ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometime today #915 should be merged. Then we can get this one in as well. Lets remove the api changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets wait for raise, will rebase on updated mainline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated mainline, if you want to rebase and tweak 👌
@@ -140,6 +153,7 @@ def _parse_results(item: MediaItem, results: Dict[str, str], log_msg: bool = Tru | |||
torrents_dict[torrent.infohash] = Stream(torrent) | |||
logger.log("SCRAPER", f"Kept {len(torrents_dict)} streams for {item.log_string} after processing bucket limit") | |||
return torrents_dict | |||
logger.debug(f"No valid torrent remains after filtering for {item.log_string}: {raw_title}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're given a list of streams to parse here in this function, so this line would only read out the last stream that gets parsed from the list. This line can be removed as we handle individual stream logging throughout this function already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a additional explicit debug checkpoint never hurt, updated to remove raw_title
…ariable This is useful if multiple instance of riven are require to run in same space. One such usecase is to have separate instance for separate torrent profiles. In current setting space, multiple profiles cannot be specified. [Testing] - Changes tested by running two separate instance in same space at different port folders
This additional information is useful for debugging issues with ranking and torrent profiles, where if nothing is filtered for a config, it's very hard to debug which setting cause all things to filtered out. [TESTING] Changes tested by runing stack in local env with docker.
updated changes, have these reviewed and merged, removed api fix commit, only container port and debug logs changes now in this PR. |
Pull Request Check List
Resolves: #921
Description:
Post updating debrid instant availability api fix in recent days, manual scraping endpoints logic require modification to be compaitible with updated models. In this commit, gap in models are addressed, ensuring no model change is introduced, saving changes on UI end.
In longer term, it can be looked upon if model change need to be propagated to UI as well.
[Testing]
Following testing mechanism is utilised.
Summary by CodeRabbit
New Features
PORT
, defaulting to8080
, for enhanced flexibility in application execution.Bug Fixes
Documentation