Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scanner rework #25

Merged
merged 5 commits into from
Jul 9, 2022
Merged

Scanner rework #25

merged 5 commits into from
Jul 9, 2022

Conversation

aaronleopold
Copy link
Collaborator

@aaronleopold aaronleopold commented Jul 9, 2022

Reworked the scanner to both simplify the logic as well as make it a little faster. Scanning, in terms of cognitive complexity, is much simpler now. The general flow is now:

  1. precheck (before scan starts) that also creates new series. (I did this to simplify the main scan FS walk, as well as to allow scanning on a per-series basis. This was mainly for my concurrent scan tests, but it is also just a simpler process)
  2. grab job meta (e.g. files to scan, start timer, etc)
  3. scan each series for media
  4. mark unvisited media as missing

(NOTE: the following information is not in a release profile) Rework has a 1.52x speedup on my machine and averages about 20 new media files a second (generate checksum, get and parse ComicInfo.xml (for archive formats), grab various other metadata (file size, pages, etc) and inserting into the database). If no new media, i.e. no insertions to database, will handle about 1555 files a second.

Notes:

  • Some concurrent scanning logic was added, and while it was an impressive speedup it is going to be used until, potentially, far in the future. Speedup averaged around 2.092x, but brought about connection issues in the prisma client (too many concurrent writers to the database). I am happy with the speed gains from the synchronous scan rework alone, and will consider revisiting concurrency once batching and transactions are supported by the prisma client.
  • I snuck in a few UI changes, mainly a new ReadMore component to be used to toggle the truncation of larger text (e.g. media/series descriptions) as well as a fix for the ellipsis navigation to paginated locations.

@vercel
Copy link

vercel bot commented Jul 9, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
stump ⬜️ Ignored (Inspect) Jul 9, 2022 at 9:24PM (UTC)

@aaronleopold aaronleopold merged commit f2b06d6 into develop Jul 9, 2022
@aaronleopold aaronleopold deleted the aleopold--scanner-rework branch July 9, 2022 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant