Skip to content

v0.8.0 - A New Era for OCR

Compare
Choose a tag to compare
@icereed icereed released this 06 Jan 22:17
· 77 commits to main since this release
32f83ec

paperless-gpt v0.8.0

We’re thrilled to unveil paperless-gpt v0.8.0, featuring a major leap forward in document management: OCR powered by Large Language Models (LLMs). This approach transforms the way you process documents, tapping into advanced AI to extract text with greater accuracy—especially valuable for complex or low-quality scans.


A New Era for OCR

LLM-Enhanced OCR: Paperless-GPT uniquely harnesses LLMs to perform OCR, going beyond traditional algorithms.

  • Higher Accuracy: AI “understands” context, boosting success rates on tough or noisy scans.
  • Versatile Tagging: Combine OCR with new environment variables to automatically sort, tag, and categorize your documents.

Why It Matters:

  • Faster, Smarter Data Extraction: Let AI handle content gleaning, so you can focus on insights, not data entry.
  • Effortless Setup: Switch a few environment variables to enable this feature and tailor to your own workflow.

What Else Is New?

  1. Flexible Tag Configurability

    • MANUAL_TAG, AUTO_TAG, and AUTO_OCR_TAG environment variables: Easily customize how documents are labeled, all within your existing environment setup.
  2. Streamlined OCR Flow

    • ProcessDocumentOCR method in ocr.go simplifies the entire pipeline—download images, perform LLM-based OCR, and update documents automatically.

Key Highlights

  • LLM-Enhanced OCR: Game-Changing in the paperless world—tap into deep-learning models for better text extraction.
  • Expanded Environment Variables: Fine-tune your entire processing strategy using new tags for manual, automatic, and OCR flows.

Get Started / Next Steps

  1. Enable OCR with LLMs

    • Set your environment variables: AUTO_OCR_TAG, VISION_LLM_PROVIDER , and VISION_LLM_MODEL .
      • Example model for Ollama: minicpm-v or x/llama3.2-vision:latest (better, but needs more GPU juice)
      • Example model for OpenAI: gpt-4o
    • See the README for details on hooking up your LLM (OpenAI, Ollama, etc.).
  2. Try the New Tag System

    • MANUAL_TAG for manual sorting,
    • AUTO_TAG for auto-sorting,
    • AUTO_OCR_TAG for OCR-based flow.
  3. Feedback Welcomed

    • This feature is experimental, and we invite all feedback to help shape its future.

A Little Poetic Fanfare

“OCR re-imagined, with AI might,
Flawless text from scans day or night,
Paperless-GPT rewriting the fight,
Your docs are free—digitized just right!”


Upgrade to v0.8.0 and discover how LLM-powered OCR can revolutionize your paperless workflow!

Happy Tagging & Document Managing!

Full Changelog: v0.7.0...v0.8.0