Commit graph

7 commits

Author SHA1 Message Date
Alessandro
9e4b2f1843 Tune LiteParse OCR defaults
Add an adaptive OCR heuristic that samples PDF text density and disables LiteParse OCR for large text-rich PDFs before the OCR path reaches timeout territory.

Keep LiteParse isolated in a subprocess regardless of stale user config, remove the subprocess toggle from the settings UI, and raise the default LiteParse worker count to 2 for a safer multi-chat speedup.

Update Document Query docs and focused tests for the new heuristic, mandatory isolation, and worker default.
2026-05-30 19:02:10 +02:00
Alessandro
98f6c17d15 feat(document_query): expand settings panel and thumbnail
Expose the main Document Query parser, retrieval, fetch, LiteParse/OCR, and fallback controls in the plugin settings UI. Add a generated 256x256 JPEG thumbnail under the plugin size limit and cover both the settings wiring and thumbnail constraints with focused tests.
2026-05-29 16:47:38 +02:00
Alessandro
59dd1c99cb feat(document_query): expose parser concurrency setting
Add a Document Query plugin settings panel that maps to the existing parser_concurrency runtime limit, with a focused regression test so the UI remains wired to the backend setting.
2026-05-29 16:30:47 +02:00
Alessandro
6df3acc1e6 fix(document_query): pin LiteParse dependency
Pin LiteParse to 2.0.3 in both Docker requirements and the plugin hook requirements so new images and existing plugin installs resolve the same tested runtime.
2026-05-29 16:07:20 +02:00
Alessandro
b2ead06a4e fix(document_query): isolate LiteParse parsing
Run LiteParse in a subprocess so native parser crashes cannot take down the Web UI process. Bound parser concurrency and LiteParse workers for multi-chat stability, seed Q&A context with leading document chunks for title/abstract grounding, and keep a small-document fallback when vector search returns no chunks.
2026-05-29 15:51:59 +02:00
Alessandro
d039af512a fix(document_query): clean prompt spelling and legacy references
Rename the query optimization prompt from optmimize to optimize, update the helper lookup, and fix the concise typo inside the prompt.

Also add a regression assertion for the corrected prompt filename and remove the remaining literal a0_small test references so global audits stay clean.
2026-05-29 12:46:32 +02:00
Alessandro
6ccbae0712 feat(document_query): add liteparse runtime and progressive skill
Add LiteParse as the preferred parser path with legacy parser fallbacks, centralized document fetching, generic user-facing progress, and compatibility shims for the former helper/tool imports.

Install the runtime through Docker requirements for fresh images and through the _document_query plugin hook/startup migration for existing installations.

Move the long document_query tool instructions into a document-query skill and leave a compact tool prompt stub that directs the model to load the skill before using document_query for documents, code-file Q&A, and document-image OCR. Also add default Agent Zero guidance for document/code/OCR Q&A routing.

Tests:
- PYTHONPATH=/home/eclypso/a0/agent-zero-pr-1528 conda run -n a0 pytest tests/test_document_query_plugin.py -q
- python -m compileall -q plugins/_document_query helpers/document_query.py tools/document_query.py tests/test_document_query_plugin.py
- git diff --check
- Live Agent Zero Web UI E2E at localhost:32080: PDF Q&A, code-file Q&A through document_query skill, and W-4 document-image OCR

Broader legacy pytest probe remains blocked by unrelated browser-agent, docker workflow branch expectation, and webui fixture path failures in this older PR worktree.
2026-05-29 12:45:14 +02:00