Add an adaptive OCR heuristic that samples PDF text density and disables LiteParse OCR for large text-rich PDFs before the OCR path reaches timeout territory.
Keep LiteParse isolated in a subprocess regardless of stale user config, remove the subprocess toggle from the settings UI, and raise the default LiteParse worker count to 2 for a safer multi-chat speedup.
Update Document Query docs and focused tests for the new heuristic, mandatory isolation, and worker default.
Expose the main Document Query parser, retrieval, fetch, LiteParse/OCR, and fallback controls in the plugin settings UI. Add a generated 256x256 JPEG thumbnail under the plugin size limit and cover both the settings wiring and thumbnail constraints with focused tests.
Add a Document Query plugin settings panel that maps to the existing parser_concurrency runtime limit, with a focused regression test so the UI remains wired to the backend setting.
Pin LiteParse to 2.0.3 in both Docker requirements and the plugin hook requirements so new images and existing plugin installs resolve the same tested runtime.
Run LiteParse in a subprocess so native parser crashes cannot take down the Web UI process. Bound parser concurrency and LiteParse workers for multi-chat stability, seed Q&A context with leading document chunks for title/abstract grounding, and keep a small-document fallback when vector search returns no chunks.
Rename the query optimization prompt from optmimize to optimize, update the helper lookup, and fix the concise typo inside the prompt.
Also add a regression assertion for the corrected prompt filename and remove the remaining literal a0_small test references so global audits stay clean.
Add LiteParse as the preferred parser path with legacy parser fallbacks, centralized document fetching, generic user-facing progress, and compatibility shims for the former helper/tool imports.
Install the runtime through Docker requirements for fresh images and through the _document_query plugin hook/startup migration for existing installations.
Move the long document_query tool instructions into a document-query skill and leave a compact tool prompt stub that directs the model to load the skill before using document_query for documents, code-file Q&A, and document-image OCR. Also add default Agent Zero guidance for document/code/OCR Q&A routing.
Tests:
- PYTHONPATH=/home/eclypso/a0/agent-zero-pr-1528 conda run -n a0 pytest tests/test_document_query_plugin.py -q
- python -m compileall -q plugins/_document_query helpers/document_query.py tools/document_query.py tests/test_document_query_plugin.py
- git diff --check
- Live Agent Zero Web UI E2E at localhost:32080: PDF Q&A, code-file Q&A through document_query skill, and W-4 document-image OCR
Broader legacy pytest probe remains blocked by unrelated browser-agent, docker workflow branch expectation, and webui fixture path failures in this older PR worktree.