Commit graph

3 commits

Author SHA1 Message Date
Alessandro
6ccbae0712 feat(document_query): add liteparse runtime and progressive skill
Add LiteParse as the preferred parser path with legacy parser fallbacks, centralized document fetching, generic user-facing progress, and compatibility shims for the former helper/tool imports.

Install the runtime through Docker requirements for fresh images and through the _document_query plugin hook/startup migration for existing installations.

Move the long document_query tool instructions into a document-query skill and leave a compact tool prompt stub that directs the model to load the skill before using document_query for documents, code-file Q&A, and document-image OCR. Also add default Agent Zero guidance for document/code/OCR Q&A routing.

Tests:
- PYTHONPATH=/home/eclypso/a0/agent-zero-pr-1528 conda run -n a0 pytest tests/test_document_query_plugin.py -q
- python -m compileall -q plugins/_document_query helpers/document_query.py tools/document_query.py tests/test_document_query_plugin.py
- git diff --check
- Live Agent Zero Web UI E2E at localhost:32080: PDF Q&A, code-file Q&A through document_query skill, and W-4 document-image OCR

Broader legacy pytest probe remains blocked by unrelated browser-agent, docker workflow branch expectation, and webui fixture path failures in this older PR worktree.
2026-05-29 12:45:14 +02:00
Deimos Agent
5fd7a6a79e feat: extract document_query into _document_query plugin with parser strategy pattern
- Create plugins/_document_query/ with full plugin structure:
  plugin.yaml, default_config.yaml, tools/, helpers/, helpers/parsers/, prompts/, README.md
- Add BaseParser ABC with asyncio.to_thread offload and configurable timeouts
- Implement 5 parsers: PDF (PyMuPDF+Tesseract), HTML (Markdownify),
  Text (expanded mimetypes: YAML, XML, TOML, JS, TS, shell),
  Image (Unstructured), Unstructured (catch-all)
- Add MIME type registry with priority-based routing via get_parser_for_mimetype()
- Add gather_timeout on asyncio.gather for bounded concurrent fetches
- All config externalized to default_config.yaml
- Disable core files (._py.bak) replaced by plugin
- Update knowledge_tool._py import to plugin path
2026-05-29 12:45:05 +02:00
frdel
d02dda3667 BIG PYTHON REFACTOR
Python scripts moved out of python/ folder to root to be unified with plugins

+ frontend extension around api calls
2026-03-05 17:28:11 +01:00
Renamed from python/tools/document_query.py (Browse further)