- Deleted the document_processing module and its associated docling_service.
- Updated imports in documents_routes.py and background_tasks.py to reflect the new service structure.
- Ensured compatibility with the task logging system by adjusting type hints for log entries.
- Integrated Docling ETL service with new task logging system
- Maintained consistent logging pattern across all ETL services
- Added progress and success/failure logging for Docling processing
- Added DOCLING as third ETL_SERVICE option (alongside UNSTRUCTURED/LLAMACLOUD)
- Implemented add_received_file_document_using_docling function
- Added Docling processing logic in documents_routes.py
- Enhanced chunking with configurable overlap support
- Added comprehensive document processing service
- Supports both CPU and GPU processing with user selection
Addresses #161 - Add Docling Support as an ETL_SERVICE
Follows same pattern as LlamaCloud integration (PR #123)
- Integrated TaskLoggingService to log the start, progress, success, and failure of podcast generation tasks.
- Updated user ID handling to ensure it is consistently converted to a string across various tasks.
- Modified frontend success message to direct users to the logs tab for status updates on podcast generation.
- Updated import paths for LLM, connector, query, and streaming services to reflect their new location in the 'services' module.
- Removed obsolete utility service files that have been migrated.
- Added support for processing YouTube videos, including transcript extraction and document creation.
- Implemented a new background task for adding YouTube video documents.
- Enhanced the connector service to search for YouTube videos and return relevant results.
- Updated frontend components to include YouTube video options in the dashboard and connector sources.
- Added necessary dependencies for YouTube transcript API.