Commit graph

37 commits

Author SHA1 Message Date
Abdullah 3li
f117d94ef7 fix: Resolve merge conflict in documents_routes.py
- Integrated Docling ETL service with new task logging system
- Maintained consistent logging pattern across all ETL services
- Added progress and success/failure logging for Docling processing
2025-07-21 10:43:15 +03:00
Abdullah 3li
aa00822169 feat: Add Docling support as ETL_SERVICE option
- Added DOCLING as third ETL_SERVICE option (alongside UNSTRUCTURED/LLAMACLOUD)
- Implemented add_received_file_document_using_docling function
- Added Docling processing logic in documents_routes.py
- Enhanced chunking with configurable overlap support
- Added comprehensive document processing service
- Supports both CPU and GPU processing with user selection

Addresses #161 - Add Docling Support as an ETL_SERVICE
Follows same pattern as LlamaCloud integration (PR #123)
2025-07-20 11:42:55 +03:00
Utkarsh-Patel-13
92781e726c Updated Streaming Service to efficently stream content\
\
- Earlier for each chunk, whole message (with all annotations included)
  were streamed. Leading to extremely large data length.
- Fixed to only stream new chunk.
- Updated ANSWER part to be streamed as message content (following
  Vercel's Stream Protocol)\
- Fixed yield typo
2025-07-18 17:43:07 -07:00
MSI\ModSetter
1eb072cc69 feat(BACKEND): Added Log Management System for better Bug's Tracking
- Background tasks are now logged so non tech users can effectively track the failurte points easily.
2025-07-16 01:10:33 -07:00
DESKTOP-RTLN3BA\$punk
a19a8af4ff fix: bad chat history logic 2025-07-10 15:06:25 -07:00
DESKTOP-RTLN3BA\$punk
21fb231683 fix: Markdown & Text files as default support. 2025-07-07 22:55:51 -07:00
DESKTOP-RTLN3BA\$punk
a85f7920a9 feat: added configurable LLM's 2025-06-09 15:50:15 -07:00
DESKTOP-RTLN3BA\$punk
99fa03d78b feat: Added Calender Based Indexing.
- This should stabalize manual syning.
2025-06-06 18:17:47 -07:00
DESKTOP-RTLN3BA\$punk
d7bb31f894 feat: Document Selector in Chat.
- Still need improvements but lets use it first.
2025-06-04 21:46:50 -07:00
Muhamad Aji Wibisono
1d67a87b82 feat: discord knowledge retrieval 2025-06-02 18:43:32 +07:00
Muhamad Aji Wibisono
4b3c662478 feat: added discord indexer 2025-06-02 18:30:38 +07:00
DESKTOP-RTLN3BA\$punk
73751c0eb1 feat: Removed Hard Dependency on Unstructured.io
- Added Llamaparse Support :)
2025-05-30 19:17:19 -07:00
DESKTOP-RTLN3BA\$punk
5411bac8e0 feat: Added content based hashing to prevent duplicates and fix resync issues 2025-05-28 23:52:00 -07:00
DESKTOP-RTLN3BA\$punk
a8080d2dc7 feat: Added Speech to Text support.
- Supports audio & video files.
- Will be useful for Youtube vids which dont have transcripts.
2025-05-13 21:13:53 -07:00
DESKTOP-RTLN3BA\$punk
a9db0a8ceb feat: Introduce the RAPTOR Search. 2025-05-11 23:05:56 -07:00
DESKTOP-RTLN3BA\$punk
a58550818b feat: Added chat_history to researcher agent 2025-05-10 20:06:19 -07:00
DESKTOP-RTLN3BA\$punk
1586a0bd78 chore: Added direct handling for markdown files.
- Fixed podcast imports.
2025-05-07 22:04:57 -07:00
DESKTOP-RTLN3BA\$punk
b4bee887bd feat: Added Podcast Feature and its actually fast.
- Fully Async
2025-05-05 23:18:12 -07:00
DESKTOP-RTLN3BA\$punk
130f43a0fa feat: Removed GPT-Researcher in favour of own SurfSense LangGraph Agent 2025-04-20 19:19:35 -07:00
DESKTOP-RTLN3BA\$punk
2008b07304 fix: Docs & Chats in other search spaces 2025-04-17 23:19:56 -07:00
Adamsmith6300
32c721113c update edit connectors page to support linear connector 2025-04-16 22:06:50 -07:00
Adamsmith6300
f2f426d5eb merge conflicts 2025-04-16 21:34:51 -07:00
Adamsmith6300
5176569e30 edit repos for gh connector 2025-04-16 20:29:50 -07:00
Adamsmith6300
ae8c74a5aa select repos when adding gh connector 2025-04-16 19:59:38 -07:00
DESKTOP-RTLN3BA\$punk
e0eb9d4b8b feat: Added Linear Connector 2025-04-15 23:10:35 -07:00
Adamsmith6300
a69bbb32f7 Merge branch 'main' of https://github.com/MODSetter/SurfSense into add-github-connector 2025-04-14 15:25:29 -07:00
DESKTOP-RTLN3BA\$punk
aaddd5ca9c Refactor: Remove redundant integer conversion for search_space_id in chat data handling 2025-04-13 20:53:21 -07:00
DESKTOP-RTLN3BA\$punk
0b93c9dfef Fixed current agent citation issues and added sub_section_writer agent for upcoming SurfSense research agent 2025-04-13 20:47:23 -07:00
Adamsmith6300
bb198e38c0 add github connector, add alembic for db migrations, fix bug updating connectors 2025-04-13 13:56:22 -07:00
DESKTOP-RTLN3BA\$punk
b43272a115 feat(youtube): integrate YouTube video processing connector
- Added support for processing YouTube videos, including transcript extraction and document creation.
- Implemented a new background task for adding YouTube video documents.
- Enhanced the connector service to search for YouTube videos and return relevant results.
- Updated frontend components to include YouTube video options in the dashboard and connector sources.
- Added necessary dependencies for YouTube transcript API.
2025-04-11 15:05:17 -07:00
DESKTOP-RTLN3BA\$punk
1609e59086 YouTube video processing utils 2025-04-09 18:46:10 -07:00
DESKTOP-RTLN3BA\$punk
8cd1264d3f feat: Updated the extension for SurfSense v0.0.6 2025-03-26 20:02:53 -07:00
DESKTOP-RTLN3BA\$punk
24fd873ca7 fix: Fixed Slack Reindexing 2025-03-26 17:44:38 -07:00
DESKTOP-RTLN3BA\$punk
23da404177 fix: Fixed Notion Reindexing & Updation 2025-03-26 17:19:10 -07:00
DESKTOP-RTLN3BA\$punk
ee0c518553 not-integreated: Add DocumentHybridSearchRetriever 2025-03-20 22:56:24 -07:00
DESKTOP-RTLN3BA\$punk
709aa6f303 feat: Added Docker Support and missing dependencies. 2025-03-20 18:52:06 -07:00
DESKTOP-RTLN3BA\$punk
da23012970 feat: SurfSense v0.0.6 init 2025-03-14 18:53:14 -07:00