Commit graph

5797 commits

Author SHA1 Message Date
Rohan Verma
1119f557df
Merge pull request #1399 from MODSetter/dev
Some checks are pending
Build and Push Docker Images / tag_release (push) Waiting to run
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64, production) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64, production) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64, runner) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64, runner) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions
feat: multi-agent chat architecture, streaming runtime rewrite, and full E2E test harness
2026-05-15 18:08:04 -07:00
DESKTOP-RTLN3BA\$punk
9fb9778bd0 test: enhance index batch parallel tests to include hybrid chunker
Some checks are pending
Build and Push Docker Images / tag_release (push) Waiting to run
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64, production) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64, production) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64, runner) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64, runner) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions
Updated the test for the indexing pipeline to verify that both the standard and hybrid chunkers are called via asyncio.to_thread, ensuring non-blocking behavior. This change reflects the routing of non-code documents through the hybrid chunker, maintaining the event loop contract.
2026-05-15 18:02:04 -07:00
DESKTOP-RTLN3BA\$punk
c187b04e82 chore: linting 2026-05-15 17:33:44 -07:00
DESKTOP-RTLN3BA\$punk
219a5977b7 fix: update URLs to use the "www" subdomain across the application
This commit modifies various metadata and canonical URLs in the SurfSense application to ensure consistency by using "https://www.surfsense.com" instead of "https://surfsense.com". Changes were made in layout files, blog posts, and SEO components to reflect this update.
2026-05-15 12:35:15 -07:00
DESKTOP-RTLN3BA\$punk
dc88ce0277 Merge branch 'dev' of https://github.com/MODSetter/SurfSense into dev 2026-05-15 11:55:40 -07:00
DESKTOP-RTLN3BA\$punk
52a64fb96c feat: added blog posts 2026-05-15 11:55:30 -07:00
Rohan Verma
953c654452
Merge pull request #1398 from mvanhorn/osc/1373-platefile-jsdoc-mod-shift-s-dev
docs(editor): align PlateEditor onSave JSDoc with Mod+Shift+S chord
2026-05-15 11:28:15 -07:00
Rohan Verma
1afecc9194
Merge pull request #1397 from voidborne-d/fix/suppress-global-error-toast-mutations-dev
fix(web): suppress global error toast on mutations that own their toast UX
2026-05-15 11:27:40 -07:00
Rohan Verma
7e484b26a4
Merge pull request #1393 from CREDO23/feature/multi-agent-with-task-parallelization
[Feature] Parallel multi-agent task delegation with parallel HITL approvals
2026-05-15 11:26:51 -07:00
Matt Van Horn
f0a51fad6f
docs(editor): align PlateEditor onSave JSDoc with Mod+Shift+S chord
Per #1373, the registered save chord is Mod+Shift+S (not Mod+S, which
collides with the browser's Save-Page-As). The JSDoc on PlateEditorProps.onSave
still claims Mod+S, which is misleading for downstream consumers of the
component. Update the JSDoc to match the actual chord and call out why.

Targeting dev per maintainer request.
2026-05-15 09:06:42 -07:00
voidborne-d
bf2b4ebeb0 fix(web): suppress global error toast on mutations that own their toast UX
Closes #1371. Retarget of #1385 onto dev per maintainer request.

surfsense_web/lib/query-client/client.ts configures a global
MutationCache.onError that shows an error toast for every failed
mutation unless meta.suppressGlobalErrorToast is set. The opt-out
hook existed in the consumer but had zero producers — every mutation
atom that already had its own onError: toast.error(...) was
double-toasting on failure.

Add meta: { suppressGlobalErrorToast: true } to the 30 mutations
across 9 atom files that own their own error toast:

- atoms/prompts/prompts-mutation.atoms.ts (4)
- atoms/invites/invites-mutation.atoms.ts (4)
- atoms/chat-comments/comments-mutation.atoms.ts (4)
- atoms/new-llm-config/new-llm-config-mutation.atoms.ts (4)
- atoms/members/members-mutation.atoms.ts (3)
- atoms/roles/roles-mutation.atoms.ts (3)
- atoms/image-gen-config/image-gen-config-mutation.atoms.ts (3)
- atoms/vision-llm-config/vision-llm-config-mutation.atoms.ts (3)
- atoms/public-chat-snapshots/public-chat-snapshots-mutation.atoms.ts (2)

Atoms intentionally left alone (no local onError, rely on global):
auth, user, search-spaces, logs, documents, connectors.

Local validation (against dev): pnpm biome check on the 9 touched
files is clean; tsc --noEmit shows no new errors in the touched files
(pre-existing errors elsewhere unrelated).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 23:43:30 +08:00
CREDO23
4980f9f1ba Merge remote-tracking branch 'upstream/dev' into feature/multi-agent-with-task-parallelization 2026-05-15 16:44:22 +02:00
CREDO23
5327f3348c connector-popup: surface trusted-tools UI in MCP edit view; consolidate disconnect
- Slot MCPTrustedTools in mcp-service-config (gated on connector.id > 0) so
  any connected MCP-backed connector exposes a revoke surface for
  approve_always grants.
- Add new mcp-trusted-tools.tsx (audit + revoke list) and
  connectorsApiService.untrustMCPTool() that backs it.
- Drop the redundant row-level Disconnect from ConnectorAccountsListView:
  Manage now leads to the edit view whose own Disconnect is the single
  source of truth. Remove the now-dead onDisconnect prop, confirm-flow
  state, and handleDisconnectFromList hook callback + return entry.
2026-05-15 16:40:16 +02:00
CREDO23
a22e0e915f schemas/new_chat: accept 'approve_always' on the resume HTTP boundary
ResumeDecision is the Pydantic gate at the /resume HTTP route. It was
the last spot still rejecting the new wire decision-type, so the FE's
'approve_always' dispatch was being 422'd before it could reach the
permission middleware that already speaks it.
2026-05-15 15:23:39 +02:00
CREDO23
1f1b6c5425 hitl/generic-approval: drop client-side MCP gate, dispatch approve_always
The 'Always Allow' button is now driven entirely by the server-supplied
allowed_decisions palette. The card no longer peeks at
context.mcp_connector_id to decide whether to render the button, and no
longer fires a separate trust-tool HTTP call on click - one
{type: 'approve_always'} dispatch is enough; the agent middleware
handles the in-memory promotion and (for MCP tools) the database save
via its trusted_tool_saver callback.

Drops the dead trustMCPTool / untrustMCPTool service helpers - they had
no remaining callers after this rework. The backing HTTP routes are
kept on the server as a programmatic surface.
2026-05-15 14:59:45 +02:00
CREDO23
98b6977c68 permissions/ask: gate 'approve_always' palette entry on MCP-ness
Only MCP tools have a persistence target for 'approve_always' (the
connector's trusted-tools list); for native tools the decision lives
only in the in-memory runtime ruleset. Reflect that in the wire palette
so the FE can stay a pure renderer of allowed_decisions instead of
peeking at context.mcp_connector_id to decide whether to show the
'Always Allow' button.

The backend still accepts an 'approve_always' reply for any tool kind
(in-memory promotion is harmless), it just doesn't advertise it when
there's nowhere to persist.
2026-05-15 14:54:16 +02:00
CREDO23
c8b756ae8f hitl/wire: rename 'always' decision-type to 'approve_always'
Renames the SurfSense HITL extension decision-type from "always" to
"approve_always" so it sits in the same verb-first family as "approve",
"reject", and "edit". The Python constant is now SURFSENSE_DECISION_APPROVE_ALWAYS;
the wire value, the permission-domain decision_type, and the FE union members
all match (no wire/internal mismatch).

Both the multi_agent_chat permission middleware and the legacy new_chat one
accept the new wire value; the FE types.ts union is updated accordingly.

The "context.always" payload key is intentionally left untouched - it's the
patterns-to-promote field, semantically distinct from the decision type.
2026-05-15 14:47:32 +02:00
CREDO23
6671c91841 multi_agent_chat/permissions: persist 'always' decisions to trusted-tools list
Until now an "Always Allow" reply only updated the in-memory runtime
ruleset, evaporating after the session ended. Persist it to the
existing connector.config['trusted_tools'] list so the next session's
fetch_user_allowlist_rulesets picks it up and the user is never asked
again for the same (connector, tool) pair.

- TrustedToolSaver + make_trusted_tool_saver(user_id) in
  user_tool_allowlist: opens its own session via async_session_maker
  per call, logs and swallows failures (in-memory promotion is the
  canonical "always" path, durable persistence is opportunistic).

- PermissionMiddleware._process is now pure: returns
  (state_update, list[_AlwaysPromotion]). aafter_model awaits the
  saver for each promotion; after_model discards them. Promotions are
  only emitted for tools whose metadata exposes mcp_connector_id, so
  native tools and KB FS ops are correctly skipped.

- main_agent factory builds the saver once per turn and stashes it in
  dependencies["trusted_tool_saver"]; pack_subagent and the KB
  middleware stack forward it through build_permission_mw.

- Renamed pm._process(state, None) call sites in two existing tests to
  pm.after_model(state, None) so they exercise the public hook
  contract instead of the now-tuple-returning private method.
2026-05-15 14:07:08 +02:00
Rohan Verma
eea2d68098
Merge pull request #1396 from guangyang1206/fix/shared-thread-timestamp-formatter-1376
Some checks are pending
Build and Push Docker Images / tag_release (push) Waiting to run
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64, production) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64, production) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64, runner) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64, runner) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions
feat(shared): extract formatThreadTimestamp helper for chats sidebars…
2026-05-15 04:55:47 -07:00
Rohan Verma
7f66159af1
Merge pull request #1391 from guangyang1206/fix/log-mutations-invalidate-all-keys-1369
fix(web): invalidate all log cache keys on log mutations
2026-05-15 04:55:25 -07:00
Rohan Verma
9475036b8a
Merge pull request #1389 from CREDO23/feature/multi-agent
[Feature] Fix multi-agent delegation: orchestrator-only main agent with knowledge_base specialist
2026-05-15 04:54:17 -07:00
Rohan Verma
4ab9544a66
Merge pull request #1382 from mvanhorn/osc/1372-use-canonical-log-types
refactor(use-logs): use canonical log types from contracts/types/log.types
2026-05-15 04:49:21 -07:00
Rohan Verma
4db3cf7fd5
Merge pull request #1377 from AnishSarkar22/feat/e2e-testing-ci
feat: add E2E CI and harden Docker build migrations
2026-05-15 04:47:26 -07:00
CREDO23
a97d1548a6 multi_agent_chat/permissions: surface MCP tool metadata into ask interrupts
The FE permission card needs mcp_connector_id, mcp_server, and
tool_description in the interrupt context to render "Always Allow"
against the right connected account. Thread the tool through the
ask pipeline:

- pack_subagent → build_permission_mw(tools=...) → PermissionMiddleware
  (tools_by_name) → request_permission_decision(tool=...) →
  build_permission_ask_payload(tool=...) projects card fields out of
  BaseTool.

- mcp_tool.py: stdio path now stashes mcp_connector_id in metadata for
  parity with the HTTP path.
2026-05-15 11:28:06 +02:00
DESKTOP-RTLN3BA\$punk
e8aad48ddf refactor(report): enhance citations and clarify implementation details
Updated the multimodal_doc_parser_compare_n171_report.md to include detailed code citations for preprocessing costs and retry logic. Improved clarity on the implementation of the retry mechanism and its impact on failure rates. Added a new section for a code citations index to ensure reproducibility of technical claims.

This enhances the report's transparency and allows readers to trace the source of each claim back to the codebase.
2026-05-14 20:07:14 -07:00
DESKTOP-RTLN3BA\$punk
9bcd50164d feat(evals): publish multimodal_doc parser_compare benchmark + n=171 report
Adds the full parser_compare experiment for the multimodal_doc suite:
six arms compared on 30 PDFs / 171 questions from MMLongBench-Doc with
anthropic/claude-sonnet-4.5 across the board.

Source code:
- core/parsers/{azure_di,llamacloud,pdf_pages}.py: direct parser SDK
  callers (Azure Document Intelligence prebuilt-read/layout, LlamaParse
  parse_page_with_llm/parse_page_with_agent) used by the LC arms,
  bypassing the SurfSense backend so each (basic/premium) extraction
  is a clean A/B independent of backend ETL routing.
- suites/multimodal_doc/parser_compare/{ingest,runner,prompt}.py:
  six-arm benchmark (native_pdf, azure_basic_lc, azure_premium_lc,
  llamacloud_basic_lc, llamacloud_premium_lc, surfsense_agentic) with
  byte-identical prompts per question, deterministic grader, Wilson
  CIs, and the per-page preprocessing tariff cost overlay.

Reproducibility:
- pyproject.toml + uv.lock pin pypdf, azure-ai-documentintelligence,
  llama-cloud-services as new deps.
- .env.example documents the AZURE_DI_* and LLAMA_CLOUD_API_KEY env
  vars now required for parser_compare.
- 12 analysis scripts under scripts/: retry pass with exponential
  backoff, post-retry accuracy merge, McNemar / latency / per-PDF
  stats, context-overflow hypothesis test, etc. Each produces one
  number cited by the blog report.

Citation surface:
- reports/blog/multimodal_doc_parser_compare_n171_report.md: 1219-line
  technical writeup (16 sections) covering headline accuracy, per-format
  accuracy, McNemar pairwise significance, latency / token / per-PDF
  distributions, error analysis, retry experiment, post-retry final
  accuracy, cost amortization model with closed-form derivation, threats
  to validity, and reproducibility appendix.
- data/multimodal_doc/runs/2026-05-14T00-53-19Z/parser_compare/{raw,
  raw_retries,raw_post_retry}.jsonl + run_artifact.json + retry summary
  whitelisted via data/.gitignore as the verifiable numbers source.

Gitignore:
- ignore logs_*.txt + retry_run.log; structured artifacts cover the
  citation surface, debug logs are noise.
- data/.gitignore default-ignores everything, whitelists the n=171 run
  artifacts only (parser manifest left ignored to avoid leaking local
  Windows usernames in absolute paths; manifest is fully regenerable
  via 'ingest multimodal_doc parser_compare').
- reports/.gitignore now whitelists hand-curated reports/blog/.

Also retires the abandoned CRAG Task 3 implementation (download script,
streaming Task 3 ingest, CragTask3Benchmark + tests) and trims the
runner / ingest module APIs to match.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 19:54:41 -07:00
CREDO23
ef1152b80e multi_agent_chat/permissions: layer user allow-list into subagent compile 2026-05-14 21:57:38 +02:00
CREDO23
e99c06c887 user_tool_allowlist: extract trust-tool storage into reusable service 2026-05-14 21:20:30 +02:00
CREDO23
31d6b43a42 multi_agent_chat/shared: drop bucket types and helpers 2026-05-14 20:10:25 +02:00
CREDO23
014801c764 multi_agent_chat/loader: MCP tools as flat list[BaseTool] per agent 2026-05-14 20:10:11 +02:00
CREDO23
5a00df8e48 multi_agent_chat/builtins: KB+deliverables+memory+research adopt RULESET + flat load_tools() 2026-05-14 20:09:55 +02:00
CREDO23
3bb90124d2 multi_agent_chat/connectors: every route declares its own RULESET + flat load_tools() 2026-05-14 20:09:49 +02:00
CREDO23
d45dfbfbd6 multi_agent_chat: pack_subagent owns per-subagent PermissionMiddleware via Ruleset 2026-05-14 20:09:29 +02:00
CREDO23
67142e68b1 multi_agent_chat: scope MCP allow/ask permissions per subagent + drop "policy" synonym 2026-05-14 18:09:14 +02:00
CREDO23
0723702320 multi_agent_chat: real-graph regressions for unified HITL paths + format pass 2026-05-14 17:41:24 +02:00
CREDO23
adb52fb575 multi_agent_chat: KB owns its ruleset, drop interrupt_on duplication 2026-05-14 17:41:07 +02:00
CREDO23
d68280113b multi_agent_chat/connectors+builtins: adopt symmetric self_gated_tool_permission_row helper 2026-05-14 17:40:59 +02:00
CREDO23
a06aec2821 multi_agent_chat/subagents: HITL umbrella + ToolKind rename 2026-05-14 17:40:29 +02:00
CREDO23
8eaab12971 multi_agent_chat/permissions: restructure slice + simplify factory 2026-05-14 17:40:12 +02:00
CREDO23
a36b15b834 multi_agent_chat/middleware: tighten parallel-keying test with heterogeneous bundles and per-slice assertions 2026-05-14 10:11:51 +02:00
CREDO23
d69d2cc1fc multi_agent_chat/middleware: tighten heterogeneous slice arithmetic to (2,3) bundles 2026-05-14 10:05:04 +02:00
CREDO23
668b89927b multi_agent_chat/middleware: real-graph regression test for partial-pause parallel routing 2026-05-14 09:47:24 +02:00
CREDO23
8e10f38f32 multi_agent_chat/middleware: real-graph regression test for all-reject parallel routing 2026-05-14 09:36:03 +02:00
CREDO23
ca57b2106e multi_agent_chat/middleware: real-graph regression test for heterogeneous parallel decisions 2026-05-14 09:26:08 +02:00
DESKTOP-RTLN3BA\$punk
3737118050 chore: evals
Some checks failed
Build and Push Docker Images / tag_release (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Has been cancelled
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Has been cancelled
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Has been cancelled
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Has been cancelled
2026-05-13 14:02:26 -07:00
CREDO23
f2495092da chat/stream_resume: salt thinking-step prefix with turn_id to avoid duplicate React keys 2026-05-13 21:15:51 +02:00
CREDO23
1bb9f435e5 chat-messages: render and batch-submit multiple HITL approval cards 2026-05-13 21:00:01 +02:00
CREDO23
0fd87ccb7f chat/stream_resume: key Command(resume=...) by Interrupt.id for parallel HITL 2026-05-13 20:59:57 +02:00
CREDO23
c06dd6e8ba chat/stream_new_chat: emit one SSE frame per pending interrupt 2026-05-13 20:59:48 +02:00
CREDO23
583ac83735 multi_agent_chat/middleware: refresh module layout docs 2026-05-13 19:58:59 +02:00