Commit graph

6 commits

Author SHA1 Message Date
orihatav
5a350a7622 fix: narrow exception to (ImportError, OSError) and include error in log
Broad 'except Exception' could silently swallow unexpected failures.
URLError and ConnectionError are both subclasses of OSError, so
'except (ImportError, OSError)' captures all real offline/not-installed
cases while letting genuine programming errors propagate.

Also include the exception detail in the warning message so failures
are diagnosable in logs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 19:45:14 -05:00
orihatav
d0bbe4a921 fix: handle tiktoken network errors in offline environments (issue #264)
In air-gapped / offline Docker deployments, tiktoken.get_encoding() tries
to download the encoding file from openaipublic.blob.core.windows.net.
When that request fails it raises a URLError / OSError — not an ImportError
— so the previous except clause silently missed it and the crash surfaced in
the UI.

Widened `except ImportError` to `except Exception` so all failures —
"not installed" and "network unreachable" — fall through to the word-count
fallback (words × 1.3). Added a loguru WARNING so operators can see when
the fallback is active.

TIKTOKEN_CACHE_DIR now reads from the environment with a blank-safe
fallback (`or` guard prevents os.makedirs("") on empty env var). This lets
Docker images redirect the cache to a path outside /app/data/ so user-data
volume mounts cannot shadow the pre-baked encoding.

Both images now pre-download the o200k_base encoding during the builder
stage (internet is available at build time) and copy it into the runtime
image at /app/tiktoken-cache. ENV TIKTOKEN_CACHE_DIR=/app/tiktoken-cache
is set in the runtime stage so no network call is ever needed at runtime.

Added test_token_count_network_error_fallback in tests/test_utils.py:
patches tiktoken.get_encoding with a URLError and asserts token_count()
returns a positive int instead of raising.

Fixes #264

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 19:45:14 -05:00
MisonL
67dd85c928
Feat/localization tests docker (#371)
* feat(i18n): complete 100% internationalization and fix Next.js 15 compatibility

* feat(i18n): complete 100% internationalization coverage

* chore(test): finalize component tests and project cleanup

* test(logic): add unit tests for useModalManager hook

* fix(test): resolve timeout in AppSidebar tests by mocking TooltipProvider

* feat(i18n): comprehensive i18n audit, fixes for hardcoded strings, and complete zh-TW support

* fix(i18n): resolve TypeScript warnings and improve translation hook stability

- Remove unused useTranslation import from ConnectionGuard
- Add ref-based checking state to prevent dependency cycles
- Fix useTranslation hook to return empty string for undefined translations
- Add comment for backward compatibility on ExtractedReference interface
- Ensure .replace() string methods work safely with nested translation keys

* feat(i18n): complete internationalization implementation with Docker deployment

- Add LanguageLoadingOverlay component for smooth language transitions
- Update all translation files (en-US, zh-CN, zh-TW) with improved terminology
- Optimize Docker configuration for better performance
- Update version check and config handling for i18n support
- Fix route handling for language-specific content
- Add comprehensive task documentation

* fix(i18n): resolve localization errors, duplicates, and type issues

* chore(i18n): finalize 100% internationalization coverage

* chore(test): supplement i18n test cases and cleanup redundant files

* fix(test): resolve lint type errors and finalize delivery documents

* feat(i18n): finalize full internationalization and zh-TW localization

* fix(frontend): add missing devDependency and fix build tsconfig

* feat(ui): enhance sidebar hover effects with better visual feedback

* fix(frontend): resolve accessibility, i18n, and lint issues

- fix: add missing id, name, autocomplete attributes to dialog inputs
- fix: add aria labels and DialogDescription for accessibility
- fix: resolve uncontrolled component warning in SettingsForm
- fix: correct duplicate 'Traditional Chinese' label in zh-TW locale
- feat: add i18n support for podcast template names
- chore: fix lint errors in Dialogs

* fix: address all 21 PR feedback items from cubic-dev-ai bot

Configuration:
- Remove ignoreDuringBuilds flags from next.config.ts

Testing:
- Fix AppSidebar.test.tsx regex pattern and add missing assertion

Logic:
- Fix ConnectionGuard.tsx re-entry prevention logic

Internationalization (I18n) - Translations:
- Add missing keys: notebooks.archived, common.note/insight, accessibility keys
- Add specific keys: sources.allSourcesDescShort, transformations.selectModel
- Add singular/plural keys: podcasts.usedByCount_one/other, common.note/notes
- Add common.created/updated with {time} placeholder

Internationalization (I18n) - Usage:
- SourcesPage: use allSourcesDescShort instead of string splitting
- TransformationPlayground: use navigation.transformation and selectModel
- CommandPalette: use dedicated keys instead of string concatenation
- GeneratePodcastDialog: fix zh-TW date locale handling
- NotebookHeader: correctly interpolate {time} placeholder
- TransformationCard: use common.description instead of undefined key
- ChatPanel/SpeakerProfilesPanel: implement proper pluralization
- SystemInfo: correctly interpolate {version} placeholder
- LanguageLoadingOverlay: use t.common.loading instead of hardcoded string
- MessageActions: use specific error key cannotSaveNoteNoNotebook

Other:
- Fix SessionManager.tsx exhaustive-deps warning

* fix: remove duplicate locale keys and add missing zh-CN translations

- en-US: remove duplicate loading key (line 59) and addNew key (sources)
- zh-CN: remove duplicate common keys (loading, note, insight, newSource, newNotebook, newPodcast)
- zh-CN: remove duplicate accessibility.searchNotebooks key
- zh-CN: remove duplicate sources.addNew key
- zh-CN: remove duplicate navigation.transformation key
- zh-CN: add missing usedByCount_one and usedByCount_other keys in podcasts
- zh-TW: remove duplicate common keys (loading, note, insight, newSource, newNotebook, newPodcast)
- zh-TW: remove duplicate accessibility.searchNotebooks key
- zh-TW: remove duplicate sources.addNew key

* docs: remove info.md

* fix: remove duplicate notebook keys and unused ts-expect-error

- zh-CN: remove duplicate notebooks keys (archived, archive, unarchive, deleteNotebook, deleteNotebookDesc)
- zh-TW: remove duplicate notebooks keys (archived, archive, unarchive, deleteNotebook, deleteNotebookDesc)
- GeneratePodcastDialog: remove unused @ts-expect-error directive

* fix(a11y): fix unassociated labels in search page

- Replace <Label> with role='group' + aria-labelledby for search type section
- Replace <Label> with role='group' + aria-labelledby for search in section
- Follows WAI-ARIA best practices for labeling form field groups

* fix(a11y): fix unassociated labels across multiple components

- search/page.tsx: use role='group' + aria-labelledby for search type and search in sections
- RebuildEmbeddings.tsx: use role='group' + aria-labelledby for include checkboxes
- TransformationPlayground.tsx: replace Label with span for non-form output label

* chore: revert to npm stack and ensure i18n compatibility

* chore: polish zh-TW translations for better idiomatic usage

* fix: resolve linter errors (ruff import sort, mypy config duplicate)

* style: apply ruff formatting

* fix: finalize upstream compliance (Dockerfile.single, i18n hooks, docker-compose)

* style: polish strings, fix timeout cleanup, and improve test mocks

* fix: use relative imports in test setup to resolve IDE path errors

* perf(docker): optimize build speed by removing apt-get upgrade and build tools

- Remove apt-get upgrade from both builder and runtime stages (saves 10-15 min each)
- Remove gcc/g++/make/git from builder (uv downloads pre-built wheels)
- Add --no-install-recommends to minimize package footprint
- Keep npm mirror (npmmirror.com) for faster frontend deps
- Add npm registry config for reliable China network access

Also includes:
- fix(a11y): add missing labels and aria attributes to form fields
- fix(i18n): add 2s safety timeout to LanguageLoadingOverlay
- fix(i18n): add robustness checks to use-translation proxy

Build time reduced from 2+ hours to ~34 minutes (~70% improvement)

* fix(a11y): resolve 16 form field accessibility warnings in notebook and podcast pages

* fix(a11y): resolve 4 button and 1 select field accessibility warnings in models page

* fix(a11y): resolve redundant attributes and residual warnings in transformations and podcast forms

* fix(i18n): deep fix for language switch hang using proxy protection and safer access

* fix(a11y): add name attributes to ModelSelector, TransformationPlayground, and SourceDetailContent

* fix: add missing Label import to SourceDetailContent

* fix(i18n): use native react-i18next in LanguageLoadingOverlay to prevent hang during language switch

* fix(i18n): rewrite use-translation Proxy with strict depth limit and expanded blocked props to prevent language switch hang

* fix: add type assertion to fix TypeScript comparison error

* fix(i18n): disable useSuspense to prevent thread hang during language resource loading

* fix(i18n): add infinite loop detection circuit breaker to useTranslation hook

* fix(i18n): update traditional chinese label to native script in en-US

* feat: add new localization strings for notebook and note management.

* fix: resolve config priority, docker build deps, and ui glitches

* refactor: improve ui details and test coverage based on feedback

* refactor: improve ui details (version check/lang toggle) and test coverage

* fix: polish language matching and test cleanup

* fix(test): update mocks to resolve timeouts and proxy errors

* fix(frontend): restore tsconfig.json structure and enable IDE support for tests

* fix: address PR review findings and resolve CI OIDC failure

* fix: merge exception headers in custom handler

* fix: comprehensive PR review remediations and async performance fixes

* refactor: address all PR #371 review feedback

- Docker: consolidate SURREAL_URL to docker.env, add single-container override
- Security: restore apt-get upgrade in Dockerfile and Dockerfile.single
- Create centralized getDateLocale helper (lib/utils/date-locale.ts)
- Refactor 7 files to use getDateLocale helper
- Revert config/route.ts to origin/main version
- Move test files to co-located pattern (3 files)
- Remove local useTranslation mock from ConfirmDialog.test.tsx
- Simplify use-version-check to single useEffect pattern
- Fix test import paths after moving to co-located pattern

* fix: add jest-dom types for test files

* fix: address remaining review issues

- Add apt-get upgrade -y to Dockerfile.single backend-builder stage
- Refactor ChatColumn.test.tsx: use 'as unknown as ReturnType<typeof hook>' instead of 'as any'
- Use toBeInTheDocument() assertions instead of toBeDefined()
2026-01-15 13:51:05 -03:00
Luis Novo
1a67f1f912
fix: enhance chat reference links and prevent text overflow (#173)
This commit addresses two related issues in the chat interface:

1. **Fix broken reference links (OSS-310)**
   - Completely rewrote convertReferencesToMarkdownLinks() with greedy pattern matching
   - Now handles all edge cases: references after commas, nested brackets, bold markdown
   - Added visual icon indicators (FileText, Lightbulb, FileEdit) for reference types
   - Implemented proper error handling with toast notifications
   - Added validation for reference types and ID lengths

2. **Fix long URL/text overflow (#172)**
   - Added break-words and overflow-wrap classes to chat messages
   - Long URLs and text now wrap properly within chat bubbles
   - Applied fix consistently across source chat, notebook chat, and search results

**Technical Details:**
- Enhanced reference detection algorithm processes from end to start to preserve indices
- Context analysis (50 chars before/after) determines original formatting
- Icons are 12px, accessible, and themed appropriately
- All changes pass linting and build successfully

**Files Modified:**
- frontend/src/lib/utils/source-references.tsx (core algorithm rewrite)
- frontend/src/components/source/ChatPanel.tsx (error handling + text wrapping)
- frontend/src/components/search/StreamingResponse.tsx (error handling + text wrapping)
- open_notebook/utils/token_utils.py (ruff formatting fix)

fixes #172
2025-10-19 15:38:59 -03:00
Luis Novo
aa593c60bd
feat: add persistent tiktoken cache to reduce re-downloads (#171)
Some checks are pending
Development Build / extract-version (push) Waiting to run
Development Build / test-build-regular (push) Blocked by required conditions
Development Build / test-build-single (push) Blocked by required conditions
Development Build / summary (push) Blocked by required conditions
Configure tiktoken to cache tokenizer encodings in ./data/tiktoken-cache
instead of using system temp directory. This prevents re-downloading
encoding files on every container restart and improves startup time.

Changes:
- Add TIKTOKEN_CACHE_DIR configuration in config.py
- Set TIKTOKEN_CACHE_DIR environment variable in token_utils.py
- Bump version to 1.0.7
2025-10-19 14:50:52 -03:00
Luis Novo
b7e656a319
Version 1 (#160)
New front-end
Launch Chat API
Manage Sources
Enable re-embedding of all contents
Sources can be added without a notebook now
Improved settings
Enable model selector on all chats
Background processing for better experience
Dark mode
Improved Notes

Improved Docs: 
- Remove all Streamlit references from documentation
- Update deployment guides with React frontend setup
- Fix Docker environment variables format (SURREAL_URL, SURREAL_PASSWORD)
- Update docker image tag from :latest to :v1-latest
- Change navigation references (Settings → Models to just Models)
- Update development setup to include frontend npm commands
- Add MIGRATION.md guide for users upgrading from Streamlit
- Update quick-start guide with correct environment variables
- Add port 5055 documentation for API access
- Update project structure to reflect frontend/ directory
- Remove outdated source-chat documentation files
2025-10-18 12:46:22 -03:00