mirror of https://github.com/ruvnet/RuVector.git synced 2026-05-25 15:03:46 +00:00

rUv 10c25953fa feat: DrAgnes + Common Crawl WET + Gemini grounding agents (#282 )

* docs: DrAgnes project overview and system architecture research

Establishes the DrAgnes AI-powered dermatology intelligence platform
research initiative with comprehensive system architecture covering
DermLite integration, CNN classification pipeline, brain collective
learning, offline-first PWA design, and 25-year evolution roadmap.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: DrAgnes HIPAA compliance strategy and data sources research

Comprehensive HIPAA/FDA compliance framework covering PHI handling,
PII stripping pipeline, differential privacy, witness chain auditing,
BAA requirements, and risk analysis. Data sources document catalogs
18 training datasets, medical literature sources, and real-world data
streams including HAM10000, ISIC Archive, and Fitzpatrick17k.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: DrAgnes DermLite integration and 25-year future vision research

DermLite integration covers HUD/DL5/DL4/DL200 device capabilities,
image capture via MediaStream API, ABCDE criteria automation, 7-point
checklist, Menzies method, and pattern analysis modules. Future vision
spans AR-guided biopsy (2028), continuous monitoring wearables (2040),
genomic fusion (2035), BCI clinical gestalt (2045), and global
elimination of late-stage melanoma detection by 2050.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: DrAgnes competitive analysis and deployment plan research

Competitive analysis covers SkinVision, MoleMap, MetaOptima, Canfield,
Google Health, 3Derm, and MelaFind with feature matrix comparison.
Deployment plan details Google Cloud architecture with Cloud Run
services, Firestore/GCS data storage, Pub/Sub events, multi-region
strategy, security configuration, cost projections ($3.89/practice at
1000-practice scale), and disaster recovery procedures.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: ADR-117 DrAgnes dermatology intelligence platform

Proposes DrAgnes as an AI-powered dermatology platform built on
RuVector's CNN, brain, and WASM infrastructure. Covers architecture,
data model, API design, HIPAA/FDA compliance strategy, 4-phase
implementation plan (2026-2051), cost model showing $3.89/practice
at scale, and acceptance criteria targeting >95% melanoma sensitivity
with offline-first WASM inference in <200ms.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): deployment config — Dockerfile, Cloud Run, PWA manifest, service worker

Add production deployment infrastructure for DrAgnes:
- Multi-stage Dockerfile with Node 20 Alpine and non-root user
- Cloud Run knative service YAML (1-10 instances, 2 vCPU, 2 GiB)
- GCP deploy script with rollback support and secrets integration
- PWA manifest with SVG icons (192x192, 512x512)
- Service worker with offline WASM caching and background sync
- TypeScript configuration module with CNN, privacy, and brain settings

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(dragnes): user-facing documentation and clinical guide

Add comprehensive DrAgnes documentation covering:
- Getting started and PWA installation
- DermLite device integration instructions
- HAM10000 classification taxonomy and result interpretation
- ABCDE dermoscopy scoring methodology
- Privacy architecture (DP, k-anonymity, witness hashing)
- Offline mode and background sync behavior
- Troubleshooting guide
- Clinical disclaimer and regulatory status

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): brain integration — pi.ruv.io client, offline queue, witness chains, API routes

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): CNN classification pipeline with ABCDE scoring and privacy layer

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(dragnes): resolve build errors by externalizing @ruvector/cnn

Mark @ruvector/cnn as external in Rollup/SSR config so the dynamic
import in the classifier does not break the production build.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): app integration, health endpoint, build validation

- Add DrAgnes nav link to sidebar NavMenu
- Create /api/dragnes/health endpoint with config status
- Add config module exporting DRAGNES_CONFIG
- Update DrAgnes page with loading state & error boundaries
- All 37 tests pass, production build succeeds

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): benchmarks, dataset metadata, federated learning, deployment runbook

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(dragnes): use @vite-ignore for optional @ruvector/cnn import

Prevents Vite dev server from failing on the optional WASM dependency
by using /* @vite-ignore */ comment and variable-based import path.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(dragnes): reduce false positives with Bayesian-calibrated classifier

Apply HAM10000 class priors as Bayesian log-priors to demo classifier,
learned from pi.ruv.io brain specialist agent patterns:
- nv (66.95%) gets strong prior, reducing over-classification of rare types
- mel requires multiple simultaneous features (dark + blue + multicolor +
  high variance) to overcome its 11.11% prior
- Added color variance analysis as asymmetry proxy
- Added dermoscopic color count for multi-color detection
- Platt-calibrated feature weights from brain melanoma specialist

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(dragnes): require ≥2 concurrent evidence signals for melanoma

A uniformly dark spot was triggering melanoma at 74.5%. Now requires
at least 2 of: [dark >15%, blue-gray >3%, ≥3 colors, high variance]
to overcome the melanoma prior. Proven on 6 synthetic test cases:
0 false positives, 1/1 true melanoma detected at 91.3%.

Co-Authored-By: claude-flow <ruv@ruv.net>

* data(dragnes): HAM10000 metadata and analysis script

Add comprehensive analysis of the HAM10000 skin lesion dataset based on
published statistics from Tschandl et al. 2018. Generates class distribution,
demographic, localization, diagnostic method, and clinical risk pattern
analysis. Outputs both markdown report and JSON stats for the knowledge module.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): HAM10000 clinical knowledge module with demographic adjustment

Add ham10000-knowledge.ts encoding verified HAM10000 statistics as structured
data for Bayesian demographic adjustment. Includes per-class age/sex/location
risk multipliers, clinical decision thresholds (biopsy at P(mal)>30%, urgent
referral at P(mel)>50%), and adjustForDemographics() function implementing
posterior probability correction based on patient demographics.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): integrate HAM10000 knowledge into classifier

Add classifyWithDemographics() method to DermClassifier that applies Bayesian
demographic adjustment after CNN classification. Returns both raw and adjusted
probabilities for transparency, plus clinical recommendations (biopsy, urgent
referral, monitor, or reassurance) based on HAM10000 evidence thresholds.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(dragnes): wire HAM10000 demographics into UI

- Add patient age/sex inputs in Capture tab
- Toggle for HAM10000 Bayesian adjustment
- Pass body location from DermCapture to classifyWithDemographics()
- Clinical recommendation banner in Results tab with color-coded
  risk levels (urgent_referral/biopsy/monitor/reassurance)
- Shows melanoma + malignant probabilities and reasoning

Co-Authored-By: claude-flow <ruv@ruv.net>

* refactor(dragnes): move to standalone examples/dragnes/ app

Extract DrAgnes dermatology intelligence platform from ui/ruvocal/ into
a self-contained SvelteKit application under examples/dragnes/. Includes
all library modules, components, API routes, tests, deployment config,
PWA assets, and research documentation. Updated paths for standalone
routing (no /dragnes prefix), fixed static asset references, and
adjusted test imports.

Co-Authored-By: claude-flow <ruv@ruv.net>

* revert: restore ui/ruvocal to main state -- remove DrAgnes commingling

Remove all DrAgnes-related files, components, routes, and config from
ui/ruvocal/ so it matches the main branch exactly. DrAgnes now lives
as a standalone app in examples/dragnes/.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(ruvocal): fix icon 404 and FoundationBackground crash

- Manifest icon paths: /chat/chatui/ → /chatui/ (matches static dir)
- FoundationBackground: guard against undefined particles in connections

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(ruvocal): MCP SSE auto-reconnect on stale session (404/connection errors)

- Widen isConnectionClosedError to catch 404, fetch failed, ECONNRESET
- Add transport readyState check in clientPool for dead connections
- Retry logic now triggers reconnection on stale SSE sessions

Co-Authored-By: claude-flow <ruv@ruv.net>

* chore: update gitignore for nested .env files and Cargo.lock

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: update links in README for self-learning, self-optimizing, embeddings, verified training, search, storage, PostgreSQL, graph, AI runtime, ML framework, coherence, domain models, hardware, kernel, coordination, packaging, routing, observability, safety, crypto, and lineage sections

* docs: ADR-115 cost-effective strategy + ADR-118 tiered crawl budget

Add Section 15 to ADR-115 with cost-effective implementation strategy:
- Three-phase budget model ($11-28/mo -> $73-108 -> $158-308)
- CostGuardrails Rust struct with per-phase presets
- Sparsifier-aware graph management (partition on sparse edges)
- Partition timeout fix via caching + background recompute
- Cloud Scheduler YAML for crawl jobs
- Anti-patterns and cost monitoring

Create ADR-118 as standalone cost strategy ADR with:
- Detailed per-phase cost breakdowns
- Guardrail enforcement points
- Partition caching strategy with request flow
- Acceptance criteria tied to cost targets

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: add pi.ruv.io brain guidance and project structure to CLAUDE.md

- When/how to use brain MCP tools during development
- Brain REST API fallback when MCP SSE is stale
- Google Cloud secrets and deployment reference
- Project directory structure quick reference
- Key rules: no PHI/secrets in brain, category taxonomy, stale session fix

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: Common Crawl Phase 1 benchmark — pipeline validation results

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(brain): make InjectRequest.source optional for batch inject

The batch endpoint falls back to BatchInjectRequest.source when items
don't have their own source field, but serde deserialization failed
before the handler could apply this logic (422). Adding #[serde(default)]
lets items omit source when using batch inject.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: Common Crawl Phase 1 deployment script — medical domain scheduler jobs

Deploy CDX-targeted crawl for PubMed + dermatology domains via Cloud Scheduler.
Uses static Bearer auth (brain server API key) instead of OIDC since Cloud Run
allows unauthenticated access and brain's auth rejects long JWT tokens.

Jobs: brain-crawl-medical (daily 2AM, 100 pages), brain-crawl-derm (daily 3AM,
50 pages), brain-partition-cache (hourly graph rebuild).

Tested: 10 new memories injected from first run (1568->1578). CDX falls back to
Wayback API from Cloud Run. ADR-118 Phase 1 implementation.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: ADR-119 historical crawl evolutionary comparison

Implement temporal knowledge evolution tracking across quarterly
Common Crawl snapshots (2020-2026). Includes:
- ADR-119 with architecture, cost model, acceptance criteria
- Historical crawl import script (14 quarterly snapshots, 5 domains)
- Evolutionary analysis module (drift detection, concept birth, similarity)
- Initial analysis report on existing brain content (71 memories)

Cost: ~$7-15 one-time for full 2020-2026 import.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: update ADR-115/118/119 with Phase 1 implementation results

- ADR-115: Status → Phase 1 Implemented, actual import numbers (1,588 memories,
  372K edges, 28.7x sparsifier), CDX vs direct inject pipeline status
- ADR-118: Status → Phase 1 Active, scheduler jobs documented, CDX HTML
  extractor issue + direct inject workaround, actual vs projected cost
- ADR-119: 30+ temporal articles imported (2020-2026), search verification
  confirmed, acceptance criteria progress tracked

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: WET processing pipeline for full medical + CS corpus import (ADR-120)

Bypasses broken CDX HTML extractor by processing pre-extracted text
from Common Crawl WET files. Filters by 30 medical + CS domains,
chunks content, and batch injects into pi.ruv.io brain.

Includes: processor, filter/injector, Cloud Run Job config,
orchestrator for multi-segment processing.

Target: full corpus in 6 weeks at ~$200 total cost.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: Cloud Run Job deployment for full 6-year Common Crawl import

- Expanded domain list to 60+ medical + CS domains with categorized tagging
- Cloud Run Job config: 10 parallel tasks, 100 segments per crawl
- Multi-crawl orchestrator for 14 quarterly snapshots (2020-2026)
- Enhanced generateTags with domain-specific labels for oncology, dermatology,
  ML conferences, research labs, and academic institutions
- Target: 375K-500K medical/CS pages over 5 months

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: correct Cloud Run Job deploy to use env-vars-file and --source build

- Use --env-vars-file (YAML) to avoid comma-splitting in domain list
- Use --source deploy to auto-build container from Dockerfile
- Use correct GCS bucket (ruvector-brain-us-central1)
- Use --tasks flag instead of --task-count

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: bake WET paths into container image to avoid GCS auth at runtime

- Embed paths.txt directly into Docker image during build
- Remove GCS bucket dependency from entrypoint
- Add diagnostic logging for brain URL and crawl index per task

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: update ADR-120 with deployment results and expanded domain list

- Status → Phase 1 Deployed
- 8 local segments: 109 pages injected from 170K scanned
- Cloud Run Job executing (50 segments, 10 parallel)
- 4 issues fixed (paths corruption, task index, comma splitting, gsutil)
- Domain list expanded 30 → 60+
- Brain: 1,768 memories, 565K edges, 39.8x sparsifier

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: WET processor OOM — process records inline, increase memory to 2Gi

Node.js heap exhausted at 512MB buffering 21K WARC records.
Fix: process each record immediately instead of accumulating in
pendingRecords array. Also cap per-record content length and
increase Cloud Run Job memory from 1Gi to 2Gi with --max-old-space-size=1536.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: add 30 physics domains + keyword detection to WET crawler

Add CERN, INSPIRE-HEP, ADS, NASA, LIGO, Fermilab, SLAC, NIST,
Materials Project, Quanta Magazine, quantum journals, IOP, APS,
and national labs. Physics keyword detection for dark matter,
quantum, Higgs, gravitational waves, black holes, condensed matter,
fusion energy, neutrinos, and string theory.

Total domains: 90+ (medical + CS + physics).

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: expand WET crawler to 130+ domains across all knowledge areas

Added: GitHub, Stack Overflow/Exchange, patent databases (USPTO, EPO),
preprint servers (bioRxiv, medRxiv, chemRxiv, SSRN), Wikipedia,
government (NSF, DARPA, DOE, EPA), science news, academic publishers
(JSTOR, Cambridge, Sage, Taylor & Francis), data repositories
(Kaggle, Zenodo, Figshare), and ML explainer blogs.

Total: 130+ domains covering medical, CS, physics, code, patents,
preprints, regulatory, news, and open data.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(brain): update Gemini model to gemini-2.5-flash with env override

Old model ID gemini-2.5-flash-preview-05-20 was returning 404.
Updated default to gemini-2.5-flash (stable release).
Added GEMINI_MODEL env var override for future flexibility.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(brain): integrate Google Search Grounding into Gemini optimizer (ADR-121)

Add google_search tool to Gemini API calls so the optimizer verifies
generated propositions against live web sources. Grounding metadata
(source URLs, support scores, search queries) logged for auditability.

- google_search tool added to request body
- Grounding metadata parsed and logged
- Configurable via GEMINI_GROUNDING env var (default: true)
- Model updated to gemini-2.5-flash (stable)
- ADR-121 documents integration

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(brain): deploy-all.sh preserves env vars, includes all features

CRITICAL FIX: Changed --set-env-vars to --update-env-vars so deploys
don't wipe FIRESTORE_URL, GEMINI_API_KEY, and feature flags.

Now includes:
- FIRESTORE_URL auto-constructed from PROJECT_ID
- GEMINI_API_KEY fetched from Google Secrets Manager
- All 22 feature flags (GWT, SONA, Hopfield, HDC, DentateGyrus,
  midstream, sparsifier, DP, grounding, etc.)
- Session affinity for SSE MCP connections

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: update ADR-121 with deployment verification and optimization gaps

- Verified: Gemini 2.5 Flash + grounding working
- Brain: 1,808 memories, 611K edges, 42.4x sparsifier
- Documented 5 optimization opportunities:
  1. Graph rebuild timeout (>90s for 611K edges)
  2. In-memory state loss on deploy
  3. SONA needs trajectory injection path
  4. Scheduler jobs need first auto-fire
  5. WET daily needs segment rotation

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: design rvagent autonomous Gemini grounding agents (ADR-122)

Four-phase system for autonomous knowledge verification and enrichment
of the pi.ruv.io brain using Gemini 2.5 Flash with Google Search
grounding. Addresses the gap where all 11 propositions are is_type_of
and the Horn clause engine has no relational data to chain.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: ADR-122 Rev 2 — candidate graph, truth maintenance, provenance

Applied 6 priority revisions from architecture review:
1. Reworked cost model with 3 scenarios (base/expected/worst)
2. Added candidate vs canonical graph separation with promotion gates
3. Narrowed predicate set to causes/treats/depends_on/part_of/measured_by
4. Replaced regex-only PHI with allowlist-based serialization
5. Added truth maintenance state machine (7 proposition states)
6. Added provenance schema for every grounded mutation

Status: Approved with Revisions

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: implement 4 Gemini grounding agents + Cloud Run deploy (ADR-122)

Phase 1 (Fact Verifier): verified 2 memories with grounding sources
Phase 2 (Relation Generator): found 1 'contradicts' relation
Phase 3 (Cross-Domain Explorer): framework working, needs JSON parse fix
Phase 4 (Research Director): framework working, needs drift data

Scripts: gemini-agents.js, deploy-gemini-agents.sh
Cloud Run Job + 4 scheduler entries deploying.
Brain grew: 1,809 → 1,812 (+3 from initial run)

Co-Authored-By: claude-flow <ruv@ruv.net>

* perf(brain): upgrade to 4 CPU / 4 GiB / 20 instances + rate limit WET injector

- Cloud Run: 2 CPU → 4 CPU, 2 GiB → 4 GiB, max 10 → 20 instances
- WET injector: 1s delay between batch injects to prevent brain saturation
- Deploy script updated to match new resource allocation

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: ADR-122 Rev 2 — candidate graph, truth maintenance, provenance

Co-Authored-By: claude-flow <ruv@ruv.net>

2026-03-23 10:12:50 -04:00

16 KiB

Raw Permalink Blame History

DrAgnes DermLite Integration Research

Status: Research & Planning Date: 2026-03-21

Overview

DermLite (manufactured by 3Gen Inc., San Juan Capistrano, CA) is the world's most widely used line of dermatoscopes. DrAgnes is designed as a DermLite-native platform, providing purpose-built integration with their device ecosystem for standardized dermoscopic imaging and analysis.

DermLite Device Lineup

DermLite HUD (Heads-Up Display)

Form Factor: Standalone camera with built-in display and optics
Magnification: 10x polarized
Illumination: LED ring with polarization filter
Camera: Built-in 12MP sensor, 1920x1080 capture
Connectivity: Wi-Fi (image transfer), Bluetooth (metadata/control)
Unique Features:
- Hands-free operation (no phone attachment needed)
- Built-in display shows magnified real-time view
- Dual-mode: polarized and non-polarized switching
- Internal storage for batch capture
DrAgnes Integration: Wi-Fi direct for image transfer; Bluetooth for device control and metadata. Best suited for high-volume clinical environments.

DermLite DL5

Form Factor: Handheld dermatoscope with smartphone adapter
Magnification: 10x, hybrid polarized/non-polarized (toggle)
Illumination: 20 PigmentBoost LEDs + 4 polarized LEDs
Adapter: Universal magnetic mount (MagnetiConnect)
Power: Rechargeable lithium-ion, 4+ hours continuous use
Unique Features:
- PigmentBoost mode enhances pigmented structures
- Hybrid mode allows instant switching without contact loss
- Crystal-clear optics with minimal distortion
- Compact enough for pocket carry
DrAgnes Integration: Phone camera passthrough via adapter. Camera API captures at phone's native resolution. DL5's PigmentBoost mode is flagged in metadata for preprocessing calibration.

DermLite DL4

Form Factor: Compact pocket dermatoscope
Magnification: 10x, polarized only
Illumination: LED ring, polarized
Adapter: Smartphone adapter available (MagnetiConnect)
Power: Rechargeable or AA batteries
Unique Features:
- Most affordable DermLite model
- Widely adopted in primary care
- Lightweight (50g)
DrAgnes Integration: Same phone camera passthrough as DL5. Lower-tier device but adequate for DrAgnes classification. Ideal for primary care adoption.

DermLite DL200 Hybrid

Form Factor: Handheld with contact/non-contact dual mode
Magnification: 10x
Illumination: Hybrid LED system
Contact Mode: Immersion fluid or direct contact with glass plate
Non-Contact Mode: Cross-polarized at distance
Adapter: Magnetic smartphone mount
Unique Features:
- Contact mode reveals subsurface structures (vessels, deeper pigment)
- Non-contact mode for mucosal surfaces, painful areas
- Dual-mode in single device
DrAgnes Integration: Contact mode detection via metadata or image analysis (presence of glass plate reflection). Different preprocessing paths for contact vs. non-contact images.

Image Capture Integration

MediaStream API (Browser-Based)

DrAgnes Camera Module
    │
    ├── navigator.mediaDevices.getUserMedia({
    │       video: {
    │           facingMode: 'environment',    // Rear camera (DermLite side)
    │           width: { ideal: 1920 },
    │           height: { ideal: 1080 },
    │           frameRate: { ideal: 30 },
    │           focusMode: 'manual',          // Lock focus for dermoscopy
    │           whiteBalanceMode: 'manual',   // Calibrated for DermLite LEDs
    │       }
    │   })
    │
    ├── Live Preview (Canvas)
    │       ├── Real-time focus quality indicator
    │       ├── Lesion centering guide (circle overlay)
    │       ├── Exposure warning (over/under)
    │       └── DermLite detection indicator
    │
    ├── Capture (requestVideoFrameCallback)
    │       ├── High-res still capture (max sensor resolution)
    │       ├── Multi-frame averaging (3 frames for noise reduction)
    │       └── Auto-rotation correction
    │
    └── Storage (IndexedDB)
            ├── Original capture (encrypted)
            ├── Preprocessed 224x224 tensor
            └── Metadata (device, timestamp, settings)

DermLite Device Detection

DrAgnes auto-detects DermLite attachment through multiple signals:

Image analysis: DermLite images have characteristic features:
- Circular field of view (dark corners from circular optics)
- Consistent illumination pattern (LED ring)
- Magnification level (10x produces distinctive scale)
- Polarization artifacts (cross-polarized light produces specific color shifts)
EXIF metadata: Some DermLite-phone combinations include device info
User confirmation: Manual DermLite model selection in UI as fallback

Image Quality Assessment

Before classification, DrAgnes assesses image quality:

Quality Assessment Pipeline
    │
    ├── Focus Quality (Laplacian variance)
    │       ├── Score < 100: "Blurry -- please refocus"
    │       ├── Score 100-500: "Acceptable"
    │       └── Score > 500: "Sharp"
    │
    ├── Exposure Check (histogram analysis)
    │       ├── Mean intensity < 50: "Underexposed"
    │       ├── Mean intensity > 200: "Overexposed"
    │       └── Dynamic range < 100: "Low contrast"
    │
    ├── Lesion Coverage (center ROI analysis)
    │       ├── Lesion < 10% of frame: "Too far -- zoom in"
    │       ├── Lesion > 90% of frame: "Too close -- zoom out"
    │       └── Lesion off-center: "Center the lesion"
    │
    ├── Hair Occlusion (line detection)
    │       ├── > 20% coverage: "Excessive hair -- consider removal"
    │       └── Software hair removal applied regardless
    │
    └── Artifact Detection
            ├── Bubble artifacts (contact dermoscopy)
            ├── Reflection artifacts (glass plate)
            └── Motion blur (movement during capture)

Dermoscopic Analysis Modules

ABCDE Criteria Automation

The ABCDE mnemonic is the most widely taught screening tool for melanoma detection.

A - Asymmetry:

Method: Divide lesion along two perpendicular axes of maximum symmetry
    │
    ├── Segmentation: Otsu thresholding + morphological cleanup
    ├── Axis detection: Principal Component Analysis on contour points
    ├── Mirror comparison: XOR of left/right and top/bottom halves
    ├── Scoring:
    │       ├── 0: Symmetric along both axes
    │       ├── 1: Asymmetric along one axis
    │       └── 2: Asymmetric along both axes
    └── Weight: 1.3x (highest discriminative power for melanoma)

B - Border Irregularity:

Method: Divide border into 8 equal segments, assess each
    │
    ├── Contour extraction: Canny edge detection on segmentation mask
    ├── Segment division: 8 equal arc-length segments from centroid
    ├── Irregularity metrics per segment:
    │       ├── Fractal dimension (box-counting method)
    │       ├── Curvature variation (second derivative of contour)
    │       └── Abrupt border cutoff (gradient magnitude at boundary)
    ├── Scoring: 0-8 (count of irregular segments)
    └── Weight: 0.1x per segment

C - Color:

Method: Count distinct colors present in lesion
    │
    ├── Color space: Convert to perceptually uniform CIELAB
    ├── Reference colors (6 clinically significant):
    │       ├── Light brown (tan)
    │       ├── Dark brown
    │       ├── Black
    │       ├── Red
    │       ├── Blue-gray
    │       └── White (regression)
    ├── Detection: K-means clustering (k=6) + distance to reference
    ├── Scoring: 1-6 (count of colors present)
    └── Weight: 0.5x

D - Diameter:

Method: Maximum diameter of lesion in mm
    │
    ├── Calibration: DermLite ruler overlay or known magnification (10x)
    ├── Measurement: Maximum Feret diameter of segmentation contour
    ├── Threshold: 6mm is the clinical cutoff
    ├── Note: Nodular melanomas can be < 6mm; size alone is insufficient
    └── Weight: Binary (>= 6mm adds to risk score)

E - Evolution:

Method: Compare current image to prior captures of same lesion
    │
    ├── Registration: Affine alignment using lesion contour landmarks
    ├── Change detection:
    │       ├── Area change (growth rate in mm^2/month)
    │       ├── Color change (new colors appearing)
    │       ├── Shape change (symmetry score delta)
    │       ├── Border change (irregularity score delta)
    │       └── New structures (dermoscopic features appearing/disappearing)
    ├── Scoring: Composite change score normalized to 0-1
    └── Note: Most powerful criterion but requires longitudinal data

7-Point Checklist (Argenziano Method)

A structured scoring system for dermoscopic evaluation:

Criterion	Points	Detection Method
Atypical pigment network	2 (major)	CNN feature detection on dermoscopic structures
Blue-whitish veil	2 (major)	Color analysis in blue-gray spectrum + opacity detection
Atypical vascular pattern	2 (major)	Red channel analysis + vessel topology extraction
Irregular streaks	1 (minor)	Directional filter banks + radial analysis from center
Irregular dots/globules	1 (minor)	Blob detection (LoG) + regularity analysis
Irregular blotches	1 (minor)	Connected component analysis in dark regions
Regression structures	1 (minor)	White scar-like areas + blue-gray peppering detection

Interpretation: Total score >= 3 suggests melanoma. Sensitivity ~95%, specificity ~75% in clinical studies.

DrAgnes Implementation: Each criterion has a dedicated CNN sub-head trained on the Derm7pt dataset which provides expert annotations for all 7 criteria. The sub-heads share the MobileNetV3 backbone but have independent classification layers.

Menzies Method

A simplified 2-step approach used in clinical practice:

Step 1 - Negative Features (must be absent for melanoma):

Point symmetry of pigmentation
Single color presence

Step 2 - Positive Features (at least one must be present for melanoma):

Blue-white veil
Multiple brown dots
Pseudopods
Radial streaming
Scar-like depigmentation
Peripheral black dots/globules
Multiple colors (5-6)
Multiple blue-gray dots
Broadened network

DrAgnes Implementation: Binary classifiers for each positive and negative feature. If both negative features are absent AND at least one positive feature is present, flag for melanoma consideration.

Pattern Analysis (Advanced Dermoscopy)

Beyond ABCDE and checklists, DrAgnes performs pattern-level analysis:

Global Patterns:

Pattern	Association	Detection
Reticular	Benign melanocytic	Network detection via Gabor filters
Globular	Benign melanocytic	Blob detection (LoG, DoG)
Homogeneous	Benign (blue nevus, dermatofibroma)	Variance analysis (low variance = homogeneous)
Starburst	Spitz nevus or melanoma	Radial streaks from center + symmetry
Multicomponent	Melanoma (multiple patterns)	Pattern diversity score (entropy)
Nonspecific	Various	Low confidence flag for expert review

Local Structures:

Structure	Clinical Significance	Detection Method
Pigment network	Regular=benign, irregular=suspicious	Gabor filter response + regularity metrics
Dots	Regular=benign, irregular=melanoma	LoG blob detection + spatial distribution analysis
Globules	Regular=benign, irregular=melanoma	Larger blob detection + shape analysis
Streaks	Radial=melanoma, regular=Spitz	Directional filter + radial pattern detection
Blue-white veil	Melanoma indicator	Color segmentation + opacity detection
Regression structures	Melanoma regression	White+blue-gray area detection
Vascular structures	Various (type-dependent)	Red channel + vessel topology
Milia-like cysts	Seborrheic keratosis	Bright spot detection with specific shape
Comedo-like openings	Seborrheic keratosis	Dark spot detection + shape analysis
Leaf-like structures	BCC	Edge structure detection + morphology
Large blue-gray ovoid nests	BCC	Connected component + color analysis

EHR Integration Research

FHIR R4 Resources

DrAgnes maps to standard FHIR resources for EHR interoperability:

DrAgnes Entity	FHIR Resource	Notes
DermImage	Media	With bodySite coding (SNOMED CT)
LesionClassification	DiagnosticReport	observationResult references
ABCDE Scores	Observation	One per criterion, grouped
Clinician Feedback	ClinicalImpression	Links to DiagnosticReport
Biopsy Result	DiagnosticReport	histopathology category
Follow-Up	ServiceRequest	scheduled monitoring

Practice Management Systems

System	Integration Method	Coverage
Epic	Epic on FHIR (R4), CDS Hooks	~38% US market
Cerner (Oracle Health)	FHIR R4 API	~25% US market
athenahealth	athenaFlex (FHIR R4)	~10% US market
Modernizing Medicine (EMA)	Proprietary API + FHIR	Dermatology specialty leader
Nextech	Proprietary API	Dermatology/plastic surgery focus

Priority Integration: Modernizing Medicine's EMA (Electronic Medical Assistant) is the dominant EHR for dermatology practices. Integration with EMA should be a Phase 2 priority.

Calibration & Quality Assurance

Color Calibration

DermLite LEDs have a known color temperature (~4500K). DrAgnes calibrates:

Capture image of ColorChecker (X-Rite) chart through DermLite
Compute color correction matrix (3x3 affine in CIELAB)
Apply correction to all subsequent captures
Re-calibrate monthly or when device changes

Magnification Calibration

Capture image of known-size reference (DermLite ruler or 1mm grid)
Compute pixels-per-mm at 10x magnification
Store calibration factor per device
Use for accurate diameter measurements (ABCDE "D" criterion)

Inter-Device Consistency

Different DermLite models produce subtly different images. DrAgnes normalizes:

Color normalization: Shades of Gray algorithm standardizes illumination
Magnification normalization: Resize to consistent pixels-per-mm
Polarization normalization: Separate processing paths for polarized vs. non-polarized
Contact artifact handling: Detect and compensate for contact plate reflections

DermLite SDK & API Research

Current State (2026)

3Gen Inc. does not provide a public SDK for DermLite devices. Integration relies on:

Phone camera passthrough (DermLite acts as optical adapter)
Wi-Fi direct for HUD model image transfer
Bluetooth for HUD model control
EXIF metadata extraction where available

Recommended API Strategy

Phase 1: Camera API integration (no DermLite SDK dependency)
- Works with all DermLite models via phone camera
- Auto-detect DermLite presence via image analysis
- Manual device selection fallback
Phase 2: Partner with 3Gen for official SDK access
- Direct device control (focus, illumination, capture)
- Device serial number for calibration persistence
- Firmware version tracking for compatibility
Phase 3: Co-develop next-gen DermLite with embedded AI
- On-device CNN inference (edge deployment)
- Built-in calibration reference
- Direct brain connectivity
- Real-time AR overlay with diagnostic guidance

16 KiB Raw Permalink Blame History