mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-29 19:33:34 +00:00
Extract DrAgnes dermatology intelligence platform from ui/ruvocal/ into a self-contained SvelteKit application under examples/dragnes/. Includes all library modules, components, API routes, tests, deployment config, PWA assets, and research documentation. Updated paths for standalone routing (no /dragnes prefix), fixed static asset references, and adjusted test imports. Co-Authored-By: claude-flow <ruv@ruv.net>
557 lines
22 KiB
Markdown
557 lines
22 KiB
Markdown
# DrAgnes Google Cloud Deployment Plan
|
|
|
|
**Status**: Research & Planning
|
|
**Date**: 2026-03-21
|
|
|
|
## Overview
|
|
|
|
DrAgnes leverages the existing pi.ruv.io Google Cloud infrastructure, extending it with dermatology-specific services. The deployment follows a multi-region, HIPAA-compliant architecture using Google Cloud's BAA-covered services.
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────┐
|
|
│ Cloud CDN + LB │
|
|
│ (Global, HTTPS termination) │
|
|
└──────────┬──────────────────────┘
|
|
│
|
|
┌──────────────┼──────────────┐
|
|
│ │ │
|
|
┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐
|
|
│ us-east1 │ │ us-west1 │ │ europe-w1 │
|
|
│ (primary) │ │ (failover)│ │ (EU data) │
|
|
└─────┬─────┘ └─────┬─────┘ └─────┬─────┘
|
|
│ │ │
|
|
┌──────────┴──────────────┴──────────────┴──────────┐
|
|
│ Service Mesh │
|
|
│ │
|
|
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
|
|
│ │ DrAgnes │ │ Brain │ │ CNN Model │ │
|
|
│ │ API │ │ Server │ │ Server │ │
|
|
│ │ (Cloud Run)│ │ (Cloud Run)│ │ (Cloud Run)│ │
|
|
│ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘ │
|
|
│ │ │ │ │
|
|
│ ┌─────┴───────────────┴───────────────┴─────┐ │
|
|
│ │ Data Layer │ │
|
|
│ │ │ │
|
|
│ │ Firestore │ GCS │ Memorystore │ BigQuery │ │
|
|
│ └────────────────────────────────────────────┘ │
|
|
│ │
|
|
│ ┌────────────────────────────────────────────┐ │
|
|
│ │ Event Layer │ │
|
|
│ │ │ │
|
|
│ │ Pub/Sub │ Cloud Scheduler │ Cloud Tasks │ │
|
|
│ └────────────────────────────────────────────┘ │
|
|
│ │
|
|
│ ┌────────────────────────────────────────────┐ │
|
|
│ │ Security Layer │ │
|
|
│ │ │ │
|
|
│ │ IAM │ Secret Manager │ CMEK │ VPC-SC │ │
|
|
│ └────────────────────────────────────────────┘ │
|
|
└────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Service Configuration
|
|
|
|
### 1. DrAgnes API Service (Cloud Run)
|
|
|
|
Primary API service for classification requests and practice management.
|
|
|
|
```yaml
|
|
# dragnes-api.yaml
|
|
apiVersion: serving.knative.dev/v1
|
|
kind: Service
|
|
metadata:
|
|
name: dragnes-api
|
|
annotations:
|
|
run.googleapis.com/launch-stage: GA
|
|
run.googleapis.com/ingress: internal-and-cloud-load-balancing
|
|
spec:
|
|
template:
|
|
metadata:
|
|
annotations:
|
|
autoscaling.knative.dev/minInstances: "2"
|
|
autoscaling.knative.dev/maxInstances: "100"
|
|
run.googleapis.com/cpu-throttling: "false"
|
|
run.googleapis.com/execution-environment: gen2
|
|
spec:
|
|
containerConcurrency: 80
|
|
timeoutSeconds: 300
|
|
containers:
|
|
- image: gcr.io/ruvector-brain-dev/dragnes-api:latest
|
|
ports:
|
|
- containerPort: 8080
|
|
resources:
|
|
limits:
|
|
cpu: "2"
|
|
memory: 2Gi
|
|
env:
|
|
- name: BRAIN_URL
|
|
value: "https://brain-server-internal.run.app"
|
|
- name: MODEL_BUCKET
|
|
value: "gs://dragnes-models"
|
|
- name: RUST_LOG
|
|
value: "info"
|
|
startupProbe:
|
|
httpGet:
|
|
path: /health
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 5
|
|
```
|
|
|
|
### 2. CNN Model Server (Cloud Run)
|
|
|
|
Server-side CNN inference for practices without WASM capability.
|
|
|
|
```yaml
|
|
# dragnes-cnn.yaml
|
|
apiVersion: serving.knative.dev/v1
|
|
kind: Service
|
|
metadata:
|
|
name: dragnes-cnn
|
|
spec:
|
|
template:
|
|
metadata:
|
|
annotations:
|
|
autoscaling.knative.dev/minInstances: "1"
|
|
autoscaling.knative.dev/maxInstances: "50"
|
|
run.googleapis.com/cpu-throttling: "false"
|
|
run.googleapis.com/execution-environment: gen2
|
|
spec:
|
|
containerConcurrency: 20
|
|
timeoutSeconds: 30
|
|
containers:
|
|
- image: gcr.io/ruvector-brain-dev/dragnes-cnn:latest
|
|
ports:
|
|
- containerPort: 8080
|
|
resources:
|
|
limits:
|
|
cpu: "4"
|
|
memory: 4Gi
|
|
env:
|
|
- name: MODEL_PATH
|
|
value: "/models/mobilenetv3_small_int8.bin"
|
|
- name: SIMD_ENABLED
|
|
value: "true"
|
|
```
|
|
|
|
**Performance Notes**:
|
|
- Cloud Run gen2 provides AVX2 SIMD acceleration
|
|
- INT8 quantized model fits in <5MB memory
|
|
- Target: <50ms inference per image
|
|
- Concurrency limited to 20 (CPU-bound workload)
|
|
|
|
### 3. Brain Server (Existing)
|
|
|
|
The existing pi.ruv.io brain server at `brain-server-*.run.app` handles:
|
|
- Knowledge graph management (316K edges)
|
|
- HNSW search (128-dim, PiQ3 quantized)
|
|
- PubMed integration
|
|
- Sparsifier analytics (ADR-116)
|
|
- Witness chain management
|
|
|
|
**DrAgnes-specific extensions**:
|
|
- New memory namespace: `dragnes-dermatology`
|
|
- Custom similarity threshold for dermoscopic embeddings
|
|
- Dermoscopy-specific PubMed search templates
|
|
- Classification feedback ingestion endpoint
|
|
|
|
### 4. PWA Frontend (Firebase Hosting)
|
|
|
|
```
|
|
Firebase Hosting Configuration
|
|
│
|
|
├── Hosting
|
|
│ ├── SPA routing (all paths → index.html)
|
|
│ ├── CDN caching (immutable assets: 1 year)
|
|
│ ├── WASM files: Cache-Control: public, max-age=31536000
|
|
│ ├── Model weights: Cache-Control: public, max-age=86400
|
|
│ └── API proxy: /api/** → Cloud Run dragnes-api
|
|
│
|
|
├── Service Worker (Workbox)
|
|
│ ├── Precache: app shell, WASM module, model weights
|
|
│ ├── Runtime cache: brain search results (stale-while-revalidate)
|
|
│ ├── Background sync: diagnosis submissions
|
|
│ └── Offline fallback page
|
|
│
|
|
└── PWA Manifest
|
|
├── name: "DrAgnes"
|
|
├── display: "standalone"
|
|
├── orientation: "portrait"
|
|
├── theme_color: "#1a365d"
|
|
└── icons: 192x192, 512x512 (maskable)
|
|
```
|
|
|
|
## Data Storage
|
|
|
|
### Firestore (De-Identified Metadata)
|
|
|
|
```
|
|
Firestore Collections
|
|
│
|
|
├── /practices/{practiceId}
|
|
│ ├── name: string
|
|
│ ├── region: string
|
|
│ ├── modelVersion: string
|
|
│ ├── totalClassifications: number
|
|
│ ├── dpBudgetUsed: number
|
|
│ └── createdAt: timestamp
|
|
│
|
|
├── /classifications/{classificationId}
|
|
│ ├── practiceId: string (hashed)
|
|
│ ├── lesionClass: string
|
|
│ ├── confidence: number
|
|
│ ├── abcdeTotal: number
|
|
│ ├── sevenPointScore: number
|
|
│ ├── riskLevel: string
|
|
│ ├── clinicianAction: string
|
|
│ ├── fitzpatrickType: number (I-VI)
|
|
│ ├── bodyLocationCategory: string
|
|
│ ├── ageDecade: number
|
|
│ ├── witnessHash: string
|
|
│ └── createdAt: timestamp
|
|
│ NOTE: No patient identifiers. No raw images.
|
|
│
|
|
├── /feedback/{feedbackId}
|
|
│ ├── classificationId: string
|
|
│ ├── clinicianReview: string
|
|
│ ├── correctedClass: string (optional)
|
|
│ ├── histopathResult: string (optional)
|
|
│ └── createdAt: timestamp
|
|
│
|
|
└── /modelVersions/{versionId}
|
|
├── version: string (semver)
|
|
├── trainedOn: number (embedding count)
|
|
├── accuracy: number
|
|
├── sensitivityMelanoma: number
|
|
├── specificityMelanoma: number
|
|
├── fairnessScore: number
|
|
└── releasedAt: timestamp
|
|
```
|
|
|
|
**Firestore Security Rules**:
|
|
- Practice-level tenant isolation
|
|
- Write access: authenticated clinicians only
|
|
- Read access: same practice only
|
|
- Admin access: platform operators only
|
|
- No cross-practice data access
|
|
|
|
### Google Cloud Storage (GCS)
|
|
|
|
```
|
|
GCS Buckets
|
|
│
|
|
├── gs://dragnes-models/
|
|
│ ├── mobilenetv3_small_int8.bin (INT8 model, ~5MB)
|
|
│ ├── mobilenetv3_small_fp32.bin (FP32 model, ~15MB)
|
|
│ ├── mobilenetv3_small.wasm (WASM module, ~2MB)
|
|
│ ├── lora_weights/{practiceId}/latest.bin (per-practice LoRA)
|
|
│ └── reference_embeddings/top1000.bin (offline cache)
|
|
│ Encryption: CMEK (AES-256)
|
|
│ Access: dragnes-api service account only
|
|
│
|
|
├── gs://dragnes-rvf/
|
|
│ ├── {contributorHash}/{memoryId}.rvf (RVF containers)
|
|
│ Encryption: CMEK (AES-256)
|
|
│ Access: brain server service account only
|
|
│ Lifecycle: Archive after 90 days, delete after 7 years
|
|
│
|
|
└── gs://dragnes-audit/
|
|
├── access_logs/YYYY/MM/DD/*.jsonl
|
|
├── classification_logs/YYYY/MM/DD/*.jsonl
|
|
└── security_events/YYYY/MM/DD/*.jsonl
|
|
Encryption: CMEK (AES-256)
|
|
Retention: 6 years (HIPAA minimum)
|
|
Access: Security team only
|
|
```
|
|
|
|
### Memorystore (Redis) -- Optional Performance Layer
|
|
|
|
```
|
|
Redis Instance (Basic tier, 1GB)
|
|
│
|
|
├── Session cache (15-min TTL)
|
|
├── Rate limiting counters (per-practice, per-hour)
|
|
├── HNSW search result cache (5-min TTL)
|
|
└── Model version cache (1-hour TTL)
|
|
```
|
|
|
|
## Event Architecture
|
|
|
|
### Pub/Sub Topics
|
|
|
|
```
|
|
Pub/Sub Configuration
|
|
│
|
|
├── dragnes-classification (new classification events)
|
|
│ ├── Publisher: dragnes-api
|
|
│ ├── Subscriber: brain-server (brain ingestion)
|
|
│ ├── Subscriber: dragnes-analytics (BigQuery sink)
|
|
│ └── Subscriber: dragnes-alerts (monitoring)
|
|
│
|
|
├── dragnes-feedback (clinician feedback events)
|
|
│ ├── Publisher: dragnes-api
|
|
│ ├── Subscriber: brain-server (model improvement)
|
|
│ └── Subscriber: dragnes-analytics (accuracy tracking)
|
|
│
|
|
├── dragnes-model-update (model version events)
|
|
│ ├── Publisher: dragnes-training (Cloud Run job)
|
|
│ ├── Subscriber: dragnes-api (hot-reload)
|
|
│ └── Subscriber: dragnes-cnn (hot-reload)
|
|
│
|
|
└── dragnes-alerts (monitoring alerts)
|
|
├── Publisher: various services
|
|
└── Subscriber: Cloud Monitoring → PagerDuty
|
|
```
|
|
|
|
### Cloud Scheduler Jobs
|
|
|
|
```
|
|
Scheduled Jobs
|
|
│
|
|
├── dragnes-model-retrain
|
|
│ ├── Schedule: Weekly (Sunday 02:00 UTC)
|
|
│ ├── Action: Trigger Cloud Run job for model retraining
|
|
│ ├── Input: New feedback + brain embeddings since last train
|
|
│ └── Output: New model version to GCS
|
|
│
|
|
├── dragnes-drift-check
|
|
│ ├── Schedule: Daily (06:00 UTC)
|
|
│ ├── Action: Brain drift analysis on dermoscopy namespace
|
|
│ └── Alert: If drift > 0.15, trigger early retrain
|
|
│
|
|
├── dragnes-fairness-audit
|
|
│ ├── Schedule: Weekly (Monday 08:00 UTC)
|
|
│ ├── Action: Compute accuracy by Fitzpatrick type
|
|
│ └── Alert: If disparity > 5%, flag for investigation
|
|
│
|
|
├── dragnes-privacy-audit
|
|
│ ├── Schedule: Daily (04:00 UTC)
|
|
│ ├── Action: Verify no PII in Firestore/GCS
|
|
│ └── Alert: Any PII detection triggers incident
|
|
│
|
|
└── dragnes-backup
|
|
├── Schedule: Daily (00:00 UTC)
|
|
├── Action: Firestore export to GCS
|
|
└── Retention: 30 daily + 12 monthly + 7 yearly
|
|
```
|
|
|
|
## Security Configuration
|
|
|
|
### Google Secrets Manager
|
|
|
|
```
|
|
Secrets (extending existing pi.ruv.io secrets)
|
|
│
|
|
├── dragnes-api-key (API authentication key)
|
|
├── dragnes-jwt-signing-key (JWT token signing)
|
|
├── dragnes-cmek-key-id (CMEK key reference)
|
|
├── dragnes-oauth-client-id (Google OAuth client)
|
|
├── dragnes-oauth-client-secret (Google OAuth secret)
|
|
├── dragnes-firebase-config (Firebase project config)
|
|
└── dragnes-pubmed-api-key (NCBI E-utilities key)
|
|
|
|
Existing secrets reused:
|
|
├── ANTHROPIC_API_KEY (for chat interface LLM)
|
|
└── huggingface-token (for model downloads)
|
|
```
|
|
|
|
### IAM Configuration
|
|
|
|
```
|
|
Service Accounts
|
|
│
|
|
├── dragnes-api@ruvector-brain-dev.iam.gserviceaccount.com
|
|
│ ├── roles/run.invoker (invoke brain server)
|
|
│ ├── roles/datastore.user (Firestore read/write)
|
|
│ ├── roles/storage.objectViewer (model bucket)
|
|
│ ├── roles/pubsub.publisher (classification events)
|
|
│ └── roles/secretmanager.secretAccessor (secrets)
|
|
│
|
|
├── dragnes-cnn@ruvector-brain-dev.iam.gserviceaccount.com
|
|
│ ├── roles/storage.objectViewer (model bucket)
|
|
│ └── roles/secretmanager.secretAccessor (secrets)
|
|
│
|
|
└── dragnes-training@ruvector-brain-dev.iam.gserviceaccount.com
|
|
├── roles/storage.objectAdmin (model bucket, write new versions)
|
|
├── roles/datastore.viewer (read feedback data)
|
|
├── roles/pubsub.publisher (model update events)
|
|
└── roles/bigquery.dataViewer (analytics queries)
|
|
```
|
|
|
|
### VPC Service Controls
|
|
|
|
```
|
|
VPC-SC Perimeter: dragnes-perimeter
|
|
│
|
|
├── Protected Services
|
|
│ ├── firestore.googleapis.com
|
|
│ ├── storage.googleapis.com
|
|
│ ├── bigquery.googleapis.com
|
|
│ └── secretmanager.googleapis.com
|
|
│
|
|
├── Access Levels
|
|
│ ├── Corporate network CIDR ranges
|
|
│ ├── Cloud Run service accounts (internal)
|
|
│ └── Emergency break-glass accounts
|
|
│
|
|
└── Ingress Rules
|
|
├── Allow: Cloud Run → Firestore/GCS (internal)
|
|
├── Allow: Cloud Scheduler → Cloud Run (internal)
|
|
└── Deny: All other access to protected services
|
|
```
|
|
|
|
## Multi-Region Deployment
|
|
|
|
### Region Selection
|
|
|
|
| Region | Role | Justification |
|
|
|--------|------|---------------|
|
|
| us-east1 (South Carolina) | Primary | Low latency to East Coast US; HIPAA eligible |
|
|
| us-west1 (Oregon) | Failover | West Coast coverage; disaster recovery |
|
|
| europe-west1 (Belgium) | EU Data Residency | GDPR compliance for EU practices |
|
|
| asia-southeast1 (Singapore) | Future | APAC coverage (Phase 4) |
|
|
|
|
### Cross-Region Data Flow
|
|
|
|
```
|
|
Data Residency Rules
|
|
│
|
|
├── Patient metadata: Region-locked (US data stays in US, EU in EU)
|
|
├── De-identified brain embeddings: Global (privacy-preserving)
|
|
├── Model weights: Global (no PHI)
|
|
├── Audit logs: Region-locked
|
|
└── WASM/PWA assets: Global CDN
|
|
```
|
|
|
|
## Monitoring & Observability
|
|
|
|
### Cloud Monitoring Dashboard
|
|
|
|
```
|
|
DrAgnes Operations Dashboard
|
|
│
|
|
├── Service Health
|
|
│ ├── API latency (p50, p95, p99)
|
|
│ ├── CNN inference latency
|
|
│ ├── Error rate by endpoint
|
|
│ ├── Active instances per region
|
|
│ └── Request volume (per hour, per practice)
|
|
│
|
|
├── Classification Metrics
|
|
│ ├── Classifications per hour (global)
|
|
│ ├── Distribution by lesion class
|
|
│ ├── Average confidence score
|
|
│ ├── Clinician override rate
|
|
│ └── Sensitivity/specificity (rolling 30-day)
|
|
│
|
|
├── Brain Health
|
|
│ ├── Memory count (dermatology namespace)
|
|
│ ├── Drift status
|
|
│ ├── Embedding quality score
|
|
│ └── Sync latency
|
|
│
|
|
├── Privacy & Compliance
|
|
│ ├── PII scan results (should always be 0)
|
|
│ ├── DP budget consumption per practice
|
|
│ ├── Access audit anomalies
|
|
│ └── Witness chain verification failures
|
|
│
|
|
└── Cost Tracking
|
|
├── Cloud Run cost by service
|
|
├── Storage cost by bucket
|
|
├── Network egress cost
|
|
└── Total monthly cost vs. budget
|
|
```
|
|
|
|
### Alert Policies
|
|
|
|
| Alert | Condition | Severity | Action |
|
|
|-------|-----------|----------|--------|
|
|
| API error rate > 1% | 5-min window | P2 | PagerDuty notification |
|
|
| CNN latency > 500ms (p95) | 15-min window | P3 | Slack notification |
|
|
| PII detected in cloud | Any occurrence | P1 | Immediate incident response |
|
|
| Melanoma sensitivity < 90% | 7-day rolling | P1 | Model freeze + investigation |
|
|
| Fairness disparity > 5% | Weekly audit | P2 | Investigation within 24 hours |
|
|
| Brain drift > 0.15 | Daily check | P3 | Trigger early retrain |
|
|
| DP budget > 80% for practice | Per check | P3 | Notify practice admin |
|
|
|
|
## Cost Projections
|
|
|
|
### Monthly Cost Estimates (by Scale)
|
|
|
|
| Component | 10 Practices | 100 Practices | 1,000 Practices |
|
|
|-----------|-------------|--------------|-----------------|
|
|
| Cloud Run (API) | $50 | $200 | $1,500 |
|
|
| Cloud Run (CNN) | $30 | $150 | $1,000 |
|
|
| Brain Server (shared) | $150 (existing) | $150 | $300 |
|
|
| Firestore | $10 | $50 | $300 |
|
|
| GCS (models + RVF) | $5 | $20 | $100 |
|
|
| Cloud CDN | $10 | $30 | $150 |
|
|
| Firebase Hosting | $0 (free tier) | $25 | $100 |
|
|
| Memorystore (Redis) | $0 (skip) | $50 | $100 |
|
|
| Cloud Monitoring | $0 (free tier) | $50 | $200 |
|
|
| Secret Manager | $1 | $1 | $5 |
|
|
| Pub/Sub | $1 | $5 | $30 |
|
|
| Cloud Scheduler | $1 | $1 | $5 |
|
|
| BigQuery (analytics) | $0 (free tier) | $20 | $100 |
|
|
| **Total Monthly** | **~$258** | **~$752** | **~$3,890** |
|
|
| **Per Practice/Month** | **$25.80** | **$7.52** | **$3.89** |
|
|
|
|
### Revenue Model
|
|
|
|
| Tier | Price | Features |
|
|
|------|-------|---------|
|
|
| Starter | $99/mo/practice | 500 classifications/mo, WASM offline, basic brain |
|
|
| Professional | $199/mo/practice | Unlimited, LoRA adaptation, full brain, teledermatology |
|
|
| Enterprise | Custom | Multi-practice, EHR integration, dedicated support, SLA |
|
|
| Academic | Free | Research use, data contribution agreement |
|
|
| Underserved | Free | Qualifying community health centers |
|
|
|
|
**Break-even**: approximately 30 practices on Professional tier covers infrastructure costs at the 100-practice scale.
|
|
|
|
## Deployment Pipeline
|
|
|
|
```
|
|
Deployment Pipeline (Cloud Build)
|
|
│
|
|
├── Source: GitHub (ruvector/dragnes)
|
|
├── Trigger: Push to main branch
|
|
│
|
|
├── Build Stage
|
|
│ ├── Rust compilation (--release --target x86_64-unknown-linux-gnu)
|
|
│ ├── WASM compilation (--target wasm32-unknown-unknown)
|
|
│ ├── Docker image build (distroless base)
|
|
│ └── SvelteKit build (npm run build)
|
|
│
|
|
├── Test Stage
|
|
│ ├── Unit tests (cargo test)
|
|
│ ├── Integration tests (against staging brain)
|
|
│ ├── WASM inference accuracy test (reference images)
|
|
│ ├── Security scan (cargo audit + npm audit)
|
|
│ └── HIPAA compliance checks (PII scanner)
|
|
│
|
|
├── Deploy Stage (Canary)
|
|
│ ├── Deploy to staging (full test suite)
|
|
│ ├── Canary deployment (5% traffic for 30 minutes)
|
|
│ ├── Monitor error rate and latency
|
|
│ ├── Auto-rollback if error rate > 0.5%
|
|
│ └── Promote to 100% if healthy
|
|
│
|
|
└── Post-Deploy
|
|
├── Smoke tests against production
|
|
├── Notify operations channel
|
|
├── Update model version registry
|
|
└── Archive previous version artifacts
|
|
```
|
|
|
|
## Disaster Recovery
|
|
|
|
| Scenario | RTO | RPO | Recovery Procedure |
|
|
|----------|-----|-----|-------------------|
|
|
| Single region outage | 5 minutes | 0 (multi-region) | Automatic failover via Cloud LB |
|
|
| Firestore corruption | 1 hour | 24 hours | Restore from daily export |
|
|
| Model corruption | 10 minutes | N/A | Roll back to previous model version |
|
|
| Brain server outage | 5 minutes | 0 | Existing brain HA (pi.ruv.io) |
|
|
| Complete GCP outage | 4 hours | 24 hours | Multi-cloud DR (backup to AWS S3) |
|
|
| Security breach | 1 hour | N/A | Incident response plan activation |
|