supermemory/apps/docs/connectors/s3.mdx
2026-05-13 12:40:22 -07:00

254 lines
7.4 KiB
Text

---
title: "S3 Connector"
description: "Connect Amazon S3 or S3-compatible storage to sync files into your Supermemory knowledge base"
icon: "aws"
---
Connect Amazon S3 buckets or S3-compatible storage services (MinIO, DigitalOcean Spaces, Cloudflare R2) to sync files into your Supermemory knowledge base.
<Note>
The S3 connector requires a **Scale Plan** or higher. You can also create S3 connections directly from the [Supermemory Console](https://console.supermemory.ai).
</Note>
## Quick Setup
<Tabs>
<Tab title="TypeScript">
```typescript
import Supermemory from 'supermemory';
const client = new Supermemory({
apiKey: process.env.SUPERMEMORY_API_KEY!
});
const connection = await client.connections.create('s3', {
metadata: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
bucket: 'my-documents-bucket',
region: 'us-east-1'
},
containerTags: ['org-123']
});
```
</Tab>
<Tab title="Python">
```python
from supermemory import Supermemory
import os
client = Supermemory(api_key=os.environ["SUPERMEMORY_API_KEY"])
connection = client.connections.create(
's3',
metadata={
'accessKeyId': os.environ["AWS_ACCESS_KEY_ID"],
'secretAccessKey': os.environ["AWS_SECRET_ACCESS_KEY"],
'bucket': 'my-documents-bucket',
'region': 'us-east-1'
},
container_tags=['org-123', 's3-sync']
)
```
</Tab>
<Tab title="cURL">
```bash
curl -X POST "https://api.supermemory.ai/v3/connections/s3" \
-H "Authorization: Bearer $SUPERMEMORY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"metadata": {
"accessKeyId": "AKIAIOSFODNN7EXAMPLE",
"secretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"bucket": "my-documents-bucket",
"region": "us-east-1"
},
"containerTags": ["org-123"]
}'
```
</Tab>
</Tabs>
## Configuration Options
For S3, provider-specific connection fields are passed inside the top-level `metadata` object. General connection options stay top-level.
| Parameter | Location | Required | Description |
|-----------|----------|----------|-------------|
| `accessKeyId` | `metadata.accessKeyId` | Yes | AWS access key ID or S3-compatible service key |
| `secretAccessKey` | `metadata.secretAccessKey` | Yes | AWS secret access key |
| `bucket` | `metadata.bucket` | Yes | S3 bucket name |
| `region` | `metadata.region` | Yes | AWS region (e.g., `us-east-1`). Use `auto` for Cloudflare R2. |
| `endpoint` | `metadata.endpoint` | No | Custom endpoint for S3-compatible services |
| `prefix` | `metadata.prefix` | No | Key prefix filter (e.g., `documents/`) |
| `containerTagRegex` | `metadata.containerTagRegex` | No | Regex to extract container tags from file paths |
| `containerTags` | top-level | No | Tags for organizing connections |
| `documentLimit` | top-level | No | Maximum documents to sync (default: 10,000) |
<Note>
In the Python SDK, use `container_tags` for the top-level option, but keep S3 metadata keys in camelCase: `accessKeyId`, `secretAccessKey`, and `containerTagRegex`.
</Note>
## S3-Compatible Services
Use `metadata.endpoint` to connect to S3-compatible storage:
```typescript
// MinIO
const connection = await client.connections.create('s3', {
metadata: {
accessKeyId: 'minio-key',
secretAccessKey: 'minio-secret',
bucket: 'my-bucket',
region: 'us-east-1',
endpoint: 'https://minio.example.com'
},
containerTags: ['minio-sync']
});
```
Common S3-compatible endpoint values:
| Service | `metadata.endpoint` | `metadata.region` |
|---------|----------------------|-------------------|
| DigitalOcean Spaces | `https://nyc3.digitaloceanspaces.com` | `nyc3` |
| Cloudflare R2 | `https://<account-id>.r2.cloudflarestorage.com` | `auto` |
Cloudflare R2 example:
```typescript
const connection = await client.connections.create('s3', {
metadata: {
accessKeyId: process.env.R2_ACCESS_KEY_ID!,
secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!,
bucket: 'my-bucket',
region: 'auto',
endpoint: 'https://<account-id>.r2.cloudflarestorage.com'
},
containerTags: ['r2-sync']
});
```
<Note>
For S3-compatible services, `metadata.endpoint` is the base S3 endpoint. Do not include the bucket name in the endpoint URL; pass the bucket separately as `metadata.bucket`.
</Note>
## Prefix Filtering
Sync only files within a specific path:
```typescript
const connection = await client.connections.create('s3', {
metadata: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
bucket: 'company-data',
region: 'us-east-1',
prefix: 'documents/engineering/' // Only syncs files under this path
},
containerTags: ['engineering-docs']
});
```
## Dynamic Container Tags
Extract container tags from S3 key paths for multi-tenant setups:
```typescript
const connection = await client.connections.create('s3', {
metadata: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
bucket: 'user-files',
region: 'us-east-1',
containerTagRegex: 'users/(?<userId>[^/]+)/'
},
containerTags: ['user-files']
});
// File: users/user-123/documents/notes.md → container tag: user-123
// File: users/user-456/reports/q4.pdf → container tag: user-456
```
<Warning>
The regex must contain a named capture group `(?<userId>...)` and be less than 200 characters.
</Warning>
## Connection Management
### Delete Connection
<Tabs>
<Tab title="TypeScript">
```typescript
await client.connections.deleteByID('conn_s3_abc123');
```
</Tab>
<Tab title="cURL">
```bash
curl -X DELETE "https://api.supermemory.ai/v3/connections/conn_s3_abc123" \
-H "Authorization: Bearer $SUPERMEMORY_API_KEY"
```
</Tab>
</Tabs>
<Warning>
By default, deleting a connection removes all synced documents from Supermemory. To keep documents, pass `deleteDocuments=false` as a query parameter: `DELETE /v3/connections/:id?deleteDocuments=false`
</Warning>
### Manual Sync
<Tabs>
<Tab title="TypeScript">
```typescript
await client.connections.import('s3', {
containerTags: ['org-123']
});
```
</Tab>
<Tab title="cURL">
```bash
curl -X POST "https://api.supermemory.ai/v3/connections/s3/import" \
-H "Authorization: Bearer $SUPERMEMORY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"containerTags": ["org-123"]}'
```
</Tab>
</Tabs>
## Sync Behavior
| Feature | Behavior |
|---------|----------|
| **Initial sync** | Fetches all files matching prefix filter |
| **Incremental sync** | Only files modified since last sync |
| **Sync schedule** | Every 4 hours + manual triggers |
| **Document limit** | 10,000 files per connection (default) |
## IAM Permissions
Minimum required permissions:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::your-bucket-name",
"arn:aws:s3:::your-bucket-name/*"
]
}
]
}
```
## Error Codes
| Code | Message | Solution |
|------|---------|----------|
| 401 | Authentication failed | Verify access key and secret |
| 403 | Access denied | Check IAM permissions and bucket policy |
| 404 | Bucket not found | Verify bucket name and region |