mirror of
https://github.com/supermemoryai/supermemory.git
synced 2026-05-17 12:20:04 +00:00
254 lines
7.4 KiB
Text
254 lines
7.4 KiB
Text
---
|
|
title: "S3 Connector"
|
|
description: "Connect Amazon S3 or S3-compatible storage to sync files into your Supermemory knowledge base"
|
|
icon: "aws"
|
|
---
|
|
|
|
Connect Amazon S3 buckets or S3-compatible storage services (MinIO, DigitalOcean Spaces, Cloudflare R2) to sync files into your Supermemory knowledge base.
|
|
|
|
<Note>
|
|
The S3 connector requires a **Scale Plan** or higher. You can also create S3 connections directly from the [Supermemory Console](https://console.supermemory.ai).
|
|
</Note>
|
|
|
|
## Quick Setup
|
|
|
|
<Tabs>
|
|
<Tab title="TypeScript">
|
|
```typescript
|
|
import Supermemory from 'supermemory';
|
|
|
|
const client = new Supermemory({
|
|
apiKey: process.env.SUPERMEMORY_API_KEY!
|
|
});
|
|
|
|
const connection = await client.connections.create('s3', {
|
|
metadata: {
|
|
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
|
|
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
|
|
bucket: 'my-documents-bucket',
|
|
region: 'us-east-1'
|
|
},
|
|
containerTags: ['org-123']
|
|
});
|
|
```
|
|
</Tab>
|
|
<Tab title="Python">
|
|
```python
|
|
from supermemory import Supermemory
|
|
import os
|
|
|
|
client = Supermemory(api_key=os.environ["SUPERMEMORY_API_KEY"])
|
|
|
|
connection = client.connections.create(
|
|
's3',
|
|
metadata={
|
|
'accessKeyId': os.environ["AWS_ACCESS_KEY_ID"],
|
|
'secretAccessKey': os.environ["AWS_SECRET_ACCESS_KEY"],
|
|
'bucket': 'my-documents-bucket',
|
|
'region': 'us-east-1'
|
|
},
|
|
container_tags=['org-123', 's3-sync']
|
|
)
|
|
```
|
|
</Tab>
|
|
<Tab title="cURL">
|
|
```bash
|
|
curl -X POST "https://api.supermemory.ai/v3/connections/s3" \
|
|
-H "Authorization: Bearer $SUPERMEMORY_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"metadata": {
|
|
"accessKeyId": "AKIAIOSFODNN7EXAMPLE",
|
|
"secretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
|
|
"bucket": "my-documents-bucket",
|
|
"region": "us-east-1"
|
|
},
|
|
"containerTags": ["org-123"]
|
|
}'
|
|
```
|
|
</Tab>
|
|
</Tabs>
|
|
|
|
## Configuration Options
|
|
|
|
For S3, provider-specific connection fields are passed inside the top-level `metadata` object. General connection options stay top-level.
|
|
|
|
| Parameter | Location | Required | Description |
|
|
|-----------|----------|----------|-------------|
|
|
| `accessKeyId` | `metadata.accessKeyId` | Yes | AWS access key ID or S3-compatible service key |
|
|
| `secretAccessKey` | `metadata.secretAccessKey` | Yes | AWS secret access key |
|
|
| `bucket` | `metadata.bucket` | Yes | S3 bucket name |
|
|
| `region` | `metadata.region` | Yes | AWS region (e.g., `us-east-1`). Use `auto` for Cloudflare R2. |
|
|
| `endpoint` | `metadata.endpoint` | No | Custom endpoint for S3-compatible services |
|
|
| `prefix` | `metadata.prefix` | No | Key prefix filter (e.g., `documents/`) |
|
|
| `containerTagRegex` | `metadata.containerTagRegex` | No | Regex to extract container tags from file paths |
|
|
| `containerTags` | top-level | No | Tags for organizing connections |
|
|
| `documentLimit` | top-level | No | Maximum documents to sync (default: 10,000) |
|
|
|
|
<Note>
|
|
In the Python SDK, use `container_tags` for the top-level option, but keep S3 metadata keys in camelCase: `accessKeyId`, `secretAccessKey`, and `containerTagRegex`.
|
|
</Note>
|
|
|
|
## S3-Compatible Services
|
|
|
|
Use `metadata.endpoint` to connect to S3-compatible storage:
|
|
|
|
```typescript
|
|
// MinIO
|
|
const connection = await client.connections.create('s3', {
|
|
metadata: {
|
|
accessKeyId: 'minio-key',
|
|
secretAccessKey: 'minio-secret',
|
|
bucket: 'my-bucket',
|
|
region: 'us-east-1',
|
|
endpoint: 'https://minio.example.com'
|
|
},
|
|
containerTags: ['minio-sync']
|
|
});
|
|
```
|
|
|
|
Common S3-compatible endpoint values:
|
|
|
|
| Service | `metadata.endpoint` | `metadata.region` |
|
|
|---------|----------------------|-------------------|
|
|
| DigitalOcean Spaces | `https://nyc3.digitaloceanspaces.com` | `nyc3` |
|
|
| Cloudflare R2 | `https://<account-id>.r2.cloudflarestorage.com` | `auto` |
|
|
|
|
Cloudflare R2 example:
|
|
|
|
```typescript
|
|
const connection = await client.connections.create('s3', {
|
|
metadata: {
|
|
accessKeyId: process.env.R2_ACCESS_KEY_ID!,
|
|
secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!,
|
|
bucket: 'my-bucket',
|
|
region: 'auto',
|
|
endpoint: 'https://<account-id>.r2.cloudflarestorage.com'
|
|
},
|
|
containerTags: ['r2-sync']
|
|
});
|
|
```
|
|
|
|
<Note>
|
|
For S3-compatible services, `metadata.endpoint` is the base S3 endpoint. Do not include the bucket name in the endpoint URL; pass the bucket separately as `metadata.bucket`.
|
|
</Note>
|
|
|
|
## Prefix Filtering
|
|
|
|
Sync only files within a specific path:
|
|
|
|
```typescript
|
|
const connection = await client.connections.create('s3', {
|
|
metadata: {
|
|
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
|
|
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
|
|
bucket: 'company-data',
|
|
region: 'us-east-1',
|
|
prefix: 'documents/engineering/' // Only syncs files under this path
|
|
},
|
|
containerTags: ['engineering-docs']
|
|
});
|
|
```
|
|
|
|
## Dynamic Container Tags
|
|
|
|
Extract container tags from S3 key paths for multi-tenant setups:
|
|
|
|
```typescript
|
|
const connection = await client.connections.create('s3', {
|
|
metadata: {
|
|
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
|
|
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
|
|
bucket: 'user-files',
|
|
region: 'us-east-1',
|
|
containerTagRegex: 'users/(?<userId>[^/]+)/'
|
|
},
|
|
containerTags: ['user-files']
|
|
});
|
|
|
|
// File: users/user-123/documents/notes.md → container tag: user-123
|
|
// File: users/user-456/reports/q4.pdf → container tag: user-456
|
|
```
|
|
|
|
<Warning>
|
|
The regex must contain a named capture group `(?<userId>...)` and be less than 200 characters.
|
|
</Warning>
|
|
|
|
## Connection Management
|
|
|
|
### Delete Connection
|
|
|
|
<Tabs>
|
|
<Tab title="TypeScript">
|
|
```typescript
|
|
await client.connections.deleteByID('conn_s3_abc123');
|
|
```
|
|
</Tab>
|
|
<Tab title="cURL">
|
|
```bash
|
|
curl -X DELETE "https://api.supermemory.ai/v3/connections/conn_s3_abc123" \
|
|
-H "Authorization: Bearer $SUPERMEMORY_API_KEY"
|
|
```
|
|
</Tab>
|
|
</Tabs>
|
|
|
|
<Warning>
|
|
By default, deleting a connection removes all synced documents from Supermemory. To keep documents, pass `deleteDocuments=false` as a query parameter: `DELETE /v3/connections/:id?deleteDocuments=false`
|
|
</Warning>
|
|
|
|
### Manual Sync
|
|
|
|
<Tabs>
|
|
<Tab title="TypeScript">
|
|
```typescript
|
|
await client.connections.import('s3', {
|
|
containerTags: ['org-123']
|
|
});
|
|
```
|
|
</Tab>
|
|
<Tab title="cURL">
|
|
```bash
|
|
curl -X POST "https://api.supermemory.ai/v3/connections/s3/import" \
|
|
-H "Authorization: Bearer $SUPERMEMORY_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"containerTags": ["org-123"]}'
|
|
```
|
|
</Tab>
|
|
</Tabs>
|
|
|
|
## Sync Behavior
|
|
|
|
| Feature | Behavior |
|
|
|---------|----------|
|
|
| **Initial sync** | Fetches all files matching prefix filter |
|
|
| **Incremental sync** | Only files modified since last sync |
|
|
| **Sync schedule** | Every 4 hours + manual triggers |
|
|
| **Document limit** | 10,000 files per connection (default) |
|
|
|
|
## IAM Permissions
|
|
|
|
Minimum required permissions:
|
|
|
|
```json
|
|
{
|
|
"Version": "2012-10-17",
|
|
"Statement": [
|
|
{
|
|
"Effect": "Allow",
|
|
"Action": ["s3:GetObject", "s3:ListBucket"],
|
|
"Resource": [
|
|
"arn:aws:s3:::your-bucket-name",
|
|
"arn:aws:s3:::your-bucket-name/*"
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Error Codes
|
|
|
|
| Code | Message | Solution |
|
|
|------|---------|----------|
|
|
| 401 | Authentication failed | Verify access key and secret |
|
|
| 403 | Access denied | Check IAM permissions and bucket policy |
|
|
| 404 | Bucket not found | Verify bucket name and region |
|