airi/services/minecraft
Rin e83c17fe57
fix(minecraft): xss mitigation for minecraft debug dashboard (#1344)
Authored-by-agent: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
2026-03-14 00:20:15 +08:00
..
.vscode chore(minecraft): debug launch.json 2026-02-18 11:12:03 +08:00
codex-skills/minecraft-debug-mcp refactor(minecraft): simplify EventBus by removing history and trace query features 2026-02-18 11:14:46 +08:00
docs feat(minecraft): add patterns runtime to REPL with block alias support for collect actions 2026-02-18 11:14:44 +08:00
src fix(minecraft): xss mitigation for minecraft debug dashboard (#1344) 2026-03-14 00:20:15 +08:00
.env feat(minecraft): disable debug servers by default (#1193) 2026-03-08 18:41:32 +08:00
.gitignore feat(minecraft): better debug server, event queue for the brain 2026-02-18 11:09:59 +08:00
eslint.config.js feat(minecraft): attempt at sliding window for salience detection 2026-02-18 11:10:04 +08:00
package.json fix(minecraft): downgrade to mitigate upstream issue and mcp cleanup (#1192) 2026-03-08 05:54:49 +08:00
README.md chore: lint & lock 2026-03-06 16:49:31 +08:00
tsconfig.json feat(minecraft): add patterns runtime to REPL with block alias support for collect actions 2026-02-18 11:14:44 +08:00
vitest.config.ts feat(minecraft): add tests for perception pipeline 2026-02-18 11:10:03 +08:00

WIP

Caution: Documentation below may be out of date.

🧠 Cognitive Architecture

AIRI's Minecraft agent is built on a four-layered cognitive architecture inspired by cognitive science, enabling reactive, conscious, and physically grounded behaviors.

Architecture Overview

graph TB
    subgraph "Layer A: Perception"
        Events[Raw Events]
        EM[Event Manager]
        Events --> EM
    end

    subgraph "Layer B: Reflex (Subconscious)"
        RM[Reflex Manager]
        FSM[State Machine]
        RM --> FSM
    end

    subgraph "Layer C: Conscious (Reasoning)"
        ORC[Orchestrator]
        Planner[Planning Agent (LLM)]
        Chat[Chat Agent (LLM)]
        ORC --> Planner
        ORC --> Chat
    end

    subgraph "Layer D: Action (Execution)"
        TE[Task Executor]
        AA[Action Agent]
        Planner -->|Plan| TE
        TE -->|Action Steps| AA
    end

    EM -->|High Priority| RM
    EM -->|All Events| ORC
    RM -.->|Inhibition Signal| ORC
    ORC -->|Execution Request| TE

    style EM fill:#e1f5ff
    style RM fill:#fff4e1
    style ORC fill:#ffe1f5
    style TE fill:#dcedc8

Layer A: Perception

Location: src/cognitive/perception/

The perception layer acts as the sensory input hub, collecting raw Mineflayer signals and translating them into typed events/signals through an event registry + rule engine pipeline.

Pipeline:

  • Event definitions in events/definitions/* bind Mineflayer events to normalized raw events.
  • EventRegistry emits raw:<modality>:<kind> events to the cognitive event bus.
  • RuleEngine evaluates YAML rules and emits derived signal:* events consumed by Reflex/Conscious layers.

Key files:

  • events/index.ts
  • events/definitions/*
  • rules/engine.ts
  • rules/*.yaml
  • pipeline.ts

Layer B: Reflex

Location: src/cognitive/reflex/

The reflex layer handles immediate, instinctive reactions. It operates on a finite state machine (FSM) pattern for predictable, fast responses.

Components:

  • Reflex Manager (reflex-manager.ts): Coordinates reflex behaviors
  • Inhibition: Reflexes can inhibit Conscious layer processing to prevent redundant responses.

Layer C: Conscious

Location: src/cognitive/conscious/

The conscious layer handles complex reasoning, planning, and high-level decision-making. No physical execution happens here anymore.

Components:

  • Brain (brain.ts): Event queue orchestration, LLM turn lifecycle, safety/budget guards, debug REPL integration.
  • JavaScript Planner (js-planner.ts): Sandboxed planning/runtime execution against exposed tools/globals.
  • Query Runtime (query-dsl.ts): Read-only world/inventory/entity query helpers for planner scripts.
  • Task State (task-state.ts): Cancellation token and task lifecycle primitives used by action execution.

Layer D: Action

Location: src/cognitive/action/

The action layer is responsible for the actual execution of tasks in the world. It isolates "Doing" from "Thinking".

Components:

  • Task Executor (task-executor.ts): Runs normalized action instructions and emits action lifecycle events.
  • Action Registry (action-registry.ts): Validates params and dispatches tool calls.
  • Tool Catalog (llm-actions.ts): Action/tool definitions and schemas bound to mineflayer skills.

🔄 Event Flow Example

Scenario: "Build a house"

Player: "build a house"
  ↓
[Perception] Event detected
  ↓
[Conscious] Architect plans the structure
  ↓
[Action] Executor takes the plan and manages the construction loop:
    - Step 1: Collect wood (calls ActionRegistry tool)
    - Step 2: Craft planks
    - Step 3: Build walls
  ↓
[Conscious] Brain confirms completion: "House is ready!"

📁 Project Structure

src/
├── cognitive/                  # 🧠 Perception → Reflex → Conscious → Action
│   ├── perception/            # Event definitions + rule evaluation
│   │   ├── events/
│   │   │   ├── index.ts
│   │   │   └── definitions/*
│   │   ├── rules/
│   │   │   ├── *.yaml
│   │   │   ├── engine.ts
│   │   │   ├── loader.ts
│   │   │   └── matcher.ts
│   │   └── pipeline.ts
│   ├── reflex/                # Fast, rule-based reactions
│   │   ├── reflex-manager.ts
│   │   ├── runtime.ts
│   │   ├── context.ts
│   │   └── behaviors/idle-gaze.ts
│   ├── conscious/             # LLM-powered reasoning
│   │   ├── brain.ts           # Core reasoning loop/orchestration
│   │   ├── js-planner.ts      # JS planning sandbox
│   │   ├── query-dsl.ts       # Read-only query runtime
│   │   ├── llm-log.ts         # Turn/log query helpers
│   │   ├── task-state.ts      # Task lifecycle enums/helpers
│   │   └── prompts/           # Prompt definitions (e.g., brain-prompt.ts)
│   ├── action/                # Task execution layer
│   │   ├── task-executor.ts   # Executes actions and emits lifecycle events
│   │   ├── action-registry.ts # Tool dispatch + schema validation
│   │   ├── llm-actions.ts     # Tool catalog
│   │   └── types.ts
│   ├── event-bus.ts           # Event bus core
│   ├── container.ts           # Dependency injection wiring
│   ├── index.ts               # Cognitive system entrypoint
│   └── types.ts               # Shared cognitive types
├── libs/
│   └── mineflayer/           # Mineflayer bot wrapper/adapters
├── skills/                   # Atomic bot capabilities
├── composables/              # Reusable functions (config, etc.)
├── plugins/                  # Mineflayer/bot plugins
├── debug/                    # Debug web dashboard + MCP bridge
├── utils/                    # Helpers
└── main.ts                   # Bot entrypoint

🎯 Design Principles

  1. Separation of Concerns: Each layer has a distinct responsibility
  2. Event-Driven: Loose coupling via centralized event system
  3. Inhibition Control: Reflexes prevent unnecessary LLM calls
  4. Extensibility: Easy to add new reflexes or conscious behaviors
  5. Cognitive Realism: Mimics human-like perception → reaction → deliberation

🚧 Future Enhancements

  • Perception Layer:

    • ⏱️ Temporal context window (remember recent events)
    • 🎯 Salience detection (filter noise, prioritize important events)
  • Reflex Layer:

    • 🏃 Dodge hostile mobs
    • 🛡️ Emergency combat responses
  • Conscious Layer:

    • 💭 Emotional state management
    • 🧠 Long-term memory integration
    • 🎭 Personality-driven responses

🛠️ Development

Commands

  • pnpm dev - Start the bot in development mode
  • pnpm lint - Run ESLint
  • pnpm typecheck - Run TypeScript type checking
  • pnpm test - Run tests

🙏 Acknowledgements

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.