Concepts — How FowyldAI Works¶

Understand FowyldAI's architecture before you configure or deploy it.

Crown Engine¶

The Crown Engine is the core runtime — a FastAPI server (src/main.py) that initializes 115+ subsystems in a specific startup sequence via an async lifespan() context manager.

Startup order includes:

Observability — logging, metrics collection, Prometheus
Safety — content filters, guardrails
Compliance — GDPR, AI Act, SOC 2 checks
Inference Queue — request queuing and prioritization
Model Registry — loads config/models.yaml, maps models to nodes
Warm Pool — preloads configured models into GPU VRAM (see config/warm_pool.yaml)
Routing — request-to-model routing logic
Sovereign Brain — reasoning and planning intelligence
Agent Orchestrator — multi-step agent and council execution
Peer Mesh — cluster node discovery, heartbeat, delegation
Gateway — reverse proxy coordination
Crown Tunnel — mTLS peering between Crown instances

When all subsystems are ready, the engine logs fowyldai.startup.READY and begins accepting requests on port 8400.

Sovereign Brain¶

The Sovereign Brain is the intelligence layer that handles reasoning, planning, and memory. It provides:

/brain/reason — Primary reasoning endpoint, routes to the best available model
/brain/plan — Decomposes a goal into sub-tasks
/brain/remember — Stores facts in institutional memory (vector store)
/brain/recall — Semantic search over stored memories
/brain/compress — Extractive summarization for token savings
/brain/cached_answer — Checks semantic cache before running inference

The Brain works with the Adaptive Intelligence Agent (AIA) to learn from interactions and improve routing over time.

Model Routing¶

FowyldAI doesn't use a single model — it routes requests to the best model for the task:

/classify — The Scout model (qwen25-1b, 3 GB VRAM) classifies what kind of task is being requested
/auto — Auto-routing uses the classification to pick a reasoning tier:
- Light (/reason/light) — phi3-mini: fast, low-VRAM tasks
- Deep (/reason/deep) — mistral-7b GPTQ: complex analysis
- Heavy (/reason/heavy) — openhermes-7b GPTQ: multi-step reasoning
VRAM-Aware Eviction — When GPU memory is tight, the warm pool evicts less-used models to make room

The model registry in config/models.yaml defines available models, their HuggingFace repos, VRAM requirements, quantization settings, and which cluster node they run on.

Council Protocol¶

For high-stakes decisions, the Council Protocol runs multi-model deliberation:

The same prompt is sent to multiple models simultaneously
Each model produces an independent response
Responses are synthesized, conflicts are identified
A consensus answer is produced with confidence scores

Access via POST /agent/council.

Editions¶

FowyldAI ships in two editions:

Edition	Environment Variable	Description
Crown	`FOWYLD_EDITION=crown`	Full-featured, enterprise subsystems enabled
Ranger	`FOWYLD_EDITION=ranger`	Lightweight, base product pack

Set the edition in .env or pass it as a Docker build arg.

MCP Server (VS Code Integration)¶

FowyldAI exposes an MCP (Model Context Protocol) server (src/mcp_server.py) that integrates with VS Code and other MCP-compatible tools.

The MCP server runs over stdio using FastMCP("FowyldAI") and exposes 20+ tools:

Tool	Purpose
`classify`	Task type classification
`reason`	Sovereign reasoning
`plan`	Goal decomposition
`remember` / `recall`	Memory storage and retrieval
`compress`	Token-saving summarization
`cached_answer`	Semantic cache lookup
`code_review`	Code review analysis
`explain_code`	Code explanation
`generate_code`	Code generation
`write_tests`	Test generation
`find_bugs`	Bug detection
`refactor`	Refactoring suggestions
`commit_message`	Commit message generation
`generate_docstring`	Docstring generation
`auto_route`	Smart model auto-routing
`eval_quality`	Output quality checking

Each MCP tool is an HTTP proxy that calls the FowyldAI REST API. It connects to http://127.0.0.1:8400 (local Crown Engine) first, with fallback to a remote endpoint.

Gateway¶

FowyldAI includes a mesh-aware reverse proxy (gateway/app.py) — a separate FastAPI application that sits in front of one or more Crown Engine instances.

Two routing modes:

FAILOVER (default) — Routes to on-premise GPU, fails over to a cloud CPU engine
COOPERATIVE — Routes GPU-heavy tasks (image/video) to GPU nodes, CPU tasks to CPU nodes

Peer Mesh¶

For multi-node deployments, the Peer Mesh handles:

Node discovery and heartbeat (/peer/heartbeat)
Task delegation to remote nodes (/peer/delegate)
Consensus and quorum protocols (/peer/consensus, /peer/quorum)
Crown-to-Crown communication via mTLS tunnel (config/crown_tunnel.yaml)