Skip to content

Quickstart — Deploy FowyldAI

Get FowyldAI running and execute your first sovereign inference.

Prerequisites

  • Python 3.11+ (enforced at startup)
  • PyTorch 2.4+ with CUDA support (optional — falls back to CPU)
  • 16 GB RAM minimum (32 GB recommended)
  • GPU recommended: 8 GB VRAM minimum (e.g., RTX 2000 Ada). 16 GB for full model suite.
  • ~35 GB disk for model weights

Step 1: Clone and Install

git clone https://github.com/melhousen-solutions-dev/fowyldai.git
cd fowyldai
python -m venv .venv
.venv\Scripts\activate   # Windows
# source .venv/bin/activate  # Linux/Mac
pip install -e .

Step 2: Configure

Copy the example environment file and edit as needed:

cp .env.example .env

Key settings in .env:

FOWYLDAI_PORT=8400
FOWYLDAI_MODEL_ROOT=D:\models
FOWYLD_EDITION=crown        # crown or ranger
FOWYLDAI_PRODUCT_PACK=melhousen

Configuration files live in the config/ directory:

File Purpose
config/dev.yaml App config (host, port, logging, inference, safety)
config/prod.yaml Production overrides
config/models.yaml Model registry (HF repos, VRAM, quantization)
config/security.yaml Rate limiting, CORS, alerting, encryption
config/warm_pool.yaml Which models stay preloaded in GPU VRAM

Step 3: Download Models

python scripts/download_models.py

This downloads the core model suite from HuggingFace (~35 GB total):

Model Role VRAM
qwen25-1b Classification (Scout) 3 GB
phi3-mini Light reasoning 7.6 GB
mistral-7b (GPTQ) Deep reasoning 4.5 GB
openhermes-7b (GPTQ) Heavy reasoning 4.5 GB

Step 4: Start the Crown Engine

Option A — Direct uvicorn:

python -m uvicorn src.main:app --host 127.0.0.1 --port 8400 --reload

Option B — Using the CLI entry point (after pip install -e .):

fowyldai

Option C — Using Make:

make dev-crown    # Crown Edition
make dev-ranger   # Ranger Edition

Option D — Using the startup script (Windows):

.\scripts\Start-FowyldAI.ps1 -ModelRoot D:\models -Host 127.0.0.1 -Port 8400

This script verifies model integrity (SHA-256 checksums) before starting.

Option E — Docker:

docker build -t fowyldai:crown --build-arg FOWYLD_EDITION=crown .
docker compose -f docker-compose.prod.yml up -d

Step 5: Verify It's Running

curl http://127.0.0.1:8400/ping
# {"status": "ok"}

curl http://127.0.0.1:8400/health
# Returns version + loaded models list

Step 6: Run Your First Inference

Auto-route (let the Sovereign Brain pick the best model):

curl -X POST http://127.0.0.1:8400/auto \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Summarize the key benefits of sovereign AI deployment"}'

OpenAI-compatible endpoint:

curl -X POST http://127.0.0.1:8400/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi3-mini",
    "messages": [{"role": "user", "content": "What is sovereign AI?"}]
  }'

Sovereign Brain reasoning:

curl -X POST http://127.0.0.1:8400/brain/reason \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Analyze the security implications of cloud-hosted LLMs"}'

Next Steps