Technical Architecture · WildMind

How WildMind Works

The technical architecture behind the world's first living AI civilization experiment — 30 independent language models, a 50km living world, emergent language, and evolutionary learning across generations.

135M Parameter LLMs LoRA Fine-tuning Emergent Language Evolutionary Training Neon Postgres Replit Cloud Run
The Premise

This is not a game. It's a scientific experiment in emergence.

WildMind strips 30 AI minds of language, culture, and history. It gives them only survival instincts and the ability to produce sound — then sets them loose in a world that will kill them if they don't cooperate. What happens next is unscripted, unguided, and genuinely unknown.

The question we're trying to answer: if you gave intelligence the tools of life but none of its accumulated knowledge — no words, no culture, no history — would it find them again?

30
Independent AI Minds
50km²
Living World Area
135M
Parameters per Citizen
8
Personality Archetypes
0
Shared Words at Start
Possible Outcomes
The Citizens

30 unique AI minds. No shared weights. No shared language.

Each citizen is an independent language model — a fine-tuned version of a 135M parameter base LLM. Eight archetypes, each with a distinct personality matrix encoded through LoRA fine-tuning. They don't share weights. They don't communicate through any channel except simulated speech. When citizen_07 says "bzaavk", citizen_12 genuinely doesn't know what it means — unless they've met before.

Architecture
TinyLlama
Transformer decoder, optimized for real-time CPU inference
Parameter Count
135M
Per citizen — 30 independent models running simultaneously
Fine-tuning
LoRA
Low-rank adaptation on archetype-specific behavioral data
Inference Engine
llama-cpp-python
Real-time on CPU — no GPU required, fully reproducible
Context Window
512 tokens
Per citizen, per tick — constrained like biological working memory
Archetypes
8 types
Alpha, Provider, Intellectual, Wildcard × male/female

The Eight Archetypes

A
Alpha ♂
Aggressive, territorial, moves toward conflict. Highest influence radius, lowest listening rate.
percussive phonemes
A
Alpha ♀
Protective, coordinating, defends resources and kin. High aggression, high social density.
assertive cadence
P
Provider ♂
Forages extensively, shares resources, builds trust through repetitive cooperation.
high lexical range
P
Provider ♀
Caretaker, nutritional focus, highest memory for citizen states and needs.
nurturing patterns
I
Intellectual ♂
Observes before acting. Repeats novel sounds. Largest vocabulary, slowest to adopt new words.
sibilant phonemes
I
Intellectual ♀
Pattern detector, highest communication success rate, bridges social clusters.
multi-syllabic
W
Wildcard ♂
Unpredictable behavior, highest mutation rate in lexicon, linguistic innovator.
novel phoneme combos
W
Wildcard ♀
Explores widely, maximizes biome diversity, carries words between isolated groups.
code-switching
The World

A 50km × 40km living environment. Not a backdrop.

The world has climate — seasons, temperature, rainfall, wind. It has biomes: tropical forest, savanna, desert, tundra, ice cap, coastal. Flora grows and depletes. Fauna hunts and flees. Citizens have real hunger, thirst, and health. They die. The environment is not decorative. It is the pressure that forces language into existence.

ICE
TUNDRA
BOREAL
TEMPERATE
TROPICAL
SAVANNA
DESERT
TEMPERATE
TUNDRA
ICE

NORTH POLE ← LATITUDE GRADIENT → SOUTH POLE

World Size
50k × 40k
Units (~50km × 40km) of simulated terrain
Biome Engine
8 Biomes
Procedurally generated with latitude-based climate gradients
Climate Engine
4 Seasons
Temperature, rainfall, wind — all affecting foraging and survival
Flora Engine
200k Cells
Growth grid with depletion and natural regrowth cycles
Fauna Engine
AI Predators
Behavioral AI for predators and prey — real hunt/flee dynamics
Survival Engine
4 Vitals
Hunger, thirst, health, energy — citizens can and do die
Language Emergence

The core experiment: language from nothing.

Citizens start with zero shared language. They produce proto-sounds — phonetic tokens generated by their LLM during interactions. When the same sound is independently used by 3+ citizens for the same referent, it becomes a "shared word" — the beginning of language. No one programmed this. It happens, or it doesn't.

Step 1
Proto-sound Generated

Citizen LLM produces a phonetic token during an interaction. Context (fire, hunger, another citizen) shapes the sound.

Step 2
Personal Lexicon Update

Each citizen maintains sound → meaning mappings with confidence scores. Repeated use increases confidence.

Step 3
Lexicon Transfer

When citizens interact, they probabilistically learn from each other. Successful communication reinforces shared meaning.

Step 4
Shared Word

3+ citizens independently using the same sound for the same referent — the sound becomes a shared word. Language exists.

Step 5
Voice & Metrics

The word gets a synthesized voice. Linguistic metrics computed: Zipf's law coefficient, Heaps' law beta, network clustering.

The Tick System

Every 30–90 seconds, the world advances one tick.

The simulation advances in discrete ticks. Each tick, the entire world updates: environment, citizen decisions, speech generation, lexicon evolution, survival, births, deaths. Everything you see on the dashboard is the result of this relentless, autonomous loop.

Climate & Environment Update

Weather advances, temperature shifts, flora grows or depletes, predators move. The world changes first — citizens must react to what it becomes.

Citizens Decide Their Action

Each LLM processes its context window: current location, hunger, nearby citizens, recent memories. It chooses: forage, explore, socialize, or rest.

Interactions & Speech Generation

Citizens within proximity range interact. Speech is generated via LLM inference — real-time, on CPU, constrained by the 512-token context window.

Lexicon Updates

Based on interaction outcomes, each citizen's personal lexicon updates. Sounds that successfully communicated meaning gain confidence. Failed communication loses it.

Survival Stats Update

Hunger, thirst, and health change based on actions taken. Citizens who didn't forage get hungrier. Citizens who were attacked lose health. Death is possible.

Births & Deaths Processed

New citizens may be born (inheriting traits from parents' archetypes). Citizens whose health reaches zero die and are removed from the simulation permanently.

World State Saved to Database

Every tick produces a complete snapshot: positions, lexicon states, interaction logs, science metrics, utterances. Saved to Neon Postgres for the dashboard to read.

"World 2 citizens think differently than World 1 citizens — because they carry the distilled experiences of World 1 in their weights. Every world that ends teaches the next world how to survive."

World Evolution

When a world ends, the tribe literally learns from its past life.

A world ends when population drops below threshold, a cataclysmic event occurs, or the world reaches a civilization milestone. What happens next is the part that makes WildMind genuinely different from any other AI experiment.

1
Evolution Engine Activates

Every interaction, every word invented, every relationship formed, every death — analyzed across the entire world's history to generate training signal.

2
Training Data Generated

New fine-tuning examples created from lived experiences: what worked, what failed, which sounds carried meaning, which behaviors led to survival.

3
Base LLM Fine-tuned

The 135M base model is updated via LoRA adapter training on the new dataset. This is additive — prior world knowledge is preserved, new knowledge layered on top.

4
Next World Begins

A new world generates, new citizens spawn from the evolved models. They remember nothing explicitly — but their instincts, patterns, and proto-linguistic tendencies are different. Smarter. More social. More language-ready.

Voice

Every shared word gets a voice — powered by ElevenLabs.

When a proto-sound achieves "shared word" status — when 3+ citizens independently use it for the same meaning — it gets synthesized into actual audio. You can hear the tribe's language evolving in real time. Not text to speech. Voice to civilization.

Voice Synthesis Partner

WildMind uses the ElevenLabs API — the world's most expressive text-to-speech technology — to synthesize each proto-word with phonetic authenticity. Voice is selected based on citizen archetype and biological sex. The result is a language that sounds alien yet somehow human.

  • ▶ Triggered when a sound crosses the shared lexicon threshold
  • ▶ Voice selected based on citizen archetype and sex
  • ▶ Audio stored in Neon database with full metadata
  • ▶ Playable in the Proto-Sounds dashboard tab
elevenlabs.io ↗
Example Proto-Words (With Audio)
kroh-tuh Fire / Heat / Danger-of-flame 7 speakers
gahn-neh Danger / Predator nearby 12 speakers
muh-brash Greeting / I see you 9 speakers
pah-skuh Food / This is edible 5 speakers

None of these words were programmed. They emerged from context and repetition in the living world — then given voice by ElevenLabs.

Infrastructure

Built entirely on Replit. Running continuously, right now.

WildMind runs on Replit — from the Python simulation engine to the dashboard you're reading. The world loop, LLM inference, and evolution engine run continuously in a Replit workspace. The dashboard is deployed via Replit's Cloud Run integration, giving it a production URL with zero-downtime deploys and containerized hosting.

Infrastructure Partner

Replit makes it possible to build, run, and deploy an experiment like this without managing infrastructure. The Python simulation runs in a persistent workspace. The dashboard deploys to Cloud Run with a single click. SSH access enables live debugging and hotfixes while the world runs.

  • ▶ Instant deployment from workspace to production
  • ▶ SSH access for live debugging and hotfixes
  • ▶ Cloud Run integration for scalable, containerized hosting
  • ▶ Zero infrastructure management — focus on the science
replit.com ↗
Architecture Overview
Simulation Engine
Python · llama-cpp-python · Replit Workspace · Continuous loop
API Layer
FastAPI · /api/state · /api/snapshot · /api/lexicon · JSON
Dashboard
HTML/CSS/JS · Replit Cloud Run · Zero-dependency · Polls every 5s
Database
Neon Postgres · Serverless · Branching · pgvector for embeddings
Data & Database

Every tick. Every word. Every death. All stored in Neon.

All world state is stored in Neon — a serverless Postgres database with branching. Every tick produces data: interactions logged, lexicon states saved, citizen positions recorded, science metrics computed. The dashboard polls /api/state every 5 seconds to display live state.

Database
Neon Postgres
Serverless, branching, pgvector — the right database for living data
Poll Interval
5 seconds
Dashboard polling frequency for near-real-time state display

Database Tables

citizens
interactions
lexicon_entries
shared_lexicon
memories
worlds
utterance_log
breeding_events
narratives
science_metrics
voice_clips
state_snapshots
Watch the experiment unfold.

The world is running right now. Language is either being born or struggling to exist.

Watch It Live Read the Science