The technical architecture behind the world's first living AI civilization experiment — 30 independent language models, a 50km living world, emergent language, and evolutionary learning across generations.
WildMind strips 30 AI minds of language, culture, and history. It gives them only survival instincts and the ability to produce sound — then sets them loose in a world that will kill them if they don't cooperate. What happens next is unscripted, unguided, and genuinely unknown.
The question we're trying to answer: if you gave intelligence the tools of life but none of its accumulated knowledge — no words, no culture, no history — would it find them again?
Each citizen is an independent language model — a fine-tuned version of a 135M parameter base LLM. Eight archetypes, each with a distinct personality matrix encoded through LoRA fine-tuning. They don't share weights. They don't communicate through any channel except simulated speech. When citizen_07 says "bzaavk", citizen_12 genuinely doesn't know what it means — unless they've met before.
The world has climate — seasons, temperature, rainfall, wind. It has biomes: tropical forest, savanna, desert, tundra, ice cap, coastal. Flora grows and depletes. Fauna hunts and flees. Citizens have real hunger, thirst, and health. They die. The environment is not decorative. It is the pressure that forces language into existence.
NORTH POLE ← LATITUDE GRADIENT → SOUTH POLE
Citizens start with zero shared language. They produce proto-sounds — phonetic tokens generated by their LLM during interactions. When the same sound is independently used by 3+ citizens for the same referent, it becomes a "shared word" — the beginning of language. No one programmed this. It happens, or it doesn't.
Citizen LLM produces a phonetic token during an interaction. Context (fire, hunger, another citizen) shapes the sound.
›Each citizen maintains sound → meaning mappings with confidence scores. Repeated use increases confidence.
›When citizens interact, they probabilistically learn from each other. Successful communication reinforces shared meaning.
›3+ citizens independently using the same sound for the same referent — the sound becomes a shared word. Language exists.
›The word gets a synthesized voice. Linguistic metrics computed: Zipf's law coefficient, Heaps' law beta, network clustering.
The simulation advances in discrete ticks. Each tick, the entire world updates: environment, citizen decisions, speech generation, lexicon evolution, survival, births, deaths. Everything you see on the dashboard is the result of this relentless, autonomous loop.
Weather advances, temperature shifts, flora grows or depletes, predators move. The world changes first — citizens must react to what it becomes.
Each LLM processes its context window: current location, hunger, nearby citizens, recent memories. It chooses: forage, explore, socialize, or rest.
Citizens within proximity range interact. Speech is generated via LLM inference — real-time, on CPU, constrained by the 512-token context window.
Based on interaction outcomes, each citizen's personal lexicon updates. Sounds that successfully communicated meaning gain confidence. Failed communication loses it.
Hunger, thirst, and health change based on actions taken. Citizens who didn't forage get hungrier. Citizens who were attacked lose health. Death is possible.
New citizens may be born (inheriting traits from parents' archetypes). Citizens whose health reaches zero die and are removed from the simulation permanently.
Every tick produces a complete snapshot: positions, lexicon states, interaction logs, science metrics, utterances. Saved to Neon Postgres for the dashboard to read.
"World 2 citizens think differently than World 1 citizens — because they carry the distilled experiences of World 1 in their weights. Every world that ends teaches the next world how to survive."
A world ends when population drops below threshold, a cataclysmic event occurs, or the world reaches a civilization milestone. What happens next is the part that makes WildMind genuinely different from any other AI experiment.
Every interaction, every word invented, every relationship formed, every death — analyzed across the entire world's history to generate training signal.
New fine-tuning examples created from lived experiences: what worked, what failed, which sounds carried meaning, which behaviors led to survival.
The 135M base model is updated via LoRA adapter training on the new dataset. This is additive — prior world knowledge is preserved, new knowledge layered on top.
A new world generates, new citizens spawn from the evolved models. They remember nothing explicitly — but their instincts, patterns, and proto-linguistic tendencies are different. Smarter. More social. More language-ready.
When a proto-sound achieves "shared word" status — when 3+ citizens independently use it for the same meaning — it gets synthesized into actual audio. You can hear the tribe's language evolving in real time. Not text to speech. Voice to civilization.
WildMind uses the ElevenLabs API — the world's most expressive text-to-speech technology — to synthesize each proto-word with phonetic authenticity. Voice is selected based on citizen archetype and biological sex. The result is a language that sounds alien yet somehow human.
None of these words were programmed. They emerged from context and repetition in the living world — then given voice by ElevenLabs.
WildMind runs on Replit — from the Python simulation engine to the dashboard you're reading. The world loop, LLM inference, and evolution engine run continuously in a Replit workspace. The dashboard is deployed via Replit's Cloud Run integration, giving it a production URL with zero-downtime deploys and containerized hosting.
Replit makes it possible to build, run, and deploy an experiment like this without managing infrastructure. The Python simulation runs in a persistent workspace. The dashboard deploys to Cloud Run with a single click. SSH access enables live debugging and hotfixes while the world runs.
All world state is stored in Neon — a serverless Postgres database with branching. Every tick produces data: interactions logged, lexicon states saved, citizen positions recorded, science metrics computed. The dashboard polls /api/state every 5 seconds to display live state.
The world is running right now. Language is either being born or struggling to exist.