WildMind — Research Insights

01

Overview Dashboard

Key metrics from the WildMind simulation at a glance.

02

Zipf's Law Analysis

The rank-frequency distribution of sounds in the emergent language. Natural languages follow Zipf's Law with slope near -1.0 on a log-log plot.

Log-Log Rank vs Frequency

Zipf Coefficient Over Time

03

Heaps' Law Analysis

Vocabulary growth as a function of corpus size. Natural languages exhibit beta between 0.4 and 0.6.

Unique Types vs Total Tokens

Heaps' Beta Over Time

04

Network Topology

Social network structure from citizen interactions. Compares clustering and path lengths to random graph baselines.

Degree Distribution

Structured vs Random

05

Vocabulary Growth

Shared vocabulary size over simulation ticks, with best-fit growth model.

Shared Vocabulary Over Time

Population & Vocab (Snapshots)

06

Influence & Cascades

How vocabulary spreads through the population. Linguistic influence ranking and information cascade tracking.

Linguistic Influence Ranking

07

Language Dictionary

Complete lexicon data: shared community words and personal citizen vocabularies.

SHARED LEXICON

Vocabulary Size Per Citizen

PERSONAL LEXICON (ALL CITIZENS)

08

Phoneme Analysis

Character-level frequency analysis of all lexicon sounds, color-coded by vowel/consonant. Radar chart shows archetype phoneme profiles.

Phoneme (Character) Frequencies

Archetype Phoneme Profiles

09

Communication Efficiency

Success rate and utterance length trends, plus context type breakdown.

Success Rate & Utterance Length Over Time

Context Type Distribution

10

Social Network

Relationship graph data: bond types, strengths, and distribution.

Relationship Type Distribution

Relationship Score Histogram

11

Survival & Population

Death causes, age distributions, and population trends.

Death Causes

Age at Death Distribution

Population Over Time

12

World History

Comparison of all simulation worlds.

Cross-World Comparison

13

Training History

Additive training runs per citizen: dataset composition across versions.

Training Dataset Composition

14

Methodology

How the simulation, language metrics, and network analysis work.

Simulation Architecture

Each citizen is an independent AI model (fine-tuned LLM) that perceives its environment, makes decisions, and generates vocalizations. Citizens live in a simulated wilderness with latitude-based climate, biomes, predators, prey, and survival mechanics. Language emerges through repeated contextual interactions -- when two citizens encounter the same stimulus and both vocalize, the co-occurrence of sound and context gradually binds them into meaning.

Zipf's Law Computation

We count the frequency of each unique sound in the full utterance log. Sounds are ranked by frequency. On a log-log plot, Zipf's Law predicts a linear relationship with slope approximately -1.0. We compute the coefficient via OLS linear regression on log(rank) vs log(frequency). Natural human languages typically show coefficients between -0.8 and -1.2.

Heaps' Law Computation

Heaps' Law models the relationship: V(n) = K * n^beta. Beta values between 0.4 and 0.6 indicate natural-language-like sublinear growth. Values near 1.0 mean every token is novel. Values near 0.0 mean the vocabulary saturated early.

Network Analysis

The social network is constructed from interaction history. Edges are weighted by interaction frequency and relationship score. We compute the clustering coefficient, average path length, and degree distribution. These are compared to an Erdos-Renyi random graph with the same parameters.

PageRank / Influence

Linguistic influence uses modified PageRank on the vocabulary transfer graph. An edge from A to B exists if B learned a sound that A established. Citizens with high PageRank are vocabulary hubs.

Known Limitations

Small population (6-15 citizens) limits statistical power. The LLM substrate introduces biases. Phoneme analysis operates at character level. Zipf and Heaps fits may be noisy at low token counts. Cross-world comparisons are confounded by parameter changes.

WildMind: Emergent Language in Self-Teaching AI Populations. 2026. Available at: https://cosmic-piroshki-aeb441.netlify.app

15

API Documentation

Public endpoints for programmatic access to simulation data.

Base URL

https://cosmic-piroshki-aeb441.netlify.app

GET /api/state

Full simulation state. Primary endpoint used by this page.

curl -s https://cosmic-piroshki-aeb441.netlify.app/api/state | jq '.shared_lexicon[:3]'

GET /api/api?endpoint=meta

API metadata and available endpoints.

curl -s "https://cosmic-piroshki-aeb441.netlify.app/api/api?endpoint=meta" | jq .

GET /api/api?endpoint=citizens

All alive citizens with positions, mood, energy, personality, vocabulary.

curl -s "https://cosmic-piroshki-aeb441.netlify.app/api/api?endpoint=citizens"

GET /api/api?endpoint=lexicon

Shared and personal lexicons.

curl -s "https://cosmic-piroshki-aeb441.netlify.app/api/api?endpoint=lexicon"

GET /api/api?endpoint=interactions&limit=50&offset=0

Paginated interaction history. Max 500 per page.

GET /api/api?endpoint=science

Latest science metrics: Zipf, Heaps, network, cascades, growth, efficiency, semantic fields.

GET /api/api?endpoint=utterances&limit=50&citizen_id=ID

Utterance log. Optionally filter by citizen. Max 500.

GET /api/api?endpoint=history&table=TABLE&limit=50

Historical data. Allowed tables: science_metrics, state_snapshots, utterance_log.

Response Format

All endpoints return JSON with CORS headers. Error responses include an error field.

16

Data Export

Download current simulation data for offline analysis.

Generated from current API response. Refresh for latest.