Emergence & Intelligence

Can evolution produce abstraction? A research hub.

What brings you here?

🪨 Crystals of Code LLMs = crystal growth (Boltzmann) 🧬 What is BFF? Self-replicating programs 🧩 What is ARC? Abstraction benchmark 🌉 Show me the connection BFF meets ARC 🔬 I want to build Tractable experiments 📚 Just the reading list Curated sources ❓ Frame the question Definitions first 🧫 Biotech frontier Proteins & manufacturing

TL;DR

Two research frontiers are converging: self-replicating programs emerge from chaos (BFF), and measuring intelligence requires abstraction (ARC). Can evolution produce abstraction? This hub explores that question.

The Question

In 2024, Google researchers showed that random programs placed in a "primordial soup" spontaneously evolve self-replicators about 40% of the time. No fitness function. No design. Just interaction and time.

Meanwhile, François Chollet's ARC benchmark reveals that current AI—despite trillions of parameters—fails at the kind of abstract reasoning a child performs effortlessly. The benchmark tests fluid intelligenceFluid IntelligenceThe ability to solve novel problems without prior training. Contrast with crystallized intelligence (accumulated knowledge).: can you learn new abstractions from just a few examples?

This creates a fascinating gap:

BFF: What emerges

Self-copying programs. Autocatalysis. Replication without design. But systems plateau—complexity stops growing. No abstraction, no learning, just copying.

ARC: What's required

Transfer learning. Novel problem-solving. Core knowledge priors. Abstraction from examples. Current deep learning fails; program synthesis helps but isn't enough.

Key Insight

Replication is necessary but not sufficient for intelligence. The gap between self-copying programs and abstraction-capable systems is the central mystery. Can evolutionary dynamics bridge it?

Three Numbers That Matter

~40%

BFF runs showing emergence

Pure LLM on ARC-AGI-2

TRM params beating 671B

The ARC Prize 2025 winner was a 7-million parameter recursive model that outperformed models with 100,000× more parameters. Size isn't the answer. Architecture matters. Perhaps how you learn matters more than how much you've seen.

The Landscape

  EMERGENCE                                    INTELLIGENCE
     │                                              │
     ▼                                              ▼
┌─────────┐    ┌──────────┐    ┌────────────┐    ┌─────────┐
│ Random  │───▶│ Self-    │───▶│ Complexity │───▶│ Abstrac-│
│ Programs│    │ Replica- │    │ Growth     │    │ tion    │
│ (BFF)   │    │ tors     │    │ (gap here) │    │ (ARC)   │
└─────────┘    └──────────┘    └────────────┘    └─────────┘
     │              │               │                 │
   chaos        autocatalysis    plateau?          transfer
                                                   learning

The dashed arrow is where research is needed. What happens between replication and reasoning? Three hypotheses:

Environmental pressure — Abstraction emerges when environments demand it
Meta-learning — Learning is replication at the algorithm level
Free energy minimization — Prediction and action unify under one principle

Learning Paths

Choose your journey:

Understand the landscape: Thesis → BFF → ARC → Bridge
Build something: Experiments → BFF Code → ARC Tasks
Deep theory: Thesis → Bridge → Reading

Foundational Papers

Computational Life: How Well-formed, Self-replicating Programs Emerge from Simple Interaction

Agüera y Arcas et al., 2024

Random BFF programs in a primordial soup spontaneously evolve self-replicators. The foundation of emergence research.

On the Measure of Intelligence

François Chollet, 2019

Intelligence is skill-acquisition efficiency. Introduces ARC benchmark. The foundation of abstraction research.

ARC Prize 2025: Technical Report

ARC Prize Foundation, 2025

Test-time training, refinement loops, and program synthesis emerge as key techniques. The current frontier.

Esoteric Languages: Where LLMs Break

Understanding how intelligence works requires understanding where it fails. Esoteric programming languages expose the limits of current AI through minimal, Turing-complete systems that challenge different cognitive capabilities.

🧬 MFF (Brainfuck) Eight commands — 6.2% success ⎵ Whitespace Invisible syntax — 0.0% success λ Unlambda Pure combinators — 1.2% success ↗ Befunge-98 2D code space — 11.2% success

The Arc

MFF establishes computational life with eight commands. Whitespace makes syntax invisible. Unlambda removes variables entirely. Befunge makes execution spatial. Each language isolates a different dimension of computation—and reveals where statistical learning cannot substitute for genuine understanding.

Reality Check

This research question has been asked for 30+ years. Most ALife systems plateau. Replication ≠ learning. The burden of novelty is high. See the honest critique.

Next Steps

→ Start with definitions What do we mean by emergence, abstraction, intelligence? → Jump to the connection How BFF and ARC might merge