RESEARCH DIRECTION

Intrinsic Motivation &
Open-Ended Learning

External reward is too sparse a teacher for open-ended growth. Following a long line of work on artificial curiosity, we build agents whose intrinsic objective is learning progress itself — systems that are drawn to exactly the experiences that improve their world model fastest, and that grow bored, productively, with what they have already mastered.

Curiosity as compression progress

A principled formulation of interestingness: an experience is valuable to the degree that it improves the model's compression of the world. Noise is incompressible and already-mastered structure is already compressed — the reward gradient points precisely at the learnable-but-not-yet-learned. This turns exploration from a heuristic into an economics.

Self-generated curricula

An agent steered by learning progress orders its own syllabus: easy structure first, harder structure as competence grows, each acquired skill widening the set of reachable next skills. We study how to make this compounding loop stable — how banked capabilities become the substrate that makes the next capability cheaper.

The allocation problem

Compute spent re-confirming what the system knows is compute not spent at the frontier. We treat attention, experimentation, and training budget as a portfolio-allocation problem driven by expected learning progress — measured, not assumed — with saturation detected and treated as a signal to move on.

WORKING PRINCIPLES

How we hold this work to account.

Reward understanding itself

Learning progress is the primary objective, not a bonus term.

Seek your own frontier

The best next problem is just beyond current competence.

Boredom is information

Saturation on a domain is the signal to widen, not to drill.

CONTINUE EXPLORING

More research directions.

World Models & Latent Imagination

Learning compressed generative models of environment dynamics — and planning inside them before acting in the world.

Self-Supervised Representation Learning

Joint-embedding predictive architectures that learn hierarchical abstractions from raw observation — without labels.

Neurosymbolic Reasoning

Hybrid architectures that combine learned representations with explicit symbol manipulation and verifiable inference.

Spatial & Embodied Intelligence

Grounding intelligence beyond language: geometric scene understanding, simulation, and perception-action loops.

Grounding & Calibration

Closed-loop evaluation against reality: held-out prediction, proper scoring, and confidence that means something.

Systems & Cognitive Architecture

Modular architectures — perception, world model, memory, critic, actor — engineered as dependable, measurable systems.

Alignment & Safety

Bounded agency, calibrated honesty, and oversight designed into the architecture — not appended to it.

ALL RESEARCH