RESEARCH DIRECTION

Intrinsic Motivation &
Open-Ended Learning

External reward is too sparse a teacher for open-ended growth. Following a long line of work on artificial curiosity, we build agents whose intrinsic objective is learning progress itself — systems that are drawn to exactly the experiences that improve their world model fastest, and that grow bored, productively, with what they have already mastered.

Curiosity as compression progress

A principled formulation of interestingness: an experience is valuable to the degree that it improves the model's compression of the world. Noise is incompressible and already-mastered structure is already compressed — the reward gradient points precisely at the learnable-but-not-yet-learned. This turns exploration from a heuristic into an economics.

Self-generated curricula

An agent steered by learning progress orders its own syllabus: easy structure first, harder structure as competence grows, each acquired skill widening the set of reachable next skills. We study how to make this compounding loop stable — how banked capabilities become the substrate that makes the next capability cheaper.

The allocation problem

Compute spent re-confirming what the system knows is compute not spent at the frontier. We treat attention, experimentation, and training budget as a portfolio-allocation problem driven by expected learning progress — measured, not assumed — with saturation detected and treated as a signal to move on.

WORKING PRINCIPLES

How we hold this work to account.

Reward understanding itself

Learning progress is the primary objective, not a bonus term.

Seek your own frontier

The best next problem is just beyond current competence.

Boredom is information

Saturation on a domain is the signal to widen, not to drill.

CONTINUE EXPLORING

More research directions.

ALL RESEARCH