Results Overview Approach Controls Processing Examples What's next

A cognitive model that
thinks like a specific person

As AI capability scales, human intent becomes the bottleneck. We built a self-model that encodes how one person thinks, reacts, and decides, so AI systems can stay aligned with them without constant manual correction.

Figure 1. Intent prediction accuracy across three iterations of the self-model.
Self-model only
14%
+ conversation history
50%
+ split intent/voice
91%

Overview

A peripheral encodes how a specific person thinks into a model that AI systems can check themselves against. It's built from structured interviews and distilled into cognitive domains: how they reason, react, judge, and communicate.

We evaluate it on two axes. Intent prediction tests whether the model understands what the person will do: approve, correct, redirect, or give a new instruction. Voice modeling tests whether it has internalized how they communicate: terse or verbose, when they paste code versus describe the problem, what they emphasize and what they skip. Both are scored by a separate LLM judge against the person's actual response.

Approach

The self-model is organized into cognitive domains: reasoning, reaction, writing, and judgment. Each domain captures a different facet of how the person thinks. Rather than dumping the entire self-model into every task, we route context selectively so each cognitive process only sees what's relevant to it.

Intent prediction draws on the reasoning and reaction domains to predict what the person will do next. Given an AI assistant's output, will they approve it, push back, redirect, or give a new instruction entirely? This tests whether the model has internalized the person's decision-making process.

Voice modeling draws on the writing and reaction domains. In the eval, we test it by predicting how the person would phrase their response. In production, the same model serves as a quality gate for LLM-generated text.

Controls

The eval data comes from private coding sessions that were never published online, so the base model hasn't seen these conversations during training. But that's not sufficient on its own because the base model might be generically good at predicting what developers say.

To isolate the self-model's contribution, we run every scenario twice: once with the full peripheral (self-model + conversation history) and once as a baseline (same model, same conversation, but no self-model). The baseline predicts what "a developer" would say. The peripheral predicts what this specific person would say. The gap between the two is what the self-model adds, and it can't come from memorization because both runs use the same base model on the same unseen data.

The processing layer

Every time someone reads AI output, corrects it, and re-prompts, they burn the bandwidth that AI was supposed to free up. The voice model addresses this by scoring any LLM-generated text against the person's actual communication patterns: their sentence rhythm, word choices, whether they'd use a bullet list or a run-on sentence, whether they'd open with context or jump straight to the point.

This creates a closed loop. An LLM generates a draft, the peripheral scores it, and the LLM revises until the output passes. The person gets text they'd actually send. Less correction, less re-prompting, more time on the work itself.

LLM generates draft
Peripheral scores voice match
Output or revise

Examples

Intent prediction

The model predicts what the person will do next: approve, correct, redirect, or give a new instruction. These examples show it catching corrections and creative direction across different projects.

Voice modeling

The model predicts how the person will phrase their response. Voice accuracy is lower overall because the eval tests raw prediction, but when it lands, it captures the person's actual communication style.

Where it breaks

The model fails when the person's next action can't be inferred from the conversation context alone, like when they pivot to something entirely new. But these failures point at what's missing: context that lives outside the conversation. When the model knows more about what the person is paying attention to, pivots stop looking random.

That's the problem Curvilinear (the company behind peripheral) is working on through Membrane, a line of research into higher-bandwidth input modalities like voice and biometric data so the model can pick up on intent that never makes it into text.

What's next

Right now, peripheral models one person from structured interviews. The next step is continuous learning: the model updates itself from every interaction, so it gets sharper over time without requiring new interviews. After that, multi-person modeling, where a team's peripherals coordinate to predict how a group decision will land before it's made.

The longer arc connects to agent coordination. As AI teams scale, each agent needs to know what the human steering them actually wants, not just what they last said. A peripheral gives every agent in the system a live model of the person's intent, so the whole team stays aligned without the person having to repeat themselves across every tool and conversation.