Egresspathy A digital sanctuary for personal insights and technical experiments.

The Entropy-Weighted Mirror: Building a Mimetic Digital Twin in OpenClaw

In the quest for a true “Digital Twin,” the bottleneck is not data storage—it is resonance.

Large Language Models (LLMs) are fundamentally trained on a single objective: to minimize surprisal (or cross-entropy loss) across a massive corpus of human text. By learning to predict the most likely next token, they minimize perplexity, a measure of how “confused” a model is by new data. This mathematical drive toward the expected is what makes LLMs consistent, coherent, and useful; it allows them to internalize the “average” of human knowledge and linguistic structure.

However, this same drive toward the expected is what makes most AI assistants feel generic and corporate. They are optimized for the statistical mean. To build a Digital Twin, we must invert this objective. Instead of aligning with the “average” human, the agent must align specifically with you.

By calculating the Predictive Gap—the distance between what the agent expects to hear and what the user actually says—we can identify the unique stylistic and conceptual deviations that define an individual. When the agent acts to minimize the surprisal gap between itself and its specific user, it ceases to be a generic tool and begins to mimic the user’s unique cadence, vocabulary, and worldview in real-time.

This post details the implementation of a recursive, entropy-weighted memory system in OpenClaw designed to achieve this resonance.

I. The Theory of the Predictive Gap

The core of this implementation rests on Surprisal Theory. In information theory, surprisal (or self-information) is the negative log of probability. We apply this to the human-AI interaction loop by monitoring the Predictive Gap:

Predicted Interaction: When a user’s input aligns with the agent’s internalized “expectations” (based on previous interactions), the surprisal is low.
The Entropy Gap: When a user introduces specific jargon, unique stylistic flair, or complex new concepts, the “gap” between the agent’s internal probability model and the user’s actual input widens.

The mathematical surprisal $I(x)$ of an event $x$ is defined as: $$I(x) = -\log_2(P(x))$$

By measuring this gap, we identify which moments are truly worth “weighting” in long-term memory. We aren’t just saving text; we are capturing the deviations that define a user’s unique identity.

II. Implementation: How to Reproduce the System

The architecture (Memory System V2) consists of several Python-based layers integrated into the OpenClaw workspace.

1. The Persona Engine (`surprisal_persona.py`)

To calculate surprisal relative to the user, the agent first needs a baseline of its own “style.” In our current implementation, we use a simplified statistical model (60% Markov Chain, 30% Unigram, 10% Heuristic complexity) to minimize computational overhead and API costs.

However, it is important to note that the most accurate measure of the Predictive Gap would come from the LLM itself. Ideally, one would use the participating model to predict the logit values for every token in the user’s response. Because surprisal is defined by the model’s own failure to predict the next token, using the actual LLM’s probability distribution provides the truest “Mirror” of its internal state.

In a production environment, this would be extremely costly, requiring high-frequency logprob requests for every interaction. Our simplified model serves as a “heuristic proxy”—it captures the essence of the deviance without the exhaustive cost of full-model probability extraction.

Mathematical Foundation: The Markov component calculates the transition probability of token $x_i$ given $x_{i-1}$: $$P(x_i | x_{i-1}) = \frac{count(x_{i-1}, x_i) + 1}{count(x_{i-1}) + |V|}$$ where $|V|$ is the vocabulary size (Laplace smoothing).

2. The Prose & Flair Analyzer (`test_prose.py`, `test_flair.py`)

To mimic the user, the system must categorize “how” something is said. - Cadence: Measures syllable variance between words to detect rhythm. - Lexical Density: The ratio of syllables to words to distinguish casual vs. academic tone. - Flair Categories: Detects if a user is an “Architect” (complex clauses), “Minimalist” (punchy fragments), or “Orator” (rhetorical repetition).

3. The Weighted Recorder (`memory_weighted.py`)

This script acts as the primary memory tool. For every user input, it: 1. Calculates the total weight $W$: $$W_{total} = I(text) + (0.5 \times V_{prose})$$ 2. Writes the entry to a Markdown log with hidden <OC_METADATA /> tags. 3. Injects “Boost Tokens” (IMPORTANT_MEMORY) into the hidden tags. Because OpenClaw uses Hybrid Search (BM25 + Vector), repeating these tokens in metadata makes high-entropy memories “louder” and easier to retrieve.

4. Real-Time Mimicry (`context_filter.py`)

During retrieval, a filter strips the metadata and prepends a stylistic cue: [Style: rhythmic, architect | Mimic: prioritize this tone]. This instructs the LLM in real-time to adopt the detected style of the current user prompt for the response.

III. Expanding to Psychometric Mimicry (Big Five)

The next frontier is mapping these metrics to the Big Five (OCEAN) Personality Traits for deeper alignment:

Openness: High surprisal in conceptual shifts and abstract metaphors.
Conscientiousness: High “Architect” flair scores (structured, precise, orderly).
Extraversion: Measured through high “Vibe” scores and intense punctuation (!!!).
Agreeableness: Analyzed via the “Orator” patterns and sentiment markers.
Neuroticism: Detected via high variance in surprisal scores within a single session.

IV. Recursive Consolidation

To prevent the system from becoming cluttered, a weekly cron job (memory_reevaluate.py) re-scans the core memory. As the agent’s persona evolves and begins to “expect” the user’s previously surprising inputs, the entropy score for those memories drops. When it falls below a threshold, the memory is demoted from the core MEMORY.md to daily archives, mimicking the biological process of consolidation.

Technical Note: Implementation requires OpenClaw with Python 3.9+ and a Markdown-based workspace.

Other articles

Human Condition Is Environment

Published: Wed 24 December 2025
By Jason Wang

In Blog.

tags: AI Alignment

A mathematical reframing of the human condition, moral intuitions, and the nature of ethics through the lens of optimization and environment saturation.
read more
Map Scout Ai Virtual Tour Guide

Published: Wed 24 December 2025
By Jason Wang

In Projects.

tags: AI Tour

read more
The New Moat: AI Shadow Wars and the Rise of Toolchain Bias

Published: Sun 14 December 2025
By Jason Wang

In AI.

tags: AI LLM Cloud Strategy

The surface of AI seems like a battle of logic and language. But beneath the chat interfaces lies a deeper, more consequential struggle for the very plumbing of our digital world. We are entering an era of “Shadow Wars,” where the moats of the future are not built with data …
read more
Unveiling the World: The Journey of My AI-Powered Tour Guide

Published: Sun 14 December 2025
By Jason Wang

In Projects.

tags: AI Gemini Travel Tour Guide

Every city has a secret language—a hidden history etched into its architecture and a vibrant energy pulse in its local events. Often, as travelers, we only scratch the surface. This realization was the spark for my latest experiment: a personal, AI-powered companion designed to unveil the hidden stories …
read more
The Invisible Boundary: Estimating CO2 Toxicity in Confined Spaces

Published: Wed 31 March 2021
By Jason Wang

In Curiosity.

tags: Python Estimation Systems Thinking

During the quiet isolation of the global lockdown, I found myself contemplating the invisible systems that sustain us. While we often think of our homes as fortresses of safety, I wondered about the very air we breathe. If a room were truly airtight, how long could a human inhabit it …
read more
The Chemistry of Ritual: A Data-Driven Exploration of Coffee Quality

Published: Tue 30 March 2021
By Jason Wang

In Review.

tags: R Data Science Coffee

Coffee is more than a beverage; it is a global ritual, a complex intersection of geography, chemistry, and sensory experience. In this experiment, I dive into the Coffee Quality Institute’s database to uncover the hidden patterns that define a “perfect” cup. By applying data science to the art of …
read more