I’ve been experimenting with running LLMs locally, and kept running into a common issue:
constraints and corrections drift out of the prompt over time
Example:
-
User: “don’t use peanuts”
-
…long conversation…
-
Model suggests something with peanuts anyway
This gets worse with smaller models or limited context windows.
So I built a small deterministic tool called a context compiler.
Instead of relying only on the transcript, it extracts structured state like:
-
facts.focus.primary = "vegan curry" -
policies.prohibit = ["peanuts"]
Then that state is injected into the prompt every turn, so important constraints don’t get lost.
Key idea:
-
prompt engineering helps
-
compiled state makes constraints persistent
I added a set of demos comparing:
-
baseline prompting
-
stronger prompt engineering
-
prompt + compiled state
The interesting part is that better prompting improves things, but the compiled state is what actually guarantees invariants.
Repo + demos: