June 9, 2026 · 7 min read

How one always-on agent wrote 12k memories with Vilix

One power user connected an always-on autonomous agent to Vilix and accumulated roughly 12,000 memory writes across thousands of turns. This is what that setup looks like, and why per-turn read/write beats pull-based ingestion for agents.

One power user running an always-on autonomous agent through Vilix accumulated around 12,000 memory writes without any manual curation. The agent called get_context before each action and save_turn after, and Vilix handled the rest: storage, search, deduplication, and cross-session retrieval. This post walks through the setup, what the agent actually did with memory, and why the per-turn read/write pattern outperforms pull-based ingestion for long-running agents.

The setup: one MCP connector, one always-on agent

The user connected their agent to Vilix by registering a single custom MCP connector pointed atapi.vilix.ai/mcp. There is no browser extension to install, no SDK to integrate, and no separate pipeline to build. Any agent framework that supports custom MCP connectors can connect the same way.

The agent itself was a general-purpose task runner: it would wake on a schedule, pick up pending items from a task queue, execute them (web lookups, code generation, data transforms), and then go idle until the next trigger. Each run was a fresh process with no shared in-memory state between sessions. Without persistent memory, each run started completely blind.

With Vilix connected, the agent was given two instructions in its system prompt: call get_contextat the start of each turn and call save_turn at the end. That is the entire integration. No custom retrieval logic. No chunking pipeline. No vector database to operate.

What get_context actually loads

When the agent calls get_context, Vilix returns several things in a single response:

Recent messages from prior sessions, ranked by recency and semantic relevance to the current query
Saved memories: facts, decisions, and snippets the agent explicitly stored with save_turn
User rules: short personal directives the owner has set (things like preferred output format or constraints on tool use)
Live project and task state, including any notes attached to the active project
Related past conversations, surfaced by keyword and semantic search over the full memory store

All of this arrives before the agent produces its first token for the turn. The agent does not need to decide what to retrieve or construct a query. The context load is automatic and shaped by what is most relevant to the current moment.

What save_turn actually writes

After each turn, the agent calls save_turn with the exchange: the input, the output, and any structured data the agent flags as worth keeping. Vilix stores this server-side in the user account. It is tied to the user, not to the agent process, so it survives restarts, redeploys, and tool switches.

Over thousands of agent runs, these per-turn writes accumulated to roughly 12,000 stored memories. The agent never manually managed the memory store. It just wrote, consistently, every turn.

The result was a growing knowledge base that made each subsequent run smarter. By the time the store reached around 12,000 entries, the agent could recall decisions made months earlier, avoid repeating work it had already done, and apply lessons from past failures without being explicitly told about them.

Why per-turn read/write beats pull-based ingestion

The dominant pattern for adding memory to agents today is pull-based ingestion: you run a pipeline that chunks documents, embeds them, and stores them in a vector database. The agent queries that database at retrieval time. This works for static knowledge bases, but it has real limitations for agents that learn from their own actions.

Pull-based ingestion is a one-time or batched write. If the agent encounters something important on Tuesday, it may not be in the store until Friday when the next pipeline runs. Per-turn writes are immediate. Every action the agent takes is available in the next context load.

Pull-based ingestion is also expensive to operate. You maintain a vector database, manage embeddings, handle chunking, and tune retrieval parameters. Per-turn writes through MCP have no operational overhead. Vilix handles storage, search, and retrieval. The agent just calls two tools.

Finally, pull-based ingestion does not capture conversational structure. It treats documents as chunks. Per-turn writes through save_turn preserve the full exchange, including the agent reasoning that produced an output, which is often the most valuable thing to retrieve later.

What changed for the agent at scale

Around 1,000 memories, the agent started skipping redundant work it had already completed in prior runs. The context load was surfacing prior task outputs, and the agent recognized them.

Around 3,000 memories, the agent began applying stylistic preferences it had learned from user corrections. The user had never updated the system prompt. The corrections lived in the memory store, andget_context was surfacing them as relevant prior context.

By the time the store reached roughly 12,000 entries, the agent's outputs were meaningfully shaped by its history. It was not just executing tasks. It was executing tasks with accumulated knowledge about what had worked, what had not, and what the user preferred.

That is a qualitative shift that pull-based ingestion cannot easily replicate, because it requires writing conversational structure at turn granularity, not just indexing static documents.

How to replicate this setup

If you are building or running an autonomous agent and want to add long-term memory, the setup is straightforward. Connect your agent to Vilix using a custom MCP connector at api.vilix.ai/mcp. Add two instructions to your system prompt: call get_context before each turn andsave_turn after. That is the full integration.

For a deeper walkthrough of the MCP integration pattern, see how to add memory to any agent with Vilix MCP. If you are using OpenClaw specifically, there is a dedicated guide at OpenClaw agent memory with Vilix.

Vilix works with any agent framework that supports custom MCP connectors, including ChatGPT, Claude, Claude Code, Cursor, Codex, Manus, Windsurf, and Lovable. Memory lives server-side in your account and follows you across tools and devices.

You can start with the free plan and upgrade to Pro (7-day full trial, $19.99/month after) when you need higher memory limits and priority retrieval. Try Vilix free and connect your first agent in a few minutes.

Frequently asked questions

How did the agent accumulate 12,000 memories without manual curation?

The agent called save_turn at the end of every turn, automatically. Over thousands of agent runs across many sessions, those per-turn writes added up to roughly 12,000 stored memories. No manual curation was involved. The agent just wrote consistently, and Vilix stored and indexed everything server-side.

What is the difference between get_context and a standard vector database query?

A vector database query retrieves similar chunks based on embedding distance. get_contextreturns a richer payload: recent messages, saved memories, user rules, live project state, and semantically related past conversations, all in one call. The agent does not need to construct a retrieval query or manage embedding logic. Vilix handles that internally.

Does the agent need a custom integration or SDK to use Vilix?

No. The only requirement is that the agent framework supports custom MCP connectors. You point it at api.vilix.ai/mcp, add two tool calls to the system prompt, and the integration is complete. There is no SDK, no browser extension, and no separate pipeline to build or maintain.

Will memory from one agent session carry over to a completely fresh process?

Yes. Memory in Vilix is stored server-side in your account, not in the agent process. When a fresh agent process starts and calls get_context, it receives the full memory accumulated in prior sessions. The agent does not need to be the same process or even the same tool to access its history.

How does per-turn memory scale as the store grows large?

Vilix uses semantic and keyword search to surface the most relevant memories for each context load, so retrieval quality does not degrade linearly as the store grows. The agent receives a ranked, relevant subset of its full history on each turn, not the entire store. This is what made the 12,000-memory store usable in practice rather than just large.