Your agent’s memory problem is an information architecture problem
Why the persistent knowledge layer your agents need comes from librarians, ontologists, and database engineers — not from cognitive science or computer science alone
I have been building agent systems for a while now, and I have been thinking about memory wrong. Not because I didn’t care about it — I did — but because I was reaching for the wrong mental models. I suspect most of the industry is making the same mistake, and the purpose of this essay is to trace how I arrived at that suspicion and what I found when I followed it.
The starting point was a simple observation: every agent framework ships a memory module, and almost all of them are thin wrappers around vector stores. Embed, index, retrieve by similarity. The consensus is that RAG solved retrieval, and retrieval solved memory. For demos, this holds. For anything that needs to persist knowledge across sessions — actually know things, detect contradictions, decide what to keep and what to discard — the consensus falls apart quickly.
Here is one way to see the crack. “Hybrid search” is now standard practice across the vector database ecosystem. Pinecone, Weaviate, Qdrant — they all combine semantic similarity with BM25 keyword matching. That combination gets marketed as innovation, but think about what the admission actually means: pure similarity wasn’t enough, so they bolted on a technique from the 1990s. If your cutting-edge retrieval system needs a thirty-year-old algorithm as a crutch, maybe similarity was never the right primitive for knowledge in the first place.
Retrieval is not memory. Similarity is not meaning. Cosine distance is not knowledge.
That observation sent me down a path I didn’t expect. The path led through computer science and cognitive science — the two disciplines the industry reaches for when thinking about agent memory — and then, surprisingly, out the other side into Library Science, Information Science, and Knowledge Engineering. This essay traces that path.
I should say upfront: this thesis was shaped by the work of two people I want to credit explicitly. Jessica Talisman has been arguing from the Library Science side that enterprises outsourced knowledge work and now lack the infrastructure AI needs. Kurt Cagle has been making the case from ontology engineering that agents maintaining state are making ontological commitments, and most do it accidentally. Both were saying this before it was fashionable.
What computer science actually gives us
Let me start with what CS gets right, because the critique only works if the credit is honest.
Computer science gave us B-trees, hash maps, LSM trees, vector indices, transaction isolation, query optimization. These are real contributions — load-bearing infrastructure that nobody is building agent memory without. The question is not whether CS matters. It does. The question is whether the primitives CS provides are sufficient for the problem of persistent agent knowledge. I think the answer is no.
The most interesting recent work from the CS camp borrows from operating systems. Letta (formerly MemGPT — the paper is literally titled “MemGPT: Towards LLMs as Operating Systems”) treats the context window as virtual memory with two tiers: core memory that the agent can edit during conversations, and archival memory that is searchable but out of context. The agent pages between them using tools like memory_insert, memory_replace, and archival_memory_search.
Letta’s self-editing memory is genuinely clever — it gives agents agency over their own context. The agent decides what to remember, what to update, what to search for. That is real innovation.
But here is what kept nagging me. Virtual memory manages space. It answers “what fits in the window right now?” using access recency and frequency. It does not answer “is this fact still true?” or “does this contradict something else I know?” or “where did this come from, and how much should I trust it?”
LRU eviction doesn’t know that a user-stated budget constraint is more important than an agent-inferred style preference. Both are pages. One is load-bearing. The other is speculative. An eviction policy that treats them identically will eventually evict the wrong one.
There is no controlled vocabulary to normalize concepts — “dark mode” and “night mode” may live as two separate entries. There is no provenance hierarchy to distinguish user-stated facts from agent inferences. There is no appraisal system to evaluate whether a fact is worth keeping based on uniqueness, actionability, or sensitivity. If contradictory facts end up in archival memory, there is no mechanism to detect the contradiction. There is no principled strategy for what should be discarded and why.
These OS primitives are excellent low-level building blocks. They’d work better with a proper vocabulary layer, provenance hierarchy, and appraisal system feeding the agent’s decisions about what to keep in core memory. The problem is not that Letta exists. The problem is treating context window management as the entire memory architecture when it is one layer of a larger system.
Someone might reasonably object: “But databases have schemas, and schemas impose structure.” Fair point. A schema describes the shape of data, not its meaning. A memories table with content, embedding, and timestamp tells you nothing about whether those memories are facts, preferences, constraints, or contradictions. That distinction matters, and CS doesn’t make it. Shape without semantics is a filing cabinet without labels.
The more I sat with this, the clearer it became: CS provides containers. It tells you how to store and retrieve data efficiently. It doesn’t tell you what the data means, how it relates to other data, or how someone will need to find it in a context you can’t predict at design time. These are different problems. The first is engineering. The second is something I didn’t have a name for yet.
What cognitive science offered (and where it went wrong)
The second discipline the industry reaches for is cognitive science. Endel Tulving’s 1972 taxonomy — episodic versus semantic memory — was a genuine breakthrough in understanding human cognition. The AI community borrowed it wholesale: agents need “episodic memory” for experiences, “semantic memory” for facts, “procedural memory” for skills. The taxonomy gave teams a vocabulary and the vocabulary felt like a design.
Mem0 is the most prominent example. Its documentation explicitly uses the CogSci taxonomy — “semantic (facts), episodic (interactions), and procedural (styles) memory.” Under the hood, an LLM extracts “memories” from conversations, stores them as text with vector embeddings, and retrieves by semantic similarity.
What is instructive is how Mem0 has evolved. V1 gave the LLM four operations — ADD, UPDATE, DELETE, NOOP — so it could, in theory, detect conflicts and update existing memories. In practice, this was entirely LLM-mediated: it worked when the model noticed a contradiction, and silently failed when it didn’t.
Mem0 v3, released in 2026, made a deliberate architectural choice: drop UPDATE and DELETE entirely. ADD-only. Store everything, resolve contradictions at retrieval through ranking. From their migration docs: “When information changes (e.g., a user moves from New York to San Francisco), both facts are preserved with temporal context.” The community pushed back. GitHub issue #4896 documented the failure (“my name is Alice” followed by “my name is Bob” yields two stored facts, both retrieved with similar scores). Issue #4904 proposed a concrete fix with a full TDD plan to reintroduce the UPDATE path via cosine similarity. Both were declined. The resolution pressure didn’t disappear — it migrated to the skills layer, where memory_update now handles in-place edits — but at the core extraction level, ADD-only stands.
To be fair about what v3 improved: entity linking across memories, hybrid retrieval combining semantic, keyword, and entity signals, temporal reasoning for time-aware queries, and strong benchmark results (91.6 on LoCoMo, 93.4 on LongMemEval). These are real improvements, and the engineering is solid.
But the core philosophical bet is now explicit: store everything, resolve at retrieval. I will get to why I think this is backwards shortly. For now, note the trajectory: v1 delegated conflict resolution to the LLM (probabilistic), v3 abandoned write-time resolution entirely. The direction is toward less structure at write time, not more.
I kept turning this over. The deeper problem is that the mapping from human memory to agent memory is structurally wrong because the design requirements are opposite. Human memory is reconstructive — we rebuild narratives from fragments. Agent knowledge should be authoritative — the stored fact should be the fact. Human memory is lossy by design — forgetting enables generalization. Agent knowledge should be versioned — old values archived, not lost. Human memory is subjective — the same event is remembered differently by different people. Agent knowledge should be consistent — the same query should return the same fact. Human memory tolerates contradiction. Agent knowledge must detect and resolve conflicts.
Borrowing a taxonomy designed to describe a lossy reconstructive system and using it as a blueprint for a system that needs to be precise and reliable — that is not interdisciplinary thinking. That is anthropomorphization dressed up as architecture.
The CogSci labels gave teams a way to name their modules (“let’s build the episodic memory component”) without giving them a methodology for deciding what knowledge to persist, how to structure it, how to maintain it over time, or how to handle when two facts contradict. The labels created the illusion of having a design when what they had was a metaphor. Mem0’s evolution illustrates this: v1 delegated conflict resolution to the LLM (probabilistic), v3 abandoned write-time resolution entirely — a trajectory that moves further away from principled knowledge management, not toward it.
This diagnosis didn’t originate with me. Jessica Talisman has been arguing from the Library Science side — that enterprises underinvested in the knowledge infrastructure that AI needs to function reliably. Her core concept of intentional arrangement — deliberately deciding how knowledge should be classified, related, and retrieved — stands in direct contrast to the “embed everything and search” approach. Kurt Cagle has been making the case from Ontology Engineering — that every agent maintaining state is making ontological commitments, and most do it accidentally, in JSON blobs.
The turn I didn’t expect
If CS gives you containers without content architecture, and CogSci gives you labels without methodology, where do you find both?
The answer, once I found it, felt almost embarrassingly obvious. The discipline that has been solving the problem of “how do you classify, organize, relate, store, and retrieve knowledge so that someone can find what they need in a context you can’t predict” — for over a century — is Library Science. And its adjacent fields: Information Science, Knowledge Engineering, Ontology Engineering.
But first, a reframe that changes everything.
An agentic memory system is not a brain simulator. It is a Customer Data Platform where the channels are agents and the signals include natural language. The agent doesn’t have “a memory.” The user has a profile. Agents are channels that read from and write signals to it. This replaces cognitive metaphors with data engineering patterns that have been battle-tested for decades: identity resolution, signal hierarchies, golden records, traits versus events, computed attributes.
One clarification worth making explicit: this article addresses one specific layer — the persistent knowledge profile for users. What the system knows about the user across sessions, how it’s structured, how it’s maintained, how it’s retrieved. There is a separate and genuinely interesting question about agent identity — giving agents a consistent reasoning style, evolving beliefs, and disposition parameters that shape how they interpret facts. Hindsight’s CARA component tackles this with configurable skepticism, literalism, and empathy dimensions. For multi-tenant agent systems where different agents need different reasoning personalities over the same user knowledge, that’s a real problem worth solving. But these are complementary layers. This article is about the first one.
The disciplines we should have been reading
Library Science — intentional arrangement
Talisman’s core concept. Library and Information Science organizes knowledge through intentional arrangement — deliberately deciding how knowledge should be classified, related, and retrieved. Not metadata-as-afterthought. Metadata-as-architecture.
What it contributes to agent memory:
Archival appraisal (Schellenberg, 1956) is value judgment at write time. Not “store everything and search later” — decide at ingestion whether something is worth keeping, based on uniqueness, evidential value, and actionability. A fact like “I have a severe peanut allergy” scores differently from “show me the blue one.” The system should know that at write time, not discover it during retrieval.
CREW/MUSTIE weeding provides systematic criteria for what to discard — Misleading, Ugly, Superseded, Trivial, Irrelevant, Elsewhere. Agents need to forget deliberately, not through cache eviction. LRU is not a knowledge management strategy.
Faceted classification (Ranganathan, 1933) offers multi-dimensional classification composed from independent facets, not pre-enumerated categories. Domain concepts multiplied by value types multiplied by provenance levels — composable, not combinatorial. An agent’s working vocabulary about one user is small (30 to 300 concepts per domain), not the 400K headings of the Library of Congress.
Authority control ensures concept normalization through controlled vocabulary. Without it, “dark mode,” “night mode,” and “dark theme” are three different memories instead of one concept with three surface forms. With it, they all resolve to a single canonical concept, and deduplication is exact, not probabilistic.
The Reference Interview (Taylor, 1968) models the gap between the stated question and the actual need. When an agent asks “what do I know about this user?” it needs a structured retrieval spec, not a vector similarity search. Taylor identified four levels of need — visceral, conscious, formalized, compromised — and the formalization step is exactly what a read path should perform.
Knowledge and ontology engineering
This is Cagle’s territory. Every agent that maintains state makes ontological commitments — what exists in its domain, what properties those things have, what relationships connect them. Most agents do this accidentally, in ad-hoc key-value pairs and JSON blobs. What happens when you do it intentionally: you get a vocabulary layer with hierarchical concepts, scope notes, synonym mappings, and lifecycle management.
Cagle’s persistent point: knowledge graphs are mature infrastructure, not hype. They are one of the older data structures in computing. And they are what LLMs actually need underneath — not as a replacement for the LLM, but as the structured knowledge layer the LLM reads from and writes to.
Data management — the boring brilliance
The patterns that make persistent knowledge reliable:
SCD Type 2 temporal versioning preserves full history with zero information loss. When a user’s budget changes from $300 to $500, the old value is not deleted — it gets a valid_to timestamp. Any previous state is recoverable.
Cascade invalidation via foreign keys means when a parent fact changes, derived facts are marked for re-evaluation automatically. If a computed trait (“prefers minimalist style”) was derived from three rejection events and those events are reassessed, the derived trait gets flagged.
Provenance-weighted retrieval ensures user-stated facts at 1.0 always outrank agent-inferred facts at 0.6. The signal hierarchy — user_declared, agent_observed, tool_returned, agent_inferred, computed — determines trust, not recency.
UPSERT semantics combined with controlled vocabulary make deduplication exact. No near-duplicate detection, no probabilistic matching.
Constraints and conflict detection at write time catch two contradictory facts on the same concept at the database layer, not during the agent’s mid-conversation reasoning.
Agent memory systems have all the data management problems that databases solved decades ago — and ignore all the solutions because “we’re doing AI, not database work.”
The write path, read path, and maintenance path. Most steps are deterministic. The LLM classifies within a framework — it doesn’t architect freeform memories.
The three flows
How these principles translate into architecture. I want to stay at the principle level — not a specific database, but the general direction.
The write path
Three steps. First, detect candidate signals — rule-based, no LLM, cheap. Pattern matching identifies preference statements, corrections, constraints, goals. High recall, low precision — it is cheap to over-detect because the next step filters.
Second, normalize, appraise, and conflict-check — one structured LLM call acting as a librarian. Normalize the input to a controlled vocabulary (authority control). Extract a canonical value while preserving the original utterance — the user said “nothing over five hundred,” the system stores “Maximum budget: $500,” and both are preserved. Appraise on five dimensions: uniqueness, replaceability, actionability, stability, sensitivity. Check for conflicts with existing facts on the same concept.
Third, deterministic write — UPSERT for traits, APPEND for events. The schema enforces structure. No LLM in the write step.
The UPSERT/APPEND distinction deserves a closer look, because it’s where the thesis becomes concrete. When a new value arrives for the same concept as an existing value, is the old value now false or now historical? “My name is Alice” followed by “my name is Bob” — the old value is false. A person has one current name. Both stored means retrieval poisoning. “I live in New York” followed by “I moved to San Francisco” — the old value is historical. Both are true, time-scoped. Both stored means correct temporal reasoning.
Mem0’s ADD-only approach doesn’t model this distinction. It appends always. It’s right by luck on the location case and wrong by luck on the name case — and they market the case where the bug looks like a feature (“both facts preserved with temporal context”). An architecture with a vocabulary layer decides on purpose, per concept, at write time: supersede-with-history (new value current, old gets valid_to, both recoverable) or append-only (every value permanently true). The LLM never makes the resolution decision. It makes a classification (which concept), and resolution is a deterministic property of where the fact was filed.
The vocabulary carries this temporal semantics per concept. At least three classes: mutate-in-place (typo corrections, scratch values), supersede-with-history (name, budget, address — SCD Type 2), and append-only event stream (purchases, interactions, rejections). The vocabulary isn’t frozen at design time either. It grows through a governed lifecycle: LLM proposes candidate concepts, a review process admits or maps them, and a baking period accumulates evidence (frequency, observed cardinality, synonym collapse) before promotion. This is how library authority files have always worked — LC’s SACO program is exactly a propose-review-admit pipeline for new headings. The governance is the part libraries spent a century building.
The LLM’s role here is classifier and cataloger, not reasoner. Classification-grade, not reasoning-grade. You don’t need a frontier model for the memory subsystem. You need reliable structured output and a good rubric.
A careful reader will notice I just criticized Mem0 for relying on LLM-mediated conflict detection — and then proposed an architecture that also relies on an LLM during ingestion. That tension deserves to be named, not hidden.
The difference, I would argue, is structural. The LLM operates as a classifier within an explicit framework — a bounded vocabulary of 50 to 200 concepts, an appraisal rubric with defined dimensions, existing facts injected as comparison context. The framework constrains the LLM’s judgment; database constraints enforce the output after. The LLM proposes; the schema enforces.
But the weakness is real. The LLM can mis-classify, mis-appraise, or miss conflicts. The difference from Mem0 isn’t “LLM versus no LLM” — it’s “uncertainty observable versus uncertainty invisible.” When the LLM-as-librarian can’t confidently classify a signal, that failure is typed: a low-confidence classification (plausible concept exists, LLM isn’t sure) routes to adjudication against the existing vocabulary. An out-of-vocabulary signal (no concept exists) routes to the promotion pipeline as evidence the vocabulary is incomplete. Both go to a dead-letter queue where they’re visible, measurable, and drainable. Mem0’s Alice/Bob contradiction doesn’t error, flag, or queue — it succeeds wrongly. ADD-only with MD5 dedup has no place to admit something didn’t classify cleanly, so it doesn’t. A DLQ that nobody drains is slow data loss, not zero data loss. The true claim is visibility, not perfection — but visibility is the precondition for fixing.
In Library Science, this curation was done by trained professionals who understood classification theory, authority control, and their specific domain. No modern library has a human catalog every item from scratch — they use automated classification, vendor-supplied records, copy cataloging — but always within a framework of authority files and classification schemes. The LLM-as-librarian is the next step in that trajectory. It is a bet, and it should be named as such.
The read path
Four steps. Query formulation translates the agent’s raw need into a structured retrieval spec by domain, provenance level, and concept type. This is Taylor’s reference interview formalized: translate “help the user pick a thing” into a precise retrieval specification. Not “embed the query and find nearest neighbors.”
Retrieval is a parameterized query against the fact store, filtered by domain, provenance, and appraisal value. No LLM. Deterministic ranking scores by appraisal value multiplied by provenance weight. Tunable configuration, not a learned parameter. Frame composition groups facts by provenance so the consuming agent can see trust levels — confirmed facts (user-stated, high confidence) separated from observed patterns (behavioral, moderate confidence) separated from tool-provided context. No summarization. No “the LLM condensed your memories into a paragraph.” A view, not a lossy compression. No information is lost.
Vector search is the fallback, not the primary path. When the vocabulary doesn’t cover a topic or the agent can’t formulate a structured query, semantic similarity helps find the nearest concept. Otherwise, structured retrieval wins because it is interpretable, auditable, and composable.
The maintenance path
Weeding is not hygiene. It is compliance.
A store-everything-forever architecture is not GDPR or CCPA compliant by construction. Right-to-erasure is not satisfiable by “we ranked it lower” — the fact must be provably gone, with an audit trail. Retention schedules require deaccessioning on a clock. A regulator does not accept “the embedding makes it unlikely to surface.” This is where the “boring” data management patterns stop being elegance and become compliance machinery: provenance tells you what to cascade-invalidate during an erasure request. SCD Type 2 valid_to timestamps enforce a retention clock. You literally cannot be compliant without these. Table stakes, not taste.
Beyond compliance, there is the correctness argument. Ranking-only conflict resolution assumes the ranker can always detect that two facts are about the same concept and in conflict. That detection is exactly the write-time step the store-everything camp deleted. “Just rank better” is circular — it smuggles back the conflict resolution it claimed to avoid, now at query time under latency pressure with less context. Mem0’s own issue tracker provides the proof: #4896 reports that “search returns both with similar scores, degrading retrieval quality.” That is the poisoning mechanism, stated by the reporter, confirmed by code.
MUSTIE criteria, applied as a background job, provide the principled alternative. Misleading facts that contradict a newer, higher-provenance fact — archive them. Ugly records that are malformed, partial, or corrupted — quarantine them. Superseded facts where a newer version exists — version them with SCD Type 2. Trivial facts with low value and zero access — remove them. Irrelevant facts where the user’s context has shifted — this one requires LLM judgment: “the user was planning a wedding; the wedding happened; wedding preferences are now irrelevant.” Elsewhere — facts redundant with an authoritative external source — replace with a pointer.
This is the part nobody builds. It is also the part that determines whether your memory system can be deployed on real user data in a regulated environment.
What the field is getting right, and what’s still missing
CS and CogSci ask the wrong questions. Library Science, Ontology Engineering, and Data Management ask the ones that directly address persistent agent knowledge.
The right questions produce the right systems. CS asks “how do I store and retrieve this efficiently?” — necessary but not sufficient. CogSci asks “how does a human remember this?” — interesting but misleading. Library Science asks “what is this, how does it relate to other things, and how will someone need to find it?” Ontology Engineering asks “what commitments am I making about the structure of this domain?” Data Management asks “how do I keep this consistent and reliable as it changes?” The last three directly address the problem of persistent, structured, reliable agent knowledge.
Not every existing system ignores these questions. Hindsight, from Vectorize.io (paper co-authored with Virginia Tech and The Washington Post), is the strongest existing system relative to this thesis — and it’s more than a counterpoint. It’s convergent evidence.
Hindsight organizes memory into four networks — World, Experience, Opinion, and Observation — that distinguish types of knowledge structurally. It performs entity resolution to canonicalize mentions. It runs four-way parallel retrieval — semantic, BM25, graph traversal, and temporal — fused with Reciprocal Rank Fusion and neural reranking. Its observation consolidation is functionally similar to materialized views. Its opinion evolution with confidence scores is a form of re-appraisal. The benchmark results are strong: 91.4% on LongMemEval with a frontier backbone (83.6% with the open-source 20B model), outperforming full-context GPT-4o.
Here is what matters for this argument: Hindsight is write-heavy by design. Its retain() pipeline does LLM fact extraction, network classification, entity resolution to canonical entities, and four-way link construction — all at write time. Their docs state it directly: “Writes are heavier but designed for background ingestion.” The field’s best-performing system does NOT defer structure to retrieval. Mem0 defers. They diverge. The one with heavier write-time structuring posts the SOTA number.
Hindsight independently arrived at write-time structuring, epistemic separation, entity canonicalization — without citing a single library scientist. That’s the strongest possible evidence the principles are real, not borrowed. The field is rediscovering authority control, appraisal-by-type, and structured ingestion under benchmark pressure.
But the gap remains. Hindsight offers an optional controlled vocabulary (their docs describe user-defined concept sets normalized at retain time), but it’s a tuning knob, not the architectural spine. Authority control isn’t the load-bearing primitive — cardinality and temporal semantics don’t flow from it. There is no archival appraisal at write time (no Schellenberg-style value judgment on ingestion). The world/experience/opinion split is a type classification, not a trust hierarchy — a world fact from a tool API has the same standing as one the user explicitly stated. There is no MUSTIE-style principled weeding, and no machinery to make it lawful (more on this below). Memory banks are per-agent, not per-user-across-agents — the CDP reframing isn’t present.
What Library Science adds isn’t decoration over work that good engineers already derived. It names the things that are still missing from the best system in the field — and names them as a coherent discipline rather than a list of patches.
Could Hindsight be extended? I think so. It is already two separable components: TEMPR (the retain/recall memory infrastructure) and CARA (the reasoning/belief/personality layer). For the persistent user knowledge problem, TEMPR is the relevant foundation — you would add a vocabulary layer, appraisal, provenance, and weeding to its retain pipeline. CARA addresses the complementary problem of agent reasoning identity and could remain an optional layer for use cases where agent personality matters.
We have been here before. Every technology eventually discovers it needs information architecture. The web did — information architecture became a discipline in the early 2000s because websites built without it were unusable. Enterprise data did — master data management exists because decades of unmanaged data created expensive chaos. AI agents are next. The only question is whether we learn from those cycles or rediscover the same lessons from scratch.
Where to start
This is harder than spinning up a vector store. Ontology development takes time. Building controlled vocabularies requires domain expertise. Schema design requires upfront thought. The payoff is a knowledge layer that is interpretable, maintainable, auditable, and composable — instead of a high-dimensional prayer.
Read Jessica Talisman’s Intentional Arrangement newsletter and her Ontology Pipeline framework. Her Graph Power Hour episode on Library Science for AI systems is a direct on-ramp. Read Kurt Cagle’s The Ontologist newsletter and his work on knowledge graph architecture and ontology-driven agents. Learn the Library Science concepts that transfer directly: archival appraisal, faceted classification, authority control, MUSTIE weeding, the reference interview.
You don’t need a specialized “AI memory database.” You need structured knowledge on top of a reliable data platform — PostgreSQL, MongoDB, whatever you already run. The principles are database-agnostic.
Agents don’t need better recall. They need better librarianship. The oldest information profession has more to teach the newest than either is comfortable admitting. The tools exist. The theory exists. The practitioners exist. What’s missing is the willingness to look outside the disciplines that got us here and learn from the ones that have been organizing human knowledge since before computers existed.
References
Jessica Talisman — Semantic Engineer, Information Architect, knowledge infrastructure strategist. Intentional Arrangement (Substack) · A Library Science Approach to Enterprise AI · Graph Power Hour Ep. 9 · The Ontology Pipeline Refresh
Kurt Cagle — Ontologist, knowledge graph architect, author of The Cagle Report. The Ontologist (Substack) · The Future of Knowledge Graphs · Knowledge Graphs and AIs
Packer et al. (2023), MemGPT: Towards LLMs as Operating Systems — the paper behind Letta’s virtual memory architecture.
Chhikara et al. (2025), Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory — published at ECAI 2025.
Mem0 v3 migration guide — documents the ADD-only architectural shift.
Latimer, Boschi et al. (2025), Hindsight is 20/20: Building Agent Memory that Retains, Recalls, and Reflects — co-authored with Virginia Tech and The Washington Post. GitHub.
Tulving, E. (1972), Episodic and Semantic Memory — in Organization of Memory (pp. 381-403). Academic Press.
Schellenberg, T.R. (1956), The Appraisal of Modern Public Records — National Archives Bulletin 8.
Texas State Library, CREW: A Weeding Manual for Modern Libraries — the source of the MUSTIE framework.
Ranganathan, S.R. (1931), Five Laws of Library Science — the foundation of faceted classification.
Taylor, R.S. (1968), Question-negotiation and information-seeking in libraries — College & Research Libraries, 29(3), 178-194.
IFLA, Functional Requirements for Bibliographic Records (FRBR) — the Work / Expression / Manifestation / Item abstraction.
DCMI, Dublin Core Metadata Basics — common metadata envelope principles.


