Open Knowledge Format (OKF) Glossary

A concise, factual glossary of OKF and adjacent terms. Each entry is self-contained. For the full explainer, see What is the Open Knowledge Format.

Independent resource, not affiliated with Google.

Open Knowledge Format (OKF)
An open specification published by Google Cloud on 12 June 2026 at version 0.1, for representing the metadata, context, and curated knowledge that AI systems need. It is vendor-neutral and designed to be readable by both humans and agents.
OKF bundle
A directory tree of UTF-8 markdown files that together form a unit of curated knowledge. A bundle requires no database or runtime: it is just a folder you can store in version control, zip, or copy.
Concept document
Any non-reserved .md file in a bundle. It captures a single concept (such as a table, dataset, metric, playbook, runbook, or API) using YAML frontmatter followed by free-form markdown.
Concept
The unit of knowledge an OKF document describes. A concept can be anything you capture and want an agent or person to understand, with no constraint on subject matter.
Frontmatter
The YAML metadata block at the top of every concept document, delimited by triple-dash lines. It holds structured fields such as type, title, and tags that describe the document.
type field
The single required frontmatter field: a non-empty short string identifying the kind of concept. Type values are not registered centrally, and consumers must tolerate unknown types gracefully.
title
A recommended optional frontmatter field giving the human-readable name of a concept, for example "Orders table".
description
A recommended optional frontmatter field providing a short, one-to-two line summary of what the concept is.
resource (resource URI)
A recommended optional frontmatter field containing a URI that identifies the underlying asset a concept describes, such as a database table or dataset.
tags
A recommended optional frontmatter field holding a YAML list of labels used to group, filter, or search concepts within a bundle.
timestamp
A recommended optional frontmatter field holding an ISO 8601 datetime, typically recording when a concept was captured or last updated.
Reserved file
A filename with special meaning in OKF, namely index.md and log.md. Reserved files must follow their expected structures; all other .md files are treated as concept documents.
index.md
A reserved file that provides a directory listing and supports progressive disclosure, letting a reader see what a directory contains and drill in selectively rather than loading everything at once.
log.md
A reserved file that records the update history of a bundle or directory, giving a human-curated, narrative record of what changed and when.
LLM-wiki
The informal pattern of writing human-authored notes and context for a large language model to read as grounding. OKF formalises this pattern into a portable, interoperable format.
Agent-readable knowledge
Curated information structured so that AI agents can parse and use it without bespoke SDKs. In OKF this is achieved through predictable YAML frontmatter plus plain markdown.
Progressive disclosure
The practice of revealing knowledge in layers so a reader loads only what is relevant. In OKF, index.md files enable this by listing available concepts before any are read in full.
Vendor-neutral
A property of OKF meaning it depends on no single vendor’s tools: no schema registry, no central authority, no required SDK or runtime, and no proprietary account.
Conformance
The set of conditions that make a bundle valid: every non-reserved file has parseable frontmatter with a non-empty type, reserved files follow their structures, and consumers accept broken links, missing optional fields, and unknown types.
RAG (retrieval-augmented generation)
A runtime technique that retrieves relevant information and supplies it to a language model at query time. OKF is complementary: a way to author and store curated knowledge that a RAG system can retrieve from, not a retrieval engine itself.
llms.txt
A convention for signposting a website’s important content to AI systems via a single file. It operates at site level, whereas OKF is a directory-structured format for typed concept knowledge.
Knowledge catalog
In the OKF context, the Google Cloud repository where the specification and reference implementations are published. More broadly, an organised inventory of an organisation’s data assets and their context.
BigQuery enrichment agent
A Google reference implementation that walks a BigQuery dataset, drafts an OKF concept document per table or view, then runs a second LLM pass crawling authoritative documentation to enrich each concept.
Static HTML visualiser
A Google reference implementation that turns an OKF bundle into an interactive graph in the browser, letting people explore concepts and their relationships without specialist software.
Vector database
A store of embeddings used for similarity search, common in retrieval pipelines. It is an implementation detail rather than a content format; an OKF bundle can be embedded into one, but OKF itself remains human-readable plain text.
Grounding
Supplying a language model with factual, curated context so its responses are anchored to real source knowledge rather than relying on its parameters alone. OKF exists to make this grounding portable and reviewable.
Context window
The amount of text a language model can consider at once. Because context windows are finite, OKF’s progressive disclosure via index.md helps keep only relevant concepts loaded.
Markdown
A lightweight plain-text formatting syntax. OKF uses UTF-8 markdown for concept bodies, which keeps bundles human-readable and diffable in version control.
YAML
A human-readable data-serialisation syntax used for OKF frontmatter. It encodes structured fields such as type, tags, and timestamp at the top of each concept document.