governance whitepaper · version 1.0 · 2026

The Aletheia Protocol:
Governance Whitepaper

A decentralised architecture for human meaning-making, semantic attribution, and the governed preservation of collective meaning.

kibela network · kibela.ai · for academic review and institutional partnership

abstract

Every correction a human makes to an AI system is an act of genuine sense-making. It takes seconds. It costs nothing. And it currently disappears without trace into a model owned by someone else, contributing to a system that grows more valuable for having received it while the person who provided it receives nothing in return.

This is not an unfortunate byproduct of how AI systems are built. It is the intended architecture. Current AI infrastructure is designed on seven implicit assumptions that, taken together, constitute what this paper calls the extractive paradigm: the treatment of human semantic contribution as a free input to be absorbed, anonymised, homogenised, and monetised by whoever owns the model, with no attribution, sovereignty, persistence, or economic return to the contributors who built its intelligence.

Kibela is built on three foundational claims. First: meaning requires a subject. Second: individual meaning-making is irreducibly singular and cannot be averaged without loss. Third: the capacity for sense-making must be exercised to exist. The Aletheia Protocol defines valid meaning as a semantic contribution satisfying five conditions: genuine subjecthood (S), explanatory power (E), contextual coherence (C), temporal priority (T), and collective connection (K). The validity condition: V(WSU) = S · E · Cα · Kβ · T.

1 · The Problem: What Current Models Implicitly Assume

Current AI infrastructure hides an ideology. Seven design assumptions, each individually defensible, combine into a system that is structurally extractive. Understanding them requires examining not what AI systems do wrong but what they are designed to do, and what that design inevitably produces.

1.1 The statistical approximation assumption

The foundational design principle of current large language models is that meaning is recoverable from statistical distributions over large corpora of text. This is the distributional hypothesis, first formalised by Zellig Harris in 1954. The implicit claim: scale solves the problem. More data, more parameters, more computation produces better approximation. The structural error: this is wrong not as a matter of degree but of kind. Better approximation of the surface of meaning is not a path to meaning itself.

The model has no mechanism for grounding meaning in genuine present understanding. It cannot distinguish between what a word statistically co-occurs with and what it means to a specific person in a specific context. It has no inside. It has encountered the word “risk” approximately twelve billion times. It has never been at risk of anything. Bender et al. (2021) formalise this precisely: language models learn to produce text statistically indistinguishable from text produced by beings who have meaning. The distinction is invisible in the output and absolute in the nature of the process.

1.2 The stateless query assumption — semantic decay by design

Each query is treated as independent. No persistent memory of corrections made, no accumulation of domain-specific understanding, no record of what was resolved last session. The semantic labour of correction is permanently ephemeral. You correct the machine today. It is better today. Tomorrow it returns to the same approximation.

Three forms of decay this produces. Drift: the AI’s statistical approximation shifts over model updates, pulling away from grounded interpretations previously established by domain experts. Dilution: specific domain meanings are averaged with generic usage across populations until precision is irretrievably lost. Erasure: meanings disappear entirely from the system’s resolution when the contributor is not present to re-establish them.

Semantic decay

The process by which a meaning loses its precision, its specificity, and its connection to the human understanding that generated it, through the combined effects of drift, dilution, and erasure. A structural feature of stateless AI architectures, not an occasional failure mode.

1.3 The centralised ownership assumption — extraction without attribution

All corrections, all feedback, and all human semantic labour that improves the model’s outputs is absorbed into weights owned by the organisation running the model. The economic consequence: value generated by human correction compounds in model quality and corporate valuation while contributors receive nothing. The epistemic consequence: the provenance of understanding is permanently erased.

Varoufakis (2023) frames this as technofeudalism: platforms own the cloudal capital through which all value must pass and charge rent for access. Your correction does not just generate value for the platform. It constitutes the platform’s capital. You are not selling your labour to a capitalist who owns the means of production. You are building the means of production. For free. Without knowing it.

Semantic extraction

The absorption of human meaning-making into AI systems without provenance, attribution, or value return. Three distinct harms: economic (uncompensated labour), epistemic (meanings lose their authors and become irresponsible), and civilisational (systematic suppression of the human capacity for sense-making).

1.4 The homogenisation assumption — precision lost at scale

Models are trained to be helpful on average: to produce outputs that satisfy the broadest range of users. The implicit claim: a response that works for most people in most contexts is the right response. What it systematically erases: the domain-specific, role-specific, culturally specific meanings that constitute the most valuable knowledge in any community. The CFO’s “risk” averaged with the teenager’s “risky neighbourhood.” Domain-specific precision erased by scale.

Page’s diversity prediction theorem (2007) shows that collective accuracy equals average individual accuracy plus diversity. A homogenised system loses the diversity component and cannot recover it from scale alone. Zollman (2010) shows formally that epistemic monoculture fails catastrophically when its shared assumption turns out to be wrong. When all AI systems converge on the same semantic approximations, a single miscalibration propagates across all deployments simultaneously with no diversity to catch or correct it.

1.5 The confidence without calibration assumption

Current models produce responses with uniform apparent confidence regardless of actual epistemic state. A response grounded in rich accurate domain data and a response confabulated from weak statistical signal arrive with the same surface fluency. Guo et al. (2017) demonstrate that neural network confidence is systematically poorly calibrated. The system cannot signal the difference between a domain where it has rich accurate grounding and one where it is confabulating. The specific harm in high-stakes domains: legal, medical, compliance, and financial decisions made on the basis of confidently wrong AI outputs whose confidence was a formatting convention, not an epistemic signal.

1.6 The anonymisation-as-privacy assumption

Platforms treat the removal of identifying information as sufficient privacy protection. Two problems. First: de-identification is substantially less effective than claimed — semantic contribution patterns can be re-identifying even without personal data attached, especially in small communities or specialised domains (Narayanan and Felten, 2014). Second: the question is not only identification but sovereignty. Fricker (2007) names the epistemic injustice: the testimonial injustice committed when a contributor’s understanding is absorbed without recognition is not primarily a violation of privacy. It is a violation of their standing as a knower — a denial of their status as the author of genuine understanding.

1.7 The economic invisibility of semantic labour

No market price for a correction. No mechanism for attributing the marginal value of a human insight to its contributor. No record of which corrections improved which outputs by how much. Economic invisibility is not an oversight. It is a structural feature maintained because measuring it would require acknowledging it, and acknowledging it would require either compensating it or defending the decision not to.

Semantic labour

The cognitive work of establishing, revising, contextualising, and communicating genuine meaning. Three distinguishing properties: constitutively contextual (its product is inseparable from the context of its production), irreversibly donated under the current architecture (it cannot be retrieved once absorbed), and compounding without proportional return (each act of semantic labour increases the value of a system that returns nothing to its contributor).

Arrieta-Ibarra et al. (2018) establish the economic case for attributing value to human contributions to AI systems. Posner and Weyl (2018) provide the mechanisms. The market opportunity is the delta between what semantic labour currently costs contributors — their time, their expertise, their cognitive effort — and what it currently returns to them. The delta is approximately the entire value of the AI industry.

1.8 The extractive paradigm — a structural summary

Extractive paradigm

The design philosophy, implicit in current AI infrastructure, that treats human semantic contribution as a free input to be absorbed, anonymised, homogenised, and monetised by whoever owns the model, with no attribution, sovereignty, persistence, or economic return to the contributors. The seven assumptions above are not incidental failures or correctable bugs. They are the coherent expression of a design philosophy optimised for corporate scale at the expense of human value.

2 · The Three Foundational Claims

2.1 Categorical difference

Machine output and human meaning differ in kind, not degree. A machine produces statistically coherent outputs correlated with inputs. It does not understand its outputs. It does not intend them. It does not care whether they are true, useful, or just. These are not missing features awaiting implementation. They are properties that require a subject — a being for whom the world shows up as mattering.

The specific ideological threat this claim addresses is not the claim that machines are conscious. The threat is subtler: the claim that since machines can produce intelligent-seeming outputs, human interiority is economically and culturally redundant. This is wrong in a way that matters enormously for how we build infrastructure. Every goal the machine pursues traces back, somewhere in the chain, to a being who cared. Remove the caring subject and you have not transcended the need for meaning. You have hidden its origin, making it invisible, unaccountable, and irrevokable by the people whose caring it was built on.

Categorical difference

The foundational claim that machine output and human meaning differ in kind, not degree, because meaning requires a subject for whom the distinction between surface approximation and genuine understanding is real and matters. No increase in model scale converts statistical approximation into meaning.

2.2 Irreducible singularity

Every act of meaning arises from a particular consciousness with a particular history in a particular context. Averaging across such acts produces something categorically different from and less valuable than the acts themselves. This is not a claim that individual meanings are always right. It is a claim about what is lost in averaging: the specificity that makes the meaning useful for the specific person in the specific context who needs it.

Attribution, in this framework, is not an economic nicety or a matter of credit. It is constitutively necessary. A meaning without an author is irresponsible by design: it cannot be questioned at source, cannot be corrected by the person whose understanding generated it, cannot be held accountable for the consequences of its application. Agency, authorship, and accountability are constitutive of ethical semantic life, not optional additions to it.

Irreducible singularity

The claim that the individual act of meaning-making is the irreplaceable generative source of all collective semantic value, and that attribution is constitutively necessary because a meaning stripped of its author is not just uncompensated but irresponsible: unquestionable, unaccountable, and uncorrectable at source.

2.3 Essential faculty

The capacity for genuine interpretation, contextual understanding, and meaning-making is a faculty that develops through exercise and atrophies through neglect. Infrastructure that systematically replaces its exercise causes demonstrable harm at individual, community, and civilisational level. Kibela is not a conservative technology. It does not argue for less AI. It argues for a specific relationship between human and machine: one in which the machine extends and honours the exercise of human sense-making rather than replacing it.

Essential faculty

The claim that sense-making is constitutive of human flourishing rather than instrumental to it, and that infrastructure which systematically replaces its exercise causes demonstrable harm. The human is not in the loop as a safety mechanism. The human is the source.

3 · Values: Eight Architectural Commitments

Not aspirations. Architectural commitments baked into the protocol before any product decision is made. Each is a direct consequence of the three foundational claims applied to a specific design choice. Each names what the extractive paradigm fails to do and what the Kibela architecture commits to doing instead.

Reduce.

What has already been understood should not have to be understood again. Resolved meaning is reused. Token cost falls as map density grows.

Stabilise.

Meaning drifts. We hold it. The decay function requires continued human engagement. Abandonment is as costly as falsification.

Attribute.

A meaning without an author is irresponsible by design. Every contribution is timestamped, anchored, and provably yours. Permanently.

Collectivise.

Individual meaning is the source. The collective map is what it becomes. Not an average. An emergence.

Protect.

The raw content of your thinking never leaves your control. Enforced by cryptographic architecture, not by a privacy policy.

Redistribute.

When the map earns, the people who built it earn. The technical consequence of an architecture that takes authorship seriously.

Include.

Humans are not in the loop. Humans are the source. The loop exists for the human, not the reverse.

Buffer.

A governed layer between human meaning and machine execution that humans govern. The epistemic monoculture alternative.

4 · What Meaning Is: A Formal Definition

4.1 Meaning as event, not property

Meaning is not stored in words, encoded in training data, or recoverable from statistical distributions over text. It is an event — it occurs when a conscious subject, situated in a specific context with a specific history of engagement, makes genuine contact with a concept and asserts its significance for that role, at that moment.

Meaning event

An act of semantic assertion arising from genuine engagement with a concept, carrying the weight of the contributor’s history of engagement with it, committed before outcomes are observable, and capable of resonating independently in others navigating the same domain. Distinct from a machine output in three essential ways: it has a subject who was present, a history that grounds it, and a stake in whether it is right.

4.2 Three conditions of genuine meaning

A meaning is genuine when three conditions are satisfied simultaneously. It comes from somewhere — a demonstrable history of real engagement behind it, the expressed product of a contributor who has been working in this domain over time, whose understanding has been built, tested, and revised through actual contact with the domain’s problems. It is committed before the answer is known — asserted before outcomes are observable, staking the contributor’s understanding on a position before the world confirms or denies it. The willingness to commit is itself the mark of genuine understanding rather than strategic positioning. It lands in others — it resonates with independent contributors working from different contexts toward the same understanding. The collective uptake is not what makes the meaning true but is evidence it was grounded in something real rather than idiosyncratic.

4.3 Four measurable parameters

The three conditions of genuine meaning translate into four quantifiable parameters used throughout the protocol.

S — Genuine Subjecthood: the temporal and contextual signatures of real human engagement over real time. Temporal irregularity consistent with actual work cycles, contextual anchoring in specific real-world tasks, revision history tracking domain evolution, language reflecting the domain’s current living vocabulary.

E — Explanatory Power: the semantic sufficiency of the contribution to ground a resolvable meaning distinction. A contribution that cannot change how the system resolves future queries has no explanatory power.

C — Contextual Coherence: consistency with the contributor’s demonstrated domain history and semantic trajectory. The contribution fits the pattern of genuine understanding developing over time.

T — Temporal Priority: cryptographic proof that the contribution was committed before any outcome was observable.

K — Collective Connection: logarithmically saturated independent activation by contributors with non-overlapping coherence histories. Why each is necessary and none is sufficient alone: each addresses a different attack surface and a different form of meaning loss.

4.4 The singular and the collective

The irreducible singularity of individual meaning-making is the generative source. The collective sense-making map is the network effect — something that transcends the sum of its parts and is irreducible to any individual contribution. The collective map is not an average of individual meanings. It is an emergence: individual singularity is the source; collective intelligence is what it becomes when those singularities are preserved and connected.

Collective sense-making map

An emergent structure built from attributed individual meaning events, whose value is irreducible to any single contribution, impossible to commodify as a static asset because its value resides entirely in its living connection to ongoing human sense-making, and which decays without continued human engagement.

5 · Memory Architecture

5.1 Working memory

Session-scoped, high plasticity, resolves meaning for this user in this role in this context right now. Ephemeral by design — the same person means different things in different contexts, and a system that cannot distinguish session context from long-term pattern produces systematically wrong resolutions. Raw content not retained.

Working memory layer

Session-scoped semantic resolution that integrates the user’s current context with signals from personal and collective layers, weighted by signal confidence for this specific query. Ephemeral: raw content and individual corrections in identifiable form are not retained.

5.2 Personal semantic memory

Role-bound, private by default, accumulated over sessions, encoding the contributor’s stable interpretive patterns per domain. Encrypted with keys that never leave the contributor’s control. Never shared in raw form. The basis of the contributor’s coherence score (C parameter) and the source of their provenance chain. What propagates to the collective layer is not the content of the personal memory but an anonymised semantic signal derived from it — the fingerprint of understanding, stripped of identifying content.

Personal semantic memory

Role-bound, private-by-default accumulated record of the contributor’s stable interpretive patterns. The basis of the C parameter and provenance weight. Sovereign: encrypted with keys that never leave the contributor’s control.

5.3 Collective sense-making maps

Domain-scoped, community-owned, built from anonymised and attributed contributions that have passed Aletheia protocol verification. Temporally alive: require continued human engagement to maintain meaning weights. Commercially licensable when domain density is sufficient. High stability in established zones. Sensitive to domain evolution. Resistant to noise through logarithmic saturation and coherence weighting. Preserving contested zones as first-class data rather than resolving them into false consensus.

Why the map cannot be a static database: its value is entirely in its living connection to ongoing human sense-making. Extract it, package it, sell it as a finished product, and you are selling something already dying. The moment genuine human contribution stops feeding it, the map begins its decay toward obsolescence.

5.4 Flow between layers

Working → Personal: automatic trace, weighted by signal strength. Every session leaves a record in the personal layer when patterns are sufficiently consistent to qualify as stable understanding. Personal → Collective: gated by the Aletheia validity condition. Only contributions satisfying S, E, C, T, K thresholds propagate. Collective → Working: top-down grounding at session start. The system pre-loads resolved meanings for the user’s role and domain from the collective map. This is where token cost reduction occurs.

Why the flow is asymmetric: individual meaning generates collective intelligence, but collective intelligence does not override individual meaning. The personal layer takes precedence in familiar contexts.

5.5 Adaptive retrieval — the weighted combination

All three layers contribute simultaneously to meaning resolution, weighted by signal confidence for the specific query context. A naive cascade — checking working first, then personal, then collective — ignores the simultaneous relevance of all three layers. The correct model is a weighted combination where each layer’s weight reflects how much the system knows and trusts its signal for this specific user, role, domain, and moment.

Adaptive weighting

The dynamic adjustment of working memory (α), personal memory (β), and collective map (γ) contributions to meaning resolution, where α + β + γ = 1. Each weight reflects the confidence of the respective layer’s signal for the current query context.

6 · The Weighted Semantic Unit

6.1 Definition

Weighted Semantic Unit (WSU)

The atomic primitive of the collective sense-making map. A structured record representing a domain-specific, role-sensitive meaning that has been produced by genuine human sense-making, verified by the Aletheia protocol, attributed to its contributors, and weighted by the Confidence Level derived from its verification. Not a dictionary entry, not a raw embedding, not a stored preference. Modality-agnostic: the same structure applies to linguistic, visual, and audio semantics.

6.2 The four structural components

Canonical interpretation — the primary resolved meaning for this domain and role context, expressed as natural language plus an embedding vector. The natural language statement is what humans and enterprises read. The vector is what the system queries.

Validated variants — alternative interpretations that are meaningfully distinct from the canonical but legitimate. Attributed to their contributors, weighted by validation. These represent genuine role or subculture differences within a domain.

Contested zones — interpretations in genuine expert disagreement. Not errors to be resolved — first-class semantic data. Represented as a structured debate: each position attributed and weighted, the specific axis of disagreement identified, the contexts in which each applies specified.

Provenance chain — the complete attributed history of all contributions to this WSU: who contributed what, when, in what context, with what validation, activation count, and current weight. The economic layer’s ledger.

6.3 WSU Confidence Level — eight factors

The single composite score that summarises a WSU’s epistemic trustworthiness. Drives adaptive retrieval state decisions, licensing eligibility, and provenance distribution calculations. Composed of eight factors:

Coherence weight — the C parameter of the primary contributor(s): demonstrated domain history and trajectory consistency. Contextual relevance — how closely the WSU’s domain and role context matches the current query context. Authorship verification — the S parameter: degree to which contributions have passed genuine subjecthood verification. Meritocratic standing — the verified expertise level of contributing authors in this domain. Domain expertise depth — the contributor’s demonstrated accuracy and consistency specifically in this domain. Collective sense-making density — the K parameter: logarithmically saturated independent activations. Temporal stability — how long the WSU has maintained its weight across multiple epochs without significant challenge. Contestation history — whether the WSU has been challenged and how the challenge resolved. A WSU that survived scrutiny carries stronger confidence than one never questioned.

6.4 Contested zones as first-class data

The most epistemically valuable regions of the map are the places where genuine experts hold genuinely different interpretations. These are not errors to be resolved — they are semantic information about contested territory. The contested zone trigger combines statistical variance detection with embedded logic judgements, comparing contributions against domain baselines to distinguish linguistic ambivalence (registered automatically), interpretive variance (flagged for community assessment), and genuine contested zones (elevated with full structured disagreement record).

Contested zone

A region of the map where two or more contributors with independent high-confidence levels hold interpretations exceeding the individuation threshold between them, persisting across more than one operational epoch. Not a sign of map failure — the map’s most epistemically honest territory.

6.5 Semantic individuation

The two-level criterion: embedding similarity at the surface level suggests the same unit; role-context divergence overrides. Two contributions with similar embeddings but significantly different role contexts are treated as variants rather than confirmations — because the CFO’s understanding of “material risk” and the engineer’s understanding are meaningfully distinct even if their surface descriptions are close. Role-context is the primary individuation signal; embedding distance is secondary. The inverse of how current AI systems handle it, which collapse role-context differences into noise.

Semantic individuation

The determination of whether a new contribution confirms an existing WSU, adds a validated variant, creates a new WSU, or opens a contested zone. Primary signal: role-context divergence. Secondary signal: embedding distance.

6.6 Meaning weight W(t)

The meaning weight of a WSU is not static. It grows as the unit is activated and validated; it decays when not reinforced. The decay is the protocol’s primary tamper against the map calcifying into historical record.

W(t) = V(WSU) · e^−λt · R(t)

Where λ is the domain decay rate and R(t) is the reinforcement function that resets the decay clock when the WSU is validated or activated. A meaning actively used maintains its weight indefinitely. A meaning no longer activated decays toward zero: deprioritised in retrieval, flagged for review, eventually eligible for retirement.

Domain decay rate (λ)

The governance parameter specifying how quickly a WSU loses meaning weight without reinforcement. High for fast-moving regulatory domains where ground truths update frequently. Low for foundational domains where established meanings are slow to change. Set at domain creation, adjustable by governance.

6.7 Minimum contribution threshold — explanatory power

The floor below which a contribution cannot generate a WSU candidate or accumulate provenance credit. Determined not by length but by explanatory power: a contribution must describe a distinction, relationship, or interpretation that the system can use to resolve future queries differently from the absence of that contribution. A single word that specifies a domain boundary may satisfy the threshold. A paragraph of generic domain-adjacent language may not. The system assesses explanatory power algorithmically and can request elaboration when a contribution is semantically insufficient.

6.8 WSU retirement

WSU retirement

Removal from active retrieval without deletion of provenance history. A WSU is eligible for retirement when activation rate falls below domain floor across multiple epochs, when a superseding WSU reaches sufficient confidence level, or when a ground truth update renders the unit obsolete. Retirement requires community proposal and contributor council decision. The archive is permanent: provenance chain intact, historical queries resolvable, contributor credits preserved. Retirement is not erasure. It is respectful deprioritisation with full memory.

7 · The Aletheia Protocol

7.0 Why Aletheia

Aletheia is the ancient Greek word for truth, recovered by Heidegger from its common translation as correctness (the matching of a proposition to a fact) to its original meaning: unconcealment. The disclosure of what was previously hidden. The bringing-into-the-open of something that was present but not yet visible.

When a human corrects an AI’s approximation of a concept, they perform exactly this: they unconceal the meaning that the machine’s statistical output had covered over. They reveal what was already there — in their understanding, in their domain’s living practice, in their community’s accumulated knowledge — but had not yet been made legible to the system.

Aletheia Protocol

The decentralised meaning-making and truth-preserving algorithm that governs how genuine human sense-making is anchored, verified, weighted, aggregated, retrieved, and attributed within the Kibela network. It makes semantic dishonesty structurally more expensive than semantic honesty, without requiring trust in any central authority.

7.1 Proof of meaning

Bitcoin’s SHA-256 defines valid work as a hash falling below a target value: computationally expensive to produce, trivially cheap to verify, adjustable to maintain stable block time, making dishonesty more expensive than honesty without requiring trust in any central authority.

The Aletheia Protocol defines valid meaning as a semantic contribution satisfying five simultaneous conditions. The validity condition:

V(WSU) = S · E · C^α · K^β · T

S, E, and T are binary gates: they cannot be compensated for by high scores on other parameters. A contribution with perfect coherence and massive collective connection produces V = 0 if S = 0 (no genuine subjecthood) or T = 0 (no prior commitment) or E = 0 (no explanatory power). C and K are weighted continuous scores, domain-configurable through α and β. The meaning weight function:

W(t) = V(WSU) · e^−λt · R(t)

Where proof of work rewards computational scale, proof of meaning rewards epistemic depth. The cost that makes dishonesty irrational is not computational. It is existential. To fake genuine subjecthood you must become a genuine subject. To fake a history of coherent domain engagement you must actually engage.

7.2 The five parameters — precise definitions

Genuine Subjecthood · binary gate

Verifies that contributions arise from a real human being engaged in real tasks over real time — not from automation, prompt engineering, or strategic input. Measured across four signals: temporal irregularity (genuine engagement clusters around real events and work cycles — too regular is as suspicious as too sparse), contextual anchoring (contributions carry the fingerprint of specific real-world situations), revision history (genuine understanding revises when reality pushes back — absence of revision over time signals automation), domain language evolution (genuine engagement tracks the domain’s current living vocabulary rather than a phase-shifted historical approximation). S = 0 for automated systems regardless of all other scores.

tampers against: bots · automated contribution · synthetic coherence-building · prompt-engineered contributions

Explanatory Power · binary gate

Verifies that a contribution contains sufficient semantic content to ground a resolvable meaning distinction: to change how the system resolves future queries involving this concept differently from the absence of the contribution. E = 1 when the contribution produces a meaningful embedding distinct from existing units at the individuation threshold and can operate in logical sense. The system may request elaboration when a contribution is semantically insufficient. Explanatory power is a precondition for all other parameters: the S gate verifies the subject; the E gate verifies that the subject said something meaningful.

tampers against: trivial contributions · generic domain vocabulary without semantic content · padding designed to satisfy thresholds without contributing understanding

Contextual Coherence · [0,1] score

Consistency of the contribution with the contributor’s demonstrated domain history and semantic trajectory. Composed of three sub-signals weighted by domain type: trajectory consistency A (embedding distance to contributor’s domain centroid — primary for all domains), predictive accuracy B (retrospective evaluation against domain ground truths at epoch transition: regulations, case law, verified factual frameworks — primary for verifiable domains such as law, medicine, and finance), peer validation C (weighted by validators’ own C scores, logarithmically saturated against echo-chamber inflation — supplementary for all domains). Contributor reliability incorporated: contributors whose inputs are consistently accurate and fact-verified earn higher domain weight.

tampers against: strategic coherence-building · surface vocabulary imitation · coordinated validation rings · high-volume low-quality flooding

Temporal Priority · binary gate

Cryptographic proof that the interpretation was committed before any outcome was observable. The commit-reveal construction: C_hash = H(content ∥ role_id ∥ context ∥ nonce). T = 1 when cryptographic commitment verifiably precedes the relevant outcome or validation event. T = 0 for any contribution whose commitment cannot be verified as prior. Retroactive provenance claims are structurally impossible, not merely discouraged. The entire provenance chain depends on commitment as its foundation.

tampers against: retroactive provenance claims · intellectual appropriation · claiming credit for understanding derived from observable outcomes

Collective Connection · logarithmically saturated score

Independent activation of the WSU by contributors with non-overlapping coherence histories. K = log(1 + n_activations) · (1 / (1 + β · n_similar)), where n_similar penalises activation from contributors with overlapping semantic histories. The similarity discount prevents coordinated inflation. First independent activations add significant weight; subsequent activations add diminishing returns. Genuine uptake from independent minds reaching the same understanding from different starting points is what makes the signal trustworthy.

tampers against: coordinated activation rings · echo-chamber validation · viral misinformation · popularity substituting for accuracy

7.3 What the protocol tampers against

Automated contribution

S gate: requires temporal and contextual signatures of genuine human engagement. No smooth automated system can fake these without actually becoming what it imitates.

Retroactive provenance claims

T gate: cryptographic commitment must precede the relevant outcome. Retroactive claims are structurally impossible.

Lies and deception

C parameter: a false interpretation requires maintaining a coherent false history across many contributions in a domain where genuinely coherent contributors will contradict it.

Misinformation

K parameter’s similarity discount: coordinated false meanings require activation from contributors with genuinely independent coherence histories. Echo-chamber inflation is algorithmically penalised.

High-volume low-quality flooding

E gate and logarithmic saturation on K together prevent volume substituting for depth. Trivial contributions earn nothing. Popular but shallow contributions earn diminishing returns.

Semantic decay through neglect

Decay function W(t): continued human engagement is the condition for maintaining meaning weight. Abandoned meanings decay toward zero.

Extraction without attribution

Provenance distribution: every use of a WSU is traceable and every reuse compensable. Value cannot accumulate in the collective map without being attributable to its sources.

False consensus

Contested zone structure: genuine disagreement between high-confidence contributors is preserved, attributed, and surfaced rather than resolved into misleading consensus.

Domain irrelevance

Individuation function: a contribution must be placeable on the domain map. Contributions that cannot be located within domain semantic space do not generate WSU candidates.

Meaninglessness passing as meaning

Combined S + E gate: genuine subjecthood and explanatory power are both required. Fluent outputs from systems without subjects and without genuine semantic content cannot produce valid WSUs.

Coordinated Sybil attacks

The combination of T (pre-commitment), S (temporal irregularity detection), and K (similarity discount) makes it prohibitively expensive to generate many fake coherent identities simultaneously.

Value loss through averaging

Role-context as primary individuation signal: the CFO’s meaning and the engineer’s meaning are preserved as variants, not collapsed into a generic approximation that serves neither.

7.4 – 7.12 · The nine protocol functions

Commitment

what it doesAnchors a semantic contribution to its author and moment before any outcome is observable. Commit-reveal construction: C_hash = H(content ∥ role_id ∥ context ∥ nonce). Verifiable at any future point without revealing content.

tampers againstRetroactive provenance claims, intellectual appropriation, claiming credit for understanding derived from outcomes. What breaks without it: attribution becomes a social claim rather than a cryptographic fact. The entire provenance chain loses its evidentiary foundation.

Genuine Subjecthood Verification (S)

what it doesVerifies that contributions arise from a real human being engaged in real tasks over real time. Detects four signatures: temporal irregularity, contextual anchoring in specific real-world tasks, revision history, domain language evolution.

tampers againstAll forms of automated contribution, prompt-engineered coherence-building, synthetic semantic labour. What breaks without it: the protocol’s foundational claim — that meaning requires a subject — becomes unenforceable. The extractive paradigm is reproduced within the system designed to replace it.

Explanatory Power Verification (E)

what it doesVerifies that a contribution contains sufficient semantic content to ground a resolvable meaning distinction. Assesses algorithmically. May request elaboration. The E gate verifies that the subject said something meaningful — the precondition before all other parameters run.

tampers againstTrivial contributions, generic domain vocabulary without semantic content, padding designed to satisfy thresholds without contributing understanding. What breaks without it: the map fills with semantically inert contributions that accumulate provenance credit without adding genuine intelligence.

Coherence Verification (C)

what it doesVerifies that a contribution is consistent with the contributor’s genuine domain history. Three-stage, domain-configurable: trajectory consistency (A), predictive accuracy against ground truths at epoch transition (B — primary for verifiable domains), peer validation weighted by validators’ own C scores (C — logarithmically saturated). Ground truths in verifiable domains serve as objective reference points. Qualitative community assessment at critical thresholds supplements algorithmic scoring.

tampers againstStrategic coherence-building, surface-level vocabulary imitation, coordinated validation rings. What breaks without it: the coherence signal fails; the map admits contributions that are statistically plausible but not genuinely grounded.

Individuation

what it doesDetermines whether a new contribution confirms an existing WSU, adds a validated variant, creates a new unit, or opens a contested zone. Two-level criterion: embedding similarity first, role-context as primary override. Logic model for contested zone detection combines statistical variance with embedded truth judgements comparing against domain baselines. Novelty criterion: domain-configurable from conservative (law, medicine) through innovative (applied professional) to rebellious (philosophy, arts).

tampers againstFalse consensus from premature merging, map fragmentation, missed contested zones. What breaks without it: there are no units — only an undifferentiated cloud of contributions with no structure to query, weight, or license.

Aggregation

what it doesCombines verified, individuated contributions into collective sense-making maps preserving singularity, representing divergence, and producing irreducible collective value. Populates WSU components from the stream of verified contributions. The network effect: individual meaning-making generates collective intelligence that cannot be reduced to the sum of its individual inputs.

tampers againstValue loss through averaging, false homogenisation, static commodification of living meaning. What breaks without it: the collective layer does not exist; the architecture reduces to a personal memory system with no network effect and no commercial value proposition.

Adaptive Retrieval

what it doesDetermines the form of response based on user history, domain map density, WSU confidence levels, and query stakes. Four states: State 0 (no user history → globally ranked list), State 1 (accumulating history → personalised ranked list), State 2 (contested zone and/or high stakes → structured disagreement: the honest AI state), State 3 (dense history, familiar domain → single resolved meaning: the cost-reduction state). Stakes assessed through query inference and user declaration. State 2 fires as an override regardless of individual history richness when stakes are high and a contested zone exists.

tampers againstFalse confidence, confidently wrong answers in high-stakes domains. What breaks without it: the system either always pretends to know or is always unhelpful. The honest AI property — knowing when not to resolve — is lost.

WSU Revision

what it doesGoverns legitimate updating of a WSU’s canonical interpretation when the domain has evolved. Distinct from retirement (meaning is obsolete) and correction (canonical was wrong). Specifies conditions for revision, the evidence threshold, the initiation process, and how the provenance chain updates to record revision history without erasing the previous canonical interpretation.

tampers againstCalcification (outdated meanings maintained because retirement is the only alternative), unauthorised revision, revision history erasure. What breaks without it: the map gradually falls out of phase with the domain it claims to represent. The living character that distinguishes Kibela’s maps from static databases is lost.

Provenance Distribution

what it doesTracks every WSU activation and maintains the provenance chain enabling value to flow back to contributors. Records for every WSU: contributor identity and role context, contribution timestamp and commitment hash, validation received and validators’ weights, activation count, and role in WSU structure. Reward structure: original contributor earns permanent provenance floor plus diminishing adoption share; correctors earn growing adoption share; validators earn micro-share per reuse; vindicated dissenters earn full provenance plus retroactive vindication multiplier.

tampers againstExtraction without attribution, value anonymisation. What breaks without it: the redistribution mechanism is rhetorical rather than architectural. Contributors build a map that earns without return. The Kibela architecture becomes indistinguishable from the extractive paradigm it was built to replace.

8 · The Epoch Structure

Three speeds run simultaneously per domain, each optimised for a different function.

Operational epoch

Short cycle, domain-configurable. Contributions accumulate, meaning weights update, the map evolves. The living speed of the protocol. Runs continuously.

Licensing epoch

Triggered by domain density threshold. Produces the stable commercial snapshot. Evaluates coherence against domain ground truths (the B component of C). Distributes provenance credits from the period. The commercial speed of the protocol.

Event epoch

Triggered by significant map events: a contested zone opening or resolution, a major ground truth update (regulatory change, legal ruling, verified factual revision), a governance proposal reaching threshold, or a domain density milestone. The governance speed of the protocol.

Domain density threshold

The minimum measurable conditions for licensing activation: minimum WSU count across meaningful conceptual territory (not concentrated in one semantic zone), minimum verified contributor count, minimum contested zone coverage in high-stakes regions, and minimum ground truth anchoring percentage for verifiable domains. Coverage of contested territory matters as much as total WSU count.

Domain novelty criterion

The configurable threshold determining when a contribution is novel enough to create a new WSU rather than confirm an existing one. Spectrum: Conservative (tight threshold, high ground truth anchoring — law, medicine, formal compliance) · Innovative (moderate thresholds — applied professional domains, technology, policy) · Rebellious (loose thresholds, high tolerance for heterodox contributions — philosophy, arts, emerging fields). Set at domain creation, adjustable by governance. Protocol-wide floor ensures trivial confirmations cannot accumulate weight regardless of domain setting.

9 · Access Design and Product Principles

Universal benefit from day one. All users benefit from adaptive retrieval across general domains immediately. Every user is simultaneously building the map and benefiting from it. There is no separation between contributor phase and user phase.

Expert contribution is permissioned. High-stakes domains gate contribution at the write level through three verification mechanisms: credential verification (qualifications, professional memberships, institutional affiliations), contribution history (demonstrated accuracy and consistency over time), and peer validation (existing verified contributors recognising new contributors’ genuine domain expertise). The gate is meritocratic: demonstrated quality, not institutional membership alone.

What the system sells and what it structurally cannot sell. Licensed users access the intelligence produced by the collective map: resolved meanings, ranked interpretations, and structured disagreements through the adaptive retrieval system. Individual user histories, raw contributor content, personal semantic memories, contributor identities, and the provenance chain in identifiable form are architecturally inaccessible. Individual semantic memory is encrypted with keys that never leave the contributor’s control.

Enterprises access the output of resolved queries. They do not access the data that produced them. You license the route, not the road graph, not the individual journey histories that trained the model. You receive intelligence. The data that generated it remains with those who generated it.

Two access tiers. Individual license: personal use, limited role contexts, collective map access, personal memory private and included. Enterprise license: unlimited role count, private institutional memory layer built from employee contributions sitting above the general collective map, role-sensitive resolution across the organisation’s defined structure. The private institutional memory is the enterprise tier’s differentiating product: once built from an organisation’s contributors, it reflects their specific terminology, role structure, and domain expertise. High switching costs arise from the value of what has been built, not from contractual lock-in.

10 · Governance Principles

Contributor council

The protocol’s primary governance body whose seats are earned through demonstrated contribution quality. Governance rights accrue through epistemic performance, not purchase. Governs protocol parameters (coherence weights, epoch structure, novelty thresholds), domain registry criteria, disputed attribution cases, WSU retirement decisions, and expert gate standards. AI-assisted deliberation for council meetings: AI surfaces relevant evidence, precedents, and parameter implications. The council decides. No token governance at this stage. Token model deferred to scale.

Domain creation. Domains are created with a validity function evaluating taxonomic distinctiveness: is this domain meaningfully different from existing ones in scope, vocabulary, and contested territory? Domain parameters set at creation using available market and expert data and adjustable by governance. The principle: domains should reflect genuine communities of practice with distinct meaning systems, not arbitrary subdivisions designed to accumulate provenance credit in uncrowded territory.

Cross-domain weighting. Full independence as default — coherence in one domain does not transfer to another automatically. Manual cross-domain connection available when expertise genuinely spans domains, subject to independent coherence verification in each. Broad categories (natural language nuance, general reasoning patterns, domain-crossing frameworks) gather contributions from all users. Deep domain categories are siloed.

Protocol fee principle. When licensing revenue is distributed, Kibela takes a protocol fee for infrastructure and development. The fee is publicly stated, governance-determined, and the minimum necessary to sustain the protocol. The majority of licensing revenue flows to contributors through provenance distribution. The community’s trust in the protocol fee is secured by governance transparency, not by goodwill.

11 · Competitive Position and Structural Moat

Against current AI systems. They re-infer meaning from scratch on every query. Kibela resolves it once, persists it, and applies it at session start through collective map grounding. Cost reduction is real, measurable, and compounds as map density grows. The improvement is the architectural consequence of having a persistent semantic layer, not a feature that can be toggled on by competitors without rebuilding their architecture.

Against fine-tuning, RAG, and memory systems. Fine-tuning modifies model weights: expensive, opaque, requires data science infrastructure. RAG retrieves documents: operates above the semantic layer, returns documents rather than resolved meaning, has no mechanism for attributing the value of retrieval to the humans who wrote them. Memory systems store facts and preferences: what the user did, not what they genuinely understand. Kibela stores meaning — the interpretive structure beneath facts, documents, and preferences. One verified correction changes how every future query in that domain resolves. No infrastructure, no training pipeline, no data team required.

Honest AI and calibrated uncertainty. Adaptive retrieval State 2 surfaces structured disagreement rather than false resolution in contested, high-stakes domains. The first AI system designed to show its uncertainty when stakes are high — a structural consequence of having a map that represents contested zones as first-class data. The collective map, built from diverse attributed contributions across verified domain experts, produces less systematically biased outputs than homogenised training data for one fundamental reason: the contributors are identified, the contributions are attributed, and the disagreements are preserved rather than averaged.

Every other AI product is built on the premise that the human is the user and the machine is the intelligence. Kibela is built on the premise that the human is the source and the machine is infrastructure. This cannot be added as a feature. It requires rebuilding the entire architecture from a different foundational claim. A competitor who copies the feature set without the foundational claim produces a system that extracts rather than honours. The ideology and the product are the same thing. The moat is philosophical and structural simultaneously, enforced by architecture rather than by intellectual property.

Scope — What This Document Does Not Cover

This document establishes the governance specification of the Kibela protocol. Companion documents cover mathematical derivations of the scoring functions and decay rates, cryptographic construction of the zero-knowledge coherence proof and commit-reveal system, the marketplace extension covering visual and audio semantic assets, licensing business model and token economics at scale, and philosophical and cognitive science foundations underlying the three foundational claims.

Glossary

Adaptive retrieval

The function determining the form of response (State 0–3) based on user history, domain map state, WSU confidence levels, and query stakes.

Adaptive weighting

Dynamic adjustment of working memory (α), personal memory (β), and collective map (γ) contributions, where α + β + γ = 1.

Aletheia Protocol

The decentralised meaning-making and truth-preserving algorithm. Validity condition: V(WSU) = S · E · Cα · Kβ · T.

Categorical difference

Machine output and human meaning differ in kind, not degree, because meaning requires a subject for whom things matter.

Collective connection (K)

Logarithmically saturated independent activation of a WSU by contributors with non-overlapping coherence histories.

Collective sense-making map

An emergent structure built from attributed meaning events, whose value is irreducible to any single contribution and impossible to commodify as a static asset.

Commitment

Cryptographic anchoring of a contribution before any outcome is observable. C_hash = H(content ∥ role_id ∥ context ∥ nonce).

Contextual coherence (C)

Consistency with the contributor’s demonstrated domain history. Composed of trajectory consistency, predictive accuracy, and peer validation.

Contested zone

A map region where high-confidence contributors hold interpretations exceeding the individuation threshold, persisting across epochs. Not map failure — the map’s most epistemically honest territory.

Contributor council

Merit-based governance body whose seats are earned through contribution quality. Governs protocol parameters, domain registry, and disputed attribution.

Domain decay rate (λ)

Governance parameter specifying how quickly a WSU loses meaning weight without reinforcement. High for fast-moving domains, low for foundational ones.

Domain density threshold

Minimum conditions for licensing activation: WSU count, contributor count, contested zone coverage, ground truth anchoring.

Domain novelty criterion

Configurable threshold for WSU creation. Spectrum: Conservative (law, medicine) · Innovative (applied professional) · Rebellious (philosophy, arts).

Essential faculty

Sense-making is constitutive of human flourishing, not instrumental to it. Infrastructure suppressing its exercise causes demonstrable harm.

Explanatory power (E)

Binary gate verifying that a contribution grounds a resolvable meaning distinction.

Extractive paradigm

The design philosophy treating human semantic contribution as a free input to be absorbed, anonymised, homogenised, and monetised without attribution or return.

Genuine subjecthood (S)

Binary gate verifying contributions arise from a real human engaged in real tasks over real time, detected through temporal and contextual signatures.

Individuation

Determination of whether a contribution confirms, adds a variant, creates a new WSU, or opens a contested zone. Role-context is the primary signal.

Irreducible singularity

Individual meaning-making is the irreplaceable generative source of collective semantic value. Attribution is constitutively necessary.

Meaning event

An act of semantic assertion arising from genuine engagement, committed before outcomes are observable, capable of resonating independently in others.

Meaning weight W(t)

The evolving value of a WSU: W(t) = V(WSU) · e^−λt · R(t). Grows with activation, decays without reinforcement.

Personal semantic memory

Role-bound, private, accumulated record of the contributor’s stable interpretive patterns. Sovereign: encrypted with keys that never leave the contributor’s control.

Proof of meaning

A semantic contribution arising from a genuine subject, coherent with their domain history, committed before outcomes, activated independently by others. The epistemic equivalent of proof of work.

Provenance chain

Complete attributed history of all contributions to a WSU: who, when, what role, what validation, activation count, current weight.

Provenance distribution

Function tracking WSU activations and distributing licensing revenue to contributors according to their provenance chain position.

Semantic decay

Meaning loss through drift, dilution, and erasure. A structural feature of stateless AI architectures.

Semantic extraction

Absorption of human meaning-making into AI systems without provenance, attribution, or value return.

Semantic labour

Cognitive work of establishing genuine meaning. Constitutively contextual, irreversibly donated, compounding without proportional return under the current architecture.

Temporal priority (T)

Binary gate: cryptographic proof that a contribution was committed before any outcome was observable.

Validated variants

Alternative interpretations within a WSU that are meaningfully distinct from the canonical but legitimate, attributed to their contributors.

Weighted Semantic Unit (WSU)

The atomic primitive of the collective map: verified, attributed, weighted. Contains canonical interpretation, validated variants, contested zones, and provenance chain.

Working memory layer

Session-scoped, ephemeral semantic resolution drawing from personal and collective layers with adaptive weighting.

WSU Confidence Level

Composite score of eight factors summarising epistemic trustworthiness. Drives adaptive retrieval state decisions and licensing eligibility.

WSU retirement

Removal from active retrieval without deletion of provenance history. Requires community proposal and council decision.

WSU revision

Legitimate updating of a WSU’s canonical interpretation when the domain has evolved. Distinct from retirement and correction. Provenance history preserved.

Key References

Author(s)	Year	Work	Relevance
Arrieta-Ibarra et al.	2018	Should We Treat Data as Labor?	Economic case for attributing value to semantic contributions. Section 1.7.
Bender, Gebru et al.	2021	On the Dangers of Stochastic Parrots	Language models do not learn meaning from text. Section 1.1.
Couldry & Mejias	2019	The Costs of Connection	Data colonialism framework. Section 1.3.
Fricker, M.	2007	Epistemic Injustice	Testimonial and hermeneutical injustice. Section 1.6.
Guo et al.	2017	On Calibration of Modern Neural Networks	Neural network confidence is poorly calibrated. Section 1.5.
Harris, Z.	1954	Distributional Structure	The foundational assumption Section 1.1 examines and critiques.
Kadavath et al.	2022	Language Models Know What They Don’t Know	Models’ limited capacity to signal uncertainty. Section 1.5.
Lazer et al.	2009	The Parable of Google Flu	Failure modes of scale in statistical systems. Section 1.4.
Longino, H.	1990	Science as Social Knowledge	Epistemic value of diversity. Section 1.4.
Merrill et al.	2021	Provable Limitations of Acquiring Meaning from Text	Formal proof of limits on semantic content from text alone. Section 1.1.
Narayanan & Felten	2014	No Silver Bullet: De-identification Doesn’t Work	The failure of anonymisation as privacy. Section 1.6.
Page, S.	2007	The Difference	Diversity prediction theorem. Section 1.4.
Posner & Weyl	2018	Radical Markets	Mechanisms for data as labour. Section 1.7.
Varoufakis, Y.	2023	Technofeudalism	Cloudal capital and cognitive rent extraction. Section 1.3.
Zollman, K.	2010	Epistemic Benefit of Transient Diversity	Epistemic monoculture risk. Section 1.4.

The Aletheia Protocol:Governance Whitepaper

1 · The Problem: What Current Models Implicitly Assume

1.1 The statistical approximation assumption

1.2 The stateless query assumption — semantic decay by design

1.3 The centralised ownership assumption — extraction without attribution

1.4 The homogenisation assumption — precision lost at scale

1.5 The confidence without calibration assumption

1.6 The anonymisation-as-privacy assumption

1.7 The economic invisibility of semantic labour

1.8 The extractive paradigm — a structural summary

2 · The Three Foundational Claims

2.1 Categorical difference

2.2 Irreducible singularity

2.3 Essential faculty

3 · Values: Eight Architectural Commitments

4 · What Meaning Is: A Formal Definition

4.1 Meaning as event, not property

4.2 Three conditions of genuine meaning

4.3 Four measurable parameters

4.4 The singular and the collective

5 · Memory Architecture

5.1 Working memory

5.2 Personal semantic memory

5.3 Collective sense-making maps

5.4 Flow between layers

5.5 Adaptive retrieval — the weighted combination

6 · The Weighted Semantic Unit

6.1 Definition

6.2 The four structural components

6.3 WSU Confidence Level — eight factors

6.4 Contested zones as first-class data

6.5 Semantic individuation

6.6 Meaning weight W(t)

6.7 Minimum contribution threshold — explanatory power

6.8 WSU retirement

7 · The Aletheia Protocol

7.0 Why Aletheia

7.1 Proof of meaning

7.2 The five parameters — precise definitions

7.3 What the protocol tampers against

7.4 – 7.12 · The nine protocol functions

8 · The Epoch Structure

9 · Access Design and Product Principles

10 · Governance Principles

11 · Competitive Position and Structural Moat

Scope — What This Document Does Not Cover

Glossary

Key References

The Aletheia Protocol:
Governance Whitepaper