The Voynich Project: what the numbers actually say

Start here: pick your path

I'm new to thisStart with the theory scorecard: every explanation you've heard, tested, in one glance. Or read the five-act story first, zero jargon. I'm skepticalGood. Jump to the hypothesis ledger, everything we tried and killed, then the hostile audit and the falsification lab. I want to touch itOpen any folio of the manuscript, run the experiments yourself, write statistically real Voynichese, test your own theory. All in the browser. I'm a researcherDatasets, every script, the citation-ready paper, the 64-part lab notebook, and the open problems list.

The whole investigation in eight plain sentences

The Voynich manuscript is a real 600-year-old book in an alphabet nobody can read.
It is not random scribbling. The writing has deep patterns that repeat, the same way every time we check.
It is not a simple secret code of any known language. The letters are far too easy to guess for that to work.
It has a real grammar that survives every test we could throw at it, even deleting all the spaces.
But a meaning-free machine can fake that grammar, so following rules does not prove the book means anything.
From line to line, there is no flow of ideas, so it does not read like a normal story or letter.
Two answers survive: a fancy but empty system, or a list or recipe book in a lost code whose key is not in the book.
Every "translation" you have heard about fails the checks on this page, and you can run all of them yourself, free.

Jargon-buster: the words this page keeps using

Every term below also has a dotted underline the first time it appears in a section, hover or tap it for a one-line reminder.

Entropy (predictability)If I say "the cat sat on the ___", you can guess "mat", that's low entropy (predictable). If I say a random word, that's high entropy. Voynichese letters are unusually easy to guess from the letter before them.

Zipf's lawIn any real language, a few words (like "the", "a") are used constantly, and most words are rare, and the drop-off follows a precise curve. Voynichese words follow that exact curve, like a real vocabulary.

Null model / controlThe "boring explanation" we test against. Before asking "is this surprising?", we ask "what would this number look like if nothing interesting were going on?", e.g. shuffled letters, or a fake Voynich text written by a simple machine.

Grammar / morphologyThe rules for how word-parts combine, e.g. in English, "-s" usually means "more than one", and it changes what kind of word can come next. We tested whether Voynichese endings behave the same way.

z-scoreA way of saying "how surprising is this, in standard units of luck". z=2 is "somewhat surprising"; z=6 is "essentially impossible by chance". Bigger = stronger evidence.

Currier A / BIn the 1970s, codebreaker Prescott Currier noticed the manuscript's handwriting and word-statistics split into two distinct "dialects", nicknamed A and B. Both appear in the same book, sometimes on facing pages.

r (correlation)A number from -1 to 1 measuring how strongly two things move together. r=0 means no relationship; r≈0.55 (this page's grammar result) is a moderately strong, consistent relationship, about as strong as everyday cause-and-effect patterns get in messy real-world data.

The story: no jargon, promise

How one curious person put a 600-year-old book to the test

Act I · The book that says nothing

A mystery with a century of famous failures

In 1912 a book dealer named Wilfrid Voynich bought a small, strange book. Every page is full of neat, flowing handwriting, in letters found nowhere else on Earth. There are drawings of plants that do not exist. Naked figures bathe in green pools joined by pipes. There are star charts for no sky we know. The people who broke Germany's secret codes in both World Wars tried to read it. So did professors, computer scientists, and AI labs. Every one of them failed. Most people who take on this book start by guessing what it says.

I started with a smaller question: what could I measure to catch it lying?

Act II · The interrogation

You can't read it: but you can make it answer yes/no questions

Here is the trick that makes this whole project work. You do not need to understand a text to ask it simple yes-or-no questions. Do your words repeat the way a real language repeats? Are your letters used like letters, or like numbers? If I hide the spaces, do your words still know where they end? Each question has an answer we can measure. And here is the important part: we always work out what the answer would look like if the book were faking it. For every test, we ran the same test on Latin, English, Finnish, and Turkish, on shuffled fakes, and on pretend Voynich text from a machine we built for this. If the real book and the fake give the same answer, the test proves nothing, and we say so.

Act III · The twist

It has grammar. And that means less than you'd hope.

The book passed tests that plain gibberish should fail. Its words follow rules about what comes next. That is what grammar is. And that grammar held up against everything we tried, even after we deleted every space and let a computer cut the words apart again. For a week it felt like a door was about to open.

Then we did the hard thing you are supposed to do to your own best result: we tried to kill it. We built a mindless machine, a pretend scribe with three copying habits and no idea what it was writing. Its pages passed the grammar test too. The door slammed shut. It turns out structure is cheap. Meaning is not. Telling the two apart is the whole game.

Act IV · The funerals

We buried fourteen of our own theories. In public.

Maybe the words at the end of each line are sums, like a ledger? Tested. Dead. Maybe the zodiac labels are day numbers? Dead. Maybe the star cluster that looks like the Pleiades is labeled "Pleiades", one crack to pry the book open, the way names cracked Egyptian hieroglyphs? Dead, just chance look-alikes. Maybe it is Turkish, like one famous theory says? We built a dictionary of 700-year-old steppe Turkic from a medieval phrasebook to test it fairly. Dead. Maybe a famous hoax machine from 2004 explains it? Dead, it cannot even make enough different words.

And here is the one that explains all those "I cracked it" headlines. We let a computer hunt for the best way to give Voynich symbols sounds, so the words would turn into real Latin, Greek, or Turkic. It happily "translated" more than half the book. Then we ran the same hunt against fake dictionaries of scrambled nonsense words, and it did just as well. That is the size of the trap every published "translation" fell into. The book shows you whatever you bring to it. So our rule is simple: the data gets a vote, and the data's vote wins.

Act V · Where the trail stands

Two suspects left: and an honest map for whoever comes next

After more than 60 experiments, two answers are still standing. One: the book means nothing. It is a fancy performance, written by habits that slowly drifted over the months of work. We can actually watch that drift move page by page, and it even tells us the order the pages were first made in. Two: the book is a kind of reference book, lists, recipes, or notes, one entry per line, written in a code whose key was never inside the book. The writing alone cannot tell us which. Anyone who claims it can should go pass the test at the bottom of this page.

What we leave behind is not a translation. We think it is something better: the first full, checkable map of what this book is and is not. Every way in, every dead end, every tool free to use, so the next curious person starts where we stopped instead of where we started.

Why trust an amateur's map? Because you don't have to trust it. Every claim above is a number, every number comes from code you can run, and the experiments that failed are documented as carefully as the ones that worked. That's the difference between a story and a scientific story.

The object

A 600-year-old book nobody can read

In plain words: It is a real book from about 600 years ago. It is full of odd plant and star pictures, and page after page of neat writing. But the letters are not from any alphabet we know, and no one has ever read a single word for sure. Smart people have tried for over 100 years, including the code-breakers who beat the Nazis. All of them got stuck.

Beinecke MS 408, the Voynich manuscript, at Yale, is a ~240-page parchment book, radiocarbon-dated to the early 1400s, filled with botanical, astronomical and bathing illustrations and roughly 38,000 words of flowing script in an alphabet that appears nowhere else on Earth. It has resisted every professional and amateur decipherment attempt for over a century, including the WWII codebreakers who cracked Enigma-era ciphers.

Voynich folio 68r: circular astronomical diagram with stars and radial labels — f68r: astronomical diagram with the labelled-star register (Beinecke MS 408, Yale; public domain)

Voynich folio 72r2: zodiac medallion ringed by nymphs holding stars — f72r2: zodiac medallion; the ring text behind the day-number hypothesis we falsified (Part 52)

Detail: the star cluster identified as the Pleiades with the label doaro — f68r3 detail: the Pleiades and the hapax label doaro: our best image crib, honestly inconclusive (Part 51)

Worked example, the words this page uses for "where something is"

illustration (plant, star, bathing figure…)

qokeedy shol daiin chedy qokain

otaiin qokedy chey shol daiin ol

qokeedy dal qokedy qokedy chedy

cheol daiin shol qokain otaiin

Folio / page, one side of a sheet of parchment; the manuscript has ~240 of them (numbered like 68r, 72r2 in the captions). Paragraph, a block of lines, usually starting with a tall "gallows" glyph. Line, one row of writing; our "records" finding (below) is about what happens within vs across these. Word / token, a glyph-cluster separated by spaces, like qokeedy. Glyph, a single written symbol, the Voynich equivalent of a letter, but, as the entropy section shows, glyphs don't behave quite like letters in any known alphabet.

Instead of proposing another reading, this project did something more boring and more useful: it measured the text, exhaustively, with controls, and let the numbers eliminate explanations until only one family survived.

Start here: the theories, and how each held up

Every explanation you've heard about this book, on one page

In plain words: For over 100 years, people have had big ideas about what this book is. Here they all are, in plain language, with how we tested each one and what we found. This is the whole project at a glance. Every section after it is just the longer version of one row below.

How to read the verdicts: ruled out means a measurable test excluded it. still open means it survived every test we could throw at it and is one of the two explanations still standing. Each row names the Part of the full report where you can check the numbers.

The theory (what you may have heard)	How we put it to the test	What we found
Is it a secret code or cipher?
A simple letter-for-letter cipher of Latin, Hebrew or another known language	Measured how predictable each next letter is (its "entropy") and compared against 13 real languages.	ruled outThe text is far more predictable than any language tested. A letter-swap can only relabel letters, it cannot remove choices that were in the original, so this is mathematically impossible. (Findings 1 & 2)
A word-level code: every word swapped for a symbol from a code-book that no longer exists	Checked whether the word-stream still behaves like a real language, and whether an alphabetised code-book would explain the thousands of near-identical words.	still openSurvives. This is one of the two explanations still standing (see below).
Is it just a normal language in a strange alphabet?
Plain Latin, Hebrew or "proto-Romance": the many published "I finally translated it" claims	Ran each proposed reading through the falsification lab that measures the manuscript's statistical fingerprint.	ruled outEvery published translation to date fails several measurable checks. You can paste any of them into the lab lower down and watch it fail.
Turkish, or medieval Kipchak Turkic (a widely shared 2018 claim)	Built a vowel-harmony detector, proved it works on real Turkish, then applied it to the Voynich.	ruled outNo vowel harmony at any level, so a phonetic Turkic reading is excluded. We even built a 700-year-old steppe-Turkic dictionary to test it fairly. (Part 54)
An agglutinative language like Hungarian, where words are built by stacking many endings	Counted word-parts per word against real agglutinative languages.	ruled outA theory we raised ourselves and then buried: Voynich words carry fewer parts than Turkish, not more. (Parts 46–47)
Is it meaningless?
Random scribbling with no structure at all	Zipf's law, the grammar test, topic-locking, and line structure.	ruled outFar too much deep, reproducible structure at every level for pure noise.
An elaborate hoax made with a 16th-century "grille" trick (Rugg, 2004)	Rebuilt that exact machine faithfully and ran the full battery against its output.	ruled outThe specific machine cannot even produce enough different words. But note: a meaning-free system in general is still one of the two survivors. (Part 57)
Can you crack it straight from the drawings?
The star and zodiac labels are readable: the cluster that looks like the Pleiades, or zodiac day-numbers	Null tests on every picture-based crib.	ruled outAll indistinguishable from chance look-alikes, the honest graveyard of the "one crack to pry it open" hope. (Parts 49–52)
The line-end words are running totals, like a bookkeeper's ledger	Compared against a position-matched control (do other positions "add up" too?).	ruled outThe first words of each line correlate just as well, so the signal was position on the line, not arithmetic. (Part 25)

The two explanations still standing

After 60+ experiments, everything above is ruled out and exactly two readings survive. The honest headline of this whole project is that the text alone cannot choose between them.

1. A meaning-free formal system

An elaborate performance that carries no message, written by hand-habits that drifted slowly across the months of work. We can actually watch that drift move page by page, and it is the book's only long-range signal.

2. A list-like reference book in a lost notation

Something like a set of recipes or property-tables, one short record per line, written in a word-level code whose key was a separate document, now lost or waiting in an archive.

Everything below is the detailed version: one section per finding, each with the chart, the control it was tested against, and, where possible, a machine you can run yourself in the browser. For the complete, filterable list of every hypothesis we tested, including the more technical ones this table leaves out, jump to the hypothesis ledger.

Findings 1 & 2

It's not random: and it's not a simple cipher of any language

In plain words: In every real language, some letters are easy to guess and some are hard. That mix is like a fingerprint. The Voynich letters are far too easy to guess, easier than in any of the 13 languages we checked. A secret code that just swaps one letter for another cannot make writing that easy to guess. So this is not simply Latin or English hiding behind funny letters.

Voynich word frequencies follow Zipf's law exactly like a real language (a random-scribble control fails badly). But the predictability of each next letter, conditional entropy, is far below every natural language. A substitution cipher cannot lower this number: you cannot get 2.0 bits from enciphering 3.3-bit Latin letter-by-letter. Every "I translated it into Latin / Hebrew / proto-Romance" headline collides with this single chart.

Worked example, what "predictable" means here

In English, after the letter t you might see h, o, r, a, i, e, lots of options ("the", "to", "tree", "tap"...). In Voynichese, after the glyph q the next glyph is o essentially every single time, there's almost no choice. Multiply that effect across the whole alphabet and you get a number, 2.09 bits, that says "this text has roughly 4x fewer real choices per letter than English". A simple letter-swap cipher can only relabel the letters; it can't remove choices that were there in the original language. So whatever produced this text, it wasn't "take a normal language and swap the letters".

Conditional letter entropy (bits per character), computed identically for all corpora. Lower = more predictable. Voynich: 2.09. No natural language tested comes near it; no letter-substitution cipher can produce it from one. Want to measure your own text on this axis? The Theory Lab below runs this exact battery in your browser.

Stranger still: where a word sits on the page changes how it is spelled. Words ending in the glyph m are 17× more common at line-ends; words containing p or f are 17× more common in paragraph-opening lines; 82% of paragraphs open with a tall "gallows" glyph. No natural language does this, open any folio in the explorer below and you can see it yourself.

Finding 3 · interactive

A medieval scribe could write this: try it yourself

In plain words: We built a simple word machine. It knows no meaning at all. It just copies and lightly changes words, using three easy habits. Its writing looks almost exactly like the real book, right down to the numbers we measure. So a person 600 years ago could have filled pages like this while writing nothing real. That does not prove the book is empty. It proves the way the text looks, on its own, cannot tell us.

We built an 11-parameter mechanical recipe, a word-table plus three habits (reuse a word, mutate a recent word, compose a new one) plus the layout rules, and tuned it against the manuscript. It reproduces nearly the entire statistical fingerprint, including the letter entropy to the third decimal. The studio below runs that exact fitted model in your browser, and hands you the dials: drag a habit off its fitted value and watch the live fingerprint meters drift away from the manuscript's targets.

ReuseCopy a word you (or a previous scribe) already wrote earlier on the page

→

MutateTake a recent word and swap one piece, like writing "qokedy" then "qokeedy"

→

ComposeBuild a brand-new word from the same prefix/middle/suffix pieces as everything else

Think of it like a scribe with a personal "phrasebook" of word-pieces, who mostly recycles and lightly remixes recent words, and occasionally snaps together a new combination from the same Lego set of pieces, with no idea what any of it means.

Loading the generator studio…

Free tool · the honest "translator"

There is no translator: here is the next best thing

In plain words: Nobody can turn Voynich words into English, us included. Anyone who says they can has not passed the test at the bottom of this page. But we can still show you everything we measured about any word: its parts, how often each dialect uses it, which sections it lives in, and its look-alike cousins. Type a word and see.

Nobody can translate Voynichese. Not us, not anyone, every published "translation" fails the falsification lab below, and saying so plainly is part of doing this honestly. What 60+ experiments can give you is everything measurable about any word: its slot anatomy, how often each "language" uses it, which sections it lives in, its near-twin neighbours, and now every place it occurs in the manuscript, in context. Type any EVA word (or click a suggestion):

Worked example, anatomy of a word

qokeedy

Almost every Voynich word splits cleanly into three slots, like a tiny assembly line: a small set of opening pieces (prefix, here qo), a short core (middle, here k), and a small set of closing pieces (suffix, here eedy). It's a bit like English words built from prefixes and suffixes (un-believ-able), except in Voynichese, nearly every word is built this way, from a surprisingly small shared parts-bin.

Loading the word analyzer…

Findings 4 & 5

It follows rules, like grammar: but that still isn't proof it means anything

In plain words: The words follow rules about what kind of word comes next. That is what grammar is. We found this same rule-pattern in three different typed-up copies of the book, and it even holds after we erase every space. But here is the catch: our meaning-free machine makes grammar just as strong. So grammar shows the book follows rules. It does not show the book means anything. Saying that out loud is the most important line on this page.

Voynich words come in suffix-swapped pairs across dozens of unrelated families (qokedy / qokeedy, chedy / chey). We measured whether the suffix changes a word's following context the same way in every family, and it does, at the strength of real English morphology (and it replicates in a second, independent transliteration, across separate physical quires, and even in the other Currier dialect on its own markers). A context-free "scribble" generator scores zero on this, so the text is genuinely not context-free pseudo-text.

Worked example, why a "suffix" can prove grammar

In English, "-s" mostly means "plural", and that has a knock-on effect: after a plural noun like dogs, you're more likely to see a verb like run than after the singular dog (which prefers runs). The suffix changes what's allowed next. We tested whether Voynichese suffixes do the same thing: does qokedy vs qokeedy change what kind of word tends to follow, and is that change consistent across dozens of unrelated word-families that share the same suffix pair? The answer is yes, about as consistently as real English "-s" does. That's the signature of grammar. But, and this is the twist, a machine with no idea what it's writing can produce that same signature, just by picking the next word's "category" from the previous word's category. So this test proves rule-following, not meaning.

Split-half context consistency. The honest scorecard, after a steelman test: the Voynich matches real English morphology, but a meaningless class-conditioned Markov process (which just picks the next word from the previous word's class) also reproduces it (0.51), while a context-free generator cannot (0.00). So this proves rule-governed class-sequential structure, not semantic content, we narrowed our own headline claim when the steelman passed. Replay this experiment with its controls in the notebooks below.

Clustering the 236 most frequent words purely by context, the algorithm never saw a single letter, yields eight coherent word classes with stable transition rules (an -or/-ar word is followed by a particle 53% of the time; dedicated line-opener and line-closer classes emerge on their own). The structure is real and deep; whether it carries recoverable meaning is a separate question the text alone cannot answer, see the audit and the model below.

Findings 6 & 7

The page is a topic. The line is a record.

In plain words: The same word pops up again and again inside one line, then stops at the end of the line. That is how a list works, like rows in a notebook, not like sentences in a story. So if the book is saying anything, it seems to say it one line at a time.

Two discoveries about structure. First: vocabulary locks onto the illustrated subject, even within a single scribe's handwriting, plant-pages and bathing-pages use measurably different words (ruling out "different scribes' habits"). Second: the famous repeated words (qokedy qokedy dal qokedy qokedy) turn out to respect line boundaries perfectly:

Probability (per 1000) that a word recurs at distance d, within its own line vs across a line break, at matched distance. Voynich recurrence drops off a cliff exactly at the line break; our distance-based generator (dashed) can't see lines at all. A scribe's memory doesn't reset at line ends, but a record does.

Worked example, what "the line is a record" looks like

Compare two ways of writing about three plants. Prose carries an idea across line breaks, you need the previous line to understand the next. A ledger restarts each line as its own self-contained entry, often repeating a key word:

Prose (ideas flow across lines)

The root of this plant, when dried
and ground to powder, eases the
stomach if taken with warm wine
before the evening meal each day.

Ledger (each line restarts)

Mallow root, dried, ground, stomach
Fennel seed, crushed, warm, stomach
Anise leaf, boiled, sweet, stomach
Mallow flower, fresh, cold, fever

The burst-chart above shows Voynich text behaves like the right-hand column: a word that just appeared is much more likely to reappear within the same line than after a line break, exactly the fingerprint of restated, record-like entries, not flowing sentences.

Together: each page treats a topic; each line behaves like a self-contained record that restates its subject, the texture of inventories, recipes and tallies, not of sentences and stories.

Finding 8

If a language is underneath: which kind?

In plain words: If a real language is hiding inside, we can guess its family from how the words behave, even without reading them. It behaves most like languages that build long words by stacking lots of endings (like Finnish or Hungarian), and least like English. But a later test, about vowels, ruled those exact languages out for a simple sound-swap. So this clue now points at the code itself, not at one language.

A word-for-word code hides what each word means but cannot hide how the word-stream behaves. So the century's proposed plaintext languages can finally be screened without reading anything. Thirteen languages, identical pipeline, matched corpus sizes:

Worked example, "fingerprint", not meaning

Think of this like recognizing someone's typing style without reading what they wrote: do they use lots of short connector-words, or long words with many endings stuck together? Do words repeat in clusters or spread out evenly? Every language has a "shape" like this, independent of its alphabet or vocabulary. Replacing every word with a code symbol (a word-level code) keeps that shape intact, even though no word is readable any more. So this chart isn't reading the Voynich, it's comparing its "typing style" to thirteen real languages' typing styles. Closest bars = most similar shape (Finnish, Greek, Hungarian, all famous for long words built from many stuck-together endings). Farthest = English, which relies on short separate words and word order instead.

Distance from the Voynich word-stream fingerprint (shorter bar = better match). The top tier is uniformly inflection-rich; the languages most often "deciphered" in headlines: English, Spanish, Hebrew, Arabic, fit worst. The strangest Voynich habit, word-class chaining, exists in only three tested languages: Finnish (1.36), Hebrew (1.20) and Turkish (1.00) vs Voynich's 1.34.

The same ruler, applied to the Indus script

The toolkit generalizes. We built a calibrated "notation ↔ language" scale, validated on Roman numerals (rigid sign-order), decimal digits (no order) and four languages, and placed two famous undeciphered scripts on it:

Decimal digits
0.02

Languages
~0.14

Indus script
0.215

Voynich
0.336

Roman numerals
0.475

← free ordering (language-like)rigid ordering (notation-like) →

The Indus script, center of a 20-year "is it writing at all?" dispute, measures more language-like than the Voynich, with a real syntax of dedicated opening, connecting and closing signs. The Voynich sits on the notation side: its words are built like ordered symbols, not pronounceable words. (Preliminary: 178-inscription sample.)

New · the manuscript, open on your desk

Explore every folio of the manuscript

In plain words: This is the whole book, typed out, every page and every word, with its measurements beside it. Click any word to look it up. Watch the writers' habits slowly change from page to page, like weather moving across the book.

Each folio card shows its codicological metadata (quire, scribal hand, Currier dialect, illustrated section), its full running text, and where it sits on the production drift curve: the page-by-page change in writing habits (Part 26) that is the manuscript's only long-range signal, strong enough that it can flag misbound pages. High-resolution scans of every folio are free at the Yale Beinecke library; each card links straight to them.

Loading 33,122 words of manuscript…

New · run the experiments yourself

The experiment notebooks: evidence you can poke

In plain words: These are the four biggest experiments, turned into buttons you can press. Flip on a "control", like shuffled text or the meaning-free machine, and watch what fair evidence looks like when the boring explanation gets its turn to answer.

Loading the notebooks…

The surviving model

What the Voynich manuscript most likely is

In plain words: After more than 60 experiments, two answers are still alive. One: the book means nothing, and was made by habits that slowly drifted as the writer worked. Two: the book is a kind of list or recipe book, real but written in a code whose key is lost. The writing by itself cannot tell us which one is right, and we would rather admit that than fake an answer.

An encoded compendium of short records, page = topic, line = record that restates its subject, written in a word-level code (each plaintext word replaced by a code token from an alphabetized, domain-organized, partly homophonic key) over a plaintext whose word-stream type is inflection-rich, with the key a separate physical document, now lost or waiting in an archive.

Why this and nothing simpler: hoax/gibberish died on the grammar and topic results; plain language in an exotic alphabet died on the entropy and layout results; letter-cipher died on the entropy math; sequential catalog numbers died on the label tests. The word-level code over an inflected language is the one description that survives every experiment run against it, including the ones designed to kill it. And one mundane detail explains the manuscript's weirdest feature: if the key was an alphabetized list numbered in order, related plaintext words (think aqua, aquae, aquam) get adjacent code values, which is exactly why thousands of Voynich words differ by a single glyph.

Worked example, how a "word-level code" could create near-twin words

Imagine a code-book where every word of a language is listed alphabetically and given a number, like a dictionary index:

101aqua (water)

102aquae (of water)

103aquam (water, object)

Because aqua / aquae / aquam sit next to each other alphabetically, their code numbers are also next to each other. If those numbers are then re-written in a made-up notation (Voynichese), related words end up looking almost identical, differing by one symbol, the way qokedy and qokeedy do. This single mundane mechanism, an alphabetized index, would explain why ~87% of Voynich words have a near-twin elsewhere in the text, without needing any of it to be readable today.

What this is not. Not a translation, nobody can read the book, including us. Not peer-reviewed, it is a documented, reproducible analysis awaiting hostile scrutiny, which is how knowledge gets made. Several individual findings reproduce prior scholarship (credited in the report); the controls, calibrations, the grammar result and the record-structure results are, to our knowledge, new.

Update (second audit round, Parts 53–63): the picture above survived a second adversarial pass and got sharper. No vowel harmony at sign or unit level (with detectors validated on Turkish and Finnish); no discourse-scale information, the only long-range structure is slowly drifting page register; the grammar survives with the scribal spaces deleted; Rugg's grille and per-letter verbose encodings of abbreviated Latin are both excluded constructively. Whatever Voynichese is, its redundancy is manufactured at the slot/word-assembly level, and if it carries meaning, that meaning lives in line-records, not prose.

Update (third round, Parts 68–70): three more probes at the two survivors, each checked against the earlier result it might duplicate. Word identity adds almost nothing beyond word class that a meaningless class-Markov process cannot fake (79% of the signal reproduced, Part 68), so the "structure is not meaning" lesson holds right down to the single word. Repeated labels do carry a weak page-level signal (Part 69), but it is confounded with plain within-page repetition and does not show that a label names its own picture. And each line clusters its own rare vocabulary the way a real inventory entry does (Part 70) which, with the earlier finding of no information flow between lines, reads as a book of independent line-records. None of the three picks a winner; together they sharpen the record picture and leave the verdict exactly where it was, honestly undecided.

Adversarial validation

We invited a hostile review: then ran its audit

In plain words: We asked an outside critic to try to break our work. They handed us six ways we might be fooling ourselves. We ran all six. Five of our claims survived the attack. One did not, so we crossed it out (and we made two of the survivors more careful). Here is the scoreboard, wins and losses in plain sight.

An independent reviewer examined the full report and prescribed six "break-it" experiments. We ran all six, each built on a single canonical parser (canonical.py) so every number prints from one auditable pipeline. The verdicts, including the casualties:

How to read this table: each row is a "find the flaw" attempt by an outside critic. Ablation means deliberately removing some of the data to see if the result survives without it. A shuffle trap scrambles the text on purpose, if a test still finds a "pattern" in scrambled gibberish, that test is broken. Green = our claim survived the attack. Red = the attack found a real problem and we retracted or narrowed the claim.

Claim under attack	Hostile test	Verdict
The grammar is an artifact of the EVA transcription	Re-run on the independent v101 transliteration (different alphabet), plus a shuffle trap that catches false positives	Grammar survives (r 0.66–0.81; collapses correctly under shuffle)
The grammar is a one-scribe / one-batch convention	Train on one set of physical quires, test on disjoint quires	-edy/-ey generalizes at the ceiling (r 0.553 vs 0.547); -dy/-y weakens, claim narrowed
The grammar is a layout artifact	Delete line-final, paragraph-initial, and p/f/m words; keep only line-medial words	Survives every ablation (z 4.1–5.8)
The e/i "operator" is just graphic line-fill	Layout covariates vs within-item variance partition	Layout explains <0.5%; the operator is a page-subject-bound graded attribute (refined: not locally contextual)
Lines are rows of a flattened table	Vertical position–class information vs the proper global-edge null	Falsified, the "column" signal was the universal line-edge gradient
Plant names exist as multi-token formulas	Bigram/trigram page-concentration vs a burstiness-preserving null	"No names" holds, the apparent signal was topic burstiness

Methodological lesson worth stealing: two of the six tests initially produced impressive false positives (z = 6.6!) that only the correct null model, within-line shuffles and burstiness-preserving permutations, exposed. Naive permutation tests are not enough for this manuscript.

The scientific method, kept honest

Everything we tried: what held, what died, and what's next

In plain words: science is mostly funerals. This wall is ours, including five of our own favourite ideas, tested and buried by our own rules. The last tab, Still to test, is our open to-do list, the experiments we have not run yet.

This is the part most write-ups hide. Every card is explicit about four things: what it was, why we tried it, the test, and what we found (for the open items: the planned test and what a result would show). Each verdict came from a pre-registered pass/fail rule and a null model, not from taste. Killed cards stay published, they are how the surviving picture earned its place. Every card links into the full lab notebook (REPORT.md) where the test is derived.

Loading the ledger…

Two readings remain standing after all of this: a meaning-thin formal system with drifting production habits (now positively evidenced at long range), or a meaningful line-record formulary under a slot-level encoding whose key is outside the manuscript. The text alone cannot split them further, we say so, rather than pick a winner.

What we measured against

The datasets: all public, all fetchable by script

In plain words: Every test here compares the Voynich book to something else: real languages, shuffled copies of itself, or fake text from our machine. This table lists all the "something else" we used. A number like "2.09" means nothing on its own. Next to these, you can see how strange it really is.

Quick definitions: a transliteration (ZL 3b, v101, Takahashi) is a letter-by-letter transcription of the manuscript's glyphs into typeable text, three independent ones exist, made by different scholars, which is how we check a result isn't just an artifact of one person's reading choices. CLTK, bible-corpus and Quran-JSON are public, downloadable collections of real-language text used as "known normal" comparisons. The Digby recipe book and Codex Cumanicus are real historical documents we use as period-appropriate stand-ins for "what writing from this era looks like".

Dataset	Role	Source
Zandbergen–Landini ZL 3b (May 2025), 33,122 tokens	Primary transcription	voynich.nu (pinned mirror in fetch_data_mirrors.sh)
v101 (GC) + Takahashi transliterations	Independent replication	voynich.nu
Latin (Caesar), English (Gutenberg), + 11 more languages	Language controls, matched token counts	Latin Library/CLTK, Gutenberg, bible-corpus, Quran-JSON
Digby recipe compendium (1669)	Record-genre control	Project Gutenberg
Codex Cumanicus (c. 1330): 2,685-type Cuman lexicon + texts we mined from Kuun's 1880 edition	Period-correct Kipchak corpus (built for P54, committed in repo)	build_cuman.py from public-domain OCR
Linear A & Indus corpora	Undeciphered-script comparanda	lineara.xyz, CISI digitization
Fitted self-citation generator, class-Markov, word-bigram, Cardan grille	Null machines, what a meaningless process can and cannot fake	generators.py, rugg_test.py

Parsing decisions are documented and sensitivity-tested: the two silent choices in the transcription pipeline (alternate readings, uncertain spaces) move no headline number beyond noise (P55). A fresh clone on a clean machine reproduces every number exactly (P53).

New · the falsification lab

Test your theory against the manuscript

In plain words: Think you cracked it? Here is the test. Paste in a few hundred words of your answer, and this tool checks it against the real book's measurements, right in your browser. Nothing gets uploaded. Every famous "translation" so far fails several checks. If yours passes, that would be real news.

Paste at least a few hundred words of any text: your proposed plaintext encoded by your proposed method, a "translation" read back into its claimed source language, another language entirely, or your own hand-rolled gibberish. The lab computes the full surface battery, conditional letter entropy, Zipf slope, word lengths, vocabulary concentration, the near-twin network, in your browser (nothing is uploaded), and grades each measurement against the Currier B targets and tolerances.

Loading the lab…

What do these rows actually mean? (plain-language glossary)

Conditional letter entropy, how easy each next letter is to guess (see the Jargon-buster above). Zipf slope, how steeply word-frequency drops off from "the most common word" to "the rarest" (real languages have a characteristic slope). Word length, the average and spread of word lengths; Voynich words almost never exceed ~9 letters. Top-100 coverage, what share of all running text the 100 most common words account for, a measure of how concentrated the vocabulary is. Near-twin network, the share of word types that differ from another word by exactly one letter (explained in the model section above). Layout constraints that need a full manuscript to measure (line-final m-words, paragraph-initial p/f gallows, the grammar consistency r ≈ 0.55, line-bound recurrence) are listed in the report with their measurement code.

Questions everyone asks

Honest answers, no hedging

What does "EVA" mean, and is it the same as a translation?

No. EVA is just a way to type the book's letters using normal keyboard letters, so computers and researchers can work with the text. When this page shows daiin or qokeedy, that is EVA. It tells you which shapes are on the page, the same way writing "alpha beta gamma" tells you which Greek letters are there, without telling you what they mean. EVA is a map of the shapes. A translation is the meaning. We have the map. Nobody has the meaning.

How would you even pronounce a Voynich word?

You can't, really. Nobody ever matched the shapes to sounds. The EVA letters were picked because they look like the shapes and are easy to type, not because we know how they sound. The shape typed "k" does not have to sound like an English k. When people "read Voynichese aloud" in videos, they are choosing sounds, not discovering them.

Has anyone ever decoded even a single confirmed word?

No. A few words have been guessed as names for plants or stars. We tested some of the most famous ones, like the "Pleiades" star label and the zodiac day-numbers (Parts 51 and 52). They came out either unclear or wrong. Not one proposed meaning has passed a fair, blind check. If one ever does, it will be huge news, not a quiet footnote.

Is it a hoax?

Maybe, and that is a serious idea, not an insult. Our simple machine copies almost the whole statistical look of the book, and the only slow pattern in the text is drifting habit, which is what happens when a human writes on autopilot. But the book was also expensive to make, stays steady over ~38,000 words, and is laid out like a reference book. So "fancy but empty" and "hoax" are not quite the same thing.

Why can't modern AI just translate it?

To translate, you need one of three things: a key, a matching text in a language we know, or enough meaning flowing across the text to grab onto. The Voynich has none of them. We measured the third one and found that nothing flows from line to line (Part 56). AI can make any text look "translated", which is exactly why our test lab exists. Any answer, from a human or an AI, has to pass the measurements first.

Is it aliens / magic / a lost civilization?

No. Nothing here needs anything strange. The parchment is dated to the 1400s, the ink and binding are ordinary European, five people did the writing, and the numbers can come from a normal copying process. The mystery is real, but it is a human-sized mystery.

What would change your mind?

A key found in an old archive (we published a ranked list of where to look), a matching text in a language we know, or any proposed reading that passes the test lab above. We also list the experiments that could kill our own current answer, see the open problems.

Who funded this? What are the credentials?

Nobody, and none. This is the free-time work of one curious person, with AI help, said openly. That is exactly why everything is open: the data, the code, the failures, and the strict rules are the credentials. Check any number on this page yourself.

Free tools: take everything

Every tool, free, no dependencies

All code is plain Python 3 (stdlib only, if you can run python3, you can rerun this investigation or point the tools at your own theory). The report documents every experiment including the failed conjectures. Data sources are linked rather than redistributed where licensing is unclear. Start with INDEX.md, the map of every document and script.

Not a programmer? Start here: the Full report is the readable version of everything below, every chart and table on this page links back to a numbered "Part" in it. If you just want to see the evidence, read the report; the scripts exist so anyone can check the report isn't making the numbers up. Roughly: files starting audit_* and decipher_* are the hostile/adversarial tests; *_test.py are the individual experiments referenced by the ledger and audit tables; canonical.py and generators.py are the shared plumbing everything else builds on.

Data sources & prior work

Voynich transliteration: Zandbergen–Landini ZL 3b (May 2025), voynich.nu · Manuscript scans: Yale Beinecke · Linear A: lineara.xyz corpus · Indus: CISI digitization · Control texts: Project Gutenberg, bible-corpus, Quran-JSON. The Voynich Project is maintained by Burak Genç, independent researcher. Prior work this builds on and credits: Currier (languages A/B); Bennett 1976 (entropy); Stolfi (word structure); Montemurro & Zanette 2013 (topic information); Timm & Schinner 2020 (self-citation generation); Bowern & Lindemann 2021 (linguistic survey); Lisa Fagin Davis (scribes); Rao et al. / Farmer et al. (Indus debate).

Open problems we left on the table

For anyone who wants to pick up where this project stops, six concrete next steps, in our own words:

① ~~Syllable-value search~~, done (Part 64): null against search-matched scrambled-lexicon controls; the last internal route is closed, and the ~57–74% null hit-rates measure exactly how convincing a wrong "translation" can feel. ② ~~Takahashi-transcription replication of the new Parts 54/56/58 results~~, done (Part 65): all three replicate on Takeshi Takahashi's fully independent transcription, no threshold flipped and the no-harmony result came out cleaner. ③ Hand-proofing the OCR-mined Cuman corpus against Kuun's printed pages. ④ ~~A true medieval list/inventory corpus (account books, litanies) for the long-range genre axis~~, done (Part 66): a Vulgate list/inventory register carries the strongest long-range information of any corpus tested, far above the Voynich's flat profile, so "it is just a list" no longer explains the absence. ⑤ The key, if it exists, is an archival object, the Kircher correspondence and Rudolf II inventories are where provenance points. ⑥ Everything here deserves hostile replication; the falsification lab above is the bar we set for ourselves too.