Theory - Grains of Sand

libraryofbabel.info is now in its second iteration. When the project began, I thought that even a virtual universal library faced insurmountable limits that made its realization impossible. But with the help of some advice I received from friends and visitors to the first version of the site, I’m happy to say that I’ve proven myself wrong. (You can take a look at the algorithm source code here)

The library originally worked by randomly generating text documents, storing them on disk, and reading from them when visitors to the site made page requests. Searches worked by reading through the books one by one. It was a method with no hope of ever achieving the proportions of the library Borges envisioned; it would have required longer than the lifespan of our planet to create and more disk space than would fit in the knowable universe to store. I wrote about the cosmic proportions of this shortcoming in a former theory page.

Because the virtual world often inspires suspicion, I feel I should explain how the new library functions, to reassure anyone who might think some chicanery was at work. I would be the first to be disappointed if this site did not truly contain what it claims to: every possible page of 3200 characters. I encourage those who prefer a sense of mystery, rather than knowing what goes on behind the looking glass, not to read on.

drawing of the library of Babel's hexagon pattern

The new site uses a pseudo-random number generating algorithm to produce the books in a seemingly random distribution, without needing to store anything on disk. Though I considered similar methods when I was starting out, at the time I lacked the mathematical knowledge and programming abilities to see how to realize it while remaining true to the story. I needed an algorithm regular enough to create the same block of text in the same place every time, yet random-seeming enough that no user would notice patterns moving from one page to the next.

It would be easy enough to use any programming language’s built-in random number generator to accomplish this task. With a few exceptions, there’s no real randomness in computing - a random number generator is just a deterministic equation which produces different values by starting from a different input each time. This input is called the random seed, and a computer’s system time is often used to guarantee a changing value. It would have been possible to create a sizeable library by using the book’s “location” (hexagon, wall, shelf, and volume) as a random seed, thus guaranteeing the same page in the same place every time.

However, I had grown quite attached to the idea of having a searchable library. For this to be possible, the algorithm I chose needed to be invertible as well. This means that for any block of text, the program can work backwards to calculate its location in the library (the random seed which would produce that output). I couldn’t help but feel that the result was a computer-age form of gematria, converting text to numbers and back again to text.

It took a significant amount of experimentation to find an algorithm capable of meeting these requirements and producing 29³²⁰⁰ unique values. One of my early attempts used a Halton sequence, which produces a pseudo-random distribution by creating fractions evenly distributed between 0 and 1, which I then multiplied by a number around 29³²⁰⁰. I’ve been working with C++, whose native data types only store numbers of up to 64 bits, which is about 17 digits in base 10. The library requires working with values of around 5,000 digits, or 16,000 bits.

Even using a multiprecision library capable of calculating with numbers of that size, any division operations made invertibility impossible. It’s difficult to represent decimal values in computing because infinite base-10 series are represented by a finite binary sequence. A tiny amount of information was lost every time the operation was performed, which made inversion impossible.

I found a successful formula combining modular arithmetic and bit-shifting operations, and the result is the library you see today. To date I have never witnessed a search fail, though I hope that if any reader comes across something which appears erroneous, she will bring it to my attention. Of course, if 5,000 visitors each made a hundred page requests per day, it would require 10⁴⁶⁷¹ years to test every possible value contained in the library.

Imagine a demiurge were to create the shoreline of earth’s continents, not with the coarse tools we use in constructing artificial beaches, but with enough resolution to know each grain of sand by name, and recall its exact location when asked for it. The pages of rational text which this algorithm can locate are rarer than a single grain of sand in that collection, yet intrinsically no more meaningful.

From Erik Desmazieres illustrations of the library of babel. Librarians comb through endlessly receding shelves of books beneath a hexagonal skylight.

Though the more powerful search engine changes the experience of the library, I feel that it leaves the essence of our encounter with its texts unchanged. Previously, finding a word of more than six letters was a rare occurrence. Now, it is trivial to find entire pages of prose in any language which uses a Roman alphabet. (As I mentioned in Alphabets & Irony, there is still an implicit Englishness to our character set.)

Interestingly, this leaves the frustration of using the library unaltered. One can find only text one has already written, and any attempt to find it in among other meaningful prose is certain to fail. The tantalizing promise of the universal library is the potential to discover what hasn’t been written, or what once was written and now is lost. But there is still no way for us to find what we don’t know how to look for. Unless, of course, you’re brave enough to browse, or open books at random.

In truth, if the new algorithm somehow allowed reason to triumph over iterability, I’m not sure I would’ve gone ahead with it. The most important experience the library can offer us is that of being overwhelmed by irrationality. The initial reaction of unbounded joy which Borges described the librarians experiencing at the proclamation that their library contained all books is one I imagine some users may feel encountering the collection here. No doubt, the desire it produces will become a more melancholy longing with time, for all those who fail to find what Borges described as their “vindication.” Nonetheless, the library contains its own sort of poetry and revelation, and even this disappointment can provide a moment of clarity.

The Tower of Babel by Pieter Bruegel The Elder