    Jonathan Basile

    In contrast to reading the titles, delving into the text is far more maddening. It’s rare to find a word of more than four or five letters on a page of text, and if you restrict yourself to looking for words separated by white space, it’s rare to find any at all. Still, to embrace the essence of the library one must open its books, whether to look for sense, or to welcome nonsense.

    The Anglishize function, which has a link on each page of the library’s books, can be helpful for cutting through some of the library’s noise, or creating more, depending on your outlook. It highlights all of the English words on a page of text.

    Please share if you come across any unexpected words in the library, or even, some day, a phrase or two. The text search can be helpful in looking for words, or you can always try your luck with a random page of a random book.

    The text… it speaks to me…

    out of curiosity this website in theory has the code to produce all the possible combinations? If so, why not use a bot that will scour the library and find the thousands of books that have the capability to revolutionize mankind? Or better yet entertain us, the owner of this website can also make a mint.

    I under the purpose of the website that being for it’s metaphoric beauty, but the above is a beauty of itself…….

    Jonathan Basile

    Hey Smartypants,

    Well, my goal in creating the site wasn’t necessarily to create what I thought were the greatest of humanity’s books. I was more interested in trying to recreate the experience of being a librarian in the library of babel, which requires that any meaningful texts be drowned out by seas of nonsense.

    Additionally, I didn’t want to impose any of my own value judgments, aesthetic or otherwise, on what ultimately gets created. I might think that some of the books are more interesting or beautiful than others, but I wouldn’t want to remove the ones I find uninteresting, because someone else might see something in them.

    If my goal were to post on the web the greatest books ever written, I would’ve just done what I did on the about page, and posted links to Borges’ work:

    Borges – Complete Fiction

    Borges – Selected Non-Fiction

    Borges – Selected Poems

    there’s no point of writing a sequential automata that tries to seek for comprehensible strings. no matter how fast it is, the longer the expected string, the longer (I mean, extremely long) the running time is. I am not entirely sure, but probably this shit is a NP-hard problem. think it deep, dude, high-value texts will be found in long strings. you don’t want to search for the string ‘abc’, as you won’t learn anything from it; but for something like ‘for a successful time travel you will need 3 lemons, 5 axes and 2 ku klux klan members’ – and this is already a long string, as we don’t know the amount, type and number of ingredients.

    the only even remotely viable method I can think of might be the construction of an enormously gigantic Hadoop cluster where books are coming from a very minimalistic pre-filtering algorithm.

    for the pre-filtering algorithm one should write extremely small and hyper fast checks – for example, if the title is not an english word, then skip the book. or if the book starts with exactly one or two or at least four dots, then skip the book (some artistic retards start their books with ‘…’, so we accept this case). or if it contains any triplet of the same letter, then skip the book (this is incorrect, but don’t dig that deep), and so on. these are just wild ideas without proving they actually help the progress.

    then feed this reduced book set into the abovementioned Hadoop cluster (like the set is just exactly one book with nearly infinite pages), and try to come up with a tremendously huge-scale statistical approach upon which some kind of AI is trying to sort out the books that have more than an arbitrarily defined probability level of containing anything usable.

    with this approach you can find something valuable in every few billion years, I think.

