Proving it is fake
This topic contains 3 replies, has 1 voice, and was last updated by Ray 2 months, 1 week ago.
April 11, 2017 at 7:06 pm #14801
Let’s talk about three things :
1°) When I search for “bob”. In “exact match” section, I have my “bob”. However, if I use only the location link of the book, and go the page, there is only “bob” on the page while absolutely EVERY OTHER pages is always made of 3278 characters.
2°) If I search “il drache dans el chnord hein comment ksa va min pti quien hein ptite bagette o fromag de chevre allegrement langoureux au sein de la fusion torride des sentiments du divin pretre connu sous le nom de einverstanden” which is a mix of misorthographied and nicely written french words with some words specific to my region, with a word in german, the “english generation” find it… which is not possible since the words are absolutely not english.
3°) The query in random characters always show the matched query half at then end of the 20th line, and other half at the beginning of the 21th line
4°) The hexcode is always the same length for each category. 3254 for random characters (so impossible to browse it by hand)
5°) It is not possible in programmation to search a query between 3254*6*5*32 pages in less than 2 seconds. Even supercomputers can’t do that.
Query > Generates a nicely encrypted hash of 3254 characters that contains the query (remember : 3200 characters maximum), calculate the lenght of the query to know where to split it, then generates a page of random characters, put the query and fill with the other characters. (works with the english words too, but didn’t check the hash length)
HashCode Query > The hash contains all the informations needed, including location, and other books around to keep the obfuscation perfect.
Browsing > Fully random generated characters
Random > Fully random generated characters
Note to moderation : Freedom of Expression. Nothing is disrespectful.April 11, 2017 at 7:07 pm #14802
Well, ended up with more that 3 things because I found the others on the way and forgot to change the beginning, sorryApril 11, 2017 at 11:54 pm #14810
You are, in fact, absolutely right when you say that it is a generator, not a storage.
Though, ‘fake’ is a word that is a little off here.
I reverse-engineered the process and got the following:
The string of text you search is converted to a bytes, then the same bytes are converted to an integer.
Then it is wholedivided by number of pages in the book and the leftover is stored as a page number
Then by number of books on the shelf, same, we got a leftover as a book’s number.
Then by number of shelves on the rack(bookcase)
Then by number of racks(bookcases) in the room
And then some magic happens, the rest of the number is converted to base16 integer (hex) which is now considered a room’s address.
And then it all is put together in a fancy way. R:BC:S:B:P
See what I did there?
Now, assuming you got your address by yourself or from a friend:
Hex to integer
Then we multiply it by the number of bookcases in room and add the bookcase number from address
Then the same with shelf
Same with book
Same with page
Then just convert it to a regular base10 integer and convert the integer back to text.
About #1 of yours, ‘Bob’ is not everything what is really on the page, it is, most likely, ‘Bob’ + 3275 spaces (3278 – the length of word ‘Bob’).
The pages nearby are, of course, filled with random stuff, but since the hash doesn’t change, they won’t change too. So it is semi-legit.
Now, to the point. Even considering the fact that the Library of Babel does not contain anything really, no knowledge, no nothing, even then you can find what you want. The only thing you need is to know what exactly are you looking for.
That is the reason you can find gibberish text and wrong facts, because you were looking exactly for them in the first place. I hope this bit covers your #1 and #2.
As of #3, I believe that when you specify “with random characters” they are added one by one from each side until the limit of 3278 is met, and then the hashing occurs. That’s why you get your text in the middle of 40 lines (which is 20 or if a text is a bit longer, 21 line). This is semi-legit too, but yeah, it would be nice to randomize where the text is placed when it comes to random generation.
#4 seems a bit off, but I didn’t check this, can’t say anything about it really
#5 made me giggle a bit (2 seconds is bit long for the operations. Maybe that’s because of code quality, I don’t really know). But yet again, you are right, the data is not actually being ‘searched’, it’s being generated during these 2 seconds.
That’s it :]
The other thing that I was concerned about is that the author states that there is every possible book in the LoB. Good luck trying to find any book in any other non-iso-8859-1 language :]
Still, I think it is a software engineering and philosophical marvel, and I respect the creator of the idea and its realization and the developer.April 17, 2017 at 7:43 pm #14911
1. Space is a character. The idea behind exact search is that it’s your query but with every character other than it being a space. In a character set that includes spaces, it’s natural that there will be a page that is nothing but spaces, and every variation but with 1, 2, or more letters in different places.
2. That’s because it’s your search query, plus random English words. In order for the algorithm to reverse engineer itself to determine what page your query is on, it must have a full page of text. In order to obtain this full page, it either
a) adds spaces (exact search)
b) randomly generates characters around your query (with random characters)
c) randomly generates English words around your query (with English words)
Once the site has combined everything into a String matching the page length, it can then reverse engineer the algorithm to determine a location where that page would appear.
3. This is a result of the page creation stated above, which is not currently designed to randomize location.
5. You don’t understand the library. Read the theory and etc. please, the site explains its own design. Nothing is physically stored, but everything is kept consistent so that all truths and lies can be found in the same places every time you visit a location.
This library is less about what exists but hasn’t been found, it’s more about what CAN exist but hasn’t been found. In a sense, it’s basically the same thing either way.