Proof that the babel image archive and image search are fake

Home Forums Babel Image Archives Proof that the babel image archive and image search are fake

This topic contains 15 replies, has 2 voices, and was last updated by  James Kitching 4 months, 4 weeks ago.

Viewing 15 posts - 1 through 15 (of 16 total)
  • Author
    Posts
  • #22244 Reply

    TheSkeptic

    You can prove it yourself, that this whole image archive thing is a hoax. Most probably you have looked into the universal slide show in hopes of finding something, but most probably too you found nothing. Just colorful pixels randomly scrambled. But no, what we want to find are images of stuff different than some random pixels. Well, the “image search” function makes you think that it is possible, somewhat possible, if it found that image of whatever archive you submitted then there is a possibility of finding something, right?

    Well, no.

    Ignoring the fact that it would be near impossible mathematically speaking to find a relevant image of something, it simply isn’t possible. Maybe I don’t understand how this works, most probably I don’t, but of what I get you have all possible arrays of pixels in that space with those colors, so you have every possible visual outcome. But the outcomes shouldn’t repeat; if there’s one image at x location, then that image should not show up at y location.

    And here’s where the proof comes into place. The proof is very simple. I started the universal slide show. I downloaded the slide that showed up and it was at some x location fewer than 20 digits. Then I searched for the image ( of the slide that I downloaded ) with the search function, the same image of the random pixels that showed up to me before, and boom. It found my image at y location, a location with tons of digits. But if it showed up to me at x location, why did you find it at y location? You can try it yourself.

    That’s why I believe, or I would say that I know, that this all is fake. It all just makes you believe that it actually has all possible images, and you can even search for the images. Nah, it’s more like an image uploader like imgur or something like that. It’s like a cheap magician, like Oz. It’s just a foolish trick.

    Thanks for reading.

    (This was written as of 10 / 10 / 2017, if you try to prove it yourself later, the dev might have “patched it up” already, so you can’t uncover the trick)

    #22245 Reply

    TheSkeptic

    I forgot to say that I actually tried several times and every time the same thing happened. So it wasn’t just a lucky attempt or an error or whatever.

    #22246 Reply

    Delengroth
    Participant

    @TheSkeptic
    “Maybe I don’t understand how this works”

    Somewhat, but you also overlooked something. You see, in order for the image you search for to be found, the Library has to convert it to only contain 4096 colors. However, I’m fairly certain that Jonathan’s script does this without checking if it already contains only 4096 colors. Try repeating your experiment, but with an extra step: Search for the image you found at location Y. You’ll see it’s now in a new location Z. This will keep happening, because each time you search for the newly-found image it slightly reduces the quality, producing a “needs more JPEG” effect.

    I tried the experiment myself, and sure enough, that’s what’s happening. I used the Mona Lisa:
    After 1 iteration: https://babelia.libraryofbabel.info/imagebookmark2.cgi?mona_lisa_quality_loss_1
    After 10 iterations: https://babelia.libraryofbabel.info/imagebookmark2.cgi?mona_lisa_quality_loss_10
    After 20 iterations: https://babelia.libraryofbabel.info/imagebookmark2.cgi?mona_lisa_quality_loss_20

    You can see the difference between 1 and 10 easily. Between 10 and 20 is a little more difficult, but there are some pixels that flicker around the inside curve of the road to the left of her right shoulder, as well as some in the mountains right above the road. Perhaps there’s a point where it tapers off, and the compression algorithm doesn’t reduce the quality anymore.

    #22250 Reply

    TheSkeptic

    Hmmm you’re right. I didn’t take into account that the jpeg effect might happen, that’s probably the reason of what’s happening here. However, I still think that this is kinda “fake”. Why for every image you search, it’s always in some huge digit location? does that mean that in the first 10^800.000 locations there’s nothing relevant? Just pure pixels? I would like to see some relevant image in some of the first I don’t know 10^100 locations or something like that to disprove the belief that all interesting images are in huge locations.

    Also, thanks for the input!

    #22254 Reply

    TheSkepticIsRight

    Obviously this site is a farce. Obviously. The “About” section claims this “archive” contains approximately 10^{961755} images: this is wildly impossible on its face. I am not saying that the computation of how many images there would be if you drew all of them having 416 columns and 640 columns with 4096 colors is not equal to 4096^{266240}: for each pixel we have 4096 choices of color; we do this 416×640 times– here I am assuming 416×640=266240 and 4096^{266240}~10^{961755} as claimed, I am not bothering to do the arithmetic.

    Instead the point is that this number, 10^{961755} is vastly too large. There are only about 10^(80) atoms in the observable universe (Google it). Even if we could encode each of these pictures on a single atom– which is impossible, at least based on our current understanding of the number of degrees of freedom of any single atom versus the number of bits requisite to specify such an image– this would only comprise about 10^{-961673} percent of the possible images. This is a .000…0001%, where “…” is 961677 zeros.

    Anyway, at best, the only thing this site could be doing is computing a function that assigns a unique number to each of the theoretically possible 416×640 images in 4096 colors. So when you input a image, it computes what number in the list it would be if the site had already stored every image, which it hasn’t. Though, @TheSkeptic’s test would seem to indicate that it’s not even doing that…

    #22268 Reply

    Delengroth
    Participant

    @TheSkepticIsRight

    I think you might have misinterpreted the About page. Yes, it says the library “contains” all combinations of those images, but it doesn’t actually store them. That would be impossible. It would be more accurate to say it contains the possibility to calculate them (which you described in your last paragraph). It’s not a farce, just math.

    #22433 Reply

    TheSkeptic

    For whoever read or will read this thread…

    I did the experiment I posted about first, but instead of some random image of colorful noise (susceptible to the “more jpeg” effect), I performed the experiment with an image of a small black square over a white background (if anyone wants to see the pic I can post it). The background was plain white, pixels the same color each, the square was pitch black, with every pixel inside the square the same color, and around the black square there were some white pixels but of a slightly different color than the background, really slight tone variation.

    The testing:

    So I uploaded the image. The location of this image was at 906…815.

    I downloaded this image (it “should” be the same image that the one I first uploaded).

    I uploaded the “new” image. The location was at 240…833.

    Hmmm, different location.

    Repeat the process; Download the image (240…833), upload it.

    Now it showed up at 128…709. Different location again.

    Repeat the process.

    Now it showed up at 128…709. Same location.

    I did this two more times. 128…709 both.

    The analysis

    The subject image was really low in pixel diversity (plain white, plain black, some rebellious pixels around the square), so the jpeg effect shouldn’t really happen, but it did (the locations changed instead of remaining the same). However, it did just for the first 3 iterations. After the third, the location remained the same; it took 3 iterations of the experiment for the image to “stabilize”.

    What most probably happened was that those rebellious pixels gave room for the jpeg effect to occur (although very little). Had the image been cleaner it would’ve not happened.

    I actually did the experiment with an image of white. Just white, plain white, and the location didn’t vary.

    The conclusion

    When performing the experiment, the images undergo a type of “needs more jpeg” effect, which is the reason of why the “same” image shows up at different locations.

    The “cleaner” the image, the less it takes until it stops changing locations because there is less room for jpeg effect to happen.

    The babel image archive and image search function are not fake (or have not been proved fake for the moment, and may or may not be fake, as proving either can be hard). The experiment that I claimed “proved” these to be “fake” didn’t take into account the flaw of the jpeg effect, which is what tampered with the results. The proof that I first posted is not valid and it has been debunked.

    May you try your luck into proving the site to be either a farce or not.
    May you keep using the site under the assumption of your choice.
    May you enjoy sinking your free time looking at colorful noise.
    May you enjoy sinking your free time trying to find a meaningful image in the void of infinity.

    Stay skeptic.

    #22435 Reply

    TheSkeptic

    Still, I have the question about why all meaningful images (found with the search function) are at locations around 10^1000000 characters long. Aren’t there any meaningful images before such huge locations?

    When you enter the slide show, it loads a random image at a location 16 characters long. It’s never any longer or any shorter.

    Are there any relevant images in the 10^16 range or are they all far away?

    That’s my big concern, and my big doubt about the authenticity of the image archive. I will try to do some testing in this area because I really want to find some cool image but if they are all in the million character realm… well… it would be hard. I hope someone joins the cause. Any experiment you think could prove this, or any proof you have would be appreciated.

    Thank you all

    #22456 Reply

    Micah

    The reason your searched pics are in the highest possible number of digits (almost one million) is because most of the pics are there. Around 99% of the images have that highest possible number of digits OR the second most amount of digits.

    #22458 Reply

    TheSkeptic

    That’s my problem. If 99% percent of meaningful images are there (1% is a lot don’t get fooled, I would say 99.9999999… of images though), then I most probably will never find any cool image, which is what I want (and many others too I think). I still most probably wouldn’t find any cool image if the meaningful images were distributed equally or something like that, but how they we assume they are distributed right now takes the probability from impossible to even more impossible.

    I really just want to find something. I’d be happy with something other than noise. Though I most likely won’t find anything.

    #22878 Reply

    Skeptik-Hueptik

    I think that couse of conversion to the jpeg format is needs to save many images on the server’s drive. And it is fake. If it is not fake, then it is GREAT thing! Links to images has too small count of digits. It is best file compressor in the human history!

    sorry for my bad english

    #23189 Reply

    Doomfrost

    I wonder if this website is storing images rather than generating them.

    1. User uploads an image which is then converted to jpeg and stored on a server.
    2. The website assigns this image a unique string of numbers.
    3. If the sequence of numbers is called it links to the file on the server.
    4. If a sequence of numbers doesn’t have an image stored on the server the website generates a random image.

    #26088 Reply

    Jacob

    Sorry for bumping this after such a long time, but it didn’t seem to resolve TheSkeptic’s confusion. Micah did a good job saying that 99% of the images will fall in the last two orders of magnitude (max length of numbers or one less than the max length), but I think it is good to elaborate a little more.

    Say you have numbers ranging from 1 to 9999. 99% of them are going to have at least three numbers. This just means that numbers 100-9999 are 99x more frequent than numbers 1 through 99, it doesn’t mean that a picture is more likely to occur at the end or at the beginning. 100 is certainly not towards the end. It just looks like that on a logarithmic scale. Every time you increase by a factor of 10, you get 10 times as many values with that length. So between 1 and 999, 90% of the numbers are in the hundreds, 9% are in the tens, and only 1% are in the 1s. Add the numbers 1000 through 9999, and it reduces the proportion of the previous ones by a factor of 10. However, grouping up numbers like this is sort of meaningless. It is just as easy to group them as 1-9900, and 9910-9999. If you group it like this, 99% of numbers are in the first group, and only 1% are in the last group. Similarly, grouping numbers by length is just as arbitrary.

    Just to clarify, if there are 10^961755 images, and you are looking at the 10^961752nd image, you are less than 1% of the way done looking at all the images. The fun thing is that it is just as likely for a given image to be the first, last, 100th, or googolth image. However, if you group up all the static noise images into one group, they vastly outnumber all the images that can be interpreted as looking like something. I guess the silver lining is that once you have gone through the first 1% of all images, you have probably seen everything there is to see. (Each image has countless slight variations, one pixel difference here, two pixel differences there, etc., but even a trillionth of this archive is HUGE.)

    But then to get rid of that silver lining, if you were to spend your entire life digging through the images, you will almost certainly not find anything but static. To try to give a sense of scale, digging through these images for your entire life makes less of a dent than removing an atom from the universe, or even removing a googolth of a googolth of a … (maybe 10,000 times more) … of a googolth of an atom from the universe. Even though I don’t think anyone will ever find anything, I think this archive is really cool and mind blowing that seriously basically every 640×416 imaginable is inside it. All you need is the lucky number. You’re just more likely to pick the winning lottery numbers every day for the rest of your life than you are to pick one that corresponds to a meaningful image here.

    #26450 Reply

    SilverLining

    To bring back the silver lining, what if someone could ramp up the viewing of the images across hundreds or thousands of 4k monitors, each starting at a different point, and do some image processing to determine which images are suitable candidates for being “not just noise”? How long then until a meaningful image is found? Humans would only have to look at the candidates a good amount of definition instead of filtering through m/b/tr/quadrillions of images just containing noise.

    You wouldn’t even need physical monitors – there are tools that allow you to create virtual monitors of large resolutions (no idea how many could be created at once). As each image takes 3 seconds to load, we are more limited by the slideshow time than processing power.

    If we wrote a simple edge-detection program and let it run for years/decades, how many candidates would we be likely to get?

    #26451 Reply

    SilverLining

    Following on from previous post, just applying a Gaussian blur on a noisy image with a ~30 pixel radius results in essentially a totally grey image. If we could define a “grey tolerance” we could come up with a pretty simple filter for potentially-interesting images.

Viewing 15 posts - 1 through 15 (of 16 total)
Reply To: Proof that the babel image archive and image search are fake
Your information: