How to recover information from pixelized screenshots using Depix with Python

Learn how depix works, a tool that helps you to recover information from pixelized screenshots using Depix.

Yeah well, why would you try to obtain information from the text of a picture that has been pixelized using an image editor? Normally, you would be scared if you knew that the information that you're trying to hide could be obtained by some malicious person, however, such a thing is impossible right? ... Right?????? A couple of months ago, Sipke Mellema, an Information Security Consultant showed that is possible to determine the text of an image that has been pixelized through a deterministic algorithm.

Depix is an awesome and innovative tool for recovering data from pixelized screenshots that could probably scare the hell out of you. It works on the images that were pixelated through a linear box filter. For more information about Depix, please visit the official repository at Github here.

How to use it

You need to extract the pixelated blocks from the screenshot that is pixelated, only the text.
You will need a picture with the same font, font size, color, and style that was used before the pixelation, this will increase the chance of obtaining something that makes sense. This picture will contain a De Bruijn Sequence from all the possible characters that can be recognized in the pixelized image.
Use Depix with the 2 input images that you have.

Keeping that in mind, let's get started with the most simple example to show how this library works. Start cloning the source code of Depix with git:

git clone https://github.com/beurtschipper/Depix.git

You need to install Pillow before using Depix. If it's not installed in your system, install it with the following instruction (note that you need Python):

python -m pip install --upgrade Pillow

The following command will run the default test that you can find in the repository. Switch to the directory where you cloned the source code and run the following command:

python depix.py -p "./images/testimages/testimage3_pixels.png" -s "./images/searchimages/debruinseq_notepad_Windows10_closeAndSpaced.png" -o output.png

The input image, set with the parameter -p will be the following one (testimage3_pixels.png):

The original version of the image is the following one, so you can keep a reference of what should be the output:

Then, specify the source of characters (debruinseq_notepad_Windows10_closeAndSpaced.png) with -s, which is the following image:

So, with the parameters that we provided, the output image (output.png) created by Depix will be the following one:

Incredible isn't it? You can identify the "Hello from the other side" text that used to be originally in the image.

What this tool isn't

Depix isn't a magical solution that will automatically discover the text that has been pixelated by any tool in 100% of the cases. There are a lot of scenarios where it simply doesn't work and never will (thank god for the privacy sake of the people) with the current logic, for example, if instead of pixelating the image with the python script included in the repository, you decide to use an external tool, let's say Paint.NET, Using a pixelation scale of 4 from 100, the pixelated image will the following one:

Trying to extract the information from the pixelated image with Depix, would generate the same pixelated image, even though the dictionary of characters has the same font style, color, and so on, even though we made the screenshot with the same tool and we even included the same phrase on the image of the characters. There are some other cases where it will be clearly nothing to do, for example, how would you be able to extract the information from a picture with the same text but with a higher cell size?:

Now you get it, right? It depends as well of the type of pixelation that was used originally, some tools instead of using the selected pattern, they will simply inject random black and white pixels, so there's no entry point.

And please, don't get me wrong, the library is totally awesome and does incredibly well its job for such a complicated thematic as extracting information from where there shouldn't be any. Under certain conditions and environments, you will surely find how to achieve something with Depix.