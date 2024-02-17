Services like YouTube offer unlimited video uploading without limitations, though that makes a lot of sense. I'd wager most people wouldn't pay to upload on YouTube, especially given that most videos on YouTube get a small number of views. What if I told you that there was a way to convert your files to videos for uploading to YouTube, and that you could then decode them later?

To be clear, this is not something you should do seriously, and you definitely shouldn't rely on YouTube as a storage medium. Not only does it turn out to be a negative compression ratio (as in, the files end up bigger than when you started), but it's something that YouTube could crack down on in the future if it was done in droves. However, this article will hopefully serve as an interesting dive into the way files are stored, and the wild and wacky ways they can be distributed.

The source code for the programs I created to encode and decode files into videos is below, if you want to give it a try yourself! With the power of technology, I successfully turned a 7KB image into a 9MB video!

We do not recommend using YouTube as a file storage medium. Not only can it be insecure, but it is also not an efficient way to store files or retrieve them. This is an incredibly impractical way to store files.

Understanding files

A basic primer on how files are stored

At its core, a file is a sequence of bytes, and when it is split, this sequence is divided into smaller, manageable segments. This division is done in such a way that each segment is an exact, contiguous subset of the original file's byte sequence. The process is inherently lossless, meaning it does not alter the content of the bytes themselves. As long as these segments are reassembled in the correct order, the original file can be perfectly recreated.

The technique that we'll use to split files up into individual "frames" of a video is to operate on files at the binary level; that is, treating files as mere collections of 1s and 0s. This binary consistency means that the process is agnostic to the file's content—be it text, image, video, or any data type. Splitting and merging directly interact with binary data, ensuring that operations are universally applicable and content-independent. By avoiding any compression or encoding that would alter the data, we can ensure a perfect reconstruction of the original file, preserving its exact state without loss of information or quality.

Maintaining the correct sequence of the split parts is the most important part of the process. In the case of our videos, we'll ensure that file parts are stored frame by frame in a contiguous line, so that the first frame is the first part, the second frame is the second part, etc.

Trying out QR codes

QR codes didn't work

I started this project by investigating if I could use QR codes for this project. The idea was simple: cut a file into "chunks" that could be stored in individual QR codes, stitch those together in a video, and then decode the video afterward. Python has a QRCode library that I thought would work well, and while it technically did, it had some problems. Most notably, the maximum file size I could fit into a QR code was 2.9KB, and this was with QR code version 40. For reference, these are massive QR codes. When decoding, the QRCode reader in the Python library had trouble validating the QR code, which I learned was a result of video compression and the small pixels that the QR code uses.

With that, I was back to square one. If I decreased the size of the QR code (such as using version 20), the video would end up too large, and it would take a lot of time to encode files into hundreds of QR codes. I had to come at this from another angle, though I had a much better idea for this that would be proprietary but could fit much more data into a single frame.

Building our own format for storing data

Using pixels as '1s' and '0's

Your browser does not support the video tag.

Instead of trying to use an already-established format, I opted instead to just... make my own. It's not anything special obviously, but what I did, on a very basic level, was create a format that was capable of holding 0.26MB per frame. For every frame of the video, a black pixel represents a 1, and a white pixel represents a 0. With a 1920x1080 video, that means there are 2073600 potential bits of information, which works out to be 0.2592MB.

I wrote a Python program that essentially generates canvases of 1s and 0s by reading the binary data of a file, and when it fills all of the pixels, adds it to a list of frames and moves on to the next. Once done, I stitch them all together in an MP4 format one after another, where I can then use a decoder that essentially takes it and reverses the process. The above video is the result of this generation being executed on a ZIP file of the US census data from 2000 to 2020, and I added the music in the encoder seeing as it makes no difference to the video quality. This is disabled in the video above, but can be enabled in the source code.

There are drawbacks to this method though, with the biggest being that compression will make it difficult to get data back. To get around this, I modified my encoder to make each 1 and 0 be five pixels across and five pixels down. This significantly decreases the amount we can store per frame, knocking it down to 10kb, but makes it much easier to retrieve data even when a video is compressed. That's still three times the size of what each QR code was capable of, so it's still a win.

I ran into struggles getting some data back from YouTube thanks to its compression, but I was able to verify that the metadata and file data was in place when using a hex editor. Checking the integrity of the ZIP file revealed that it wouldn't unzip because the file size didn't exactly match. I'm sure this script can be refined over time to add data contingency too, such as generating two frames for every "chunk" of information and averaging the data between the two.

As it stands, the best contingency I could make across each frame was to calculate an "average" value of 0 or 1 for each 5x5 block of pixels. This worked for standard MP4 compression, but YouTube is a different beast. I'm not the first to do this mind you. Other tools out there use more advanced techniques to do the same thing, such as this application written in Rust, but YouTube's compression can mess up those results, too.

With smaller files, I was able to retrieve the data without any problems. For example, I turned a regular image into a video, which I was able to upload to YouTube, download back from YouTube, and turn back into the original file. The SHA-256 of both files came back differently, suggesting differences in the files, but the image was able to be opened. I suspect part of the reason why the SHA-256 came back differently is that the last frame of the video will have a lot of white space, which the decoder then interprets as a collection of 0s rather than the end of the file. I was able to confirm this in a hex editor.

Running our encoder and decoder

We have a working solution you can try out

If you want to give it a try, I've published my source code for this on GitHub that you can use! You need to specify the file names in the source code for both input and output, and it only runs in a single thread. It's also fairly RAM intensive, with a 100MB file using more than 100GB of swap on my M1 Pro Macbook to generate a file... before failing.

Keep in mind that it's not a perfect program, and this is more of a proof of concept of what you can do with files. There are always creative ways to store files, and this is one of the most creative ways that you can. It'll work out of the box, but you may run into problems at times using it with YouTube. If you do, feel free to make changes and see if you can get it working!