Couple days ago, a video, ad hoc named
11B-X-1371, containing some hidden puzzles, went viral. See original source or Reddit thread for more information on that. Most of these puzzles aren’t very sophisticated (some seems to be), but never mind. Some of them were hidden within an audio track and that’s pretty neat. It could be done better though. And it’s something I’d like to cover here.
Using sound for storing miscellaneous data isn’t of course something new, for example take a look at old analog modems or ZX Spectrum filesystem, but this technology have seen better days. However, a whole new field of applications might has been opened by the steganography.
I’ve written a python script for encoding images to sound files whose spectrograms look like these input images. Let’s take the eye picture from the header of this page and encode it into a wav file. Let me just convert it into a proper format:
Windows 24-bit bmp file. I also reduced its size and added some secret message.
$ ./imageEncode -i eye.bmp -o eye.wav Input: eye.bmp Output: eye.wav Pixel per second: 15 Max amplitude per sample: 300 Image Width: 200 Height: 106 Frequency Interval: 186.792452830189 Samples per Pixel: 2940 Generating wave file
Let’s take a look at file’s spectrogram (I use Sonic Visualiser for that):
It works just fine, but
wav is pretty big. What about
Definitely usable and only
200KB in size. Let’s listen to it:
It’s hard to stand it actually. Can we make it more bearable? Can we hide it within a song so people won’t know we did?
Doing that may a bit tricky and involves much experimenting, but in some cases may be done. Listen to that piece of music (warning: it’s not about its artistic values (sic!)).
Let’s brutally merge our secret message with that music (320kbps):
Maybe it would fit with some experimental electro song, but in this case it sounds just awful. It’s easy to notice that we lost some valuable data at the bottom of the spectrum. Still, it’s readable. Now, to hide our message better, let’s sacrifice the bottom. What about
-36dB on the message track?
More losses, but in higher frequencies data still may be successfully stored. And it’s harder to notice that there’s something wrong with that audio file (it just sounds like a low quality recording).
But hey, do you remember we got two channels to use? We can for example move our song
50% left, message
100% right and then analyze just the right channel with our software. I also applied
-10dB to the message track:
On speakers it sounds great, but on headphones you can probably still hear something strange with your right ear.
Another try, I lowered gain to the level I can’t hear anything on my headphones when I’m playing only the message’s channel (with reasonable volume level, both channels centered). Let’s take a look:
And still, much data have survived. By the way, MP3 cuts frequencies range at about
20k Hz. If proper codec is used, much data can be stored in ranges that you won’t even hear. This article covers however most popular ones.
This one is
-60dB at the message,
100% left music,
100% right message. This spectrogram is made only from the second channel.
No audio in the second channel? That’s suspicious. Let’s try with
-60dB at the message,
90% left music,
100% right message.
Same audio settings, but output file saved to
And this is probably the best one I created but experimenting just has begun.
This secret message is 13 seconds length as an audio. Let’s say we got 2 hour movie. We can store a lot unnoticeable, (yet unimportant, since there are too many problems with methods described to use it for serious purposes) data there. And that’s fun.
I’ve rewritten the encoder script, it’s available here: https://github.com/solusipse/spectrology. Now it’s possible to select range of frequencies to be used and all popular image codecs are supported.
All spectrograms above can be produced from files made with
spectrology. Take a look at spectrogram below. It was produced from audio file made this way:
python spectrology.py test.bmp -b 13000 -t 19000
Message track’s gain was lowered to
-40dB level. Both tracks are centered. It’s the best method I’ve tested - image’s quality is great, no noise can be heard. If you would like to test described methods, I would recommend this last one.