Wednesday 1 July 2020

The Ethics of Sources - Day 3 - ffmpeg and hex editing


  FFmpeg and hex-editing 
 
Welcome back to part 3 of The Ethics of sources the original talk can be found here - Ethics of Sources Day 3

Following on from the previous blog  I mentioned a command-line trick for editing values in a text file , sed , short for stream editor which if you remember looks something like the command below in bold. If you want to find what else sed is capable of just type in man sed in a linux terminal or there is a handy online page from those nice people at gnu.org here : Sed

  sed 's/-1/-2/g;s/0/32/g;' test.json > testhex.json

What this command does is search sequentially through a file from beginning to end looking for in the first instance the value -1 and replaces that with the value -2 this part 's/-1/-2/g , then redirects the output ( the > sign ) to a new file as copying the original onto itself results in a blank file , redirection is useful for all sorts of reasons not least of which is that you can redirect the output to a different directory which becomes handy if you are working on a lot of files ( which I'm going to be covering in another talk).


Now you can use the same command to say mess up a video file so lets go ahead and do that on one of the files we captured on Monday – adbreak .webm ( notice the extension, this points to the file being googles vp8 codec which is in fact what cheese webcam booth captures with) , navigate to the folder the file is in and open a terminal and issue this command :

  sed 's/-1/-2/g;s/0/32/g;' adbreak.webm > adbreaksed.webm 





We can open the end result in say VLC, as you can see though VLC wont play it, so lets try mpv , yep it plays but not well ( something else to consider is that different video players react differently to the same file – a hex edited file may not play in vlc but will in mpv , but mpv may not show the file as it is it may try to correct it so it plays as it should without damage ) but not much information there – webm and vp8 can be quite finicky with what you do to it so lets change some values ( remember we are still treating it as a text file at this point in some ways what we are doing is similar to the The wordpad effect )

So that didn't work so well but as we have the command in the terminal we can just flick the command back up using the arrow key and change some values.
Maybe these 'sed 's/oa/ob/g;s/0/0/g;' adbreak.webm > adbreaksed.webm'




When we play that back in VLC ( and it plays this time as VLC can actually read the file)  you can see its  much better more blocky and broken but still readable. 
 
Which raises the question how readable do we want a video ( or image or sound piece) to be ? Well from my point of view unless you are going for complete abstraction you need to have something left of the original to give people something to hold onto . But I also don’t think people are going to watch a whole video, I certainly don’t make them with linear viewing in mind , given the way that we exist online our viewing is by its nature fragmentary, we don’t just watch a whole movie, we watch a clip of it on YouTube or a meme made from it or even a still. I don't make videos intending them to be seen as a complete thing they are part of the ocean of information we float in
Given that we are treating the file as a text file does that mean its just a text file then. Well, with this command line script we are treating the video as a text file , and this does work but not with any great subtlety , as a video file is much more than just a text file, we need to change the script
  Hex – editing

Hex editing is one of the key methods that I use in my work , its a way of manipulating a file be it video or image or sound , in a way that reflects the true structure of the file by changing hexadecimal values avoiding those we know are essential for the file to be readable ( headers and footers etc ). We can look at and manipulate a file in a hex editor like bless, and its a good idea to begin with to look at how a file is constructed, just to get a feel for what to target and what to avoid. 
But I prefer to use a command-line editor like xxd which is fast and because in the bash terminal I can recall command with the replay function easy to change values quite quickly on the fly for example lets take the previous video and do multiple changes to the file while its playing in VLC on a loop.
And before we get into using xxd and the command above lets open a terminal and issue the command 'xxd -p adbreak.webm' to see exactly what a file looks like to xxd



Now lets use that command above in a loop and look at the resulting file



Now at the end of all of that lets bake that file with ffmpeg (although I suggested yesterday that handbrake and mencoder were the best methods for baking files lets have a look at what using ffmpeg on this damaged file doesit might work , sometimes it does , sometimes it doesn't)
Lets bake it with FFmpeg





And lets look at it  after baking. 



Now what is that command doing ? Lets break it down: the first part is fairly simple , xxd – p tells xxd to read the file adbreakhex.webm into standard output as a plain hexdump the | symbol pipes the output of xxd into whatever command or script is after the pipe , which in this case is sed , sed replaces the values we tell it to the output of sed is then fed through another pipe back into xxd to be turned back into its original form then the redirector symbol > outputs that into a new file we call adbreakhex.webm , its very important when using this method not to copy a file back onto itself or you will end up with a zero bytes file and will lose that file unless you have made a copy elsewhere . 
Now what goes between the two pipes is up to you it could be sed , it could be something else there are various things you could do ( for instance this is a script for batch pixel sorting ' find . -type f -name '*.png'|while read filename; do echo ${filename}; pxlsrt brute ${filename} /home/ian/texture2/${filename} --min 20 --max 30 --vertical --smooth --reverse --method hue ; done' which requires the installation of ruby , ruby stuff from synaptic plus this gem from here https://github.com/GlitchTools/pxlsrt).
This method is fast and clean but you have to have an understanding of how each codec will break to use this effectively so lets run through some codec types and see how they differ – I'm going to use the same file over and again just for comparison

Computational cost

Just as a side note – there is a computational cost in some of these codecs , in that some files take a lot longer to encode and need beefier hardware , longer files can take hours literally , depending on the age and speed of your computer , for instance msvideo1 will take a lot longer to encode than libxvid – a lot of time making glitch art is more about waiting around for the computer to finish an encoding run so match the codecs you use to the computer you have. And there is also size to take into account – how much hard-drive space do you have , rawvideo obviously takes a lot of space but then so does ogv and even webm when playing with high bitrates and quality .

As a side note with libxvid and a few other codecs you can add in or subtract from the file ie make it bigger or smaller using this method , libxvid responds especially well to this for instance ( xxd -p adbreak.avi | sed 's/0808/808/g;' | xxd -r -p > adbreaklibxvidhex.avi

so this is how different file formats look when using this method .

 First libschroedinger

 



 

H261



 

H265(hevc)


 
(This one is quite blocky and angular with a lot of horizontal break out striping and odd sideways stripes) 
Libxvid




Libxvid – interestingly with this one I subtracted from the file ie did a 0808 to 808 swap reducing the size of the file which led to something akin to datamoshing.
Msvideo1




Msvideo1 – Is the dinosaur of codecs and microsofts original video format. I love this one its quite cartoony and you get these beautiful almost echoes going on and that strange pixellated early 90’s feel
Snow



 
Snow is possibly the most beautiful of all codecs when hex edited, but its also one of the most abstract - often the artifacts you get bear no relation to whats happening in the video or the movement involved.
Lets take a look at my workflow and really push a file to its ends . I'll use one of the files I downloaded the other day, transcode it from h264 to h261 using ffmpeg , hex-edit it using the examples I've given transcode and bake it, then run it through Kaspar ravels tomato script that I talked about yesterday and show real time video of the processes as they happen and then some of the final result. 90% of glitch is waiting around for the computer to finish doing what you asked it to do , sometimes it seems like its doing nothing and then you get to poke it and hope you haven't broken the process halfway thru !
 Realtime glitching 



Add fresh Tomato



 
Edit in Flowblade




And the final film finished (short version)

 



 

The final film ended up being about 4 hours long , an unexpected side effect of tomato is that it can often timestretch a film, I did edit it down to 50 minutes in flowblade but then the file ended up being over 1gb in size and on rural broadband thats just not going to be viable for upload ( even over 4g) so I went with the original tomatoed file thus the length but I'm okay with that. The editor I'm using to edit this is called Flowblade find that here ( linux only as far as I know ) its my editor of choice -  https://jliljebl.github.io/flowblade/

 

The next blog post will be about Dividing by stills .























ikillerpulse ( requires tomato.py in working directory)

I've been working on this script for the last week or so. What does it do? It takes an input video/s,  converts to a format we can use f...