Friday, 6 October 2017

Querying video file metadata with mediainfo

I am working on a script that will query media files (mp4/mkv videos) to obtain metadata that can be subsequently used to rename the file to enforce a naming convention. I use the excellent mediainfo tool (available in the standard repositories) to do this.

mediainfo has a metric tonne of options and functions that you can use for various purposes. In my case I want to know the aspect ratio, vertical height and video codec for the file. This can be done in a single command;

mediainfo --Inform="Video;%DisplayAspectRatio%,%Height%,%Format%"

This works fine and returns something like this;

1.85,720p,AVC

When I say it works fine I mean it works fine in 99% of cases. The other 1% are made up of files that contain more than one video stream. Sometimes people package a JPEG image inside the container which is designated internally as "Video#2". In such cases the above command will also return values relating to the JPEG image producing something like this;

1.85,720p,AVC1.85,720p,JPEG

When this happens my script breaks. The workaround for that is to pipe the results through some unix tools to massage the output;

mediainfo --Inform="Video;%DisplayAspectRatio%,%Height%,%Format%\n" "${_TARGET}" | xargs | awk '{print $1;}'

Things to note in the revised command. There is a carriage return ("\n") at the end of the --Inform parameters which will put the unwanted data on a new line like this;

1.85,720p,AVC
1.85,720p,JPEG

xargs will remove that line feed and replace it with a space;

1.85,720p,AVC 1.85,720p,JPEG

And finally awk will produce only the first "word" (space delimited) from the result, which produces the desired output.

1.85,720p,AVC

Now obviously this method assumes that the first video stream in the container is the one we are interested in. I'm struggling to imagine a scenario where this would not be the case so at this point I am OK with that. If I find a file that doesn't work I might have to revise my script, but for now I will stick with this solution.