Video Analysis With FFmpeg
FFmpeg is an open source software program that allows you to manipulate audiovisual files without the need for expensive editing tools. It can do everything from encoding to extracting metadata.
FFmpeg allows you to modify many aspects of your file through the use of flags. These can include things like framerate, for example.
Basics
The DH community is often focused on applying digital methodologies to audiovisual materials. FFmpeg is an invaluable tool to those looking for ways to manipulate these materials. In this article, we will explore a few basic commands that can help us better understand how to work with these files.
To start off, let’s load a sample video file. This sample is an old, distorted voice recording, and will give us a good example of different amplitude ranges in a waveform.
Use this command to set the pixel format for the output. This will affect the way ffmpeg interprets all subsequent filters, including automatically-inserted ones and those within filtergraphs. It also sets the default flags for the libswscale library (as opposed to the default flags specified by the -pix_fmt option). This will also effect automatic conversions inside filtergraphs.
Object detection
The FFmpeg[1] media framework is an industry de-facto standard for its comprehensive support on video (and audio) streams formats and encoding/decoding capabilities. This makes it a natural choice to extend for analytics workloads.
The -advanced options are used to control inference at the frame level and between filters in filter graphs. Using advanced settings may improve the performance and accuracy of inference.
Input the hardware device with name for all filters in the filter graph. This will typically be the hwupload or hwmap filters, although other filters which require hardware devices can use this option too.
Show benchmarking information at the end of the encode. The real, system and user time and maximum memory consumption will be shown.
Set the fraction of decoding frame failures across all inputs which if crossed will make ffmpeg exit with an exit code 69. The value should be a floating point number.
Object recognition
FFmpeg can read from all sorts of inputs: your webcam, a file, a stream, a Blackmagic DeckLink and more. It can also output to files, streams, pipes and sockets – the options are endless.
The main parts of the FFmpeg framework (demuxer, decoder, color space conversion, scaling and asyncronized inference) are optimized for parallel execution. Moreover, the -filter_complex option can be used to create filter graphs that have multiple inputs and outputs; these can be mapped manually or automatically using the -map option.
Graph filters are very useful in detecting various errors. For example, the YUV and diffs can be used to detect light/dark issues in your video, while the TOUT filter is excellent at detecting white speckle patterns that may occur on damaged VHS tape. These types of errors are easy to detect and can be fixed quickly by utilizing these powerful tools.
Segmentation
FFmpeg is an open-source software that can be used to analyze a video and segment it into multiple smaller segments. This can be helpful for OTT applications that require low latency and high speed. It can also be used to create a different output format with the same content and metadata. Besides segmentation, ffmpeg has many other capabilities. This is an open source tool that can be accessed in the default repositories of most Linux distributions and is free to use.
Specifies the split point time in seconds. This is the time at which a new segment will begin. This value must be a positive integer.
Forces the segmenter to always start a new segment on a key frame. This is to avoid a gap between segments. Note that this option can cause the segmentation to behave unpredictably on some players. It may also be incompatible with some codecs. This option should be combined with -enc_time_base, or a key frame can end up being set too early due to rounding problems.