Audio and video series 4: how to get started with ffmpeg for beginners (taking FLV decoding H.264 as an example)

Audio and video series 4: how to get started with ffmpeg for beginners (taking FLV decoding H.264 as an example)


The first is installation. You can choose to install package manager (in ubuntu, apt get) or use source code compilation. The difference between the two is that source code compilation can see the source code, while package manager can't. for novices, I prefer to use source code compilation.

Official website source compilation

Precautions for compilation and installation:
Let's talk about a comparison pit. The compilation guide installed the snapshot version for me, not the release version. Then there will be two problems: one is easy to bug, the other is unable to view the version number of ffmpeg. To find the corresponding api, you can only look at the header file, not the documentation (although I haven't seen the corresponding version of the documentation after reinstallation, maybe ffmpeg and opencv are different, but it doesn't matter. The release version is better than the snapshot version).

Fortunately, with this compilation method, it is easy to update. You only need to delete the relevant package, update the installation package, and compile it again. Therefore, I downloaded the latest release version on github, recompiled ffmpeg, and updated it to the current release version 4.1.5.

Upper hand

Run demo directly

For the guy with programming experience, it's natural to run demo directly, run out and watch it slowly. I do the same. If you search directly, there will be some uneven articles. Then I recommend you to read Raytheon's blog. Raytheon's blog contains a lot of demos, but I think the related knowledge is still less popular. After running the demo, you will find that you can't understand the meaning of the code. At this time, you need to advance to the second stage.

Lei Xiaohua blog:

Tutorial of ffmpeg Library

First of all, you need to understand the function of the basic classes of ffmpeg library and the data flow of ffmpeg. There is a Tutorial on github that does a good job. Through it, you can understand the function of some basic classes and how to operate ffmpeg. In this way, you can basically understand the basic operation of ffmpeg, but this is not enough. You don't have relevant professional knowledge. At this time, you are the advanced To the third stage.

Learning of professional knowledge

For example, what's the difference between H.264, yuv and rgb? What's pts and dts. These knowledge can let you deal with the occurrence of some errors, or quickly find the errors. But if you just want to run through a demo simply and don't want to go deep into it, then it's unnecessary to learn systematically. However, if there is a mistake like this, you may go around and don't know about it. You can only solve it by asking people. Wikipedia or some professional books are recommended here.

Look at the source code of ffmpeg

Sometimes google can find other people's demos, but not all functions have demos on the Internet, but the source code must have them. Looking at the source code allows you to quickly know how to use this function.
This source code is in fftools under your ffmpeg source directory.

Let me show you how to use the api quickly by giving you an example.

Give an example

discover problems

Suppose you are compiling the correct code, but with the following warnings:
xxxx is deprecated [-Wdeprecated-declarations]

For example, I am in Section 2 of audio and video series The code used in contains the following errors:
av_bitstream_filter_init(const char*)' is deprecated [-Wdeprecated-declarations]

At this time, we need to search through Google. What does this mean. You will find that the reason for this warning is that the API is out of date. Of course, the word "prepare" has an out of date meaning. Students who are good at English can find it directly, which is the original meaning.

Then you'll find all the functions in your code that will give you a prepare warning.
For example, in my second section of code, there are several functions that are out of date.

What are these functions for? It should be noted that AVPacket obtained by ffmpeg decoding only contains video compression data, and does not contain relevant decoding information (such as sps pps header information of h264, adts header information of AAC). Without these codecs, it cannot be recognized and decoded. In ffmpeg, the above functions are used to add decoding header information for these avpackets. Refer here.

Find API documentation

API document:
Here's how to use the new API for related functions.
In the upper right corner, there is a search box: search for just out of date functions, such as: AV? Bitstream? Filter? Init

As you can see from the figure below, this function is replaced by the new version of the function. We need to change the original function to: AV bubsf get by buname(), AV bubsf alloc(), and AV bubsf init()

Click to know how to use its function.

OK, even if I know the related functions and the way to use them, I still can't use them. What should I do? Do you have a demo, brother?

Yes, of course!

ffmpeg source code starts

Open the fftool / folder in the ffmpeg directory as we just mentioned, and you will see that there are many. c files. We have priority to view the ffmpeg.c file. If this can't be found, we can find others.

Use ctrl+f to locate the new api and see how it is used.
For example, just found AV ﹣ BSF ﹣ init, as shown in the following figure:

OK, now you finally know how to use the api, and then it's the code.

Compiled as follows:

Modified source code

About the code of pulling the RGB of mango station video flow, update it below. Interested partners can compare the code below with the code in Section 2 and the corresponding code of ffmpeg.c. you will find that ffmpeg.c has actually told you how to use it. It's good to modify and use it.

#include <iostream>
extern "C"
#include "libavformat/avformat.h"
#include <libavutil/mathematics.h>
#include <libavutil/time.h>
#include <libavutil/samplefmt.h>
#include <libavcodec/avcodec.h>
#include <libavfilter/buffersink.h>
#include <libavfilter/buffersrc.h>
#include "opencv2/core.hpp"

void AVFrame2Img(AVFrame *pFrame, cv::Mat& img);
void Yuv420p2Rgb32(const uchar *yuvBuffer_in, const uchar *rgbBuffer_out, int width, int height);
using namespace std;
using namespace cv;

AVFilterContext *buffersrc_ctx;
AVFilterGraph *filter_graph;

int main(int argc, char* argv[])
    AVFormatContext *ifmt_ctx = NULL;
    AVPacket pkt;
    AVFrame *pframe = NULL;
    int ret, i;
    int videoindex=-1;

    AVCodecContext  *pCodecCtx;
    AVCodec         *pCodec;

    const AVBitStreamFilter *buffersrc  =  NULL;
    AVBSFContext *bsf_ctx;
    AVCodecParameters *codecpar = NULL;

    const char *in_filename  = "rtmp://";   //Mango station rtmp address
    //Const char * in_filename = "test. H264"; / / Mango platform rtmp address
    //Const char * in_filename = "rtmp: / / localhost: 1935 / rtmplive"; / / Mango stage rtmp address
    const char *out_filename_v = "test1.h264";	//Output file URL
    if ((ret = avformat_open_input(&ifmt_ctx, in_filename, 0, 0)) < 0)
        printf( "Could not open input file.");
        return -1;
    if ((ret = avformat_find_stream_info(ifmt_ctx, 0)) < 0)
        printf( "Failed to retrieve input stream information");
        return -1;

    for(i=0; i<ifmt_ctx->nb_streams; i++)
            codecpar = ifmt_ctx->streams[i]->codecpar;
    //Find H.264 Decoder
    pCodec = avcodec_find_decoder(AV_CODEC_ID_H264);
        printf("Couldn't find Codec.\n");
        return -1;

    pCodecCtx = avcodec_alloc_context3(pCodec);
    if (!pCodecCtx) {
        fprintf(stderr, "Could not allocate video codec context\n");

    if(avcodec_open2(pCodecCtx, pCodec,NULL)<0){
        printf("Couldn't open codec.\n");
        return -1;

        printf("Could not allocate video frame\n");

    FILE *fp_video=fopen(out_filename_v,"wb+"); //For saving H.264

    cv::Mat image_test;
    // Old API
   /*//AVBitStreamFilterContext* h264bsfc =  av_bitstream_filter_init("h264_mp4toannexb");

    buffersrc = av_bsf_get_by_name("h264_mp4toannexb");
    ret = av_bsf_alloc(buffersrc,&bsf_ctx);
    if(ret < 0)
        return -1;
    ret = av_bsf_init(bsf_ctx);

    while(av_read_frame(ifmt_ctx, &pkt)>=0)
        if (pkt.stream_index == videoindex) {
            //Old API
//            av_bitstream_filter_filter(h264bsfc, ifmt_ctx->streams[videoindex]->codec, NULL, &, &pkt.size,
//                             , pkt.size, 0);

            ret = av_bsf_send_packet(bsf_ctx, &pkt);
            if(ret < 0) {
                cout << " bsg_send_packet is error! " << endl;
            ret = av_bsf_receive_packet(bsf_ctx, &pkt);
            if(ret < 0) {
                cout << " bsg_receive_packet is error! " << endl;
            printf("Write Video Packet. size:%d\tpts:%ld\n", pkt.size, pkt.pts);

            //Save as h.264 this function is used for testing
            //fwrite(, 1, pkt.size, fp_video);

            // Decode AVPacket
                ret = avcodec_send_packet(pCodecCtx, &pkt);
                if (ret < 0 || ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
                    std::cout << "avcodec_send_packet: " << ret << std::endl;
                //Get AVframe
                ret = avcodec_receive_frame(pCodecCtx, pframe);
                if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
                    std::cout << "avcodec_receive_frame: " << ret << std::endl;
                //AVframe to rgb

        //Free AvPacket
    //Close filter old API
  //  av_bitstream_filter_close(h264bsfc);

    if (ret < 0 && ret != AVERROR_EOF)
        printf( "Error occurred.\n");
        return -1;
    return 0;
void Yuv420p2Rgb32(const uchar *yuvBuffer_in, const uchar *rgbBuffer_out, int width, int height)
    uchar *yuvBuffer = (uchar *)yuvBuffer_in;
    uchar *rgb32Buffer = (uchar *)rgbBuffer_out;

    int channels = 3;

    for (int y = 0; y < height; y++)
        for (int x = 0; x < width; x++)
            int index = y * width + x;

            int indexY = y * width + x;
            int indexU = width * height + y / 2 * width / 2 + x / 2;
            int indexV = width * height + width * height / 4 + y / 2 * width / 2 + x / 2;

            uchar Y = yuvBuffer[indexY];
            uchar U = yuvBuffer[indexU];
            uchar V = yuvBuffer[indexV];

            int R = Y + 1.402 * (V - 128);
            int G = Y - 0.34413 * (U - 128) - 0.71414*(V - 128);
            int B = Y + 1.772*(U - 128);
            R = (R < 0) ? 0 : R;
            G = (G < 0) ? 0 : G;
            B = (B < 0) ? 0 : B;
            R = (R > 255) ? 255 : R;
            G = (G > 255) ? 255 : G;
            B = (B > 255) ? 255 : B;

            rgb32Buffer[(y*width + x)*channels + 2] = uchar(R);
            rgb32Buffer[(y*width + x)*channels + 1] = uchar(G);
            rgb32Buffer[(y*width + x)*channels + 0] = uchar(B);

void AVFrame2Img(AVFrame *pFrame, cv::Mat& img)
    int frameHeight = pFrame->height;
    int frameWidth = pFrame->width;
    int channels = 3;
    //Output image allocation memory
    img = cv::Mat::zeros(frameHeight, frameWidth, CV_8UC3);
    Mat output = cv::Mat::zeros(frameHeight, frameWidth,CV_8U);

    //Create buffer to save yuv data
    uchar* pDecodedBuffer = (uchar*)malloc(frameHeight*frameWidth * sizeof(uchar)*channels);

    //Get yuv420p data from AVFrame and save it to buffer
    int i, j, k;
    //Copy y component
    for (i = 0; i < frameHeight; i++)
        memcpy(pDecodedBuffer + frameWidth*i,
               pFrame->data[0] + pFrame->linesize[0] * i,
    //Copy u component
    for (j = 0; j < frameHeight / 2; j++)
        memcpy(pDecodedBuffer + frameWidth*i + frameWidth / 2 * j,
               pFrame->data[1] + pFrame->linesize[1] * j,
               frameWidth / 2);
    //Copy v component
    for (k = 0; k < frameHeight / 2; k++)
        memcpy(pDecodedBuffer + frameWidth*i + frameWidth / 2 * j + frameWidth / 2 * k,
               pFrame->data[2] + pFrame->linesize[2] * k,
               frameWidth / 2);

    //Convert yuv420p data in buffer to RGB;
    Yuv420p2Rgb32(pDecodedBuffer,, frameWidth, frameHeight);

    //Simple processing, canny is used for binarization
//    cvtColor(img, output, CV_RGB2GRAY);
//    waitKey(2);
//    Canny(img, output, 50, 50*2);
//    waitKey(2);
    // Test function
    // imwrite("test.jpg",img);
    //Release buffer

cmake files will not be placed here, please refer to Section 2.

46 original articles published, 51 praised, 80000 visitors+
Private letter follow

Tags: codec github snapshot OpenCV

Posted on Mon, 16 Mar 2020 23:17:02 -0700 by danrah