With us is better!
Development of the video codec H.264

Even it sounds banal, I believe that video codec development should start with English language study. However, if you are reading these lines, this matter for you is not urgent, and next 3 paragraphs you can skip.

Why English is so important for video codec development? Nobody in Russia is engaged in such things, well, maybe except Elecard Company, I have devoted a special note to brief analysis of its products and its comparison with analogues - Internet-TV system developer toolkit. But I haven’t seen any technical information from Elecard that could help us in our difficult work, so it is possible not to take it into account. All technical documentation, which will be required, is in English language only. Also to read Russian translation of the documentation - this is only do our heads number and it makes us sure once again that we have taken not right business, and throw the idea away.

But why not use the translation of all documentation? It's a reasonable question. Once I read Russian translation of Mpeg-2, iso 13818-1, 2 and 3 standards. I have to re-read the same proposals for five times, trying to understand what was written there - fruitlessly. Then I reject Russian version and begin to read an initial English version. And immediately, as if by magic, all were straighten out. The problem with translations is – a one who understands perfectly what it is about must translate the material. There are not a lot of such translators, and I think, in this case, they just do not exist.

As an example I can give words of one of the developers, which is also, as I, was trying to cope with the Russian translation of the documentation. It was about documentation to one of the chips, in which it was indicated that when soldering pins, one of the pins should be left to the left. Have not found any logical explanation why it is on the left instead of the right, he decided to look into the original English-language documentation. It turned out, that there was used a word “left”, which, apart from an adverb “left” can be translated as a verb “to leave”. And these blunders are in technical translations everywhere. Therefore, if an interpreter does not understand what he translates, the translation is better to be thrown out.

Let me cite basic resources that will be required for development.

Firstly, this is H.264 Part 10 standard. The standard rather volumetric, there are more than 600 pages. One will have to read it, to examine thoroughly, to understand. There is no other way.

Secondly, this is Ian Richardson’s book “Video coding. H.264 and MPEG-4 - standards of new generation”. By the way the Russian translation is very correct here, I have not met any slip-up, all is explained understandably. Also second edition of this book was issued, but without Russian translation. In general, if you have the second edition, the first edition you can leave to be unread.

Third, this is reference codec h264, which is available here http://iphome.hhi.de/suehring/tml/ The codec can be compiled for Windows, and for Linux as well. It has to be a basis for all my development.

Fourthly, these are software for H.264 streams analysis (Internet-TV system developer toolkit). Unfortunately, prices for these analyzers are calculated in amounts with many zeros for one workplace, therefore, we have not purchased any of them, although I had an opportunity to try them in work, and to make a choice. At the moment I have already received preliminary approval for necessary funds’ allocation for analyzer purchase, so, I hope that in near future, we will buy the analyzer. Still, I was satisfied only by the reference codec.

Well, the last. There must be a clear understanding of the fact that in this matter you can rely only on yourself and your team. In the process of work there will be many questions, answers to which you will have to find by yourself, not relying on someone else's help. Simply because there are few companies in the world which are developing video codec, and people who work in these companies, do not make their discussions on public. Even those few moments, by which one can manage to find finished solutions, can be proved to be unworkable.

As an example I can give such a situation. I was engaged in development of entropy coder. In the standard decoder algorithm was described, but coding algorithm was left to codec developers’ discretion. I.e., on the basis of decoding algorithm, I had to output correct coding algorithm, which turned out to be not such a simple matter. Firstly, as usual I began to search for finished algorithm in the Internet (why we should re-invent the wheel?), and I have found such algorithm. I.e. it was not finished code, which could not help much in this situation, because was depended too much on the context, but pseudo-code. In other words, it was just what I needed. I wrote realization program of this algorithm, everything started to work, but in some unpredictable moments the coder failed - incorrect sequence of bits was in output. It took me a lot of time, until I realized that the bug was present exactly in the pseudo-code. I don't know, maybe it was specially made there? They say there is nothing efficient for free. It may be so. But the fact is - you have to look for decisions by yourself.

One of peculiarities of algorithm of H.264 coding is the fact that only video stream decoding process is standardized, each developer is free to choose the process of coding in accordance with his preferences. This explains an existence of different codecs, with different characteristics on a bitrate, quality and performance. Accordingly, it has been developed a lot of various algorithms of optimal coding. During my working process I have read some dozens and even hundreds of scientific works, in which authors have suggested different variants of optimal coding algorithms choice. Here I would like to note one interesting moment.

I was sure before that all major developments in the field of video coding are held in the United States. Indeed, we all have heard such names as Google, Microsoft, Intel, Apple and other companies, famous for their achievements in the field of high technologies. Now, in reading process of scientific works in the field of video coding, practically I have not met any American surname, or any American research center or Institute. If anything of deserving an attention was found, it was referred to the beginning or to the middle of 90-ies. The Hindus have completed the greater part of developments. Next came South Korea and China, perhaps even Japan. For me, this was so unexpected that I wondered unwittingly - but at what expense the USA lives, as the USA is considered to be pioneers in high technology, but all real developments (in the field of video coding at least) are made in India and other Asian countries? The answer is - patents. The United States is strong in patents. I.e., developments are conducted in India, but the patent on the technology already belongs to the USA. By the way, I never met Russia. Although as it is well known a number of scientific works inversely to science progress.

Why I have to read all these developments? Got understanding basic approaches of the construction of these algorithms, got acquainted with experimental results, provided by the authors, I implemented some of them, watched, chose the best. Then I sat for a while, thought, counted up all flies on a ceiling, went with my wife and daughter on a river for a swim, pick up mushrooms, and output the technology, which best by efficiency in a number of conducted tests the algorithms given in the scientific papers. However, this does not mean that my video codec codes better than other professional video codecs. Indeed, it would be strange if the video codec, which I was developing for about six months, codes better than video codecs, which have been developed for decades. Of course it is not. I have got higher parameters only in one of a few modes of coder’s operation. Taking into account the fact that I have implemented only a small part of all the possibilities of H.264 standard, professional codecs, of course, work better. So, in Intra mode I have used only 4x4 and 16x16 blocks, and in Inter mode – only 16x16. Also I have not optimized the coding algorithm with a constant bitrate absolutely, so loss of image quality is noticeable in fast scenes. In general, I still have a great scope of work here.

It would be wrong not to touch upon those technologies, which are involved in open source video codec x264. Of course, I got acquainted with them. What has surprised me here? If the overwhelming majority of scientific developments which I have read are based on mathematical algorithms and calculation, i.e. use theoretical data as a base and then move to practice, in x264 everything is not so. Such the impression, that theoretical researches, preceded practical researches, were hushed up and were not to be made public, or they were not exist at all. This is not speaks about a poor quality of x264, quite the contrary, I consider this codec as one of the best, but it's just its feature. All criteria for choosing coding algorithms, an acceptance of one or another decisions about fork coding process are based on bare figures, which in the most part are not supported by theory. In those rare cases, when x264 developer clarified substantiation matters of these figures, he said that he used the results of experiments. I.e., that there have been many experiments, as a result of which it was chosen such value of such parameter as optimal.

So, the video codec is finished, it is remained to deal with its performance. You can read about it in the next note - Optimizing video codec for performance.

Igor, October 2012.

Users' Comments Development of the video codec H.264 (0)
Hide comments

Add your comment