Android - Decoding and Encoding video simultaneously
While recently working on a project from ThinkApps, we did a lot of research about rapid video encoding and decoding on Android and I want to share my findings in this post. The first obvious question is, why do we need hardware decoding and encoding for videos? The answer is simple: processing done by hardware will be a lot faster than what software could do.
As you probably know, the Android ecosystem is very fragmented with a lot of different software and hardware vendors on the market. There are a variety device manufacturers such as LG, Samsung, and HTC and different hardware manufacturers like Nvidia and Qualcomm which provide components like CPU and GPU chipsets for the devices.
When talking about hardware accelerated processing, it must be at a low level i.e. “close to the hardware”. The code responsible for processing needs to be run in-parallel on the hardware components. This is why most hardware manufacturers provide their own components (encoders and decoders) that can be used within the Android ecosystem.
This hardware-close approach does have a problem though, all of these companies are competitors. Each has their own R&D; departments with superstar engineers and the majority of the results they produce are proprietary in nature. That means one can’t expect a single decoded video frame to be in the same desired format on all Android phones. One also can’t assume that all encoders will consume the input frames in the same format either.
Take Qualcomm for example. Qualcomm hardware decoders produce decoded frames in a format which is not described anywhere. There is no description available on the Qualcomm official site. Even in the official Android documentation, one can find every official color format except for the Qualcomm format which isn’t even mentioned!
We tried several different approaches to troubleshoot this problem. We checked the FFmpeg solution, which can be compiled with Android’s hardware decoding. But, we wanted to get to a point where we have a fully hardware accelerated decoding-postprocess-encoding chain. We also tried Qualcomm’s solution with their samples without any success. This seems to be because Qualcomm’s solution is only for phones which are using Qualcomm Snapdragon chipsets. Even with their sample application, the final result included broken frames on our Jelly Bean devices with Snapdragon chipsets.
What’s more, Qualcomm doesn’t provide ready-to-use binaries for the libraries. Qualcomm’s code is mostly native C and uses deep system integration. So in order to use it, we have to compile the native code for every major Android release separately using Android sources. Basically, this means that we have to compile five different shared object libraries, each of them including a full download and compilation of Android sources. Later, they can be packed into a single .apk file but we need to compile it again every time we change something or a major Android API update is released.
Apparently Google has noticed the problem of fragmentation related to the hardware accelerated video processing and has started to fix it. The fix is a MediaCodec class which has been available since Android 4.1. Unfortunately, the first release had a serious implementation flaw: it didn’t have support for a portable encoding/decoding chain. It worked fine for encoding (as long as you use OpenGL to draw movie frames) or decoding (as long as you want to display the decoded movie on Android OpenGL surface view) but when you tried to do a combined decoding/encoding cycle, you run into the problem of color format conversions again.
As of Android API level 18, (Jelly Bean 4.3+) there has been a way to better use hardware acceleration and the problem from the first release has been solved. There are new API calls (createInputSurface particularly) that are using OpenGL surface view as an intermediary (the decoder writes data to the surface view using hardware acceleration and the encoder can then read the data directly from the OpenGL surface view buffer) but current device coverage is extremely low for Jelly Bean 4.3. Official statistics show only 1.5% of all devices have it currently installed which means for now it’s not really a feasible solution for wider audiences. Most of the people who are using the latest API agree that the pre-Jelly Bean 4.3 APIs are not really portable across the board and it’s not feasible to try and perform the encoder/decoder loop on your own. This means that from now on, (Jelly Bean 4.3 and later) all hardware manufacturers need to support dual side conversion for their hardware decoders and encoders (decoder color format to RGB to encoder color format).
By using the MediaCodec solution, we were able to gain 6-7x more speed in terms of processing when compared to software processing. We tested this approach using 720p videos which had 1280x720 resolution recordings at 30fps.
Below you can find more information about playing with hardware accelerated decoding/encoding:
- http://bigflake.com/mediacodec/ - examples of decode/encode loops for the MediaCodec API (only the 4.3+ examples make sense for us)
- https://vec.io/tags/openmax - examples about software and hardware decode/encode
- http://stackoverflow.com/questions/14806987/how-to-use-bytebuffer-in-the-mediacodec-context-in-android - trying to use pre-API-18 with poor results
- http://stackoverflow.com/questions/17096726/how-to-encode-bitmaps-into-a-video-using-mediacodec?rq=1 - trying to encode raw data.