Deep learning is a picture compression algorithm that can save 55% bandwidth

With the development of the Internet, people's demand for high-definition pictures is also increasing. It is an industry trend to minimize image size while ensuring image quality.

The currently known image compression formats are: WebP and HEIF.

WebP: Google's image file format that provides both lossy compression and lossless compression, with VP8 encoding as the kernel, and support for lossless and transparent colors in November 2011. At present, websites such as facebook have adopted this image format.

BPG: Image format introduced by project author Fabrice Bellard, a well-known programmer, ffmpeg and QEMU. It uses HEVC encoding as the kernel. Under the same volume, the BPG file size is only half of JPEG. In addition, BPG also supports 8-bit and 16-bit channels and so on. Although BPG has a good compression effect, HEVC has a high patent fee, so the current market is relatively small.

Both of these technologies have their own advantages and disadvantages. In order to respond to the market demand to the greatest extent, the use of deep learning technology to do image compression algorithms has received more and more attention from the industry.

Deep learning technology design image compression algorithm

Designing the compression algorithm through deep learning technology can not only design a higher compression ratio image compression algorithm that is more suitable for commercial use without HEVC, but also reduce the image volume while maintaining the image quality.

The deep learning technique used in the field of picture compression is the Convolutional Neural Network (CNN). Convolutional neural networks are like building blocks. A convolutional neural network consists of modules such as convolution, pooling, nonlinear functions, and normalization layers. The final output depends on the application; as in the field of face recognition, we It can be used to extract a series of features to represent a face image, and then to perform face recognition by comparing the similarities and differences of features.

Figure 1: Schematic diagram of a convolutional neural network (source http://blog.csdn.net/hjimce/article/details/47323463)

How to use convolutional neural networks for compression?

As shown in Figure 2, the complete framework includes several modules such as CNN encoder, quantization, inverse quantization, CNN decoder, entropy coding, codeword estimation, and rate-distortion optimization. The role of the encoder is to convert the picture into a compressed feature, and the decoder recovers the original picture from the compressed feature. The coding network and decoder can be designed and built with modules such as convolution, pooling and nonlinearity.

Figure 2: Schematic diagram of image compression with deep learning

How to judge the compression algorithm?

At present, there are three important indicators for judging a compression algorithm: PSNR (Peak Signal to Noise Ratio), BPP (bit per pixel), and MS-SSIM (multi-scale SSIM index). We know that any data is stored in bits in the computer, and the more bits required, the larger the storage space. PSNR is used to evaluate the quality of image restoration after decoding. BPP is used to represent the number of bits occupied by each pixel in the image. The MS-SSIM value is used to measure the subjective quality of the picture. Simply put, the PSNR is higher under the same Rate/BPP. The compression effect is better, the MSSIM is higher, and the subjective feeling is better.

The following picture shows the comparison of the PSNR value and the MS-SSIM value of the picture compression format Tiny Network Graphics (TNG) and other picture formats at the same compression ratio:

Figure 3: PSNR value of the duck TNG image format and other image formats at the same compression ratio compared with the MS-SSIM value

As can be seen from the comparison above, the TNG of the Duck has always been in the leading position of the MS-SSIM value, and its PSNR value has exceeded the commercial algorithms such as WebP and JPEG2000.

How to use deep learning to do compression?

When it comes to how to use deep learning for compression, let's take a picture as an example. A three-channel picture of size 768 * 512 is sent to the encoding network, and after forward processing, compression features occupying 96 * 64 * 192 data units are obtained. Readers with a computer foundation may think that this data unit can be placed in a floating point number, an integer number, or a binary number. What type of data should I put in the end?

From the perspective of image restoration and neural network principles, if the compressed feature data is a floating point number, the restored image quality is the highest. But a floating point number occupies 32 bits. The calculation formula of the picture is ( 96 * 64 * 192 * 32) / (768 * 512) = 96. After compression, each pixel occupies the bit from 24 to 96! The size of the image is not compressed, but it is increased. This is a bad result. Obviously floating point numbers are not a good choice.

So in order to design a reliable algorithm, you can use a technique called quantization. Its purpose is to convert floating-point numbers to integers or binary numbers. The simplest operation is to remove the decimals after the floating-point number. After the floating-point number becomes an integer, Only occupying 8 bits means that each pixel occupies 24 bits. Correspondingly, at the decoding end, the inverse feature can be used to restore the transformed feature data to a floating point number, such as adding a random number to the integer, which can reduce the influence of quantization on the accuracy of the neural network to a certain extent, thereby improving recovery. The quality of the image.

Even if each data in the compression feature occupies 1 bit, there is room for improvement in compression. So how to further optimize the algorithm? Look at the calculation formula of BPP.

Assuming that each compressed feature data unit occupies 1 bit, the formula can be written as: (96 * 64 * 192 * 1) / (768 * 512) = 3, the calculation result is 3 bit / pixel, from the purpose of compression, The smaller the BPP, the better. In this formula, the denominator is determined by the image, and we only adjust the numerator: 96, 64, 192, which are related to the network structure. So, if we design a better network structure, these three numbers will also become smaller.

Which modules are related to that 1? 1 means that each compressed feature data unit occupies an average of 1 bit, and quantization affects this number, but it is not the only influencing factor, it is also related to rate control and entropy coding. The purpose of rate control is to ensure that the data distribution in the compressed feature data unit is as concentrated as possible and the range of values â€‹â€‹is as small as possible, so that we can further reduce the value by 1 by entropy coding. The image compression ratio will be further improved.

to sum up

Overall, designing video and image compression algorithms with deep learning is a very promising, but also very challenging, technology.

Finally, you can click on the original to get a TNG test link (recommended on the PC side).

ZGAR Vape Pods 5.0

ZGAR Vape Pods 5.0

ZGAR electronic cigarette uses high-tech R&D, food grade disposable pod device and high-quality raw material. All package designs are Original IP. Our designer team is from Hong Kong. We have very high requirements for product quality, flavors taste and packaging design. The E-liquid is imported, materials are food grade, and assembly plant is medical-grade dust-free workshops.

From production to packaging, the whole system of tracking, efficient and orderly process, achieving daily efficient output. WEIKA pays attention to the details of each process control. The first class dust-free production workshop has passed the GMP food and drug production standard certification, ensuring quality and safety. We choose the products with a traceability system, which can not only effectively track and trace all kinds of data, but also ensure good product quality.

We offer best price, high quality Pods, Pods Touch Screen, Empty Pod System, Pod Vape, Disposable Pod device, E-cigar, Vape Pods to all over the world.

Much Better Vaping Experience!

Pods, Vape Pods, Empty Pod System Vape,Disposable Pod Vape Systems

ZGAR INTERNATIONAL(HK)CO., LIMITED , https://www.zgarette.com