BASIC BINARY CONVOLUTION UNIT FOR BINARIZED IMAGE RESTORATION NETWORK

Abstract

Lighter and faster image restoration (IR) models are crucial for the deployment on resource-limited devices. Binary neural network (BNN), one of the most promising model compression methods, can dramatically reduce the computations and parameters of full-precision convolutional neural networks (CNN). However, there are different properties between BNN and full-precision CNN, and we can hardly use the experience of designing CNN to develop BNN. In this study, we reconsider components in binary convolution, such as residual connection, Batch-Norm, activation function, and structure, for IR tasks. We conduct systematic analyses to explain each component's role in binary convolution and discuss the pitfalls. Specifically, we find that residual connection can reduce the information loss caused by binarization; BatchNorm can solve the value range gap between residual connection and binary convolution; The position of the activation function dramatically affects the performance of BNN. Based on our findings and analyses, we design a simple yet efficient basic binary convolution unit (BBCU). Furthermore, we divide IR networks into four parts and specially design variants of BBCU for each part to explore the benefit of binarizing these parts. We conduct experiments on different IR tasks, and our BBCU significantly outperforms other BNNs and lightweight models, which shows that BBCU can serve as a basic unit for binarized IR networks.

1. INTRODUCTION

Image restoration (IR) aims to restore a high-quality (HQ) image from its low-quality (LQ) counterpart corrupted by various degradation factors. Typical IR tasks include image denoising, superresolution (SR), and compression artifacts reduction. Due to its ill-posed nature and high practical values, image restoration is an active yet challenging research topic in computer vision. Recently, the deep convolutional neural network (CNN) has achieved excellent performance by learning a mapping from LQ to HQ image patches for image restoration (Chen & Pock, 2016; Zhang et al., 2018a; Tai et al., 2017; Xia et al., 2023) . However, most IR tasks require dense pixel prediction and the powerful performance of CNN-based models usually relies on increasing model size and computational complexity. That requires extensive computing and memory resources. While, most hand-held devices and small drones are not equipped with GPUs and enough memory to store and run the computationally expensive CNN models. Thus, it is quite essential to largely reduce its computation and memory cost while preserving model performance to promote IR models. Binary neural network (Courbariaux et al., 2016) (BNN, also known as 1-bit CNN) has been recognized as one of the most promising neural network compression methods (He et al., 2017; Jacob et al., 2018; Zoph & Le, 2016) for deploying models onto resource-limited devices. BNN could achieve 32× memory compression ratio and up to 64× computational reductions on specially designed processors (Rastegari et al., 2016) . Nowadays, the researches of BNN mainly concentrate on high-level tasks, especially classification (Liu et al., 2018; 2020) , but do not fully explored in lowlevel vision, like image denoising. Considering the great significance of BNN for the deployment of IR deep networks and the difference between high-level and low-level vision tasks, there is an urgent need to explore the property of BNN on low-level vision tasks and provide a simple, strong, universal, and extensible baseline for latter researches and deployment. Recently, there have been several works exploring the application of BNN on image SR networks. Specifically, Ma et al. (Ma et al., 2019) tried to binarize the convolution kernel weight to decrease the SR network model size. But, the computational complexity is still high due to the preservation of full-precision activations. Then BAM (Xin et al., 2020) adopted the bit accumulation mechanism to approximate the full-precision convolution for the SR network. Zhang et al. (Zhang et al., 2021b) designed a compact uniform prior to constrain the convolution weight into a highly narrow range centered at zero for better optimization. BTM (Jiang et al., 2021) further introduced knowledge distillation (Hinton et al., 2015) to boost the performance of binarized SR networks. However, the above-mentioned binarized SR networks can hardly achieve the potential of BNN. In this work, we explore the properties of three key components of BNN, including residual connection (He et al., 2016) , BatchNorm (BN) (Ioffe & Szegedy, 2015) , and activation function (Glorot et al., 2011) and design a strong basic binary Conv unit (BBCU) based on our analyses. (1) For the IR tasks, we observe that residual connection is quite important for binarized IR networks. That is because that BNN will binarize the input full-precision activations to 1 or -1 before binary convolution (BC). It means that BNN would lose a large amount of information about the value range of activations. By adding the full-precision residual connection for each binary convolution (BC), BNN can reduce the effect of value range information loss. (2) Then, we explore the BN for BBCU. BNN methods (Liu et al., 2020) for classification always adopt a BN in BC. However, in IR tasks, EDSR (Lim et al., 2017) has demonstrated that BN is harmful to SR performance. We find that BN in BNN for IR tasks is useful and can be used to balance the value range of residual connection and BC. Specifically, as shown in Fig. 2 (b) , the values of the full-precision residual connection are mostly in the range of -1 to 1 because the value range of input images is around 0 to 1 or -1 to 1, while the values of the BC without BN are large and ranges from -15 to 15 for the bit counting operation (Fig. ?? ). The value range of BC is larger than that of the residual connection, which covers the preserved full-precision image information in the residual connection and limits the performance of BNN. In Fig. 2 (a), the BN in BNN for image restoration can realize the value range alignment of the residual connection and BC. (3) Based on these findings, to remove BN, we propose a residual alignment (RA) scheme by multiplying the input image by an amplification factor k to increase the value range of the residual connection rather than using BN to narrow the value range of the BC (Fig. 2 (c )). Using this scheme can improve the performance of binarized IR networks and simplify the BNN structure (Sec. 4.5). (4) In Fig. 2 (d) , different from BNNs (Liu et al., 2020; 2018) for classification, we further move the activation function into the residual connection and can improve performance (Sec. 4.5). That is because activation function would narrow the negative value ranges of residual connection. Its information would be covered by the next BC with large negative value ranges (Fig. 2 (c) ). (5) Furthermore, we divide IR networks into four parts: head, body, upsampling, and tail (Fig. 3 (a)). These four parts have different input and output channel numbers. Previous binarized SR networks (Xin et al., 2020; Jiang et al., 2021) merely binarize the body part. However, upsampling part accounts for 52.3% total calculations and needs to be binarized. Besides, the binarized head and tail parts are also worth exploring. Thus, we design different variants of the BBCU to binarize these four parts (Fig. 3 (b) ). Overall, our contributions can be mainly summarized in threefold: • We believe our work is timely. The high computational and memory cost of IR networks hinder their application on resource-limited devices. BNN, as one of the most promising compression methods, can help IR networks to solve this dilemma. Since the BNN-based networks have different properties from full-precision CNN networks, we reconsider, analyze, and visualize some essential components of BNN to explore their functions. • According to our findings and analyses on BNN, we specially develop a simple, strong, universal, and extensible basic binary Conv unit (BBCU) for IR networks. Furthermore, we develop variants of BBCU and adapt it to different parts of IR networks. • Extensive experiments on different IR tasks show that BBCU can outperform the SOTA BNN methods (Fig. 1 ). BBCU can serve as a strong basic binary convolution unit for future binarized IR networks, which is meaningful to academic research and industry. 2 RELATED WORK 

3.1. BASIC BINARY CONV UNIT DESIGN

As shown in Fig. 2 (a), we first construct the BBCU-V1. Specifically, the full-precision convolution X f j ⊗ W f j (X f j , W f j , and ⊗ are full-precision activations, weights, and Conv respectively) is approximated by the binary convolution X b j ⊗ W b j . For binary convolution, both weights and activations are binarized to -1 and +1. Efficient bitwise XNOR and bit counting operations can replace computationally heavy floating-point matrix multiplication, which can be defined as:  X b j ⊗ W b j = bitcount XNOR X b j , W b j , x b i,j = Sign x f i,j = +1, if x f i,j > α i,j -1, if x f i,j ≤ α i,j , x f i,j ∈ X f j , x b i,j ∈ X b j , i ∈ [0, C), Sign RPReLU BatchNorm Binary Conv ×𝑘 RA RA ÷ 𝑘 … (a) Basic Binary Conv Unit V1 (BBCU-V1) (b) Basic Binary Conv Unit V2 (BBCU-V2) (c) Basic Binary Conv Unit V3 (BBCU-V3) Sign RPReLU Sign RPReLU Binary Conv Binary Conv … ×𝑘 RA RA ÷ 𝑘 … (d) Basic Binary Conv Unit V4 (BBCU-V4, Adopted) Sign RPReLU Binary Conv … w b i,j = W f j 1 n Sign w f i,j =    + ∥W f j ∥ 1 n , if w f i,j > 0 - ∥W f j ∥ 1 n , if w f i,j ≤ 0 , w f i,j ∈ W f j , w b i,j ∈ W b j , i ∈ [0, C), where X f j ∈ R C×H×W and W f j ∈ R Cout×Cin×K h ×Kw are full-precision activations and convolution weights in j-th layer, respectively. Similarly, X b j ∈ R C×H×W and W b j ∈ R Cout×Cin×K h ×Kw are binarized activations and convolution weights in j-th layer separately. x f i,j , w f i,j , x b i,j , and w b i,j are the elements of i-th channel of X f j , W f j , X b j , and W b j respectively. α i,j is the learnable coefficient controlling the threshold of sign function for i-th channel of X f j . It is notable that the weight binarization method is inherited from XONR-Net (Rastegari et al., 2016) , of which ∥W f j ∥ 1 n is the average of absolute weight values and acts as a scaling factor to minimize the difference between binary and full-precision convolution weights. Then, we use the RPReLU (Liu et al., 2020) as our activation function, which is defined as follows: f (y i,j ) = y i,j -γ i,j + ζ i,j , if y i,j > γ i,j β i,j (y i,j -γ i,j ) + ζ i,j , if y i,j ≤ γ i,j , y i,j ∈ Y j , i ∈ [0, C), where Y j ∈ R C×H×W is the input feature maps of RPReLU function f (.) in j-th layer. y i,j is the element of i-th channel of Y j . γ i,j and ζ i,j are learnable shifts for moving the distribution. β i,j is a learnable coefficient controlling the slope of the negative part, which acts on i-th channel of Y j . Different from the common residual block, which consists of two convolutions used in full-precision IR network (Lim et al., 2017) , we find that residual connection is essential for binary convolution to supplement the information loss caused by binarization. Thus, we set a residual connection for each binary convolution. Therefore, the BBCU-V1 can be expressed mathematically as: X f j+1 = f BatchNorm X b j ⊗ W b j + X f j = f κ j X b j ⊗ W b j + τ j + X f j , where κ j , τ j ∈ R C are learnable parameters of BatchNorm in j-th layer. In BNNs, the derivative of the sign function in Eq. 2 is an impulse function that cannot be utilized in training. Thus, we adopt the approximated derivative function as the derivative of the sign function. It can be expressed mathematically as: Approx   ∂ Sign x f i ∂x f i   =        2 + 2 x f i -α i if α i -1 ⩽ x f i < α i 2 -2 x f i -α i if α i ⩽ x f i < α i + 1 0 otherwise . However, EDSR (Lim et al., 2017 ) demonstrated that BN changes the distribution of images, which is harmful to accurate pixel prediction in SR. So, can we also directly remove the BN in BBCU-V1 to obtain BBCU-V2 (Fig. 2 (b ))? In BBCU-V1 (Fig. 2 (a)), we can see that the bit counting operation of binary convolution tends to output large value ranges (from -15 to 15). In contrast, residual connection preserves the full-precision information, which flows from the front end of IR network with a small value range from around -1 to 1. By adding a BN, its learnable parameters can narrow the value range of binary convolution and make it close to the value range of residual connection to avoid full-precision information being covered. The process can be expressed as: Mean κ j X b j ⊗ W b j + τ j → Mean X f j , where κ j , τ j ∈ R C are learnable parameters of BN in j-th layer. Thus, compared with BBCU-V1, BBCU-V2 simply removes BN and suffers a huge performance drop. After the above exploration, we know that BN is essential for BBCU, because it can balance the value range of binary convolution and residual connection. However, BN changes image distributions limiting restoration. In BBCU-V3, we propose residual alignment (RA) scheme by multiplying the value range of input image by an amplification factor k (k > 1) to remove BN Figs. 2(c ) and 3(b): Mean X b j ← Mean kX f j . We can see from the Eq. 8, since the residual connection flows from the amplified input image, the value range of X f j also is amplified, which we define as kX f j . Meanwhile, the values of binary convolution are almost not affected by RA, because X b j filters amplitude information of kX f j . Different from BatchNorm, RA makes the value range of residual connection close to binary convolution (-60 to 60). Besides, we find that using RA to remove the BatchNorm has two main benefits: (1) Similar to full-precision IR networks, the binarized IR networks would have better performance without BatchNorm. (2) The structure of BBCU becomes more simple, efficient, and strong. Based on the above findings, we are aware that the activation function (Eq. 4) in BBCU-V3 (Fig. 2(c )) narrows the negative value range of residual connection, which means it loses negative full-precision information. Thus, we further develop BBCU-V4 by moving the activation function into the residual connection to avoid losing the negative full-precision information. The experiments (Sec. 4.5) show that our design is accurate. We then take BBCU-V4 as the final design of BBCU.

3.2. ARM IR NETWORKS WITH BBCU

As shown in Fig. 3(a As shown in Fig. 3 (b), we further design different variants of BBCU for these four parts. (1) For the head H part, its input is I LQ ∈ R 3×H×W and the binary convolution output with C channels. Thus, we cannot directly add I LQ to the binary convolution output for the difference in the number of channels. To address the issue, we develop BBCU-H by repeating I LQ to have C channels. (2) For body B part, since the input and output channels are the same, we develop BBCU-B by directly adding the input activation to the binary convolution output. (3) For upsampling U part, we develop BBCU-U by repeating the channels of input activations to add with the binary convolution. (4) For the tail T part, we develop BBCU-T by adopting I LQ as the residual connection. To better evaluate the benefits of binarizing each part on computations and parameters, we define two metrics: Relu Conv Head ℋ Body ℬ Upsampling 𝒰 Tail 𝒯 W ∈ ℛ !×#×$!×$" W ∈ ℛ !×%×$!×$" … W ∈ ℛ &!×%×$!×$" W ∈ ℛ #×%×$!×$" Full-Precision Conv Relu PixelShuffle W ' ∈ ℛ !×#×$!×$" BBCU-B W ' ∈ ℛ !×%×$!×$" … W ' ∈ ℛ &!×%×$!×$" W ' ∈ ℛ #×%×$!×$" V C = PSNR f -PSNR b / OPs f -OPs b , V P = PSNR f -PSNR b / Parms f -Parms b , where PSNR b , OPs b , and Parms b denote the performance, calculations, and parameters after binarizing one part of networks. Similarly, PSNR f , OPs f , and Parms f measure full-precision networks. We adopt reconstruction loss L rec to guide the image restoration training, which is defined as: L rec = I HQ -ÎHQ 1 , where I HQ and ÎHQ are the real and restored HQ images, respectively. ∥ • ∥ 1 denotes the L 1 norm.

4.1. EXPERIMENT SETTINGS

Training Strategy. We apply our method to three typical image restoration tasks: image superresolution, color image denoising, and image compression artifacts reduction. We train all models on DIV2K (Agustsson & Timofte, 2017), which contains 800 high-quality images. Besides, we adopt widely used test sets for evaluation and report PSNR and SSIM. For image super-resolution, we take simple and practical SRResNet (Ledig et al., 2017) as the backbone. The mini-batch contains 16 images with the size of 192×192 randomly cropped from training data. We set the initial learning rate to 1×10 -4 , train models with 300 epochs, and perform halving every 200 epochs. For image denoising and compression artifacts reduction, we take DnCNN and DnCNN3 as backbone (Zhang et al., 2017) . The mini-batch contains 32 images with the size of 64×64 randomly cropped from training data. We set the initial learning rate to 1×10 -4 , train models with 300,000 iterations, and perform halving per 200,000 iterations. The amplification factor k in the residual alignment is set to 130. We implement our models with a Tesla V100 GPU. Methods OPs (G) Params (K) Live1 Classic5 q = 10 q = 20 q = 30 q = 40 q = 10 q = 20 q = 30 q = 40 on the Live1 and Classic5 as q = 40. In addition, our BBCU surpasses DnCNN-3-Lite by 0.16dB and 0.2dB on benchmarks as q = 10. The visual comparisons are provided in appendix. Basic Binary Convolution Unit. To validate BBCU for the binarized IR network, we binarize the body part of SRResNet with four variants of BBCU (Fig. 2 ) separately. The results are shown in Tab. 4. (1) Compared with BBCU-V1, BBCU-V2 declines by 0.14dB and 0.24dB on Urban100 and Manga109. This is because BBCU-V2 simply removes the BN making the value range of binary Conv far larger than residual connection and cover full-precision information (Fig. 2 ). (2) BBCU-V3 adds residual alignment scheme on BBCU-V2, which addresses the value range imbalance between binary Conv and residual connection and removes the BN. Since the BN is harmful for IR networks, BBCU-V3 surpasses BBCU-V1 by 0.17dB, 0.16dB, and 0.29dB on Set5, Urban100, and Manga109 respectively. (3) BBCU-V4 moves the activation function into the residual connection, which preserves the full-precision negative values of residual connection (Fig. 2 ). Thus, BBCU-V4 outperforms BBCU-V3. Residual Connection. We take SRResNet as the backbone with 32 Convs in body part and explore the relationship between performance and the number of binary convolution in residual connection. Specifically, for both full-precision and binarized SRResNet, we set 1,2,4,8,16, and 32 Convs with a residual connection as basic convolution units, respectively. We evaluate SRResNet equipped with these basic convolution units on 4× Urban100 (see Fig. 5 ). DnCNN

4.5. ABLATION STUDY

(1) For full-precision networks, it is best to have 2 Convs with a residual connection to form a basic convolution unit. Besides, if there are more than 4 Convs in a basic convolution unit, its performance would sharply drop. For binarized networks, it is best to equip each binary convolution with a residual connection. In addition, compared with the full-precision network, the binarized network is more insensitive to the growth of the Convs number in the basic convolution unit. (2) For binarized SRResNet, we delete residual connection of a BBCU in a certain position. In Tab. 5, it seems that once the residual connection is removed (the full-precision information is lost) at any position, the binarized SRResNet would suffer a severe performance drop. Amplification Factor. As shown in Fig. 6 , the performance of full-precision remain basically unchanged with the variation of amplification factor k. However, the binarized network is sensitive to the k. The best k * is related to the number of channels n, which empirically fits k * = 130n/64. Intuitively, the larger number of channels makes binary convolution count more elements and have larger results, which needs larger k to increase the value of residual connection for balance. The Binarization Benefit for Different Parts of IR Network. We binarize one part in SRResNet with BBCU in Fig. 3 while keeping other parts full-precision. The results are shown in Tab. 6. We use V C (Eq. 10) and V P (Eq. 11) as metric to value binarization benefit of different parts. We can see that Upsampling part is most worth to be binarized. However, it is ignored by previous works (Xin et al., 2020) . The binarization benefit of first and last convolution is relatively low.

5. CONCLUSION

This work devotes attention to exploring the performance of BNN on low-level tasks and search of generic and efficient basic binary convolution unit. Through decomposing and analyzing existing elements, we propose BBCU, a simple yet effective basic binary convolution unit, that outperforms existing state-of-the-arts with high efficiency. Furthermore, we divide IR networks to four parts, and specially develop the variants of BBCU for them to explore the binarization benefits. Overall, BBCU provide insightful analyses on BNN and can serve as strong basic binary convolution unit for future binarized IR networks, which is meaningful to academic research and industry.



Super-Resolution. Take SRResNet as backbone and test on 4× Urban100. (b) Denoising. Take DnCNN as backbone and test on σ = 15 Urban100.(c) Deblocking.Take DnCNN3 as backbone and test on q = 10 Classic5.

Figure 1: Our BBCU achieves the SOTA performance on IR tasks with efficient computation.

Figure 2: The illustration of the improvement process of our BBCU. (a) The initial BBCU design. (b) We remove BatchNorm to explore its actual function in IR tasks. We find that BatchNorm is essential because it can balance the value range gap between residual connection and binary convolution. (c) We further propose the residual alignment (RA) by multiplying an amplification factor k on the input image to address the value range gap. (d) Based on BBCU-V3, we move the activation function into the residual connection to reduce the negative value information loss.

), the existing image restoration (IR) networks could be divided into four parts: head H, body B, upsampling U, and tail T . If the IR networks do not need increasing resolution(Zhang et al., 2017), it can remove the upsampling U part. Specifically, given an LQ input I LQ , the process of IR network restoring HQ output ÎHQ can be formulated as: ÎHQ = T (U(B(H(I LQ )))).(9) Previous BNN SR works(Xin et al., 2020;Jiang et al., 2021; Zhang et al., 2021b)  concentrate on binarizing the body B part. However, upsampling U part accounts for 52.3% total calculations and is essential to be binarized. Besides, the binarized head H and tail T parts are also worth exploring.

Figure 3: The illustration of full-precision and binary IR networks. (a) The IR network can be generally divided into four parts: head, body, upsampling, and tail. Notably, for IR tasks whose resolution remains unchanged, such as denoising and deblocking, we can ignore the upsampling part. (b) We equip different parts of binarized IR networks with the variants of BBCU. the benefits of binarizing each part on computations and parameters, we define two metrics:

OPs and Parameters Calculation of BNN. Following Rastegari et al. (2016); Liu et al. (2018), the operations of BNN (OPs b ) is calculated by OPs b = OPs f / 64 (OPs f indicates FLOPs), and the parameters of BNN (Parms b ) is calculated by Parms b = Parms f / 32.

Figure 4: Visual comparison of BNNs for 4× image super-resolution. BBCU-M. Besides, we reduce the number of channels of SRResNet to 12, obtaining SRResNet-Lite. In addition, we use Set5 (Bevilacqua et al., 2012), Set14 (Zeyde et al., 2010), B100 (Martin et al., 2001), Urban100 (Huang et al., 2015), and Manga109 (Matsui et al., 2017) for evaluation.The quantitative results (PSNR and SSIM), the number of parameters, and operations of different methods are shown in Tab. 1. Compared with other binarized methods, our BBCU-L achieves the best results on all benchmarks and all scale factors. Specifically, for 4× SR, our BBCU-L surpasses ReActNet by 0.25dB, 0.27dB, and 0.44dB on Set5, Urban100, and Manga109, respectively. For 2× SR, BBCU-L also achieves 0.32dB, 0.27dB, and 0.58dB improvement on these three benchmarks compared with ReActNet. Furthermore, our BBCU-M achieves the second best performance on most benchmarks consuming 5% of operations and 13.7% parameters of ReActNet (Liu et al., 2020) on 4× SR. Besides, BBCU-L significantly outperforms SRResNet-Lite by 0.16dB and 0.3dB with less computational cost, showing the superiority of BNN. The qualitative results are shown in Fig.4, and our BBCU-L has the best visual quality containing more realistic details close to respective ground-truth HQ images. More qualitative results are provided in appendix.4.3 EVALUATION ON IMAGE DENOISINGFor image denoising, we use DnCNN(Zhang et al., 2017) as the backbone and binarize its body B part with BNN methods, including BNN, Bi-Real, IRNet, BTM, ReActNet, and our BBCU. We also develop DnCNN-Lite by reducing the number of channels of DnCNN from 64 to 12. The standard benchmarks: Urban100(Huang et al., 2015), BSD68(Martin et al., 2001), and Set12(Shan et al.,

Figure 6: The effect of amplification factor on SRResNet with various number of channels.

Quantitative comparison (average PSNR/SSIM) with BNNs for classical image Super-Resolution on benchmark datasets. Best and second best performance among BNNs are in red and blue colors, respectively. OPs are computed based on LQ images with a resolution of 320×180.

Quantitative comparison (average PSNR) with BNNs for classical image denoising on benchmark datasets. Best and second best performance among BNNs are in red and blue colors, respectively. OPs is computed based on LQ images with a resolution of 320×180.

Quantitative comparison (average PSNR/SSIM) with BNNs for classical JPEG compression artifact reduction. Best and second best performance among BNNs are in red and blue colors, respectively. OPs is computed based on LQ images with a resolution of 320×180.

applied to evaluate each method. Additive white Gaussian noises (AWGN) with different noise levels σ (15, 25, 50) are added to the clean images.The quantitative results of image denoising are shown in Tab. 2, respectively. As one can see, our BBCU achieves the best performance among compared BNNs. In particular, our BBCU surpasses the state-of-the-art BNN model ReActNet by 0.82dB, 0.88dB, and 1dB on CBSD68, Kodak24, and Urban100 datasets respectively as σ = 15. Compared with DnCNN-Lite, our BBCU surpasses it by 0.92dB, 1.03dB, and 1.5dB on these three benchmarks as σ = 15 consuming 58% computations of DnCNN-Lite. Qualitative results are provided in appendix.4.4 EVALUATION ON JPEG COMPRESSION ARTIFACT REDUCTIONFor this JPEG compression deblocking, we use practical DnCNN-3(Zhang et al., 2017) as the backbone and replace the full-precision body B part of DnCNN3 with some competitive BNN methods, including BNN, Bi-Real, IRNet, BTM, ReActNet, and our BBCU. The compressed images are generated by Matlab standard JPEG encoder with quality factors q ∈ {10, 20, 30, 40}. We take the widely used LIVE1 (Sheikh, 2005) and Classic5(Foi et al., 2007) as test sets to evaluate the performance of each method. The quantitative results are presented in Tab. 3. As we can see, our BBCU achieves the best performance on all test sets and quality factors among all compared BNNs. Specifically, our BBCU surpasses the state-of-the-art BNN model ReActNet by 0.38dB and 0.34dB

PSNR (dB) values (4×) on four types of basic binary convolution unit (BBCU).

The breakpoint position of residual connection.

The binarization benefit of different parts in IR networks. We test models on Set14, and PSNR f is 28.60dB.

ACKNOWLEDGMENTS

This work was partly supported by the Alexander von Humboldt Foundation, the National Natural Science Foundation of China(No. 62171251), the Natural Science Foundation of Guangdong Province(No.2020A1515010711), the Special Foundations for the Development of Strategic Emerging Industries of Shenzhen(Nos.JCYJ20200109143010272 and CJGJZD20210408092804011) and Oversea Cooperation Foundation of Tsinghua.

availability

//github.com/

