STEGO NETWORKS: INFORMATION HIDING ON DEEP NEURAL NETWORKS

Abstract

The best way of keeping a secret is to pretend there is not one. In this spirit, a class of techniques called steganography aims to hide secret messages on various media leaving as little detectable trace as possible. This paper considers neural networks as novel steganographic cover media, which we call stego networks, that can be used to hide one's secret messages. Although there have been numerous attempts to hide information in the output of neural networks, techniques for hiding information in the neural network parameters themselves have not been actively studied in the literature. The widespread use of deep learning models in various cloud computing platforms and millions of mobile devices as of today implies the importance of safety issues regarding stego networks among deep learning researchers and practitioners. In response, this paper presents the advantages of stego networks over other types of stego media in terms of security and capacity. We provide observations that the fraction bits of some typical network parameters in a floating-point representation tend to follow uniform distributions and explain how it can help a secret sender to encrypt messages that are indistinguishable from the original content. We demonstrate that network parameters can embed a large amount of secret information. Even the most significant fraction bits can be used for hiding secrets without inducing noticeable performance degradation while making it significantly hard to remove secrets by perturbing insignificant bits. Finally, we discuss possible use cases of stego networks and methods to detect or remove secrets from stego networks.

1. INTRODUCTION

As much as it goes without saying knowledge is power, inventing methods for keeping and selectively conveying secret messages has been a crucial mission throughout the history of humanity. Among various methods to protect secrets, an effective approach called steganography makes it difficult to detect the very existence of the secrets in an object looking innocuous. The object containing the secrets is called a stego medium in the context of steganography. Starting with the case of hiding a secret message in the form of engraved tattoos on hidden parts of a human body in an ancient greek period, numerous methods (e.g., using invisible inks, writing tiny-sized letters) were employed to transmit information without leaving detectable footprints (Kahn, 1996) . Most recently, digital steganography, which embeds secret messages in digital images or audio files, has been actively developed. Traditional steganography is typically used in communication between two individuals, but steganography in digital media enables its brand-new usage by conveying secrets in a multitude of devices and unknowingly influencing their behavior when accompanied with a small decoding code. The secrets in this scenario are often called stegomalware (Nagaraja et al., 2011; Suarez-Tangil et al., 2014) . Meanwhile, deep neural networks (DNNs) have shown remarkable success in various areas over the years, and are now beginning to be applied to industry and the consumer sector as well as to the academic. DNNs have been deployed in a variety of computing systems ranging from largescale cloud computing systems to millions of mobile devices (Howard et al., 2017) . More and more mobile devices are running application programs that include deep learning models with numerous camera filter and speech recognition applications being good examples. Furthermore, building upon existing large pre-trained models, such as ResNet (He et al., 2016) , BERT (Devlin et al., 2019) Figure 1 : Fraction bit distributions of image classification model parameters and distributions of evaluated bytes of various file formats (JPEG, MP3, and PDF). We visualized the distributions of fraction bits of parameters into three different plots. Three subsequences of 23 fraction bits are evaluated at their decimal values (top row, first three). The distributions of the entire parameters of each model is also presented (top row, last one). Next, we plotted byte distributions of the original and the encrypted files (bottom row, first two). The plotted values are averaged over 100 files for each file format. Finally, the probabilities of each bit in floating-point representation to be one is visualized (bottom row, last one). An interesting observation is that the fraction bits of five architectures follow nearly identical distributions, but the shape of parameter distributions are noticeably different. Exponent bits of floating-point numbers mainly responsible for the discrepancy between parameter distributions. GPT-3 (Brown et al., 2020) , rather than training complete neural networks from scratch has become a trend in deep learning research. Accordingly, various files containing neural network parameters are uploaded on the source code repositories, and FTP servers, and they are frequently exchanged among individuals and organizations. The fraction bits, which are sometimes called the significand or the mantissa, of a floating-point number follows a distribution close to uniform distribution as shown in Figure 1 . A secret sender can easily embed encrypted messages, which also typically follows a uniform distribution, without causing much suspicion from static analysis tools. Also, as the sizes of pre-trained neural network parameters are often large being more than hundreds of megabytes in size, neural network parameters are suitable media to exchange a nontrivial amount of secrets. In this paper, we analyze the distribution of fraction bits of typical neural network parameters. Fraction bits, which contain the least significant information of a floating-point number, can conveniently be used to embed secret messages. We experimented with a special kind of weight perturbation, which simulate general cases of hiding secrets in network parameters and explored several methods to inject arbitrary data into neural network parameters without noticeable performance degradation. We empirically showed that steganography in the least significant fraction bits is readily applicable and even steganography in the most significant fraction bits is also possible. Our main contributions are as follows: • We demonstrate suitability of neural network parameters as a novel steganographic medium. • We propose novel approaches to effectively embed secrets in neural network parameters. • We give comprehensive analysis of stego networks in terms of security and capacity, and discuss the advantages of stego networks over the conventional stego media.

