EFFICIENT HYPERDIMENSIONAL COMPUTING

Abstract

Hyperdimensional computing (HDC) uses binary vectors of high dimensions to perform classification. Due to its simplicity and massive parallelism, HDC can be highly energy-efficient and well-suited for resource-constrained platforms. However, in trading off orthogonality with efficiency, hypervectors may use tens of thousands of dimensions. In this paper, we will examine the necessity for such high dimensions. In particular, we give a detailed theoretical analysis of the relationship among dimensions of hypervectors, accuracy, and orthogonality. The main conclusion of this study is that a much lower dimension, typically less than 100, can also achieve similar or even higher detecting accuracy compared with other state-of-the-art HDC models. Based on this insight, we propose a suite of novel techniques to build HDC models that use binary hypervectors of dimensions that are orders of magnitude smaller than those found in the state-of-the-art HDC models, yet yield equivalent or even improved accuracy and efficiency 1 . For image classification, we achieved an HDC accuracy of 96.88% with a dimension of only 32 on the MNIST dataset. We further explore our methods on more complex datasets like CIFAR-10 and show the limits of HDC computing.

1. INTRODUCTION

Hyperdimensional computing (HDC) is an emerging learning paradigm inspired by an abstract representation of neuron activity in the human brain using high-dimensional binary vectors. Compared with other well-known training methods like artificial neural networks (ANNs), HDCs have the advantage of high parallelism and low energy consumption (low latency). This makes HDCs well suited to resource-constrained applications such as electroencephalogram detection, robotics, language recognition and federated learning (Hsieh et al., 2021; Asgarinejad et al., 2020; Neubert et al., 2019; Rahimi et al., 2016) . HDCs are also easy to implement in hardware (Schmuck et al., 2019; Salamat et al., 2019) . Unfortunately, the practical deployment of HDC suffers from low model accuracy and is always restricted to small and simple datasets. To solve the problem, one commonly used technique is increasing the hypervector dimension (Neubert et al., 2019; Schlegel et al., 2022; Yu et al., 2022) . For example, running on the MNIST dataset, hypervector dimensions of 10,000 are often used. Duan et al. (2022) and Yu et al. (2022) achieved the state-of-the-art accuracies of 94.74% and 95.4% separately this way. In these and other state-of-the-art HDC works, hypervectors are randomly drawn from the hyperspace {-1, +1} d , where the dimension d is very high. This ensures high orthogonality, making the hypervectors more independent and easier to distinguish from each other (Thomas et al., 2020) . As a result, accuracy is improved and more complex application scenarios can be targeted. However, the price paid due to higher dimension is in higher energy consumption possibly negating the advantage of HDC altogether (Neubert et al., 2019) . This paper addresses this tradeoff. In this paper, we will analyze the relationship between hypervector dimension and accuracy, as well as between dimension and orthogonality. In our analysis, we found that strict orthogonality can be obtained for small d. We will show that a dimension d of only 2 ⌈log 2 n⌉ is sufficient to yield n vectors in {-1, 1} d with strict orthogonality. Dimensions higher than that are not necessary. If we relax orthogonality to ε-quasi-orthogonality (Kainen & Krkova, 2020), we will show that it is even easier to construct the hypervectors. Further, it is intuitively true that high dimensions will lead to high orthogonality (Thomas et al., 2020) , contrary to popular belief, we found that as the dimension of the hypervectors d increases, the upper bound for inference accuracy actually decreases (Statement 3.1 and Statement 3.2). In particular, if the hypervector dimension d is sufficient to represent a vector with K classes (d > log 2 K) then, the lower the dimension, the higher the accuracy. The key insight of our work is this: In HDC, it is not the higher dimension, that is the determinant of accuracy, and the required orthogonality for a given problem can be achieved at lower hypervector dimensions using our proposed techniques. Based on the analysis, we propose a combination of a novel trainable binary kernel-based encoder with the majority rule (shown in Figure 3 ) to reduce the hypervector dimension significantly while maintaining state-of-art accuracies. Running on the MNIST dataset, HDC accuracies of 96.88/97.23% were achieved with hypervector dimensions of only 32/64. The total number of calculation operations of our method is a mere 7% of the previous state-of-art related works where hypervectors dimensions of 10,000 or more were needed. We further explored our methods on CIFAR-10 and an HDC accuracy of 46.18% was achieved. Both our analysis and experiments show that dimensions of 5,000 or even 10,000 used by the state-of-the-art in HDC are not necessary. The contribution of this paper is as follows: • We give a comprehensive analysis of the relationship between hypervector dimension and the accuracy of HDC. Both the worst-case and average-case accuracy are studied. Mathematically, we explain why relatively lower dimensions can yield higher model accuracies. This contradicts the standard assumption in HDC. Furthermore, the relationship between orthogonality and hypervector dimension is also discussed. Based on the analysis, we can reduce the dimension by nearly three orders of magnitude. • We introduce a kernel-based binary encoder and two HDC retraining algorithms. With these techniques, we can achieve higher detection accuracies using much smaller hypervector dimensions (latency) and better orthogonality compared to the state-of-the-art. Organisation This paper is organized as follows. First, the basic workflow and background of HDC are introduced. Then, we describe our main dimension-accuracy and dimension-orthogonality analysis in Section 3. In Section 4, we present a trainable binary encoder and two HDC retraining approaches to improve accuracy while at the same time reducing energy consumption. We then show our experimental results and comparison with state-of-the-art HDCs in Section 5, followed by a discussion and conclusion.

2. BACKGROUND

Hyperdimensional computing encodes binary hypervectors with typical dimensions of 5,000 to 10,000 to represent the data. Using the MNIST dataset as an example, HDC encodes one float32type image f = f 0 , f 1 , ..., f 783 to hypervectors by binding and adding the value hypervectors v and position hypervectors p together. Both these two hypervectors are independently drawn from the hyperspace {-1, +1} d randomly. Mathematically, we can construct representation r for each image as followed: r = sgn (v f0 p f0 + v f1 p f1 + ... + v f783 p f783 ) , where sgn(•) is the sign function that binarizes the sum of hypervectors and returns -1 or 1. sgn(0) is randomly assigned to 1 or -1. is the binding operation that perform coordinate-wise (elementwise) multiplication. For example, [-1, 1, 1, -1] [1, 1, 1, -1] = [-1, 1, 1, 1]. For training, all hypervectors r 1 , ..., r 60,000 that of the same digit are added together. The majority rule is then used to generate the representation R c for class c R c = sgn i∈c r i . ( ) For inference, the encoded test image is compared with the representation of each class R c , and the most similar one is selected. Cosine similarity, L2 distance, and Hamming distance are commonly



https://anonymous.4open.science/r/LowHDC-F74B/README.md

