ESTIMATION OF NUMBER OF COMMUNITIES IN AS-SORTATIVE SPARSE NETWORKS

Abstract

Most community detection algorithms assume the number of communities, K, to be known a priori. Among various approaches that have been proposed to estimate K, the non-parametric methods based on the spectral properties of the Bethe Hessian matrices have garnered much popularity for their simplicity, computational efficiency, and robust performance irrespective of the sparsity of the input data. Recently, one such method has been shown to estimate K consistently if the input network is generated from the (semi-dense) stochastic block model, when the average of the expected degrees ( d) of all the nodes in the network satisfies d log(N ) (N being the number of nodes in the network). In this paper, we prove some finite sample results that hold for d = o(log(N )), which in turn show that the estimation of K based on the spectra of the Bethe Hessian matrices is consistent not only for the semi-dense regime, but also for the sub-logarithmic sparse regime when 1 d log(N ). Thus, our estimation procedure is a robust method for a wide range of problem settings, regardless of the sparsity of the network input.

1. INTRODUCTION

Statistical analysis of network data has now become an extensively studied field within statistics and machine learning (see (Goldenberg et al., 2010; Kolaczyk & Csárdi, 2014; Newman, 2018) for reviews). Network datasets show up in several disciplines. Examples include networks originating from biosciences such as gene regulation networks (Emmert-Streib et al. (2014) ), protein-protein interaction networks (De Las Rivas & Fontanillo (2010) ), structural (Rubinov & Sporns (2010) ) and functional networks (Friston (2011) ) of brain and epidemiological networks (Reis et al. (2007) ); networks originating from social media such as Facebook, Twitter and LinkedIn (Faloutsos et al. (2010) ); citation and collaboration networks (Lehmann et al. (2003) ); information and technological networks such as internet-based networks (Adamic & Glance (2005) ), power networks (Pagani & Aiello (2013) ) and cell-tower networks (Isaacman et al. (2011) ). There are several active areas of research in developing statistical methodologies for network data analysis and also deriving the theoretical properties of the statistical methods. In this paper, we focus on networks with community structure and finding the number of communities in networks with arbitrary sparsity level. The last two decades saw a resurgence of interest in a problem popularly known as "community detection". A common problem definition is to partition N nodes in a graph into K communities such that there are differences in edge densities between within and between communities, where K is assumed to be known a priori. Estimating number of communities (K) has recently become active in the literature. While the initial focus in the literature for estimating K has been developing algorithms and drawing support from domain-specific intuition and empirical studies using the Stochastic Block Model (SBM), first proposed in Holland et al. (1983) , (such as, Saade et al. (2014a) , Yan et al. (2018) ), there has been recent progress on attaining theoretical understanding of community numbers. Bickel & Sarkar (2015) and Lei et al. (2016) proposed hypothesis testing approaches based on principal eigenvalues or singular values. Some likelihood-based methods using the BIC criterion were proposed by Wang et al. (2017) and Hu et al. (2019) . From a Bayesian perspective, Riolo et al. (2017) discussed priors for number of communities under the SBM and designed an Markov Chain Monte Carlo algorithm, Kemp et al. (2006) presented a nonparametric Bayesian approach for detecting concept systems, Xu et al. (2006) introduced an infinite-state latent variable as part of a Dirichlet process mixture model, and Cerqueira & Leonardi (2020) proposed an estimator based on integrated likelihood for the SBM. Rosvall & Bergstrom (2007) introduced the concept of the minimum description length (MDL) to describe network modularities in partitioning networks, and Peixoto (2013) employed MDL to detect the number of communities. Chen & Lei (2018) and Li et al. (2020) proposed cross-validation based approaches with theoretical guarantees to estimate K. Yan et al. (2018) proposed a semi-definite programming approach, and Ma et al. (2018) proposed an estimator based on the loss of binary segmentation using pseudo-likelihood ratio. All of these approaches had theoretical guarantees. However, all the theoretical results were obtained under the assumption that mean density of the networks is greater than log(N ). Methods based on the spectrum of a certain class of matrices have become increasingly popular in recent years as non-parametric alternatives that are more computationally efficient and applicable to a wider range of settings. Most notably the non-backtracking matrices (e.g., Krzakala et al. (2013) , Saade et al. (2014b) , Coste & Zhu (2019) , Bordenave et al. (2015) , Saade et al. (2016) ) and the Bethe Hessian matrices (e.g., Saade et al. (2015b) , Lelarge (2018) , Dall'Amico et al. (2019) , Saade et al. (2015a) , Dall'Amico et al. (2020) , Saade et al. (2014a) , Le & Levina (2015) ) have received much attention due to their non-parametric form and competitive performance in the presence of degree heterogeneity and sparsity. In particular, unlike the non-backtracking operator, the Bethe Hessian is a real symmetric operator and hence offers additional computational advantages. Through simulations, Saade et al. (2014a) demonstrated that the Bethe Hessian outperformed the non-backtracking operator, belief propagation, and the adjacency matrices on clustering on both accuracy and efficiency. Le & Levina (2015) proved the consistency of the method based on the spectrum of the Bethe Hessian operator in semi-dense regimes, i.e., with the expected degree d log(N ) and the scalar parameter chosen from the two values commonly used in the literature based on heuristics for assortative and disassortative networks. However, other than the two candidate values and their variations, there are no other known values for the scalar parameter to ensure the consistency result in any regime. Furthermore, real-world networks are generally much more sparse and there is no theoretical result in the literature that guarantees the effectiveness of the Bethe Hessian operator in more sparse regimes. Our contribution. In this paper, we contribute to the theoretical understanding of the Bethe Hessian operator in estimating K for networks generated from the SBM in any regime regardless of the sparsity. We have three main contributions. • We show that the method of estimating K based on the spectral properties of the Bethe Hessian matrix ("spectral method") is statistically consistent, even in regimes more sparse than those previously considered in the literature, with the expected degree 1 d log(N ). The precise definition of d is given in §2.1. • We provide the first-of-its-kind interval of values for the scalar parameter of the Bethe Hessian operator that serves as a sufficient condition for the spectral method to correctly estimate K asymptotically in network data. • Through extensive simulations, we demonstrate that for any value chosen from the interval for the scalar parameter, the spectral method correctly estimates K in networks regardless of sparsity. We also consider the heuristics-based values commonly used in the literature for the scalar parameter in the context of the interval. The paper is arranged as follows. We present the definitions and a formal problem statement in §2. We present our main theoretical result and a sketch of the proof in §3, followed by empirical methods in §4. The simulation results and concluding remarks are given in §5 and §6, respectively.

2. PRELIMINARIES

2.1 NOTATION An adjacency matrix, denoted by A, is a random matrix whose rows and columns are labeled by nodes i, j ∈ [N ], where A ij = 1 if there is an edge between nodes i and j and 0 otherwise, and [N ] denotes the set {1, . . . , N }. The mean observed degree is denoted by d := 1 N 1 T N A1 N and the expected degree by d := 1 N 1 T N EA1 N . λ ↓ (A) denotes the -th largest eigenvalue of A and λ ↑ (A) denotes the -th smallest eigenvalue of A.

2.2. THE STOCHASTIC BLOCK MODEL

The stochastic block model (SBM) is a simple generative model for network data that embeds a community structure in an adjacency matrix A N ×N of the randomly generated network. SBM has three parameters: (1) the number of communities K; (2) the membership vector z = (z 1 , ..., z N ) that assigns a community label z i ∈ [K] to each node i ∈ [N ]; and (3) the connectivity probability matrix B K×K where the element B ab represents the probability of an edge between nodes belonging to community a and b, where a, b ∈ [K]. Z ∈ Z N ×K >0 is defined as the community membership matrix such that Z ij = 1 if node i belongs to community j and 0 otherwise. We denote the maximum expected degree by d max := N max i N j=1 [(ZBZ T ) ij -Diag(ZBZ T ) ij ] and the maximum entry in matrix B by d/N , where d := N max a,b∈[K] B ab . λ denotes the smallest eigenvalue of the normalized B matrix, λ := λ ↓ K N d B . Ā is the expectation of A and is computed as Ā = ZBZ T -Diag(ZBZ T ). D is a diagonal matrix whose i-th diagonal entry is the sum of the i-th row of Ā. Let N be the vector of true community sizes and N min denotes the number of nodes in the community with the lowest number of nodes in it. A network generated from the SBM with parameters K, B, Z is defined to be assortative if B aa > B ab for all a, b ∈ [K] with a = b, and if B has all positive eigenvalues (i.e., B has full-rank K). The existing works in the literature on the spectral method referenced above have considered assortative networks, and we also consider assortative networks in this paper.

2.3. THE BETHE HESSIAN MATRIX

The Bethe Hessian matrix associated with an adjacency matrix A is defined as H ζ := (ζ 2 -1)I N + D -ζA (2.1) where ζ > 1 is a real scalar parameter, D := Diag(A1 N ) is a diagonal matrix whose i-th diagonal entry corresponds to the degree of the i-th node, and I N is an identity matrix of dimension N × N . As a real symmetric operator, H ζ is analytically tractable and computationally efficient, and has a number of useful properties. Saade et al. (2014a) demonstrated that the community structure in A can be recovered by applying a standard clustering algorithm (such as k-means clustering) to the eigenvectors of H ζ corresponding to negative eigenvalues. In the spectral clustering literature, those eigenvalues whose eigenvectors encode the community structure are known as the informative eigenvalues and have been observed to be well-separated from the bulk of the spectrum. In Saade et al. (2014a) , ζ was set to be the square-root of the mean observed degree as a heuristic to render informative (negative) eigenvalues of H ζ . Le & Levina (2015) showed that the number of informative eigenvalues of H ζ directly estimate K in the semi-dense regime ( d log(N )) when ζ is set to be either r m := d 1 + • • • + d N -1 d 2 1 + • • • + d 2 N -1 or r a := (d 1 + • • • + d N )/N . Both r m and r a are obtained based on heuristic arguments and are commonly used in the literature to estimate the radius of the bulk of the spectra. r a was considered in Saade et al. (2014a) and the choice of r m stems from the deep connection between the spectrum of H ζ and that of another matrix which is known as the non-backtracking operator B. Denoting by m the number of edges in A, B is a 2m × 2m matrix indexed by directed edges i → j and defined B i→j,k→l = δ jk (1 -δ il ), where δ is the Kronecker delta and m is the number of edges. As in H ζ , the informative eigenvalues of B are well-separated from the bulk of its spectrum and are real, so it also has been used to develop many popular non-parametric methods for clustering (see e.g., Saade et al. (2014b) , Coste & Zhu (2019) , Bordenave et al. (2015) , Bruna & Li (2017) , Gulikers et al. (2016) ). This deep connection between H ζ and B was noted in Krzakala et al. (2013) and can be summarized by the phenomenon that, given any eigenvalue ν of B, the determinant of H ν vanishes. However, unlike H ζ , B is non-symmetric and its dimension (2m × 2m) can get quite large. These present analytical and computational challenges when using B, and in turn have popularized H ζ as a tool for clustering. Le & Levina (2015) showed that in semi-dense regimes with expected degree d log(N ), the number of negative eigenvalues of H ζ directly estimate K for ζ ∈ {r m , r a }, where the methods were called BHm and BHa. In addition, it was noted that the number of negative eigenvalues of H ζ tend to underestimate K when networks are unbalanced. Hence, corrections for BH m and BH a were proposed, namely BHmc and BHac, which heuristically estimate K = max{k : tρ n-k+1 ρ n-k } where ρ 1 • • • ρ N are sorted eigenvalues and t > 0 is the hyperparameter. In light of this, we present the following problem we focus on in this paper: Problem Definition: Suppose that we observe one network generated from the SBM, where the parameters K, Z, B satisfy (i) assortativity, and (ii) the sparsity condition d = o(log(N )). For the appropriate choices of ζ, are the negative eigenvalues of the Bethe Hessian matrix H ζ still informative for estimating K? If so, what are the appropriate choices for ζ? Can there be other heuristic choices for ζ? Are the popular heuristic choices of ζ, i.e., r m and r a as defined above (hereinafter "heuristic choices"), appropriate in the above sense?

3. THEORETICAL RESULTS

Our main contribution is twofold. First, we show that even in a sparse regime when 1 d log(N ), the number of informative eigenvalues of H ζ directly estimates K consistently. Second, we provide the first-of-its-kind interval, which serves conveniently as a sufficient condition, of appropriate values for ζ for which the number of informative eigenvalues of the associated matrix H ζ directly estimates K. Below, we formally state this twofold result and provide a sketch of the proof, where we build intuition and provide key intermediate results. Precise statements and full proofs for all of the intermediate results discussed below are presented in §1.2 in the Supplement, along with statements and proofs of other relevant results in the literature. Theorem 3.1. (Main Result) Let β := -d(λN min -1)/N . For any δ ∈ (0, 3/2), H ζ has exactly K negative eigenvalues for all ζ ∈ 1 2 -β ± β 2 + 4 -4d max with probability at least 1 - exp[-(ζ/ √ d) 3/2-δ ]. Sketch of the Proof In assessing the spectral properties of H ζ , it is more convenient to instead work with the spectrum of the associated Laplacian matrix, since it would allow us to use some of the important known results on the concentration of certain regularized adjacency matrix A around its expectation. Indeed, we are allowed to do so due to Sylvester's law of inertia (Theorem 1.4 in Supplement §1.1), which gives us that H ζ and the associated Laplacian have the same inertia. Note that the inertia of a real and symmetric matrix is a vector consisting of the number of positive, negative, and zero eigenvalues of the matrix. To be more precise, consider the Laplacian L ζ := 1 ζ H ζ = Dζ -A, where Dζ = (ζ -1 ζ )I N + 1 ζ D and ζ > 1. Now take its symmetric normalized version L(L ζ ) := D-1/2 ζ L ζ D-1/2 ζ . Then, by Sylvester's law of inertia, H ζ and L(L ζ ) have the same number of negative eigenvalues (see Lemma 1.5 in Supplement §1.2). Next, to make the problem more tractable, we show that L(L ζ ) concentrates around its expectation L( Lζ ) such that the problem can be stated in terms of the latter, which is a deterministic matrix, rather than the former, the random counterpart. More concretely, denote the expectation of the Laplacian L( Lζ  2 ζ-1/ζ √ d + (ζ 2 -1) 1/4 , where C is a constant, with probability at least 1 -2N -r (see Theorem 1.3 in Supplement §1.1) due to a concentration result in Le et al. (2017) where it is shown that regularized A concentrates around its expectation. The second part is also bounded by C r 4 ζ 2 ζ 2 -1 Å d ζ 2 -1 ã 3 Å 1 + d ζ 2 -1 ã 2 , where C is a constant, with probability at least 1 -e -2r due to the properties of the Orlicz norm and Markov-Bernstein-type inequalities. Hence, for d = o(log N ), the difference between the sample and its expected Laplacian, L(L ζ ) -L( Lζ ) , is finite. Note that this is a finite-sample result. We can obtain an asymptotic result from it by considering appropriate relationships among d, ζ, and r, where r 1 determines the probability (1 -e -r ) for the foregoing result. A sufficient condition for L(L ζ ) -L( Lζ ) to be o(1) with high probability is 1 r 1/3 ζ √ d (see Lemma 1.6 in Supplement §1.2). As the last step in this proof sketch, we apply Weyl's inequality to λ K (-Lζ ) and λ K+1 (-Lζ ), and readily see that only the K informative eigenvalues are negative, and hence the claimed result in the theorem (see Proof of Theorem 3.1 in Supplement §1.2). Remark (Theorem 3.1). Note that Theorem 3.1 is a finite sample result. The sufficient condition √ d ζ implies a high probability asymptotic result showing consistent estimation of the number of communities by the spectral method with ζ chosen from the interval given in the theorem. Hereinafter, we refer to the interval for ζ stated in Theorem 3.1 as the "oracle interval". A sufficient threshold for detecting K is presented below with a proof appearing in the Supplement. Corollary 3.2. In the setup of Theorem 3.1, with high probability, K can be detected if the following threshold is satisfied: λ > 2N √ d max -1 dN min + 1 N min (3.1) 4 EMPIRICAL METHODS

4.1. ESTIMATION OF THE INTERVAL FOR BETHE HESSIAN SCALAR PARAMETER

One practical consideration that needs to be addressed when implementing Theorem 3.1 and Corollary 3.2 is finding estimators of the parameters that are not directly observable in the data, namely d, d max , λ, and N min . Below, we propose an algorithm to compute the estimators for these oracle values. We do so by first estimating community memberships Z using regularized spectral clustering (Amini et al. ( 2013); Le et al. ( 2017)) and using maximum likelihood estimates to estimate the rest of the parameters. Then, the desired estimators are computed in a straightforward way. Procedure 4.1 PARAMS-ESTIMATION Input: Adjacency matrix A; a candidate number of communities K 0 Output: NK0 : estimator for N; BK0 : estimator for B; and Ẑ: estimator for Z 1: Obtain Ẑ using regularized spectral clustering of A with K 0 communities See Remark (4.1) 2: NK0 ← ẐT 1 N 3: BK0 ← Diag( NK0 ) -1 ẐT A ẐDiag( NK0 ) -1 .

Remark (4.1). In

Step 1, we need an algorithm which can consistently recover communities from A. Other standard clustering algorithms can also be used in Step 1 as long as it consistently recovers community labels. The consistency of the estimators proposed in Algorithm 4.1 have already been established in Le et al. (2017) . The time complexity of the procedure is O(N 3 ) driven by the eigenvalue computation in Step 1. Hereinafter, we refer to the interval computed with the estimators from this procedure as the "estimated interval" (recall that the interval in Theorem 3.1 is referred to as the "oracle interval"). Procedure 4.1 outputs NK0 and BK0 with candidate number of communities K 0 ∈ [1, ..., K max ] as an input, where K max is a tuning parameter. Then, the minimal community size is estimated with Nmin = min{ N2 }. Nmin is an upper bound of N min with high probability and has shown in simulations to be a good estimate of N min . Details on ad-hoc estimation of d, d max , and λ using NK0 and BK0 and tuning parameter K 0 are given in the Supplement §1.3. Figure 4 .1 shows the simulation results on the performance of the oracle and estimated intervals for ζ, and two popular heuristic choices r m and r a . Under the setting of a large network (N ) and the assortativity condition, the estimated intervals computed with Procedure 4.1 appear to match their oracle values well. It is shown that once the threshold in Corollary 3.2 is satisfied, r m and r a turn out to be sufficient, i.e., fall within the oracle interval. In §5, it is shown that values from the interval other than r m and r a can improve the performance, especially when N is large in the sparse regime. Further extensive simulation results based on other parameter settings are included in the Supplement §1.3.  1: D ← Diag(A1 N ) 2: H ζ ← (ζ 2 -1)I N + D -ζA 3: Obtain sorted eigenvalues λ ↑ 1 , ..., λ ↑ N of H ζ 4: K ← max{k : λ ↑ k < 0} Remark (4.2). Just as with Procedure 4.1, the time complexity of Procedure 4.2 is O(N 3 ) driven by the eigenvalue computation in Step 3. Hereinafter, we refer to Procedures 4.1 and 4.2 as the "BHsparse" method.

5. EMPIRICAL STUDIES

We denote empirical accuracy rate (ACR) as the fraction of accurate estimates of K out of 20 repetitions per simulation. Recent literature (Le & Levina (2015) , Yan et al. (2018) , Cerqueira & Leonardi (2020) ) showed that methods based on the spectrum of the Bethe Hessian operator with popular heuristic choices for ζ, i.e., {r m , r a }, are competitive in performance and computational efficiency in the semi-dense regimes. However, the synthetic networks used in the above references were relatively small (in terms of N ) and more dense (with d O(log(N ))) compared to the realworld networks. Through extensive simulations, we compare the performance of BHsparse with those based on the heuristic choices for ζ on large (N up to 35,000) and sparse ( d = o(log(N )) networks. It is shown that BHsparse outperforms those based on the heuristic choices, especially as N gets large and networks become more assortative.

5.1. DATA GENERATION AND SIMULATION SETTINGS

We simulate network data from the SBM under two different settings. In Simulation Setting (1), we define B := ρB 0 := ρ(η -1)b [I K + 1 η-1 1 K 1 T K ]. ρ controls the expected degree by d = ρ(1 T N (ZB 0 Z T -Diag(ZB 0 Z T ))1 N )/N . η is the in/out ratio based on B and determines the degree of assortativity. b is the baseline value in B, which is set to 0.1. We first simulate the membership vector Z ∼ Mult 1; 1 K , ..., 1

K

. We set d ∈ {3 log(N ), 0.165(log(N )) 2 , 0.788(N ) (1/3) } by varying ρ, to assess the performance of the algorithms under different sparsity regimes. The constants in the rates of d are chosen in way that d is same at N = 1000 for all the rates. With a fixed Z and B, and given model parameters K, N , d, and η, we then generate A with 20 repetitions. Table 5 .1 summarises the combinations of model parameter settings used in the simulations. In Simulation Setting (2), we use a more general probability connectivity matrix as defined in equa- 

6. DISCUSSION

In this paper, we contribute theoretical results on the selection of Bethe Hessian scalar parameter, ζ, for a consistent estimation of number of communities (K) in networks that are generated from the SBM with arbitrary degree of sparsity. To the best of our knowledge, this is the first study to theoretically prove the consistency of the Bethe Hessian spectral method to estimate K in sparse regimes with d = o(log(N )). We also rigorously derive the oracle interval and provide a convenient way to empirically estimate the intervals for selecting ζ to construct the Bethe Hessian operator to consistently estimate K. We support our theoretical results with simulation studies and real-world network application too. In this paper, we only prove an upper bound of the hypothesized threshold for estimation of number of communities. An important future work will be to prove the lower bound results such that the existence of the threshold can be properly established.



= Dζ -Ā and Dζ = (ζ -1 ζ )I N + 1 ζ D. Then, we decompose L( Lζ ) into two parts. The first part is the difference between A and Ā, In a regime satisfying d = o(log(N )), the first part is bounded by Cr

Figure 4.1: The oracle interval for ζ (Theorem 3.1) and its estimation (Procedure 4.1) are shown with two popular heuristic choices for ζ (r m and r a ) commonly used in literature. Network data was simulated from the SBM with the parameter settings shown in Table 5.1 with K = 3 and d = 3 log(N ), each simulated with 20 repetitions. Intervals are shown as zeros when the threshold (3.1) is not met.

Figure 5.1 below shows ACR of BHsparse versus η, with varying values for ζ chosen from quantiles of the oracle interval in Theorem 3.1. It is clear that there is a threshold value of η below which detection of K fails and otherwise it succeeds. The top row (A)  shows that this threshold decreases as N increases from 5, 000 to 15, 000 while the bottom row (B) shows that the threshold increases with K. Note that the threshold for λ in equation 3.1, which depends on η, decreases as N increases.

Figure 5.1: ACR of BHsparse with ζ set to quantiles (10%, 30%, 50%, 70%, 90%) of the oracle interval in Theorem 3.1. Network data was generated from Simulation Setting (1) with fixed d = 3 log(N ). (A) shows ACR versus η for varying levels of N with K = 3. (B) shows ACR versus η for varying levels of K with N = 25, 000.

Figure 5.2: ACR of BHsparse versus η as K and N vary, using estimated intervals with ζ set to quantiles (10%, 30%, 50%, 70%, 90%) of the estimated intervals using Procedure 4.1 based on networks satisfying the threshold (3.2). Network data was generated from Simulation Setting (1) with fixed d = 3 log(N ).

Figure 5.2 shows ACR of BHsparse with ζ set to different quantiles of the estimated intervals. Only those cases where either interval exists are shown in the plot. It can be observed that the performance becomes worse as ζ gets close to end-points of the intervals. Generally 30% to 50% quantiles within the intervals appear to work the best. In Figure 5.3 (Figure 5.4 resp.), we compare the performance of BHsparse using 30% and 50% quantiles of oracle intervals (estiamted intervals resp.) with BHmc and BHac. Figure 5.3 and 5.4 show that when the threshold in Corollary 3.2 is satisfied, ζ ∈ {30%, 50%} quantiles of both the oracle and estimated intervals perform better than the two heuristic choices in Le & Levina (2015). The plots corresponding to Figures 5.1, 5.2, 5.3, and 5.4 for the other two density regimes of d ∈ {0.165(log(N )) 2 , 0.788(N ) (1/3) } are given in the Supplement §1.3.We also compare performances of BHsparse of ζ equals 30%, 50%, and 70% quantiles of the estimated intervals with BHmc and BHac with a more general setting of the probability connectivity matrix as Equation 5.1. Figure5.5 shows the ACR performances of our proposed method with choices of ζ as 30% to 50% quantiles of the intervals over-perform the methods proposed inLe & Levina (2015).

Figure 5.3: Row (A) shows ACR versus η using oracle intervals, with different values of N and K = 3. Row (B) shows ACR versus η as K varies with fixed N = 25, 000. Both plots only include cases where oracle thresholds in Corollary 3.2 are satisfied and are based on data generated from Simulation Setting (1) with fixed d = 3 log(N ).

Figure 5.4: ACR versus η as K varies using estimated intervals, based on data from Simulation Setting (1) with fixed d = 3 log(N ) and only including cases where estimated thresholds in Corollary 3.2 are satisfied. For ζ, 30% and 50% quantiles of the estimated intervals are considered.

Figure 5.5: ACR versus η as N varies using estimated intervals, based on data from Simulation Setting (2) with K = 3 and d = 3 log(N ), and only include cases where estimated thresholds in Corollary 3.2 are satisfied. For ζ, 30% and 50% quantiles of the estimated intervals are considered.



