LONG-TAILED LEARNING REQUIRES FEATURE LEARNING

Abstract

We propose a simple data model inspired from natural data such as text or images, and use it to study the importance of learning features in order to achieve good generalization. Our data model follows a long-tailed distribution in the sense that some rare subcategories have few representatives in the training set. In this context we provide evidence that a learner succeeds if and only if it identifies the correct features, and moreover derive non-asymptotic generalization error bounds that precisely quantify the penalty that one must pay for not learning features. We begin with a simple example to explain our data model and to illustrate, at an intuitive level, the importance of learning features when faced with a long-tailed data distribution. For the sake of exposition we adopt NLP terminology such as 'words' and 'sentences,' but the image-based terminology of 'patches' and 'images' would do as well. The starting point is a very standard mechanism for generating observed data from some underlying collection of latent variables. Consider the data model depicted in Figure 1 . We have a vocabulary of n w = 12 words and a set of n c = 3 concepts: V = {potato, cheese, carrots, chicken, . . .} and C = {vegetable

1. INTRODUCTION

Part of the motivation for deploying a neural network arises from the belief that algorithms that learn features/representations generalize better than algorithms that do not. We try to give some mathematical ballast to this notion by studying a data model where, at an intuitive level, a learner succeeds if and only if it manages to learn the correct features. The data model itself attempts to capture two key structures observed in natural data such as text or images. First, it is endowed with a latent structure at the patch or word level that is directly tied to a classification task. Second, the data distribution has a long-tail, in the sense that rare and uncommon instances collectively form a significant fraction of the data. We derive non-asymptotic generalization error bounds that quantify, within our framework, the penalty that one must pay for not learning features. We first prove a two part result that quantifies precisely the necessity of learning features within the context of our data model. The first part shows that a trivial nearest neighbor classifier performs perfectly when given knowledge of the correct features. The second part shows it is impossible to a priori craft a feature map that generalizes well when using a nearest neighbor classification rule. In other words, success or failure depends only on the ability to identify the correct features and not on the underlying classification rule. Since this cannot be done a priori, the features must be learned. Our theoretical results therefore support the idea that algorithms cannot generalize on long-tailed data if they do not learn features. Nevertheless, an algorithm that does learn features can generalize well. Specifically, the most direct neural network architecture for our data model generalizes almost perfectly when using either a linear classifier or a nearest neighbor classifier on the top of the learned features. Crucially, designing the architecture requires knowing only the meta structure of the problem, but no a priori knowledge of the correct features. This illustrates the built-in advantage of neural networks; their ability to learn features significantly eases the design burden placed on the practitioner. Subcategories in commonly used visual recognition datasets tend to follow a long-tailed distribution (Salakhutdinov et al., 2011; Zhu et al., 2014; Feldman & Zhang, 2020) . Some common subcategories have a wealth of representatives in the training set, whereas many rare subcategories only have a few representatives. At an intuitive level, learning features seems especially important on a long-tailed dataset since features learned from the common subcategories help to properly classify test points from a rare subcategory. Our theoretical results help support this intuition. We note that when considering complex visual recognition tasks, datasets are almost unavoidably long-tailed (Liu et al., 2019) -even if the dataset contains millions of images, it is to be expected that many subcategories will have few samples. In this setting, the classical approach of deriving asymptotic performance guarantees based on a large-sample limit is not a fruitful avenue. General-ization must be approached from a different point of view (c.f. Feldman (2020) for very interesting work in this direction). In particular, the analysis must be non-asymptotic. One of our main contribution is to derive, within the context of our data model, generalization error bounds that are non-asymptotic and relatively tight -by this we mean that our results hold for small numbers of data samples and track reasonably well with empirically evaluated generalization error. In Section 2 we introduce our data model and in Section 3 we discuss our theoretical results. For the simplicity of exposition, both sections focus on the case where each rare subcategory has a single representative in the training set. Section 4 is concerned with the general case in which each rare subcategory has few representatives. Section 5 provides an overview of our proof techniques. Finally, in Section 6, we investigate empirically a few questions that we couldn't resolve analytically. In particular, our error bounds are restricted to the case in which a nearest neighbor classification rule is applied on the top of the features -we provide empirical evidence in this last section that replacing the nearest neighbor classifier by a linear classifier leads to very minimal improvement. This further support the notion that, on our data model, it is the ability to learn features that drives success, not the specific classification rule used on the top of the features. Related work. By now, a rich literature has developed that studies the generalization abilities of neural networks. A major theme in this line of work is the use of the PAC learning framework to derive generalization bounds for neural networks (e.g. Bartlett et al. (2017) ; Neyshabur et al. (2017) ; Golowich et al. (2018) ; Arora et al. (2018) ; Neyshabur et al. ( 2018)), usually by proving a bound on the difference between the finite-sample empirical loss and true loss. While powerful in their generality, such approaches are usually task independent and asymptotic; that is, they are mostly agnostic to any idiosyncrasies in the data generating process and need a statistically meaningful number of samples in the training set. As such, the PAC learning framework is not well-tailored to our specific aim of studying generalization on long-tailed data distributions; indeed, in such setting, a rare subcategory might have only a handful of representatives in the training set. After breakthrough results (e.g. Jacot et al. (2018) ; Du et al. (2018) ; Allen-Zhu et al. (2019) ; Ji & Telgarsky (2019) ) showed that vastly over-parametrized neural networks become kernel methods (the so-called Neural Tangent Kernel or NTK) in an appropriate limit, much effort has gone toward analyzing the extent to which neural networks outperform kernel methods (Yehudai & Shamir, 2019; Wei et al., 2019; Refinetti et al., 2021; Ghorbani et al., 2019; 2020; Karp et al., 2021; Allen-Zhu & Li, 2019; 2020; Li et al., 2020; Malach et al., 2021) . Our interest lies not in proving such a gap for its own sake, but rather in using the comparison to gain some understanding on the importance of learning features in computer vision and NLP contexts. Analyses that shed theoretical light onto learning with long-tailed distributions (Feldman, 2020; Brown et al., 2021) or onto specific learning mechanisms (Karp et al., 2021) are perhaps closest to our own. The former analyses (Feldman, 2020; Brown et al., 2021) investigate the necessity of memorizing rare training examples in order to obtain near-optimal generalization error when the data distribution is long-tailed. Our analysis differs to the extent that we focus on the necessity of learning features and sharing representations in order to properly classify rare instances. Like us, the latter analysis (Karp et al., 2021 ) also considers a computer vision inspired task and uses it to compare a neural network to a kernel method, with the ultimate aim of studying the learning mechanism involved. Their object of study (finding a sparse signal in the presence of noise), however, markedly differs from our own (learning with long-tailed distributions). < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 H G B N G G b 8 K s Q A a p z C W s a R w l X H A S U j h M p x / N v H L B X A R s / S 7 X G Y Q J H i a x p O Y Y K l d 1 w e t n 0 d H X T + E a Z y q B E s e 3 7 4 t P B p 0 l R 9 O 0 G 1 x r U Z u g V D / E + o j 5 P V 9 C b c y U m E u J f B i o A O l J 1 E Z 4 3 O 9 r 7 Z U C z B b t H U f q S W b 5 l y a g w P k + / q 7 x z e q + L z q N D K L y R z S O 7 5 I u w A E 1 I z a Y Y o 2 C L s s 1 U J R k D I n J d S G 7 K k h w G S 7 g t O G g h I 7 Q H c F 1 P x N R Z V r o Q j m n M m V q P 4 a d 9 k u w 7 1 J f d a g 3 m x d 3 f s 7 L b b 0 u u S M S S z Z S s r a D W x j f P d f 7 a 4 o G g N Q t b v u N s w t 8 b Z y b a t 9 S K N 6 2 r p 6 A K / 3 e 8 O T Y b l Q 0 3 C t 0 X P s u t B z O / Q j R v I E U k k o F s J z h 5 k M F O Y y J h S K r p 8 L y D C Z 4 y l 4 2 k x x A i J Q 5 X s p 0 L H 2 R G j C u P 6 l E p X e + x k K J 0 I s T c O O t c q Z 2 I w Z 5 7 a Y l 8 v J h 0 D F a Z Z L S M m K a J J T J B k y j w 9 F M Q c i 6 V I b m P B Y a 0 V k h j k m u p s N F j l L B p Z s U A l a w 5 i I Z I y K g Z D Y X J l + x K b 0 C C b a K K t S y X I J l L K b Q v F p W C j d 0 w G y f 2 c N 6 J S D v n e L N J D T E t z A c Y j q 8 8 4 0 R H 8 a k 8 I N Y U m C 9 d 2 u 7 r 3 w 3 E A p / 3 5 W G T D D 1 3 O L Y k v O Y i N n p e h f W d F G l i O i n c Z C k n i 7 L T u 5 h i Y k X z j t n a m b o = " > A A A F 1 X i c h V P f b 9 M w E M 5 a C q P 8 2 u A N X i z a T Q h V U 7 M N A Q 9 I k 3 j h c U i U T W q i y X G u r V U 7 j m y n W 2 W F J 8 Q b 4 u / j / + A P w E m 9 L U 0 q z U q r 0 9 1 3 / u 6 + O 0 c p o 0 o P h 3 + 3 W u 1 7 n f s P t h 9 2 H z 1 + 8 v T Z z u 7 z 7 0 p k k s C I C C b k e Y Q V M J r A S F P N 4 D y V g H n E 4 C y a f y 7 i Z w u Q i o r k m 1 6 m E H I 8 T e i E E q y t 6 2 K 3 9 b v f 7 w Y R T G l i O N a S X r 3 N x y z s m i C a o K v 8 w h z 5 O U L 7 n 9 A + Q u P 9 Q M O V 5 o b M K J l D k g 9 s p H T F 1 g W g w H q a D u Q 8 S z H N p L 7 1 L A z B U g p d 3 B + i I L B f h f b w m t a x L g w D m N 9 S c p M K O a 8 S F m 0 X c X c c C z f M q l G y r o 6 D R 5 n W I D d z H z W 4 X a W D 5 r X N h h s V l s V U u Z 0 S B X e d + n h D 2 1 p n p K J k V f 8 b c e t y r w 2 p M Y N N z O / u F L x C u y 5 i R f S a O K 7 x 9 S I r m g e Q x D d 7 1 + 3 3 L 3 Z 6 w 4 N h e V D T 8 J 3 R 8 9 w 5 t Q s 8 D G J B M g 6 J J g w r N f a H q Q 4 N l p o S B n k 3 y B S k m M z x F M b W T D A H F Z r y 4 e R o z 3 p i N B H S / h K N S m 8 1 w 2 C u 1 N K 2 g v Z s k T N V j x X O T b F x p i c f Q k O T N N O Q k B X R J G N I C 1 S 8 Q h R T C U S z p T U w k d T W i s g M S 0 y s n A 0 W P e M D R z a 4 L m g N U 0 S 0 E E w N l M b F i O x r L l q P Y W K N s i v D l 0 t g T F z m R k 6 j 3 F h N B 8 j 9 H T e g U w l 2 z g 5 Z Q I 5 K c A M n I b 6 5 7 9 h C 7 G c x C V w S w T m 2 o 1 2 t Q z 7 2 Q 2 O C a l Y Z K J a v 5 + f 5 h p x F L W d V 0 V 1 Z c S 3 O i n c Z C k n i 7 L T u 5 h i Y k X z j t n a m b o = " > A A A F 1 X i c h V P f b 9 M w E M 5 a C q P 8 2 u A N X i z a T Q h V U 7 M N A Q 9 I k 3 j h c U i U T W q i y X G u r V U 7 j m y n W 2 W F J 8 Q b 4 u / j / + A P w E m 9 L U 0 q z U q r 0 9 1 3 / u 6 + O 0 c p o 0 o P h 3 + 3 W u 1 7 n f s P t h 9 2 H z 1 + 8 v T Z z u 7 z 7 0 p k k s C I C C b k e Y Q V M J r A S F P N 4 D y V g H n E 4 C y a f y 7 i Z w u Q i o r k m 1 6 m E H I 8 T e i E E q y t 6 2 K 3 9 b v f 7 w Y R T G l i O N a S X r 3 N x y z s m i C a o K v 8 w h z 5 O U L 7 n 9 A + Q u P 9 Q M O V 5 o b M K J l D k g 9 s p H T F 1 g W g w H q a D u Q 8 S z H N p L 7 1 L A z B U g p d 3 B + i I L B f h f b w m t a x L g w D m N 9 S c p M K O a 8 S F m 0 X c X c c C z f M q l G y r o 6 D R 5 n W I D d z H z W 4 X a W D 5 r X N h h s V l s V U u Z 0 S B X e d + n h D 2 1 p n p K J k V f 8 b c e t y r w 2 p M Y N N z O / u F L x C u y 5 i R f S a O K 7 x 9 S I r m g e Q x D d 7 1 + 3 3 L 3 Z 6 w 4 N h e V D T 8 J 3 R 8 9 w 5 t Q s 8 D G J B M g 6 J J g w r N f a H q Q 4 N l p o S B n k 3 y B S k m M z x F M b W T D A H F Z r y 4 e R o z 3 p i N B H S / h K N S m 8 1 w 2 C u 1 N K 2 g v Z s k T N V j x X O T b F x p i c f Q k O T N N O Q k B X R J G N I C 1 S 8 Q h R T C U S z p T U w k d T W i s g M S 0 y s n A 0 W P e M D R z a 4 L m g N U 0 S 0 E E w N l M b F i O x r L l q P Y W K N s i v D l 0 t g T F z m R k 6 j 3 F h N B 8 j 9 H T e g U w l 2 z g 5 Z Q I 5 K c A M n I b 6 5 7 9 h C 7 G c x C V w S w T m 2 o 1 2 t Q z 7 2 Q 2 O C a l Y Z K J a v 5 + f 5 h p x F L W d V 0 V 1 Z c S 3 O i n c Z C k n i 7 L T u 5 h i Y k X z j t n a m b o = " > A A A F 1 X i c h V P f b 9 M w E M 5 a C q P 8 2 u A N X i z a T Q h V U 7 M N A Q 9 I k 3 j h c U i U T W q i y X G u r V U 7 j m y n W 2 W F J 8 Q b 4 u / j / + A P w E m 9 L U 0 q z U q r 0 9 1 3 / u 6 + O 0 c p o 0 o P h 3 + 3 W u 1 7 n f s P t h 9 2 H z 1 + 8 v T Z z u 7 z 7 0 p k k s C I C C b k e Y Q V M J r A S F P N 4 D y V g H n E 4 C y a f y 7 i Z w u Q i o r k m 1 6 m E H I 8 T e i E E q y t 6 2 K 3 9 b v f 7 w Y R T G l i O N a S X r 3 N x y z s m i C a o K v 8 w h z 5 O U L 7 n 9 A + Q u P 9 Q M O V 5 o b M K J l D k g 9 s p H T F 1 g W g w H q a D u Q 8 S z H N p L 7 1 L A z B U g p d 3 B + i I L B f h f b w m t a x L g w D m N 9 S c p M K O a 8 S F m 0 X c X c c C z f M q l G y r o 6 D R 5 n W I D d z H z W 4 X a W D 5 r X N h h s V l s V U u Z 0 S B X e d + n h D 2 1 p n p K J k V f 8 b c e t y r w 2 p M Y N N z O / u F L x C u y 5 i R f S a O K 7 x 9 S I r m g e Q x D d 7 1 + 3 3 L 3 Z 6 w 4 N h e V D T 8 J 3 R 8 9 w 5 t Q s 8 D G J B M g 6 J J g w r N f a H q Q 4 N l p o S B n k 3 y B S k m M z x F M b W T D A H F Z r y 4 e R o z 3 p i N B H S / h K N S m 8 1 w 2 C u 1 N K 2 g v Z s k T N V j x X O T b F x p i c f Q k O T N N O Q k B X R J G N I C 1 S 8 Q h R T C U S z p T U w k d T W i s g M S 0 y s n A 0 W P e M D R z a 4 L m g N U 0 S 0 E E w N l M b F i O x r L l q P Y W K N s i v D l 0 t g T F z m R k 6 j 3 F h N B 8 j 9 H T e g U w l 2 z g 5 Z Q I 5 K c A M n I b 6 5 7 9 h C 7 G c x C V w S w T m 2 o 1 2 t Q z 7 2 Q 2 O C a l Y Z K J a v 5 + f 5 h p x F L W d V 0 V 1 Z c S 3 O i n c Z C k n i 7 L T u 5 h i Y k X z j t n a m b o = " > A A A F 1 X i c h V P f b 9 M w E M 5 a C q P 8 2 u A N X i z a T Q h V U 7 M N A Q 9 I k 3 j h c U i U T W q i y X G u r V U 7 j m y n W 2 W F J 8 Q b 4 u / j / + A P w E m 9 L U 0 q z U q r 0 9 1 3 / u 6 + O 0 c p o 0 o P h 3 + 3 W u 1 7 n f s P t h 9 2 H z 1 + 8 v T Z z u 7 z 7 0 p k k s C I C C b k e Y Q V M J r A S F P N 4 D y V g H n E 4 C y a f y 7 i Z w u Q i o r k m 1 6 m E H I 8 T e i E E q y t 6 2 K 3 9 b v f 7 w Y R T G l i O N a S X r 3 N x y z s m i C a o K v 8 w h z 5 O U L 7 n 9 A + Q u P 9 Q M O V 5 o b M K J l D k g 9 s p H T F 1 g W g w H q a D u Q 8 S z H N p L 7 1 L A z B U g p d 3 B + i I L B f h f b w m t a x L g w D m N 9 S c p M K O a 8 S F m 0 X c X c c C z f M q l G y r o 6 D R 5 n W I D d z H z W 4 X a W D 5 r X N h h s V l s V U u Z 0 S B X e d + n h D 2 1 p n p K J k V f 8 b c e t y r w 2 p M Y N N z O / u F L x C u y 5 i R f S a O K 7 x 9 S I r m g e Q x D d 7 1 + 3 3 L 3 Z 6 w 4 N h e V D T 8 J 3 R 8 9 w 5 t Q s 8 D G J B M g 6 J J g w r N f a H q Q 4 N l p o S B n k 3 y B S k m M z x F M b W T D A H F Z r y 4 e R o z 3 p i N B H S / h K N S m 8 1 w 2 C u 1 N K 2 g v Z s k T N V j x X O T b F x p i c f Q k O T N N O Q k B X R J G N I C 1 S 8 Q h R T C U S z p T U w k d T W i s g M S 0 y s n A 0 W P e M D R z a 4 L m g N U 0 S 0 E E w N l M b F i O x r L l q P Y W K N s i v D l 0 t g T F z m R k 6 j 3 F h N B 8 j 9 H T e g U w l 2 z g 5 Z Q I 5 K c A M n I b 6 5 7 9 h C 7 G c x C V w S w T m 2 o 1 2 t Q z 7 2 Q 2 O C a l Y Z K J a v 5 + f 5 h p x F L W d V 0 V 1 Z c S 3 V I N k p D q 6 l 7 B W c A D m p b A v U j 0 C 0 = " > A A A E Q X i c h V J N j 9 M w E E 2 3 f C z l q w t H L h Y t B a G q S p a V g A P S S l w 4 L h J l V 0 q i 4 j i T 1 F r b q W y 3 u 5 G V K 1 f 4 U f w J / g I n x B E u O G 1 a b Z M i L C e a z L w 3 b y Y z 0 Y x R p V 3 3 e 2 u v f e 3 6 j Z v 7 t z q 3 7 9 y 9 d 7 9 7 8 O C j y u a S w J h k L J N n E V b A q I C x p p r B 2 U w C 5 h G D 0 + j 8 b R k / X Y B U N B M f d D 6 D k O N U 0 I Q S r K 1 r 0 v 3 d 7 3 e C C F I q D M d a 0 s v n h c / C j g m i B J H i 6 c R D g z c D 5 K N B o O F S x y b G V O b F E D U c q + + F W U C a U t g 4 u O G A d T N u E 4 Q o C O x d a 6 2 l B q i e a X e q f 0 q j O q K U Q p 0 A R L z p s W P b n n R 7 7 s h d H t Q 0 v M r o O d U 5 m R z s u U G c k T k H o Q n D S v m e O 9 O h w V J T w q D o B H M F M 0 z O c Q q + N Q X m o E K z H F K B n l h P j J J M 2 k d o t P R e Z R j M l c p 5 Z J G 2 y q m q x 0 r n r p g / 1 8 m r 0 F A x m 2 s Q Z C W U z B n S G S o n j m I q g W i W W w M T S W 2 t i E y x x E T b v a i r 6 C k f V m L D d U F b m D K i s 4 y p o d L W B c K u T t l 6 D I k 1 l l 0 Z n u f A W H Z R G J l G h b H / d I i q 1 1 E D m k o A s U a W k B d L c A M n I d 7 k O 7 I Q e y 1 G w A X J O M d 2 t q s l K X w v N C a 4 y l o G y i 3 r e U W x g 7 O o c V Y V / Y 8 V 1 1 h V y 9 s 0 u 2 N e f a O a x v h w 9 H r k v T / s H f e r Z d t 3 H j m P n W e O 5 7 x 0 j p 1 3 z o k z d k j r U + t z 6 0 v r a / t b + 0 f 7 Z / v X C r r X q j g P n a 3 T / v M X w I 9 s c Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 H V I N k p D q 6 l 7 B W c A D m p b A v U j 0 C 0 = " > A A A E Q X i c h V J N j 9 M w E E 2 3 f C z l q w t H L h Y t B a G q S p a V g A P S S l w 4 L h J l V 0 q i 4 j i T 1 F r b q W y 3 u 5 G V K 1 f 4 U f w J / g I n x B E u O G 1 a b Z M i L C e a z L w 3 b y Y z 0 Y x R p V 3 3 e 2 u v f e 3 6 j Z v 7 t z q 3 7 9 y 9 d 7 9 7 8 O C j y u a S w J h k L J N n E V b A q I C x p p r B 2 U w C 5 h G D 0 + j 8 b R k / X Y B U N B M f d D 6 D k O N U 0 I Q S r K 1 r 0 v 3 d 7 3 e C C F I q D M d a 0 s v n h c / C j g m i B J H i 6 c R D g z c D 5 K N B o O F S x y b G V O b F E D U c q + + F W U C a U t g 4 u O G A d T N u E 4 Q o C O x d a 6 2 l B q i e a X e q f 0 q j O q K U Q p 0 A R L z p s W P b n n R 7 7 s h d H t Q 0 v M r o O d U 5 m R z s u U G c k T k H o Q n D S v m e O 9 O h w V J T w q D o B H M F M 0 z O c Q q + N Q X m o E K z H F K B n l h P j J J M 2 k d o t P R e Z R j M l c p 5 Z J G 2 y q m q x 0 r n r p g / 1 8 m r 0 F A x m 2 s Q Z C W U z B n S G S o n j m I q g W i W W w M T S W 2 t i E y x x E T b v a i r 6 C k f V m L D d U F b m D K i s 4 y p o d L W B c K u T t l 6 D I k 1 l l 0 Z n u f A W H Z R G J l G h b H / d I i q 1 1 E D m k o A s U a W k B d L c A M n I d 7 k O 7 I Q e y 1 G w A X J O M d 2 t q s l K X w v N C a 4 y l o G y i 3 r e U W x g 7 O o c V Y V / Y 8 V 1 1 h V y 9 s 0 u 2 N e f a O a x v h w 9 H r k v T / s H f e r Z d t 3 H j m P n W e O 5 7 x 0 j p 1 3 z o k z d k j r U + t z 6 0 v r a / t b + 0 f 7 Z / v X C r r X q j g P n a 3 T / v M X w I 9 s c Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 H V I N k p D q 6 l 7 B W c A D m p b A v U j 0 C 0 = " > A A A E Q X i c h V J N j 9 M w E E 2 3 f C z l q w t H L h Y t B a G q S p a V g A P S S l w 4 L h J l V 0 q i 4 j i T 1 F r b q W y 3 u 5 G V K 1 f 4 U f w J / g I n x B E u O G 1 a b Z M i L C e a z L w 3 b y Y z 0 Y x R p V 3 3 e 2 u v f e 3 6 j Z v 7 t z q 3 7 9 y 9 d 7 9 7 8 O C j y u a S w J h k L J N n E V b A q I C x p p r B 2 U w C 5 h G D 0 + j 8 b R k / X Y B U N B M f d D 6 D k O N U 0 I Q S r K 1 r 0 v 3 d 7 3 e C C F I q D M d a 0 s v n h c / C j g m i B J H i 6 c R D g z c D 5 K N B o O F S x y b G V O b F E D U c q + + F W U C a U t g 4 u O G A d T N u E 4 Q o C O x d a 6 2 l B q i e a X e q f 0 q j O q K U Q p 0 A R L z p s W P b n n R 7 7 s h d H t Q 0 v M r o O d U 5 m R z s u U G c k T k H o Q n D S v m e O 9 O h w V J T w q D o B H M F M 0 z O c Q q + N Q X m o E K z H F K B n l h P j J J M 2 k d o t P R e Z R j M l c p 5 Z J G 2 y q m q x 0 r n r p g / 1 8 m r 0 F A x m 2 s Q Z C W U z B n S G S o n j m I q g W i W W w M T S W 2 t i E y x x E T b v a i r 6 C k f V m L D d U F b m D K i s 4 y p o d L W B c K u T t l 6 D I k 1 l l 0 Z n u f A W H Z R G J l G h b H / d I i q 1 1 E D m k o A s U a W k B d L c A M n I d 7 k O 7 I Q e y 1 G w A X J O M d 2 t q s l K X w v N C a 4 y l o G y i 3 r e U W x g 7 O o c V Y V / Y 8 V 1 1 h V y 9 s 0 u 2 N e f a O a x v h w 9 H r k v T / s H f e r Z d t 3 H j m P n W e O 5 7 x 0 j p 1 3 z o k z d k j r U + t z 6 0 v r a / t b + 0 f 7 Z / v X C r r X q j g P n a 3 T / v M X w I 9 s c Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 H V I N k p D q 6 l 7 B W c A D m p b A v U j 0 C 0 = " > A A A E Q X i c h V J N j 9 M w E E 2 3 f C z l q w t H L h Y t B a G q S p a V g A P S S l w 4 L h J l V 0 q i 4 j i T 1 F r b q W y 3 u 5 G V K 1 f 4 U f w J / g I n x B E u O G 1 a b Z M i L C e a z L w 3 b y Y z 0 Y x R p V 3 3 e 2 u v f e 3 6 j Z v 7 t z q 3 7 9 y 9 d 7 9 7 8 O C j y u a S w J h k L The sequence of words on the right was obtained by sampling each word uniformly at random from the corresponding concept. For example, the first word was randomly chosen out of the dairy concept (butter, cheese, cream, yogurt), and the last word was randomly chosen out of the vegetable concept (potato, carrot, leek, lettuce.) Sequences of words will be referred to as sentences. J N n E V b A q I C x p p r B 2 U w C 5 h G D 0 + j 8 b R k / X Y B U N B M f d D 6 D k O N U 0 I Q S r K 1 r 0 v 3 d 7 3 e C C F I q D M d a 0 s v n h c / C j g m i B J H i 6 c R D g z c D 5 K N B o O F S x y b G V O b F E D U c q + + F W U C a U t g 4 u O G A d T N u E 4 Q o C O x d a 6 2 l B q i e a X e q f 0 q j O q K U Q p 0 A R L z p s W P b n n R 7 7 s h d H t Q 0 v M r o O d U 5 m R z s u U G c k T k H o Q n D S v m e O 9 O h w V J T w q D o B H M F M 0 z O c Q q + N Q X m o E K z H F K B n l h P j J J M 2 k d o t P R e Z R j M l c p 5 Z J G 2 y q m q x 0 r n r p g / 1 8 m r 0 F A x m 2 s Q Z C W U z B n S G S o n j m I q g W i W W w M T S W 2 t i E y x x E T b v a i r 6 C k f V m L D d U F b m D K i s 4 y p o d L W B c K u T t l 6 D I k 1 l l 0 Z n u f A W H Z R G J l G h b H / d I i q 1 1 E D m k o A s U a W k B d L c A M n I d 7 k O 7 I Q e y 1 G w A X J O M d 2 t q s l K X w v N C a 4 y l o G y i 3 r e U W x g 7 O o c V Y V / Y 8 V 1 1 h V y v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v The non-standard aspect of our model comes from how we use the 'latent-variable → observeddatum' process to form a training distribution. The training set in Figure 1 is made of 15 sentences split into R = 3 categories. The latent variables c 1 , c 2 , c 3 each generate a single sentence, whereas the latent variables c 1 , c 2 , c 3 each generate 4 sentences. We will refer to c 1 , c 2 , c 3 as the familiar sequences of concepts since they generate most of the sentences encountered in the training set. On the other hand c 1 , c 2 , c 3 will be called unfamiliar. Similarly, a sentence generated by a familiar (resp. unfamiliar) sequence of concepts will be called a familiar (resp. unfamiliar) sentence. The former represents a datum sampled from the head of a distribution while the latter represents a datum sampled from its tail. We denote by x r,s the s th sentence of the r th category, indexed so that the first sentence of each category is an unfamiliar sentence and the remaining ones are familiar. Suppose now that we have trained a learning algorithm on the training set described above and that at inference time we are presented with a previously unseen sentence generated by the unfamiliar sequence of concept c 1 = [dairy, dairy, veggie, meat, veggie]. To fix ideas, let's say that sentence is: x test = [butter, yogurt, carrot, beef, lettuce] This sentence is hard to classify since there is a single sentence in the training set that has been generated by the same sequence of concepts, namely x 1,1 = [cheese, butter, lettuce, chicken, leek] . (2) Moreover these two sentences do not overlap at all (i.e. the i th word of x test is different from the i th word of x 1,1 for all i.) To properly classify x test , the algorithm must have learned the equivalences butter ↔ cheese, yogurt ↔ butter, carrot ↔ lettuce, and so forth. In other words, the algorithm must have learned the underlying concepts. Nevertheless, a neural network with a well-chosen architecture can easily succeed at such a classification task. Consider, for example, the network depicted on Figure 2 . Each word of the input sentence, after being encoded into a one-hot-vector, goes through a multi-layer perceptron (MLP 1 on the figure) shared across words. The output is then normalized using LayerNorm (Ba et al., 2016) to produce a representation of the word. The word representations are then concatenated into a single vector that goes through a second multi-layer perceptron (MLP 2 on the figure). [ cheese, butter, lettuce, chicken, leek ] < l a t e x i t s h a 1 _ b a s e 6 4 = " K 6 R T H w 7 d 8 G L J D G O r y 1 W U 2 z l G j 5 g = " > A A A C w X i c d V F d S 9 x A F J 3 E j 9 r U 1 t U + 9 m V w U U o J S 1 I K 1 T d B H 3 y 0 0 F U h C c t k 9 u 7 u d C c z Y e Z G X E N + q G / 9 K Z 2 s s a i r F w b O P f d c z u V M X k p h M Y r u P X 9 t f W P z 3 d b 7 4 M P 2 x 0 8 7 v d 2 9 S 6 s r w 2 H I t d T m O m c W p F A w R I E S r k s D r M g l X O X z 0 3 Z + d Q P G C q 1 + 4 6 K E r G B T J S a C M 3 T U q H e X 5 j A V q i 4 Y G n H 7 r U l k F t D k M E W 4 x Z r P A C w 0 I a U d k V e I Y B z R 9 R I Q K 9 4 q D u n j i u B z U E 8 Y C T B v X J f R g K a g x v + t R r 1 + N I i W R V d B 3 I E + 6 e p i t O t F 6 V j z q g C F X D J r k z g q M a u Z Q c E l N E F a W S g Z n 7 M p J A 4 q V o D N 6 m V I D T 1 w z J h O t H F P I V 2 y T z d q V l i 7 K H K n d B f O 7 M t Z S 7 4 2 S y q c H G W 1 U G W F o P i D 0 a S S F D V t E 6 d j Y Y C j X D j A u B H u V s p n z D D u o l x x w V k R d m b h 4 0 H P N O 0 E t Z Y 2 t M j a q N 3 v w R u S J n A Z x y 8 T X Q X D 7 4 P j Q f z r R / 8 k 7 M L e I l / I P v l K Y v K T n J B z c k G G h J O / 3 q a 3 4 / X 8 M / + P X / r m Q e p 7 3 c 5 n 8 q z 8 + h 8 d z t x e < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " K 6 R T H w 7 d 8 G L J D G O r y 1 W U 2 z l G j 5 g = " > A A A C w X i c d V F d S 9 x A F J 3 E j 9 r U 1 t U + 9 m V w U U o J S 1 I K 1 T d B H 3 y 0 0 F U h C c t k 9 u 7 u d C c z Y e Z G X E N + q G / 9 K Z 2 s s a i r F w b O P f d c z u V M X k p h M Y r u P X 9 t f W P z 3 d b 7 4 M P 2 x 0 8 7 v d 2 9 S 6 s r w 2 H I t d T m O m c W p F A w R I E S r k s D r M g l X O X z 0 3 Z + d Q P G C q 1 + 4 6 K E r G B T J S a C M 3 T U q H e X 5 j A V q i 4 Y G n H 7 r U l k F t D k M E W 4 x Z r P A C w 0 I a U d k V e I Y B z R 9 R I Q K 9 4 q D u n j i u B z U E 8 Y C T B v X J f R g K a g x v + t R r 1 + N I i W R V d B 3 I E + 6 e p i t O t F 6 V j z q g C F X D J r k z g q M a u Z Q c E l N E F a W S g Z n 7 M p J A 4 q V o D N 6 m V I D T 1 w z J h O t H F P I V 2 y T z d q V l i 7 K H K n d B f O 7 M t Z S 7 4 2 S y q c H G W 1 U G W F o P i D 0 a S S F D V t E 6 d j Y Y C j X D j A u B H u V s p n z D D u o l x x w V k R d m b h 4 0 H P N O 0 E t Z Y 2 t M j a q N 3 v w R u S J n A Z x y 8 T X Q X D 7 4 P j Q f z r R / 8 k 7 M L e I l / I P v l K Y v K T n J B z c k G G h J O / 3 q a 3 4 / X 8 M / + P X / r m Q e p 7 3 c 5 n 8 q z 8 + h 8 d z t x e < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " K 6 R T H w 7 d 8 G L J D G O r y 1 W U 2 z l G j 5 g = " > A A A C w X i c d V F d S 9 x A F J 3 E j 9 r U 1 t U + 9 m V w U U o J S 1 I K 1 T d B H 3 y 0 0 F U h C c t k 9 u 7 u d C c z Y e Z G X E N + q G / 9 K Z 2 s s a i r F w b O P f d c z u V M X k p h M Y r u P X 9 t f W P z 3 d b 7 4 M P 2 x 0 8 7 v d 2 9 S 6 s r w 2 H I t d T m O m c W p F A w R I E S r k s D r M g l X O X z 0 3 Z + d Q P G C q 1 + 4 6 K E r G B T J S a C M 3 T U q H e X 5 j A V q i 4 Y G n H 7 r U l k F t D k M E W 4 x Z r P A C w 0 I a U d k V e I Y B z R 9 R I Q K 9 4 q D u n j i u B z U E 8 Y C T B v X J f R g K a g x v + t R r 1 + N I i W R V d B 3 I E + 6 e p i t O t F 6 V j z q g C F X D J r k z g q M a u Z Q c E l N E F a W S g Z n 7 M p J A 4 q V o D N 6 m V I D T 1 w z J h O t H F P I V 2 y T z d q V l i 7 K H K n d B f O 7 M t Z S 7 4 2 S y q c H G W 1 U G W F o P i D 0 a S S F D V t E 6 d j Y Y C j X D j A u B H u V s p n z D D u o l x x w V k R d m b h 4 0 H P N O 0 E t Z Y 2 t M j a q N 3 v w R u S J n A Z x y 8 T X Q X D 7 4 P j Q f z r R / 8 k 7 M L e I l / I P v l K Y v K T n J B z c k G G h J O / 3 q a 3 4 / X 8 M / + P X / r m Q e p 7 3 c 5 n 8 q z 8 + h 8 d z t x e < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " This network, if properly trained, will learn to give similar representations to words that belong to the same concept. Therefore, if it correctly classifies the train point x 1,1 given by (2), it will necessarily correctly classify the test point x test given by (1). So the neural network is able to classify the previously unseen sentence x test despite the fact that the training set contains a single example with the same underlying sequence of concepts. This comes from the fact that the neural network learns features and representations from the familiar part of the training set (generated by the head of the distribution), and uses these, at test time, to correctly classify the unfamiliar sentences (generated by the tail of the distribution). K 6 R T H w 7 d 8 G L J D G O r y 1 W U 2 z l G j 5 g = " > A A A C w X i c d V F d S 9 x A F J 3 E j 9 r U 1 t U + 9 m V w U U o J S 1 I K 1 T d B H 3 y 0 0 F U h C c t k 9 u 7 u d C c z Y e Z G X E N + q G / 9 K Z 2 s s a i r F w b O P f d c z u V M X k p h M Y r u P X 9 t f W P z 3 d b 7 4 M P 2 x 0 8 7 v d 2 9 S 6 s r w 2 H I t d T m O m c W p F A w R I E S r k s D r M g l X O X z 0 3 Z + d Q P G C q 1 + 4 6 K E r G B T J S a C M 3 T U q H e X 5 j A V q i 4 Y G n H 7 r U l k F t D k M E W 4 x Z r P A C w 0 I a U d k V e I Y B z R 9 R I Q K 9 4 q D u n j i u B z U E 8 Y C T B v X J f R g K a g x v + t R r 1 + N I i W R V d B 3 I E + 6 e p i t O t F 6 V j z q g C F X D J r k z g q M a u Z Q c E l N E F a W S g Z n 7 M p J A 4 q V o D N 6 m V I D T 1 w z J h O t H F P I V 2 y T z d q V l i 7 K H K n d B f O 7 M t Z S 7 4 2 S y q c H G W 1 U G W F o P i D 0 a S S F D V t E 6 d j Y Y C j X D j A u B H u V s p n z D D u o l x x w V k R d m b h 4 0 H P N O 0 E t Z Y 2 t M j a q N 3 v w R u S J n A Z x y 8 T X Q X D 7 4 P j Q f z r R / 8 k 7 M L e I l / I P v l K Y v K T n J B z c k G G h J O / 3 q a 3 In other words, because it learns features, the neural network has no difficulty handling the long-tailed nature of the distribution. To summarize, the variables L, n w , n c , R and n spl parametrize instances of our data model. They denote, respectively, the length of the sentences, the number of words in the vocabulary, the number of concepts, the number of categories, and the number of samples per category. So in the example presented in Figure 1 we have L = 5, n w = 12, n c = 3, R = 3 and n spl = 5 (four familiar sentences and one unfamiliar sentence per category). The vocabulary V and set of concepts C are discrete sets with |V| = n w and |C| = n c , rendered as V = {1, . . . , n w } and C = {1, . . . , n c } for concreteness. A partition of the vocabulary into concepts, like the one depicted at the top of Figure 1 , is encoded by a function ϕ : V → C that assigns words to concepts. We require that each concept contains the same number of words, so that ϕ satisfies |ϕ -1 ({c})| = |{w ∈ V : ϕ(w) = c}| = n w /n c for all c ∈ C, and we refer to such a function ϕ : V → C satisfying (3) as equipartition of the vocabulary. The set Φ = {All functions ϕ from V to C that satisfy (3) } denotes the collection of all such equipartitions, while the data space and latent space are denoted X = V L and Z = C L , respectively. Elements of X are sentences of L words and they take the form x = [x 1 , x 2 , . . . , x L ], while elements of Z take the form c = [c 1 , c 2 , . . . , c L ] and correspond to sequences of concepts. In the context of this work, a feature map refers to any function ψ : X → F from data space to feature space. The feature space F can be any Hilbert space (possibly infinite dimensional) and we denote by •, • F the associated inner product. Our analysis applies to the case in which a nearest neighbor classification rule is applied on the top of the extracted features. Such rule works as follow: given a test point x, the inner products ψ(x), ψ(y) F are evaluated for all y in the training set; the test point x is then given the label of the training point y that led to the highest inner product.

3. STATEMENT AND DISCUSSION OF MAIN RESULTS

Our main result states that, in the context of our data model, features must be tailored (i.e. learned) to each specific task. Specifically, it is not possible to find a universal feature map ψ : X → F that performs well on a collection of tasks like the one depicted on Figure 1 . In the context of this work, a task refers to a tuple T = ( ϕ ; c 1 , . . . , c R ; c 1 , . . . , c R ) ∈ Φ × Z 2R (4) that prescribes a partition of the vocabulary into concepts, R familiar sequences of concepts, and R unfamiliar sequences of concepts. Given such a task T we generate a training set S as described in the previous section. This training set contains R × n spl sentences split over R categories, and each category contains a single unfamiliar sentence. Randomly generating the training set S from the task T corresponds to sampling S ∼ D train T from a distribution D train T defined on the space X R×n spl and parametrized by the variables in (4) (the appendix provides an explicit formula for this distribution). We measure performance of an algorithm by its ability to generalize on previously unseen unfamiliar sentences. Generating an unfamiliar sentence amounts to drawing a sample x ∼ D test T from a distribution D test T on the space X parametrized by the variables ϕ, c 1 , . . . , c R in (4) that determine unfamiliar sequences of concepts. Finally, associated with every task T we have a labelling function f T : X → {1, . . . , R} that assigns the label r to sentences generated by either c r or c r (this function is ill-defined if two sequences of concepts from different categories are identical, but this issue is easily resolved by formal statements in the appendix). Summarizing our notations, for every task T ∈ Φ × Z 2R we have a distribution D train T on the space X R×n spl , a distribution D test T on the space X , and a labelling function f T . Given a feature space F, a feature map ψ : X → F, and a task T ∈ Φ × Z 2R , the expected generalization error of the nearest neighbor classification rule on unfamiliar sentences is given by: err(F, ψ, T ) = E S∼D train T P x∼D test T f T arg max y∈S ψ(x), ψ(y) F = f T (x) . (5) For simplicity, if the test point has multiple nearest neighbors with inconsistent labels in the training set (and so the arg max returns multiple training points y), we will count the classification as a failure for the nearest neighbor classification rule. We therefore replace (5) by the more formal (but more cumbersome) formula err(F, ψ, T ) = E S∼D train T P x∼D test T ∃y ∈ arg max y∈S ψ(x), ψ(y) F such that f T (y) = f T (x) to make this explicit. Our main theoretical results concern performance of a learner not on a single task T but on a collection of tasks T = {T 1 , T 2 , . . . , T Ntasks }, and so we define err(F, ψ, T) = 1 |T| T ∈T err(F, ψ, T ) as the expected generalization error on such a collection T of tasks. As a task refers to an element of the discrete set Φ × Z 2R , any subset T ⊂ Φ × Z 2R defines a collection of tasks. Our main result concerns the case where the collection of tasks T = Φ × Z 2R consists in all possible tasks that one might encounter. For concreteness, we choose specific values for the model parameters and state the following special case of our main theorem (Theorem 3 at the end of this section) -Theorem 1. Let L = 9, n w = 150, n c = 5, R = 1000 and n spl ≥ 2. Let T = Φ × Z 2R . Then err(F, ψ, T) > 98.4% for all feature spaces F, and all feature maps ψ : X → F. In other words, for the model parameters specified above, it is not possible to design a 'task-agnostic' feature map ψ that works well if we are uniformly uncertain about which specific task we will face. Indeed, the best possible feature map will fail at least 98.4% of the time at classifying unfamiliar sentences (with a nearest-neighbor classification rule), where the probability is with respect to the random choices of the task, of the training set, and of the unfamiliar test sentence. Interpretation: Our desire to understand learning demands that we consider a collection of tasks rather than a single one, for if we consider only a single task then the problem, in our setting, becomes trivial. Indeed, assume T = {T 1 } with T 1 = (ϕ; c 1 , . . . , c R ; c 1 , . . . , c R ) consists only of a single task. With knowledge of this task we can easily construct a feature map ψ : X → R Lnc that performs perfectly. Indeed, the map ψ([x 1 , . . . , x L ]) = [e ϕ(x1) , . . . , e ϕ(x L ) ] that simply 'replaces' each word x of the input sentence by the one-hot-encoding e ϕ(x ) of its corresponding concept will do. 1 A bit of thinking reveals that the nearest neighbor classification rule associated with feature map (8) perfectly solves the task T 1 . This is due to the fact that sentences generated by the same sequence of concepts are mapped by ψ to the exact same location in feature space. As a consequence, the nearest neighbor classification rule will match the unfamiliar test sentence x to the unique training sentence y that occupies the same location in feature space, and this training sentence has the correct label by construction (assuming that sequences of concepts from different categories are distinct). To put it formally: Theorem 2. Given a task T ∈ Φ × Z 2R satisfying c r = c s and c r = c s for all r = s, there exists a feature space F and a feature map ψ : X → F such that err(F, ψ, T ) = 0. Consider now the case where T = {T 1 , T 2 } consists of two tasks. According to Theorem 2 there exists a ψ that perfectly solves T 1 , but this ψ might perform poorly on T 2 , and vice versa. So, it might not be possible to design good features if we do not know a priori which of these tasks we will face. Theorem 1 states that, in the extreme case where T contains all possible tasks, this is indeed the case -the best possible 'task-agnostic' features ψ will perform catastrophically on average. In other words, features must be task-dependent in order to succeed. To draw a very approximate analogy, imagine once again that T = {T 1 } and that T 1 represents, say, a hand-written digit classification task. A practitioner, after years of experience, could hand-craft a very good feature map ψ that performs almost perfectly for this task. If we then imagine the case T = {T 1 , T 2 } where T 1 represents a hand-written digit classification task and T 2 represents, say, an animal classification task, then it becomes more difficult for a practitioner to handcraft a feature map ψ that works well for both tasks. In this analogy, the size of the set T encodes the amount of knowledge the practitioner has about the specific tasks she will face. The extreme choice T = Φ × Z 2R corresponds to the practitioner knowing nothing beyond the fact that natural images are made of patches. Theorem 1 quantifies, in this extreme case, the impossibility of hand-crafting a feature map ψ knowing only the range of possible tasks and not the specific task itself. In a realistic setting the collection of tasks T is smaller, of course, and the data generative process itself is more coherent than in our simplified setup. Nonetheless, we hope our analysis sheds some light on some of the essential limitations of algorithms that do not learn features. Finally, our empirical results (see Section 6) show that a simple algorithm that learns features does not face this obstacle. We do not need knowledge of the specific task T in order to design a good neural network architecture, but only of the family of tasks T = Φ × Z 2R that we will face. Indeed, the architecture in Figure 2 succeeds at classifying unfamiliar test sentences more than 99% of the time. This probability, which we empirically evaluate, is with respect to the choice of the task, the choice of the training set, and the choice of the unfamiliar test sentence (we use the values of L, n w , n c and R from Theorem 1, and n spl = 6, for this experiment). Continuing with our approximate analogy, this means our hypothetical practitioner needs no domain specific knowledge beyond the patch structure of natural images when designing a successful architecture. In sum, successful feature design requires task-specific knowledge while successful architecture design requires only knowledge of the task family. Main Theorem: Our main theoretical result extends Theorem 1 to arbitrary values of L, n w , n c , n spl and R. The resulting formula involves various combinatorial quantities. We denote by n k the binomial coefficients and by n k the Stirling numbers of the second kind. Let N = {0, 1, 2, . . .} and let γ, γ : N L+1 → N be the functions defined by γ(k) := L+1 i=1 (i -1)k i and γ(k) := L+1 i=1 ik i , respectively. We then define, for 0 ≤ ≤ L, the sets S := k ∈ N L+1 : γ(k) = n w and ≤ γ(k) ≤ L . We let S = S 0 , and we note that the inclusion S ⊂ S always holds. Given k ∈ N L+1 we denote by A k := A ∈ N (L+1)×nc : L+1 i=1 iA ij = n w /n c for all j and nc j=1 A ij = k i for all i the set of k-admissible matrices. Finally, we let f, g : S → R be the functions defined by f(k) := (n w /n c )! nc n L c n w ! A∈A k L+1 i=1 k i ! A i,1 ! A i,2 ! • • • A i,nc ! and g(k) := γ(k)! n 2L w n w ! k 1 !k 2 ! • • • k L+1 ! L+1 i=2 i (i-2) i! ki   L i=γ(k) L i i γ(k) 2 i n L-i w   , respectively. With these definitions at hand, we may now state our main theorem. Theorem 3 (Main Theorem). Let T = Φ × Z 2R . Then err(F, ψ, T) ≥ k∈S f(k)g(k) - 1 R 1 + 1 2 max k∈S f(k) (9) for all feature spaces F, all feature maps ψ : X → F, and all 0 ≤ ≤ L. The combinatorial quantities involved appear a bit daunting at a first glance, but, within the context of the proof, they all take a quite intuitive meaning. The heart of the proof involves the analysis of a measure of concentration that we call the permuted moment, and of an associated graph-cut problem. The combinatorial quantities arise quite naturally in the course of analyzing the graph cut problem. We provide a quick overview of the proof in Section 5, and refer to the appendix for full details. For now, it suffices to note that we have a formula (i.e. the right hand side of ( 9)) that can be exactly evaluated with a few lines code. This formula provides a relatively tight lower bound for the generalization error. Theorem 1 is then a direct consequence -plugging L = 9, n w = 150, n c = 5, R = 1000 and = 7 in the right hand side of ( 9) gives the claimed 98.4% lower bound.

Category

Category  Category x11 = [ cheese, butter, lettuce, chicken, leek ] x12 = [ yogurt, cheese, carrot, pork, carrot ] x13 = [ carrot, pork, cream, carrot, cheese ] x14 = [ lettuce, chicken, butter, potato, butter ] x15 = [ lettuce, beef, yogurt, leek, cream ] x16 = [ potato, lamb, butter, potato, yogurt ] < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " P S X 3 V X y t I H g B L u C c 6 B e V h 4 e f T W M = " > A A A D N 3 i c h V J d a 9 R A F J 3 G r 7 p W b Z 9 9 C S 6 C D 8 u S a E F 9 E 3 z x s Q X X F j a h T C Y 3 2 a H z E W Z u W s O Q P + C r P 8 r f 0 B / R p + K 7 k 2 w q 3 W z B S x I O 9 5 y T + 8 H N K s E t R t H V T v D g 4 a P H T 3 a f T p 7 t T Z 6 / e L m / 9 9 3 q 2 j B Y M C 2 0 O c 2 o B c E V L J C j g N P K A J W Z g J P s / E v H n 1 y A s V y r b 9 h U k E p a K l 5 w R t G n j s 7 2 p 9 E 8 6 i P c B v E A p m S I s 4 M g S n L N a g k K m a D W L u O o w t R R g 5 w J a C d J b a G i 7 J y W s P R Q U Q k 2 d X 2 f b f j G Z / K w 0 M a / C s M + e 9 f h q L S 2 k Z l X S o o r O + a 6 5 H 3 c s s b i Y + q 4 q m o E x d a F i l q E q M N u 6 D D n B h i K x g P K D P e 9 h m x F D W X o V z O u g i s 5 G 4 r N b h v a 0 H Q M a i 3 s z K J P g S r 9 8 r 0 k h 8 K D f i o n m w a E 0 J e t M 2 X W O r / S W T h 8 D r e k p Q F Q t 8 p O 8 r 4 X b + k M 5 P / + d + g l / v E a B Z d M S 0 l V 7 h K E H y j b Z Z w 6 l 9 x 1 9 Y R L s m I a t + 0 9 n o u R Z 9 3 R / 1 z 5 y D W M v G n z J x a P D 2 o b L N 7 N P 8 3 j 4 4 j s k l f k N X l L Y v K B f C Z f y R F Z E E Z y 8 p P 8 C n 4 H 1 8 H N + h K D n e E k D 8 h G B H / + A o S 9 E j c = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " x R k z v p y v q w w W e M j i f q x q k r J O q Q o = " > A A A G J 3 i c h V T f b 9 M w E M 5 W C q M M t v H K i 8 W 2 C q F q S r b x 6 w E J i R c e N 4 m y S U k 0 O c 6 l j e r E k e 1 s q 6 x I 8 E f x x y C e E G / 8 F d h p m q Z N G V Y S n e 6 + 8 3 d 3 / p w g o 7 G Q t v 1 j Y 7 N z r 3 v / w d b D 3 q P t x 0 9 2 d v e 2 v w i W c w J D w i j j l w E W Q O M U h j K W F C 4 z D j g J K F w E k 4 8 m f n E N X M Q s / S y n G f g J H q V x F B M s t e t q b / P P w U H P C 2 A U p y r B k s e 3 L w u X + j 3 l B R G 6 L a 6 U 4 x Q I 9 d + j P k J u 3 5 N w K 0 N F x g A C i o E O V J 4 g l x K 4 9 s w c 1 4 q C l D k x m D 4 q X Y n O i s k E 0 o X L o G B i t v e R 5 + m n w X r c Z p 2 y U c 5 l k 7 W u Y 8 5 K M O d M N k k z x i d N x g q x l v N k z u m 2 9 q s Y F v v N K z C z X r N / 7 a m L 1 I 4 2 4 2 m L c T G 3 d Y P 7 9 7 w z J r F k B m I S y 7 U M N R 1 r + m X + V 3 f w 1 / Q B Q N R s q D 6 H 5 W M c I N R o u p y L 6 b n F + b r F u a h 9 P m W q B d y Y 8 l 3 d 9 p u 9 V q W V p 9 v z I A 1 r T f e 0 z K 9 2 9 + 0 j u 1 y o b T i V s W 9 V 6 0 z f D t s L G c k T S C W h W A j X s T P p K 8 x l T C g U P S 8 X k G E y w S N w t Z n i B I S v y l t Z o E P t C V H E u H 5 T i U p v M 0 P h R I i p b h Q d 6 i r H Y j V m n O t i b i 6 j t 7 6 K 0 y y X k J I Z U Z R T J B k y V x y F M Q c i 6 V Q b m P B Y 1 4 r I G H N M 9 B B b L H K c D C q y w b y g J Y y J S M a o G A i J j Q 7 1 r 8 K 0 H k K k j b I r l U y n Q C m 7 K R Q f B Y X S M x 2 g 6 n P a g o 4 4 a D F X S A M 5 K c E t H I e w 3 u 9 U Q / S j M S n c E J Y k W J / t T C y F 6 / h K e c 2 s M m A U t + 8 U x Z q c 6 5 W c W U X / y w p X s q q W l 9 O 0 x p x V R b W N 4 f H R u y P n 3 L a 2 r G f W c + u F 5 V h v r A / W J + v M G l q k c 9 6 5 6 X z t f O t + 7 / 7 s / p q J c X O j U u V T a 2 l 1 f / 8 F a Y 4 T T Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " x R k z v p y v q w w W e M j i f q x q k r J O q Q o = " > A A A G J 3 i c h V T f b 9 M w E M 5 W C q M M t v H K i 8 W 2 C q F q S r b x 6 w E J i R c e N 4 m y S U k 0 O c 6 l j e r E k e 1 s q 6 x I 8 E f x x y C e E G / 8 F d h p m q Z N G V Y S n e 6 + 8 3 d 3 / p w g o 7 G Q t v 1 j Y 7 N z r 3 v / w d b D 3 q P t x 0 9 2 d v e 2 v w i W c w J D w i j j l w E W Q O M U h j K W F C 4 z D j g J K F w E k 4 8 m f n E N X M Q s / S y n G f g J H q V x F B M s t e t q b / P P w U H P C 2 A U p y r B k s e 3 L w u X + j 3 l B R G 6 L a 6 U 4 x Q I 9 d + j P k J u 3 5 N w K 0 N F x g A C i o E O V J 4 g l x K 4 9 s w c 1 4 q C l D k x m D 4 q X Y n O i s k E 0 o X L o G B i t v e R 5 + m n w X r c Z p 2 y U c 5 l k 7 W u Y 8 5 K M O d M N k k z x i d N x g q x l v N k z u m 2 9 q s Y F v v N K z C z X r N / 7 a m L 1 I 4 2 4 2 m L c T G 3 d Y P 7 9 7 w z J r F k B m I S y 7 U M N R 1 r + m X + V 3 f w 1 / Q B Q N R s q D 6 H 5 W M c I N R o u p y L 6 b n F + b r F u a h 9 P m W q B d y Y 8 l 3 d 9 p u 9 V q W V p 9 v z I A 1 r T f e 0 z K 9 2 9 + 0 j u 1 y o b T i V s W 9 V 6 0 z f D t s L G c k T S C W h W A j X s T P p K 8 x l T C g U P S 8 X k G E y w S N w t Z n i B I S v y l t Z o E P t C V H E u H 5 T i U p v M 0 P h R I i p b h Q d 6 i r H Y j V m n O t i b i 6 j t 7 6 K 0 y y X k J I Z U Z R T J B k y V x y F M Q c i 6 V Q b m P B Y 1 4 r I G H N M 9 B B b L H K c D C q y w b y g J Y y J S M a o G A i J j Q 7 1 r 8 K 0 H k K k j b I r l U y n Q C m 7 K R Q f B Y X S M x 2 g 6 n P a g o 4 4 a D F X S A M 5 K c E t H I e w 3 u 9 U Q / S j M S n c E J Y k W J / t T C y F 6 / h K e c 2 s M m A U t + 8 U x Z q c 6 5 W c W U X / y w p X s q q W l 9 O 0 x p x V R b W N 4 f H R u y P n 3 L a 2 r G f W c + u F 5 V h v r A / W J + v M G l q k c 9 6 5 6 X z t f O t + 7 / 7 s / p q J c X O j U u V T a 2 l 1 f / 8 F a Y 4 T T Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " r 2 U 4 9 v y B G 5 V s 8 q X 2 0 9 V y 1 a 4 j E 2 k = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i y m 7 L 1 w G p E h e O r U R o J d u q 1 u t J Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s g 5 R G Q t r 2 9 4 3 N 1 p 3 2 3 X t b 9 z s P H j 5 6 v L 2 z + + S z Y B k n M C C M M n 4 R Y A E 0 S m A g I 0 n h I u W A 4 4 D C e T D 5 Y P z n V 8 B F x J J P c p q C H + N R E g 0 j g q W G L n c 3 f + / t d b w A R l G i Y i x 5 d P M y d 6 n f U V 4 w R D f 5 p X K c H K H e e 9 R D y O 1 5 E m 5 k q M g Y Q E D e 1 4 4 S C T I p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k e f p p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N v Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p V 8 A D C s F 1 S d w / I x 9 h G q F V 3 0 x d T c 0 H z d 0 F z k P u 8 y 1 Q N c 6 / J t 1 f b q t Z a p F a f b 8 S A J q 5 n u 6 D G / 3 O n a B 3 a x U N N w S q N r l e t U 3 w 7 b C x n J Y k g k o V g I 1 7 F T 6 S v M Z U Q o 5 B 0 v E 5 B i M s E j c L W Z 4 B i E r 4 p b m a N 9 j Y R o y L h + E 4 k K t B 6 h c C z E V B e K 9 n W W Y 7 H q M + A 6 n 5 v J 4 V t f R U m a S U j I T G i Y U S Q Z M l c c h R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 E o T a K q l Q 8 n Q K l 7 D p X f B T k S v e 0 j 8 r P c Y M 6 4 q C H u W Q a y l F B b v A 4 h N V + x 5 q i H 8 1 J 4 J q w O M b 6 b G f D k r u O r 5 R X j y o c Z u K 6 T p 6 v i b l a i Z l l 9 L + o c C W q L H k 5 T M + Y s z p R T W N w e P D u w D m z u y d 7 5 b B t W c + s 5 9 Y L y 7 H e W C f W R + v U G l i k d d a 6 b n 1 p f W 1 / a / 9 o / 2 z / m l E 3 N 8 q Y p 9 b S a v / 5 C 2 I 7 F M w = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " A i H Z B a A b 2 k 1 Y 2 b / 0 6 G n a C + f K K m 4 = " > A A A G M n i c h V R N b 9 N A E H U b A i V 8 t I U j l x V p I 4 S i K m 7 L 1 w G p E h e O r U R o J d u q 1 u t x Y m X t t X b X b a O V J f h R / B j E C X F D / A h 2 H c d x 4 l B W t j V 6 8 2 b f z O y s / Z R G Q g 4 G 3 z c 2 W 3 f a d + 9 t 3 e 8 8 e P j o 8 f b O 7 p P P g m W c w J A w y v i F j w X Q K I G h j C S F i 5 Q D j n 0 K 5 / 7 k g / G f X w E X E U s + y W k K X o x H S R R G B E s N X e 5 u / t 7 b 6 7 g + j K J E x V j y 6 O Z l 7 l C v o 1 w / R D f 5 p b L t H K H e e 9 R D y O m 5 E m 5 k o M g Y Q E D e 1 4 4 S 8 T M p g W t k B l w p C l J m x H B 6 q I B i H R W R C S Q L y L B g Y r b 3 k O v q p 6 Z 6 2 F S d s l H G Z V 2 1 y m O u S j D n T N Z F U 8 Y n d c W S s V b z a K 7 p N P Y r F R b 7 z T M w v V 6 z f 4 V U S W q g q X j c U F z 0 b V 3 j / t 3 v l E k s m a G Y w G I t U 0 3 F W n 5 Z / 9 U t + p W 8 D x D W C 6 r O Y f k Y + w j V i i 7 6 Y m p u a L 5 u a C 5 y n 3 e Z 6 g G u d f m 2 a n v 1 W s v U i t P t u J A E 1 U x 3 9 J h f 7 n Q H B 4 N i o a Z h l 0 b X K t e p v h 0 D N 2 A k i y G R h G I h H H u Q S k 9 h L i N C I e + 4 m Y A U k w k e g a P N B M c g P F X c y h z t a y R A I e P 6 T S Q q 0 H q E w r E Q U 1 0 o 2 t d Z j s W q z 4 D r f E 4 m w 7 e e i p I 0 k 5 C Q m V C Y U S Q Z M l c c B R E H I u l U G 5 j w S O e K y B h z T H Q T G y p y H P d L s f 4 8 o S W O 8 U j G q O g L i c 0 c 6 l + F K T 2 A U B t F V S q e T o F S d p 0 r P v J z p X v a R + X n u E E d c d D D X D I N 5 a g g N 3 g c g m q / Y 0 3 R j + Y k c E 1 Y H G N 9 t r N h y R 3 b U 8 q t R x U O M 3 F d O 8 / X x F y t x M w y + l 9 U s B J V l r w c p m f M X p 2 o p j E 8 P H h 3 Y J 8 d d k / 2 y m H b s p 5 Z z 6 0 X l m 2 9 s U 6 s j 9 a p N b R I 6 6 x 1 3 f r S + t r + 1 v 7 R / t n + N a N u b p Q x T 6 2 l 1 f 7 z F 2 L b F M 4 = < / l a t < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3 v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3 v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3 v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3 v t 9 I 8 + F R C 6 F x Y = " > A A A E O H i c j V J N j 9 M w E E 1 a P p b w 1 Y U j F 4 u W g l B V N W U l 4 I C 0 E h e O i 0 T p S k 1 U O c 4 0 a 6 3 t V L b T 3 c j K F X 4 U v 4 Q j J 8 Q R f g F O m l b b d C W w H G s 8 8 5 7 f T G a i J a N K j 0 b f 3 V b 7 x s 1 b t w / u e H f v 3 X / w s H P 4 6 L N K M 0 l g Q l K W y t M I K 2 B U w E R T z e B 0 K Q H z i M E 0 O n 9 f x q c r k I q m 4 p P O l x B y n A i 6 o A R r 6 5 p 3 f v d 6 X h B B Q o X h W E t 6 + b K Y s d A z Q b R A p H g + H 6 P + u z 6 a 9 V G g 4 V L H J s Z U 5 s U A 1 Q 5 u O G B d 3 v / j u m E j 1 A 9 R E N i 9 0 W n I b H k N Y v O h z X 1 l V p A k F A r U U L Z C X g A i 3 t b m 9 X r z T n c 0 H F U L 7 R t + b X S d e p 3 M D 1 u j I E 5 J x k F o w r B S M 3 + 0 1 K H B U l P C o P C C T M E S k 3 O c w M y a A n N Q o a l 6 U 6 B n 1 h O j R S r t J z S q v F c Z B n O l c h 5 Z p E 3 y T D V j p f O 6 2 C z T i z e h o W K Z a R B k L b T I G N I p K h u N Y i q B a J Z b A x N J b a 6 I n G G J i b b j 0 F T R Z 3 x Q i w 0 2 C e 1 g y o h O U 6 Y G S l s X C D s x Z e k x L K x R V W V 4 n g N j 6 U V h Z B I V x v 7 T A a q P o z 1 o I g H E B l l C X l X g P Z y E e P v e k Y X Y b T E C L k j K O b a t X f e 7 m P m h M c F V V h U o B 6 z r F 8 U 1 n F W D s 8 7 o X 6 y 4 w a p L 3 q X Z G f O b E 7 V v T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > c01 = [ dairy, dairy, veggie, meat, veggie ] c1 = [ veggie, meat, dairy, veggie, dairy ] < l a t e x i t s h a 1 _ b a s e 6 4 = " i j i C o 1 a 4 M d g r Q b 2 N D B L X 3 7 8 L Z h 8 = " > A A A E Q H i c h V J N j 9 M w E E 1 a P p b w s V 0 4 c r F o K Q h V V b K s B B y Q V u L C c Z E o u 1 I T V Y 4 z z V p r O 5 X t d r e y c u Q K P 4 p f w U / g h D j C C S d N q 2 1 S h O V E k 5 n 3 5 s 1 k J p 4 x q r T v f 3 d b 7 R s 3 b 9 3 e u + P d v X f / w X 7 n 4 O E n l c 0 l g R H J W C b P Y q y A U Q E j T T W D s 5 k E z G M G p / H F u y J + u g C p a C Y + 6 u U M I o 5 T Q a e U Y G 1 d k 8 7 v X s 8 L Y 0 i p M B x r S a 9 e 5 G M W e S a M p 4 j k z y Y B 6 r / t o z H q h x q u d G I S T O U y H 6 C G Y / W 9 M A t I U w o b B z c c s G 7 G b Y I I h a G 9 a 6 2 1 V B / V M + 1 O 9 U 9 p V E c U U s g L Q S S b H r 1 e b 9 L p + k O / P K h p B J X R d a p z M j l o + W G S k T k H o Q n D S o 0 D f 6 Y j g 6 W m h E H u h X M F M 0 w u c A p j a w r M Q U W m n F G O n l p P g q a Z t I / Q q P R e Z x j M l V r y 2 C J t k e e q H i u c u 2 L j u Z 6 + j g w V s 7 k G Q V Z C 0 z l D O k P F w F F C J R D N l t b A R F J b K y L n W G K i 7 V r U V f Q 5 H 1 R i g 3 V B W 5 g i o r O M q Y H S 1 g X C b k 7 R e g J T a 5 R d G b 5 c A m P Z Z W 5 k G u f G / t M B q l 5 H D W g q A c Q a W U B e l u A G T k K y y X d k I f Z a j I B L k n G O 7 W h X O 5 K P g 8 i Y 8 D q r D B R L 1 g 3 y f A d n U e O s K v o f K 6 m x q p a 3 a X b H g v p G N Y 3 R 4 f D N M P h w 2 D 3 u V c u 2 5 z x 2 n j j P n c B 5 5 R w 7 7 5 0 T Z + Q Q d + J + d r + 4 X 9 v f 2 j / a P 9 u / V t C W W 3 E e O V u n / e c v J L l s X Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " i j i C o 1 a 4 M d g r Q b 2 N D B L X 3 7 8 L Z h 8 = " > A A A E Q H i c h V J N j 9 M w E E 1 a P p b w s V 0 4 c r F o K Q h V V b K s B B y Q V u L C c Z E o u 1 I T V Y 4 z z V p r O 5 X t d r e y c u Q K P 4 p f w U / g h D j C C S d N q 2 1 S h O V E k 5 n 3 5 s 1 k J p 4 x q r T v f 3 d b 7 R s 3 b 9 3 e u + P d v X f / w X 7 n 4 O E n l c 0 l g R H J W C b P Y q y A U Q E j T T W D s 5 k E z G M G p / H F u y J + u g C p a C Y + 6 u U M I o 5 T Q a e U Y G 1 d k 8 7 v X s 8 L Y 0 i p M B x r S a 9 e 5 G M W e S a M p 4 j k z y Y B 6 r / t o z H q h x q u d G I S T O U y H 6 C G Y / W 9 M A t I U w o b B z c c s G 7 G b Y I I h a G 9 a 6 2 1 V B / V M + 1 O 9 U 9 p V E c U U s g L Q S S b H r 1 e b 9 L p + k O / P K h p B J X R d a p z M j l o + W G S k T k H o Q n D S o 0 D f 6 Y j g 6 W m h E H u h X M F M 0 w u c A p j a w r M Q U W m n F G O n l p P g q a Z t I / Q q P R e Z x j M l V r y 2 C J t k e e q H i u c u 2 L j u Z 6 + j g w V s 7 k G Q V Z C 0 z l D O k P F w F F C J R D N l t b A R F J b K y L n W G K i 7 V r U V f Q 5 H 1 R i g 3 V B W 5 g i o r O M q Y H S 1 g X C b k 7 R e g J T a 5 R d G b 5 c A m P Z Z W 5 k G u f G / t M B q l 5 H D W g q A c Q a W U B e l u A G T k K y y X d k I f Z a j I B L k n G O 7 W h X O 5 K P g 8 i Y 8 D q r D B R L 1 g 3 y f A d n U e O s K v o f K 6 m x q p a 3 a X b H g v p G N Y 3 R 4 f D N M P h w 2 D 3 u V c u 2 5 z x 2 n j j P n c B 5 5 R w 7 7 5 0 T Z + Q Q d + J + d r + 4 X 9 v f 2 j / a P 9 u / V t C W W 3 E e O V u n / e c v J L l s X Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " i j i C o 1 a 4 M d g r Q b 2 N D B L X 3 7 8 L Z h 8 = " > A A A E Q H i c h V J N j 9 M w E E 1 a P p b w s V 0 4 c r F o K Q h V V b K s B B y Q V u L C c Z E o u 1 I T V Y 4 z z V p r O 5 X t d r e y c u Q K P 4 p f w U / g h D j C C S d N q 2 1 S h O V E k 5 n 3 5 s 1 k J p 4 x q r T v f 3 d b 7 R s 3 b 9 3 e u + P d v X f / w X 7 n 4 O E n l c 0 l g R H J W C b P Y q y A U Q E j T T W D s 5 k E z G M G p / H F u y J + u g C p a C Y + 6 u U M I o 5 T Q a e U Y G 1 d k 8 7 v X s 8 L Y 0 i p M B x r S a 9 e 5 G M W e S a M p 4 j k z y Y B 6 r / t o z H q h x q u d G I S T O U y H 6 C G Y / W 9 M A t I U w o b B z c c s G 7 G b Y I I h a G 9 a 6 2 1 V B / V M + 1 O 9 U 9 p V E c U U s g L Q S S b H r 1 e b 9 L p + k O / P K h p B J X R d a p z M j l o + W G S k T k H o Q n D S o 0 D f 6 Y j g 6 W m h E H u h X M F M 0 w u c A p j a w r M Q U W m n F G O n l p P g q a Z t I / Q q P R e Z x j M l V r y 2 C J t k e e q H i u c u 2 L j u Z 6 + j g w V s 7 k G Q V Z C 0 z l D O k P F w F F C J R D N l t b A R F J b K y L n W G K i 7 V r U V f Q 5 H 1 R i g 3 V B W 5 g i o r O M q Y H S 1 g X C b k 7 R e g J T a 5 R d G b 5 c A m P Z Z W 5 k G u f G / t M B q l 5 H D W g q A c Q a W U B e l u A G T k K y y X d k I f Z a j I B L k n G O 7 W h X O 5 K P g 8 i Y 8 D q r D B R L 1 g 3 y f A d n U e O s K v o f K 6 m x q p a 3 a X b H g v p G N Y 3 R 4 f D N M P h w 2 D 3 u V c u 2 5 z x 2 n j j P n c B 5 5 R w 7 7 5 0 T Z + Q Q d + J + d r + 4 X 9 v f 2 j / a P 9 u / V t C W W 3 E e O V u n / e c v J L l s X Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " i j i C o 1 a 4 M d g r Q b 2 N D B L X 3 7 8 L Z h 8 = " > A A A E Q H i c h V J N j 9 M w E E 1 a P p b w s V 0 4 c r F o K Q h V V b K s B B y Q V u L C c Z E o u 1 I T V Y 4 z z V p r O 5 X t d r e y c u Q K P 4 p f w U / g h D j C C S d N q 2 1 S h O V E k 5 n 3 5 s 1 k J p 4 x q r T v f 3 d b 7 R s 3 b 9 3 e u + P d v X f / w X 7 n 4 O E n l c 0 l g R H J W C b P Y q y A U Q E j T T W D s 5 k E z G M G p / H F u y J + u g C p a C Y + 6 u U M I o 5 T Q a e U Y G 1 d k 8 7 v X s 8 L Y 0 i p M B x r S a 9 e 5 G M W e S a M p 4 j k z y Y B 6 r / t o z H q h x q u d G I S T O U y H 6 C G Y / W 9 M A t I U w o b B z c c s G 7 G b Y I I h a G 9 a 6 2 1 V B / V M + 1 O 9 U 9 p V E c U U s g L Q S S b H r 1 e b 9 L p + k O / P K h p B J X R d a p z M j l o + W G S k T k H o Q n D S o 0 D f 6 Y j g 6 W m h E H u h X M F M 0 w u c A p j a w r M Q U W m n F G O n l p P g q a Z t I / Q q P R e Z x j M l V r y 2 C J t k e e q H i u c u 2 L j u Z 6 + j g w V s 7 k G Q V Z C 0 z l D O k P F w F F C J R D N l t b A R F J b K y L n W G K i 7 V r U V f Q 5 H 1 R i g 3 V B W 5 g i o r O M q Y H S 1 g X C b k 7 R e g J T a 5 R d G b 5 c A m P Z Z W 5 k G u f G / t M B q l 5 H D W g q A c Q a W U B e l u A G T k K y y X d k I f Z a j I B L k n G O 7 W h X O 5 K P g 8 i Y 8 D q r D B R L 1 g 3 y f A d n U e O s K v o f K 6 m x q p a 3 a X b H g v p G N Y 3 R 4 f D N M P h w 2 D 3 u V c u 2 5 z x 2 n j j P n c B 5 5 R w 7 7 5 0 T Z + Q Q d + J + d r + 4 X 9 v f 2 j / a P 9 u / V t C W W 3 E e O V u n / e c v J L l s X Q = = < / l a t e x i t > x21 = [ butter, pork, lamb, lamb, yogurt ] x22 = [ cream, beef, chicken, pork, butter ] x23 = [ chicken, cheese, cream, lettuce, beef ] x24 = [ beef, cheese, cheese, carrot, pork ] x25 = [ lamb, butter, cream, potato, lamb ] x26 = [ chicken, cream, butter, leek, pork ] < l a t e x i t s h a 1 _ b a s e 6 4 = " o p P b O u / 7 t S m S s p 0 A 7 o / c J L B 5 0 Using a simple union bound, inequality (9) easily extends to this situation -the resulting formula is a bit cumbersome so we present it in the appendix (see Theorem 7). In the concrete case where L = 9, n w = 150, n c = 5, R = 1000 this formula simplifies to err(F, ψ, T) ≥ 1 -0.015 n * -1/R for all F and all ψ, M k = " > A A A G P X i c j V R N b 9 M w G M 5 W C q N 8 b X D k Y t F S I V R N T T e + D k i T u H A c E m W T k m h y n D d t V C e O H K d b Z e X C F X 4 U f 4 M / w A l x Q 1 x x U i d L m 0 h g J d H O P n L k 6 A B h F M R S A o n M c c c O h S O H M X 7 3 L 7 2 R J 4 E r D o o 1 j F 4 I R 4 F g V + Q L B Q q o u D 3 d + D Q c 9 2 Y R Z E M s S C B 1 f P M 4 s 6 P W m 7 P r r K L u T E z B A a v k V D h K y h L e B K e N J N h Q C e j Z S h 0 I Q y Z n y h 9 u W W K g L 5 F r X u P b l i s 5 S L / G A H 2 b Z 6 a n i T J h 7 J M 6 r D u Q B + D Y 7 M A 7 K A q I 5 Y E k K b l F s R j 0 p E q 3 k g q j j M A R K o Q G u k 9 N J Q S 0 l B i J Q U r t p U Z 9 3 K 4 L j B Q G d 4 n U C F 3 2 R U q p a S Y M 6 Z W J M a b m A X 9 c i x t 6 F f N K C 3 m 1 V 1 + 5 q L T r 1 K O W Y C C 7 a m s t H z N s S X / 1 X u E q I x c m W 5 q 2 r D Q g O 3 p a t L b U P k V f P d G w w u 9 v v j w 3 G x U F M w t d A 3 9 D p V F 2 V s e 4 y k I U S C U J w k l j m O h S M x F w G h k P X s N I E Y k w W e g a X E C I e Q O L K 4 o B l 6 q j Q e 8 h l X b y R Q o a 1 H S B w m y S q v 1 1 N F c p 5 s 2 3 J l m 8 1 K h f / a k U E U p w I i s g b y U 4 o E Q / l t R 1 7 A g Q i 6 U g I m P F B c E Z l j j o k q Z g N F z M O R B h u V h D Z 8 c o t g j C a j R O C 8 Y + q v k a f u g a + E I i s Z r l Z A K b v M J J + 5 m V Q 1 H S H 9 O W 6 4 z j i o t m v P 3 O W o c G 7 4 c f C q 8 4 6 V i 3 q U T w S X h I U h V q 1 d t z 2 z T E d K u x 5 V G P L Z 6 5 t Z 1 h K z 3 I p Z M / p X l L c V p V P e D F M z Z m 5 P V F O Y T g 7 f H J o f J v 2 T g R M k = " > A A A G P X i c j V R N b 9 M w G M 5 W C q N 8 b X D k Y t F S I V R N T T e + D k i T u H A c E m W T k m h y n D d t V C e O H K d b Z e X C F X 4 U f 4 M / w A l x Q 1 x x U i d L m 0 h g J d H O P n L k 6 A B h F M R S A o n M c c c O h S O H M X 7 3 L 7 2 R J 4 E r D o o 1 j F 4 I R 4 F g V + Q L B Q q o u D 3 d + D Q c 9 2 Y R Z E M s S C B 1 f P M 4 s 6 P W m 7 P r r K L u T E z B A a v k V D h K y h L e B K e N J N h Q C e j Z S h 0 I Q y Z n y h 9 u W W K g L 5 F r X u P b l i s 5 S L / G A H 2 b Z 6 a n i T J h 7 J M 6 r D u Q B + D Y 7 M A 7 K A q I 5 Y E k K b l F s R j 0 p E q 3 k g q j j M A R K o Q G u k 9 N J Q S 0 l B i J Q U r t p U Z 9 3 K 4 L j B Q G d 4 n U C F 3 2 R U q p a S Y M 6 Z W J M a b m A X 9 c i x t 6 F f N K C 3 m 1 V 1 + 5 q L T r 1 K O W Y C C 7 a m s t H z N s S X / 1 X u E q I x c m W 5 q 2 r D Q g O 3 p a t L b U P k V f P d G w w u 9 v v j w 3 G x U F M w t d A 3 9 D p V F 2 V s e 4 y k I U S C U J w k l j m O h S M x F w G h k P X s N I E Y k w W e g a X E C I e Q O L K 4 o B l 6 q j Q e 8 h l X b y R Q o a 1 H S B w m y S q v 1 1 N F c p 5 s 2 3 J l m 8 1 K h f / a k U E U p w I i s g b y U 4 o E Q / l t R 1 7 A g Q i 6 U g I m P F B c E Z l j j o k q Z g N F z M O R B h u V h D Z 8 c o t g j C a j R O C 8 Y + q v k a f u g a + E I i s Z r l Z A K b v M J J + 5 m V Q 1 H S H 9 O W 6 4 z j i o t m v P 3 O W o c G 7 4 c f C q 8 4 6 V i 3 q U T w S X h I U h V q 1 d t z 2 z T E d K u x 5 V G P L Z 6 5 t Z 1 h K z 3 I p Z M / p X l L c V p V P e D F M z Z m 5 P V F O Y T g 7 f H J o f J v 2 T g R O P n L k 6 A B h F M R S A o n M c c c O h S O H M X 7 3 L 7 2 R J 4 E r D o o 1 j F 4 I R 4 F g V + Q L B Q q o u D 3 d + D Q c 9 2 Y R Z E M s S C B 1 f P M 4 s 6 P W m 7 P r r K L u T E z B A a v k V D h K y h L e B K e N J N h Q C e j Z S h 0 I Q y Z n y h 9 u W W K g L 5 F r X u P b l i s 5 S L / G A H 2 b Z 6 a n i T J h 7 J M 6 r D u Q B + D Y 7 M A 7 K A q I 5 Y E k K b l F s R j 0 p E q 3 k g q j j M A R K o Q G u k 9 N J Q S 0 l B i J Q U r t p U Z 9 3 K 4 L j B Q G d 4 n U C F 3 2 R U q p a S Y M 6 Z W J M a b m A X 9 c i x t 6 F f N K C 3 m 1 V 1 + 5 q L T r 1 K O W Y C C 7 a m s t H z N s S X / 1 X u E q I x c m W 5 q 2 r D Q g O 3 p a t L b U P k V f P d G w w u 9 v v j w 3 G x U F M w t d A 3 9 D p V F 2 V s e 4 y k I U S C U J w k l j m O h S M x F w G h k P X s N I E Y k w W e g a X E C I e Q O L K 4 o B l 6 q j Q e 8 h l X b y R Q o a 1 H S B w m y S q v 1 1 N F c p 5 s 2 3 J l m 8 1 K h f / a k U E U p w I i s g b y U 4 o E Q / l t R 1 7 A g Q i 6 U g I m P F B c E Z l j j o k q Z g N F z M O R B h u V h D Z 8 c o t g j C a j R O C 8 Y + q v k a f u g a + E I i s Z r l Z A K b v M J J + 5 m V Q 1 H S H 9 O W 6 4 z j i o t m v P 3 O W o c G 7 4 c f C q 8 4 6 V i 3 q U T w S X h I U h V q 1 d t z 2 z T E d K u x 5 V G P L Z 6 5 t Z 1 h K z 3 I p Z M / p X l L c V p V P e D F M z Z m 5 P V F O Y T g 7 f H J o f J v 2 T g R O P n L k 6 A B h F M R S A o n M c c c O h S O H M X 7 3 L 7 2 R J 4 E r D o o 1 j F 4 I R 4 F g V + Q L B Q q o u D 3 d + D Q c 9 2 Y R Z E M s S C B 1 f P M 4 s 6 P W m 7 P r r K L u T E z B A a v k V D h K y h L e B K e N J N h Q C e j Z S h 0 I Q y Z n y h 9 u W W K g L 5 F r X u P b l i s 5 S L / G A H 2 b Z 6 a n i T J h 7 J M 6 r D u Q B + D Y 7 M A 7 K A q I 5 Y E k K b l F s R j 0 p E q 3 k g q j j M A R K o Q G u k 9 N J Q S 0 l B i J Q U r t p U Z 9 3 K 4 L j B Q G d 4 n U C F 3 2 R U q p a S Y M 6 Z W J M a b m A X 9 c i x t 6 F f N K C 3 m 1 V 1 + 5 q L T r 1 K O W Y C C 7 a m s t H z N s S X / 1 X u E q I x c m W 5 q 2 r D Q g O 3 p a t L b U P k V f P d G w w u 9 v v j w 3 G x U F M w t d A 3 9 D p V F 2 V s e 4 y k I U S C U J w k l j m O h S M x F w G h k P X s N I E Y k w W e g a X E C I e Q O L K 4 o B l 6 q j Q e 8 h l X b y R Q o a 1 H S B w m y S q v 1 1 N F c p 5 s 2 3 J l m 8 1 K h f / a k U E U p w I i s g b y U 4 o E Q / l t R 1 7 A g Q i 6 U g I m P F B c E Z l j j o k q Z g N F z M O R B h u V h D Z 8 c o t g j C a j R O C 8 Y + q v k a f u g a + E I i s Z r l Z A K b v M J J + 5 m V Q 1 H S H 9 O W 6 4 z j i o t m v P 3 O W o c G 7 4 c f C q 8 4 6 V i 3 q U T w S X h I U h V q 1 d t z 2 z T E d K u x 5 V G P L Z 6 5 t Z 1 h K z 3 I p Z M / p X l L c V p V P e D F M z Z m 5 P V F O Y T g 7 f H J o f J v 2 T g R therefore exhibiting an affine relationship between the error rate and the number n * of unfamiliar sentences per category. Note that choosing n * = 1 in (10) leads to a 98.4% lower bound on the error rate, therefore recovering the result from Theorem 1. This lower bound then decreases by 1.5% with each additional unfamiliar sentence per category in the training set. We would like to emphasize one more time the importance of non-asymptotic analysis in the longtailed learning setting. For example, in inequality (10), the difficulty lies in obtaining a value as small as possible for the coefficient in front of n * . We accomplish this via a careful analysis of the graph cut problem associated with our data model.

5. PROOF OUTLINE -PERMUTED MOMENT AND OPTIMAL FEATURE MAP

The proof involves two main ingredients. First, the key insight of our analysis is the realization that generalization in our data model is closely tied to the permuted moment of a probability distribution. To state this central concept, it will prove convenient to think of probability distributions on X as vectors p ∈ R N with N = |X |, together with indices 0 ≤ i ≤ N -1 given by some arbitrary (but fixed) indexing of the elements of data space. Then p i denotes the probability of the i th element of X in this indexing. We use S N to denote the set of permutations of {0, 1, . . . , N -1} and σ ∈ S N to refer to a particular permutation. The t th permuted moment of the probability vector p ∈ R N is H t (p) := max σ∈S N N -1 i=0 (i/N ) t p σ(i) Since ( 11) involves a maximum over all possible permutations, the definition clearly does not depend on the way the set X was indexed. In order to maximize the sum, the permutation σ must match the largest values of p i with the largest values of (i/N ) t , so the maximizing permutation simply orders the entries p i from smallest to largest. A very peaked distribution that gives large probability to only a handful of elements of X will have large permuted moment. Because of this, the permuted moment is akin to the negative entropy; it has large values for delta-like distributions and small values for uniform ones. From definition (11) it is clear that 0 ≤ H t (p) ≤ 1 for all probability vectors p, and it is easily verified that the permuted moment is convex. These properties, as well as various useful bounds for the permuted moment, are presented and proven in the appendix. Second, we identify a specific feature map, ψ : X → F , which is optimal for a collection of tasks closely related to the ones considered in our data model. Leveraging the optimality of ψ on these related tasks allows us to derive an error bound that holds for the tasks of interest. The feature map ψ is better understood through its associated kernel, which is given by the formula K (x, y) = ψ (x), ψ (y) F = n L c n L w {ϕ ∈ Φ : ϕ(x ) = ϕ(y ) for all 1 ≤ ≤ L} |Φ| . ( ) Up to normalization, K (x, y) simply counts the number of equipartitions of the vocabulary for which sentences x and y have the same underlying sequence of concepts. Intuitively this makes sense, for the best possible kernel must leverage the only information we have at hand. We know the general structure of the problem (words are partitioned into concepts) but not the partition itself. So to try and determine if sentences (x, y) were generated by the same sequence of concepts, the best we can do is to simply try all possible equipartitions of the vocabulary and count how many of them wind up generating (x, y) from the same underlying sequence of concepts. A high count makes it more likely that (x, y) were generated by the same sequence of concepts. The optimal kernel K does exactly this, and provides a good (actually optimal, see the appendix) measure of similarity between pairs of sentences. For fixed x ∈ X , the function y → K (x, y) defines a probability distribution on data space. The connection between generalization error, permuted moment, and optimal feature map, come from the fact that sup F ,ψ [1 -err(F, ψ, T)] ≤ 1 |X | x∈X H 2R-1 (K (x, •)) + 1 R , and so, up to a small error 1/R, it is the permuted moments of K that determine the success rate. We then obtain the lower bound (9) by studying these moments in great detail. A simple union bound is then used to obtain inequalities such as (10).

6. EMPIRICAL RESULTS

We conclude by presenting empirical results that complement our theoretical findings. The full details of these experiments (training procedure, hyperparameter choices, number of experiments ran to estimate the success rates, and standard deviations of these success rates), as well as additional experiments, can be found in Appendix E. Codes are available at https://github.com/ xbresson/Long_Tailed_Learning_Requires_Feature_Learning. Parameter Settings. We consider five parameter settings for the data model depicted in Figure 3 . Each setting corresponds to a column in Table 1 . In all five settings, we set the parameters L = 9, n w = 150, n c = 5 and R = 1000 to the values for which the error bound (10) holds. We choose values for the parameters n spl and n * so that the i th column of the where e x denotes the one-hot-encoding of the th word of the input sentence. Finally, the last row considers a SVM with Gaussian Kernel (also called RBF kernel). Results. The first two rows of the table correspond to algorithms that learn features from the data; the remaining rows correspond to algorithms that use a pre-determined (not learned) feature map. When n * = 1 our main theorem states that no feature map can succeed more than 1.6% of the time on unfamiliar test sentences (fifth row of the table). At first glance this appears to contradict the empirical performance of the feature map extracted by the neural network, which succeeds 99% of the time (second row of the table). The resolution of this apparent contradiction lies in the order of operations. The point here is to separate hand crafted or fixed features from learned features via the order of operations. If we choose the feature map before the random selection of the task then the algorithm performs poorly since it uses unlearned, task-independent features. By contrast, the neural network learns a feature map from the training set, and since the training set is generated by the task, this process takes place after the random selection of the task. It therefore uses taskdependent features, and the network performs almost perfectly for the specific task that generated its training set. But by our main theorem, it too must fail if the task changes but the features do not. [1err(F, ψ, T)] ≤ 1 |X | x∈X H 2R-1 (K (x, •)) + 1 R where the collection of tasks T = Φ × Z 2R consists in all possible tasks that one might encounter. Inequality ( 14) plays a central role in our work as it establishes the connection between the generalization error, the permuted moment, and the optimal kernel K defined by ( 12). The proof is non-technical and easily accessible. In Section C we provide the following upper bound on the permuted moment of the optimal kernel: 1 |X | x∈X H 2R-1 (K (x, •)) ≤ 1 - k∈S f(k)g(k) + 1 2R max k∈S f(k) for all 0 ≤ ≤ L. The proof is combinatorial in nature, and involves the analysis of a graph-cut problem. Combining ( 14) and ( 15) establishes Theorem 3. In Section D we consider the case in which each unfamiliar sequence of concepts has n * representatives in the training set. A simple union bound shows that, in this situation, inequality ( 14) becomes sup F ,ψ [1 -err(F, ψ, T)] ≤ n * |X | x∈X H 2R-1 (K (x, •)) + 1 R Combining ( 16) and ( 15) then provides our most general error bound, see Theorem 7. Inequality (10) in the main body of the paper is just a special case of Theorem 7. Finally, in Section E, we provide the full details of the experiments.

A PROPERTIES OF THE PERMUTED MOMENT

The permuted moment, in Section 5, was defined for probability vectors only. It will prove convenient to consider the permuted moment of nonnegative vectors as well. We denote by R + = [0, +∞) the nonnegative real numbers, and by R N + the vectors with N nonnegative real entries indexed from i = 0 to i = N -1. The permuted moment of u ∈ R N + is then given by H t (u) := max σ∈S N N -1 i=0 (i/N ) t u σ(i) . ( ) where S N denote the set of permutations of {0, 1, . . . , N -1}. The concept of an ordering permutation will prove useful in the next lemma. Definition 1. σ ∈ S N is said to be an ordering permutation of u ∈ R N if u σ(0) ≤ u σ(1) ≤ . . . ≤ u σ(N -1) . ( ) The lemma below shows that the permutation maximizing ( 17) is the one that sorts the entries u i from smallest to largest. Lemma 1. Let u ∈ R N + and let σ * be an ordering permutation of u. Then σ * ∈ arg max σ∈S N N -1 i=0 (i/N ) t u σ(i) . Proof. The optimization problem (19) can be formulated as finding a pairing between the u i 's and the (i/N ) t 's that maximizes the sum of the product of the pairs. An ordering permutation of u corresponds to pairing the smallest entry of u to (0/N ) t , the second smallest entry to (1/N ) t , the third smallest entry to (2/N ) t , and so forth. This pairing is clearly optimal. In light of the previous lemma, we see that computing the permuted moment of a vector u can be accomplished as follow: 1) sort the entries of u from smallest to largest; 2) compute the dot product between this sorted vector and the vector 0 N t 1 N t 2 N t . . . N -1 N t . Let us now focus on the case where u is a probability distribution. If u is very peaked, it must have a large permuted moment since, after sorting, most of the mass concentrates on the high values of (20) located on the right. On the contrary, if u is very spread, it must have small permuted moment since it 'wastes' its mass on small values of (20). Because of this, the permuted moment is akin to the negative entropy; it has large values for delta-like distributions and small values for uniform ones. We now show that the permuted moment is subaddiditive and one-homogeneous on R N + (as a consequence it is convex on the set of probability vectors) and we derive some elementary 1 and ∞ bounds. We denote by u p the p -norm of a vector u. In particular, if u ∈ R N + , we have u 1 := N -1 i=0 u i and u ∞ := max 0≤i≤N -1 u i . With this notation in hand, we can now state our lemma: Lemma 2. (i) H t (u + v) ≤ H t (u) + H t (v) for all u, v ∈ R N + . (ii) H t (c u) = c H t (u) for all u ∈ R N + and all c ≥ 0. (iii) H t (u) ≤ u 1 for all u ∈ R N + . (iv) H t (u) ≤ N t+1 u ∞ for all u ∈ R N + . Proof. Properties (i) and (ii) are obvious. To prove (iii) and (iv), define w i = (i/N ) t and note that w ∞ ≤ 1 and w 1 = N 1 N N -1 i=0 (i/N ) t ≤ N 1 0 x t dt = N t + 1 Then (iii) comes from H t (u) ≤ w ∞ u 1 whereas (iv) comes from H t (u) ≤ w 1 u ∞ . We conclude this section with a slightly more sophisticated bound that holds for probability vectors -this bound will play a central role in Section C. Lemma 3. Suppose p ∈ R N + , and suppose N i=1 p i = 1. Then H t (p) ≤ 1 - N -1 i=0 min{p i , λ} + λN t + 1 for all λ ≥ 0. Proof. Fix a λ ≥ 0 and define the vectors u and v as follow: u i = min{p i , λ} and v i = p i -min{p i , λ} for all 0 ≤ i ≤ N -1 Note that this two vectors are non-negative and sum to p. We can therefore use Lemma 2 to obtain H t (p) = H t (u + v) ≤ H t (u) + H t (v) ≤ N t + 1 u ∞ + v 1 To conclude, we note that u ∞ ≤ λ, and v 1 = 1 - N -1 i=0 min{p i , λ}.

B PERMUTED MOMENT OF K AND GENERALIZATION ERROR

This section is devoted to the proof of inequality ( 14). We start by recalling a few definitions. The vocabulary, set of concepts, data space, and latent space are V = {1, . . . , n w }, C = {1, . . . , n c }, X = V L and Z = C L respectively. Elements of X are sentences of L words and they take the form x = [x 1 , x 2 , . . . , x L ], while elements of Z take the form c = [c 1 , c 2 , . . . , c L ] and correspond to sequences of concepts. We also recall that the collection of all equipartitions of the vocabulary is denoted by Φ = All functions ϕ from V to C that satisfy |ϕ -1 ({c})| = s c for all c where s c := n w /n c denote the size of the concepts. Given ϕ ∈ Φ, we denote by φ : X → C the function φ [x 1 , x 2 , . . . , x L ] := ϕ(x 1 ), ϕ(x 2 ), . . . , ϕ(x L ) that operates on sentences element-wise. The informal statement "the sentence x is randomly generated by the sequence of concepts c" means that x is sampled uniformly at random from the set φ-1 ({c}) = {x ∈ X : φ(x) = c}. We will often do the abuse of notation of writing φ-1 (c) instead of φ-1 ({c}). We now formally define the sampling process associated with our main data model. Sampling Process DM: (i) Sample T = ( ϕ ; c 1 , . . . , c R ; c 1 , . . . , c R ) uniformly at random in T = Φ × Z 2R . (ii) For r = 1, . . . , R: • Sample (x r,1 , . . . , x r,n unf ) uniformly at random in φ-1 (c r ) × . . . × φ-1 (c r ). • Sample (x r,n unf +1 , . . . , x r,n spl ) uniformly at random in φ-1 (c r ) × . . . × φ-1 (c r ). (iii) Sample x test uniformly at random in φ-1 (c 1 ). Step (i) of the above sampling process consists in selecting at random a task T among all possible tasks. Step (ii) consists in generating a training setfoot_1 S ∈ X R×n spl exactly as depicted on Figure 1 : each unfamiliar sequence of concept c r generates n unf sentences, whereas each familiar sequence of concept c r generates n fam sentences (recall that the number of samples per category is n spl = n unf +n fam ). Finally step (iii) consists in randomly generating an unfamiliar test sentence x test ∈ X . Without loss of generality we assume that this test sentence is generated by the unfamiliar sequence of concept c 1 . We denote by DM the p.d.f. of the sampling process DM. This function is defined on the sample space Ω DM := Φ × C 2R × X R×n spl × X . A sample from Ω DM takes the form ω = ϕ ; c 1 , . . . , c R ; c 1 , . . . , c R The task ; x 1,1 , . . . , x 1,nspl ; . . . ; x R,1 , . . . , x R,nspl The training sentences ; x test The test sentence and we have the following formula for DM DM (ω) := 1 |Φ||C| 2R R r=1 1 φ-1 (c r ) (x r, 1 ) |φ -1 (c r )| nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| (22) where 1 φ-1 (cr) and 1 φ-1 (c r ) denote the indicator functions of the set φ-1 (c r ) and φ-1 (c r ) respectively. Let us compute a few marginals of DM in order to verify that it is indeed the p.d.f. of the sampling process DM. Writing ω = (T , S, x test ), summing over the variables S and x test , and using the fact that x∈X 1 φ-1 (c) (x) = |φ -1 (c)|, we obtain S∈C 2R x test ∈X DM (T , S, x test ) = 1 |Φ||C| 2R . This shows that each task T is equiprobable. Summing over the variable S gives S∈C 2R DM (T , S, x test ) = 1 |Φ||C| 2R 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| This shows that, given a task T , the test sentence x test is obtained by sampling uniformly at random from φ-1 (c 1 ). A similar calculation shows that, given a task T , the train sentence x r,s is obtained by sampling uniformly at random from φ-1 (c r ) if s ≥ 1, and from φ-1 (c r ) if s = 1. Given a feature space F and a feature map ψ : X → F, we define the events E F ,ψ ⊂ Ω DM as follow: E F ,ψ = ω ∈ Ω DM : There exists 1 ≤ s * ≤ n spl such that ψ(x test ), ψ(x 1,s * ) F > ψ(x test ), ψ(x r,s ) F for all 2 ≤ r ≤ R and all 1 ≤ s ≤ n spl . (23) Note that this event consists in all the outcomes ω = (T , S, x test ) for which the feature map ψ associates the test sentence x test to a train point x r,s from the first category. Since by construction, x test belongs to the first category, E F ,ψ consists in all the outcomes for which the nearest neighbor classification rule is 'successful'. As a consequence, when T = Φ × Z 2R , the generalization error can be expressed as err(F, ψ, T) = 1 - P DM E F ,ψ where P DM denote the probability measure on Ω DM induced by DM . Equation ( 24) should be viewed as our 'fully formal' definition of the quantity err(F, ψ, T), as opposed to the more informal definition given earlier by equations ( 6) and ( 7). The goal of this section is to prove inequality ( 14), which, in light of ( 24) is equivalent to sup F ,ψ P DM E F ,ψ ≤ 1 |X | x∈X H 2R-1 (K (x, •)) + 1 R . ( ) which in turn equivalent to sup K:X ×X →R K pos. semi-def. P DM E K ≤ 1 |X | x∈X H 2R-1 (K (x, •)) + 1 R where the event E K is defined by E K = ω ∈ Ω DM : There exists 1 ≤ s * ≤ n spl such that K(x test , x 1,s * ) > K(x test , x r,s ) for all 2 ≤ r ≤ R and all 1 ≤ s ≤ n spl (27) and where the supremum is taken over all kernels K : X × X → R which are symmetric positive semidefinite. We will actually prove a slightly stronger result, namely sup K:X ×X →R Kis symmetric P DM E K ≤ 1 |X | x∈X H 2R-1 (K (x, •)) + 1 R (28) where the supremum is taken over all functions K : X × X → R that satisfy K(x, y) = K(y, x) for all (x, y) ∈ X × X . The rest of the section is devoted to proving (28). In Subsection B.1 we start by considering a simpler data model -for this simpler data model we are able to show that the function ψ implicitly defined by ( 12) is the best possible feature map (we actually only work with the associated kernel K , and never need ψ itself). We also show that the success rate is exactly equal to the permuted moment of K -see Theorem 4, which is is the central result of this section. In the remaining subsections, namely Subsection B.2 and Subsection B.3, we leverage the bound obtained for the simpler data model in order to obtain bound (28) for the main data model. These two subsections are mostly notational. The core of the analysis takes place in Subsection B.1.

B.1 A SIMPLER DATA MODEL

We start by presenting the sampling process associated with our simpler datamodel.

Sampling Process SDM:

(i) Sample ϕ uniformly at random in Φ. Sample c 1 , c 2 , . . . , c t+1 uniformly at random in Z. (ii) For 1 ≤ r ≤ t + 1: Sample x r uniformly at random in φ-1 (c r ). (iii) Sample x test uniformly at random in φ-1 (c 1 ). The function SDM (ϕ; c 1 , . . . , c t+1 ; x 1 , . . . , x t+1 ; x test ) := 1 |Φ||C| t+1 t+1 r=1 1 φ-1 (cr) (x r ) |φ -1 (c r )| 1 φ-1 (c1) (x test ) |φ -1 (c 1 )| (29) on Ω SDM := Φ × C t+1 × X t+2 is the p.d.f. of the above sampling process. We use P SDM to denote the probability measure on Ω SDM induced by this function. The identity in the next theorem is the central result of this section. Theorem 4. Let K denote the set of all symmetric functions from X × X to R. Then sup K∈K P SDM K(x test , x 1 ) > K(x test , x r ) for all 2 ≤ r ≤ t + 1 = 1 |X | x∈X H t (K x ) In (30), K x stands for the function K(x, •). Theorem 4 establishes an intimate connection between the permuted moment and the ability of any fixed feature map (or equivalently, any fixed kernel) to generalize well in our framework. The sampling process considered in this theorem involves two points, x test and x 1 , generated by the same sequence of concepts c 1 , and t 'distractor' points x 2 , . . . , x t+1 generated by different sequences of concepts. Success for the kernel K means correctly recognizing that x test is more 'similar' to x 1 than any of the distractors, and the success rate in (30) precisely quantifies its ability to do so as a function of the number t of distractors. The theorem shows that the probability of success for the best possible kernel at this task is exactly equal to the averaged t th -permuted moment of K x , so it elegantly quantifies the generalization ability of the best possible fixed feature map in term of the permuted moment. We also provide an explicit construction for a kernel K(x, y) that achieves the supremum in (30) -First, choose a kernel ε(x, y) that satisfies (i) ε(x, y) = ε(x, z) for all x, y, z ∈ X with y = z. (ii) 0 ≤ ε(x, y) ≤ 1 for all x, y ∈ X . and then define the following perturbation K(x, y) = K (x, y) + ε(x, y)/(2s L c |Φ|) of K . Any such kernel is a maximizer of the optimization problem in (30), so we may think of perturbations of K as bona-fide optimal. The rest of this subsection is devoted to the proof of Theorem 4, and we also show, in the course of the proof, that (31) is a maximizer of the optimization problem in (30). We use K to denote the set of all symmetric functions from X × X to R. We will refers to such functions as 'kernel' despite the fact that these functions are not necessarily positive semi-definite. Proving Theorem 4 requires that we study the following optimization problem: Maximize E(K) := P SDM K(x test , x 1 ) > K(x test , x r ) for all 2 ≤ r ≤ t + 1 (32) over all kernels K ∈ K. We recall the definition of the optimal kernel, K (x, y) = 1 s L c {ϕ ∈ Φ : ϕ(x ) = ϕ(y ) for all 1 ≤ ≤ L} |Φ| ( ) where s c = n w /s c denotes the size of a concept. We start with the following simple lemma: Lemma 4. The function K x (•) = K (x, •) is a probability distribution on X . Proof. First note that K can be written as K (x, y) = 1 s L c |{ϕ ∈ Φ : φ(x) = φ(y)}| |Φ| = 1 s L c |Φ| ϕ∈Φ 1 {φ(x)=φ(y)} Since ϕ maps exactly s c words to each concept c ∈ {1, . . . , n c }, we have that |{x ∈ X : φ(x) = c}| = s L c for all c ∈ C. ( ) Therefore y∈X K (x, y) = 1 s L c |Φ| ϕ∈Φ y∈X 1 {φ(x)=φ(y)} = 1 s L c |Φ| ϕ∈Φ |{y ∈ X : φ(y) = φ(x)}| = 1 We now show that the marginal of the p.d.f. SDM is related to K . Lemma 5. For all x 1 , . . . , x t+1 and x test in X we have ϕ∈Φ c1∈C • • • ct+1∈C SDM (ϕ; c 1 , . . . , c t+1 ; x 1 , . . . , x t+1 ; x test ) = 1 |X | t+1 K (x 1 , x test ). Proof. Identity (36) can be expressed as φ-1 (c) = s L c for all c ∈ C. As a consequence, definition (29) of SDM (ω) simplifies to SDM (ω) = α t+1 r=1 1 φ-1 (cr) (x r ) 1 φ-1 (c1) (x test ) (37) where the constant α is given by α = 1 |Φ||C| t+1 s L(t+2) c = 1 |Φ|n L(t+1) c s L(t+2) c = 1 |Φ||X | t+1 s L c In the above we have used the fact that |C| = n L c and |X | = n L w . We then note that the identity 1 φ-1 (c) (x) = 1 {φ(x)=c} implies c∈C 1 φ-1 (c) (x) = c∈C 1 {φ(x)=c} = 1 (38) c∈C 1 φ-1 (c) (x) 1 φ-1 (c) (y) = c∈C 1 {φ(x)=c} 1 {φ(y)=c} = 1 {φ(x)=φ(y)} for all x, y ∈ X . Summing (37) over the variables c 1 , . . . , c t+1 we obtain c1∈C • • • ct+1∈C SDM (ω) = α c1∈C • • • ct+1∈C 1 φ-1 (c1) (x 1 ) 1 φ-1 (c1) (x test ) t+1 r=2 1 φ-1 (cr) (x r ) = α c1∈C 1 φ-1 (c1) (x 1 ) 1 φ-1 (c1) (x test ) c2∈C • • • ct+1∈C t+1 r=2 1 φ-1 (cr) (x r ) = α 1 {φ(x1)=φ(x test )} where we have used ( 38) and ( 39) to obtain the last equality. Summing the above over the variable ϕ gives K (x 1 , x test )/|X | t+1 . The next lemma provides a purely algebraic (as opposed to probabilistic) formulation for the functional E(K) defined in (32). Lemma 6. The functional E : K → R can be expressed as E(K) = 1 |X | x∈X y∈X K (x, y) |{z ∈ X : K(x, z) < K(x, y)}| |X | t . Proof. Let g : X t+2 × K → {0, 1} be the indicator function defined by g(x 1 , . . . , x t+1 , x test , K) = 1 if K(x test , x 1 ) > K(x test , x r ) for all 2 ≤ r ≤ t + 1 0 otherwise Let ω denote the sample (ϕ; c 1 , . . . , c t+1 ; x 1 , . . . , x t+1 ; x test ). Since g only depends on the last t+2 variables of ω, we have E(K) = P SDM K(x test , x 1 ) > K(x test , x r ) for all 2 ≤ r ≤ t + 1 (41) = ϕ∈Φ c1∈C • • • ct+1∈C x1∈X • • • xt+1∈X x test ∈X g(x 1 , . . . , x t+1 , x test , K) SDM (ω) (42) = x1∈X • • • xt+1∈X x test ∈X g(x 1 , . . . , x t+1 , x test , K)   ϕ∈Φ c1∈C • • • ct+1∈C SDM (ω)   (43) = x1∈X • • • xt+1∈X x test ∈X g(x 1 , . . . , x t+1 , x test , K) 1 |X | t+1 K (x 1 , x test ) (44) = 1 |X | x1∈X x test ∈X K (x 1 , x test )   1 |X | t x2∈X • • • xt+1∈X g(x 1 , . . . , x t+1 , x test , K)   where we have used Lemma 5 to go from (43) to (44). Writing the indicator function g as a product of indicator functions, g(x 1 , . . . , x t+1 , x test , K) = t+1 r=2 1 {K(x test ,x1)>K(x test ,xr)} we obtain the following expression for the term appearing between parentheses in (45): 1 |X | t x2∈X • • • xt+1∈X g(x 1 , . . . , x t+1 , x test , K) = 1 |X | t t+1 r=2 xr∈X 1 {K(x test ,x1)>K(x test ,xr)} = 1 |X | t z∈X 1 {K(x test ,x1)>K(x test ,z)} t = |{z ∈ X : K(x test , x 1 ) > K(x test , z)}| |X | t Changing the name of variables x test , x 1 to x, y gives (40). We now use expression (40) for E(K) and reformulate optimization problem (32)-( 33) into an equivalent optimization problem over symmetric matrices. Putting an arbitrary ordering on the set X (starting with i = 0) and denoting by K ij the value of the kernel K on the pair that consists of the i th and j th element of X , we see that optimization problem ( 32)-( 33) can be written as Maximize E(K) := 1 N N -1 i=0 N -1 j=0 K ij |{j ∈ [N ] : K ij < K ij }| N t ( ) over all symmetric matrices K ∈ R N ×N (47) In the above we have used the letter N to denote the cardinality of X , that is N = n L w , and we have used the notation [N ] = {0, 1, . . . , N -1}. Before solving the matrix optimization problem ( 46)-( 47), we start with a simpler vector optimization problem. Let p be a probability vector, that is p ∈ R N + with N i=1 p i = 1, and consider the optimization problem: Maximize e(v) := N -1 j=0 p j |{j ∈ [N ] : v j < v j }| N t (48) over all vector v ∈ R N . ( ) Recall from Definition 1 that an ordering permutation of a vector v is a permutation that sorts its entries from smallest to largest. We will say that two vectors v, w ∈ R N have the same ordering if there exist σ ∈ S N which is ordering for both v and w. The following lemma is key -it shows that the optimization problem ( 48)-( 49) has a simple solution. Lemma 7. The following identity sup v∈R N e(v) = H t (p ) holds. Moreover, the supremum is achieved by any vector v ∈ R N that has mutually distinct entriesfoot_2 and that has the same ordering than p . Proof. Let Distinct(R N ) denote the vectors of R N that have mutually distinct entries. We will first show that sup v∈R N e(v) = sup v∈Distinct(R N ) e(v). To do this we show that for any v ∈ R N , there exists w ∈ Distinct(R N ) such that |{j ∈ [N ] : v j < v j }| ≤ |{j ∈ [N ] : w j < w j }| for all 0 ≤ j ≤ N -1. There are many ways to construct such a w. One way is to simply set w j = σ -1 (j) for some permutation σ that orders v. Indeed, note that σ -1 (j) provides the position of v j in the sequence of inequality (18). Therefore if v j < v j we must have that σ -1 (j ) < σ -1 (j). This implies {j ∈ [N ] : v j < v j } ⊂ {j ∈ [N ] : σ -1 (j ) < σ -1 (j)} for all j ∈ [N ] which in turn implies (51). Because of (50) we can now restrict our attention to v ∈ Distinct(R N ). Note that if v ∈ Distinct(R N ), then it has a unique ordering permutation σ, v σ(0) < v σ(1) < v σ(2) < v σ(3) < . . . < v σ(N -1) and, recalling that σ -1 (j) provide the position of v j in the above ordering, we clearly have that |{j ∈ [N ] : v j < v j }| = σ -1 (j). Therefore, if v ∈ Distinct(R N ) and if σ denotes its unique ordering permutation, e(v) can be expressed as e(v) = N -1 j=0 p j |{j ∈ [N ] : v j < v j }| N t = N -1 j=0 p j σ -1 (j) N t = N -1 j=0 p σ(j) (j/N ) t (52) Looking at definition (11) of the permuted moment, it is then clear that e(v) ≤ H t (p ) for all v ∈ Distinct(R N ). We then note that if v ∈ Distinct(R N ) has the same ordering than p , then its unique ordering permutation σ must also be an ordering permutation of p . Then (52) combined with Lemma 1 implies that e(v) = H t (p ). This concludes the proof. Relaxing the symmetric constraint in the optimization problem ( 46)-( 47) gives the following unconstrained problem over all N -by-N matrices: Maximize E(K) := 1 N N -1 i=0 N -1 j=0 K ij |{j ∈ [N ] : K ij < K ij }| N t (53) over all matrices K ∈ R N ×N (54) Let us denote by K i,: the i th row of the matrix K and remark that K i,: is a probability vector (because K (x, •) is a probability distribution on X , see Lemma 4). We then note that the above unconstrained problem decouples into N separate optimization problems of the type ( 48)-( 49) in which the probability vector p must be replaced by the probability vector K i,: . Using Lemma 7 we therefore have that any K ∈ R N ×N that satisfies, for each 0 ≤ i ≤ N -1, (a) The entries of K i,: are mutually distinct, (b) K i,: and K i,: have the same ordering, must be a solution of ( 53)-( 54). Lemma 7 also gives: sup K∈R N ×N E(K) = 1 N N -1 i=0 H t (K i,: ). Now let ε ∈ R N ×N be a symmetric matrix that satisfies: (i) ε ij = ε ij for all i, j, j ∈ [N ] with j = j , (ii) 0 ≤ ε ij ≤ 1 for all i, j ∈ [N ], and define the following perturbation of the matrix K : K = K + 0.5 s L c |Φ| ε (55) Recalling definition (35) of the kernel K , it is clear that for each i, j ∈ [N ], we have K ij = s L c |Φ| for some integer . As a consequence perturbing K by adding to its entries quantities smaller than 1/(s L c |Φ|) can not change the ordering of its rows. Therefore the kernel K defined by ( 55) satisfies (b). It also satisfies (a). Indeed, if K ij = K ij and j = j , then we clearly have that K ij = K ij due to (i). On the other hand if K ij = K ij , then K ij = K ij due to (ii) and (56). We have therefore constructed a symmetric matrix that is a solution of the optimization problem ( 53)-( 54). As a consequence we have sup K∈K E(K) = sup K∈R N ×N E(K) = 1 N N -1 i=0 H t (K i,: ) where K should now be interpreted as the set of N -by-N symmetric matrices. The above equality proves Theorem 4, and we have also shown that the perturbed kernel (55) achieves the supremum.

B.2 CONNECTION BETWEEN THE TWO SAMPLING PROCESSES

In this subsection we show that the p.d.f. of Sampling Process SDM can be obtained by marginalizing the p.d.f. of Sampling Process DM over a subset of the variables. We also compute another marginal of DM that will prove useful in the next subsection. Recall that DM (ω) = 1 |Φ||C| 2R R r=1 1 φ-1 (c r ) (x r, 1 ) |φ -1 (c r )| nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| on Ω DM := Φ × C 2R × X R×nspl+1 is the p.d.f. of the sampling process for our main data model. Samples from Ω DM take the form ω = (ϕ ; c 1 , c 2 , c 3 , . . . , c R ; c 1 , c 2 , c 3 , . . . , c R ; x 1,1 , x 1,2 , x 1,3 , . . . , x 1,nspl ; . . . ; x R,1 , x R,2 , x R,3 , . . . , x R,nspl ; x test ) We separate these variables into two groups, ω = (ω a , ω b ), where ω a = (ϕ ; c 1 , c 2 , c 3 , . . . , c R ; c 1 , c 2 , c 3 , . . . , c R ; x 1,1 , x 1,2 ; . . . ; x R,1 , x R,2 ; x test ) (58) ω b = (x 1,3 , x 1,4 , . . . , x 1,nspl ; . . . ; x R,3 , x R,4 , . . . , x R,nspl ) The variable ω a belongs to Ω a = Φ×C 2R ×X 2R+1 , and the variable ω b belongs to Ω b = X R(nspl-2) . Note that the variables in ω a contains, among other, 2R sequences of concepts and 2R training points (the first and second training points of each category). Each of these 2R training points is generated by one of the 2R sequences of concepts. So the variables involved in ω a are generated by a process similar to the one involved in the simpler data model. The following lemma shows that p.d.f. of ω a , after marginalizing ω b , is indeed SDM . Lemma 8. For all ω a ∈ Ω a we have ω b ∈Ω b DM (ω a , ω b ) = 1 |Φ||C| 2R R r=1 1 φ-1 (c r ) (x r, 1 ) |φ -1 (c r )| 1 φ-1 (cr) (x r, 2 ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| Recalling the definition (29) of SDM , and letting t + 1 = 2R, we see that the above lemma states that ω b ∈Ω b DM (ω a , ω b ) = SDM (ω a ) and Ω a = Ω SDM . Proof of Lemma 8. We start by reorganizing the terms involved in the product defining DM so that the variables in ω a and ω b are clearly separated: DM (ω) = 1 |Φ||C| 2R R r=1 1 φ-1 (c r ) (x r, 1 ) |φ -1 (c r )| 1 φ-1 (cr) (x r, 2 ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| R r=1 nspl s=3 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| To demonstrate the process, let us start by summing the above formula over the first variable of ω b , namely x 1,3 . Since this variable only occurs in the last term of the above product, we have: x1,3∈X DM (ω) = 1 |Φ||C| 2R R r=1 1 φ-1 (c r ) (x r, 1 ) |φ -1 (c r )| 1 φ-1 (cr) (x r, 2 ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )|        1≤r≤R 3≤s≤nspl (r,s) =(1,3) 1 φ-1 (cr) (x r, s ) |φ -1 (c r )|          x1,3∈X 1 φ-1 (c1) (x 1,3 ) |φ -1 (c 1 )|   Since x∈X 1 φ-1 (c1) (x) = |φ -1 (c 1 )|, the last term of the above product is equal to 1 and can therefore be omitted. Repeating this process for all the x r,s that constitute ω b leads to the desired result. In the next subsection we will need the marginal of DM with respect to another set of variables. To this aim we write ω = (ω c , ω d ) where ω c = (ϕ ; x 1,2 , x 1,3 , . . . , x 1,nspl ; . . . ; x R,2 , x R,3 , , . . . , x R,nspl ; x test ) (61) ω d = (c 1 , . . . , c R ; c 1 , . . . , c R ; x 1,1 ; . . . ; x R,1 ) (62) Note that all the unfamiliar training points are contained in ω d . The test point and the familiar training points are in ω c . We also let Ω c = Φ × X R(nspl-1)+1 and Ω d = C 2R × X R . Lemma 9. For all ω c ∈ Ω c we have ω d ∈Ω d DM (ω c , ω d ) = 1 |Φ||X | R+1 s LR(n spl -2) c R r=1 1 {φ(xr,2)=φ(xr,3)=...=φ(xr,n spl )} Proof. We reorganizing the terms involved in the product defining DM so that the variables in ω c and ω d are separated: DM (ω) = 1 |Φ||C| 2R R r=1 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| R r=1 1 φ-1 (c r ) (x r, 1 ) |φ -1 (c r )| Summing the above formula over the last variable involved in ω d , namely x R,1 , gives x R,1 ∈X DM (ω) = 1 |Φ||C| 2R R r=1 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| R-1 r=1 1 φ-1 (c r ) (x r, 1 ) |φ -1 (c r )| x R,1 ∈X 1 φ-1 (c R ) (x R,1 ) |φ -1 (c R )| The last term in the above product is equal to 1 and can therefore be omitted. Iterating this process gives x1,1∈X • • • x R,1 ∈X DM (ω) = 1 |Φ||C| 2R R r=1 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| 1 φ-1 (c 1 ) (x test ) |φ -1 (c 1 )| We then use the fact that c 1 ∈C 1 φ-1 (c 1 ) (x test ) = 1, see (38), together with φ-1 (c 1 ) = s L c , see (36), to obtain c 1 ∈C x1,1∈X • • • x R,1 ∈X DM (ω) = 1 |Φ||C| 2R s L c R r=1 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| We then sum over c 2 , . . . , c R . Since these variables are not involved in the above formula we get c 1 ∈C • • • c R ∈C x1,1∈X • • • x R,1 ∈X DM (ω) = 1 |Φ||C| R+1 s L c R r=1 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| = 1 |Φ||C| R |X | R r=1 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| where we have used |C|s L c = n L c s L c = |X | to obtain the last equality. Summing over c 1 gives c1∈C c 1 ∈C • • • c R ∈C x1,1 • • • x R,1 ∈X DM (ω) = 1 |Φ||C| R |X | c1∈C R r=1 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| = 1 |Φ||C| R |X | R r=2 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| c1∈C nspl s=2 1 φ-1 (c1) (x 1, s ) |φ -1 (c 1 )| = 1 |Φ||C| R |X | R r=2 nspl s=2 1 φ-1 (cr) (x r, s ) |φ -1 (c r )| 1 {φ(x1,2)=φ(x1,3)=...=φ(x1,n spl )} |φ -1 (c 1 )| nspl-1 To obtain the last equality we have used (39) but for a product of n spl -1 indicator functions instead of just two. Iterating this process we obtain c1∈C • • • c R ∈C c 1 ∈C • • • c R ∈C x1,1∈X • • • x R,1 ∈X DM (ω) = 1 |Φ||C| R |X | R r=1 1 {φ(xr,2)=φ(x1,3)=...=φ(xr,n spl )} |φ -1 (c r )| nspl-1 Using one more time that φ-1 (c r ) = s L c and |C|s L c = |X | gives to the desired result. < l a t e x i t s h a 1 _ b a s e 6 4 = " 8 X < l a t e x i t s h a 1 _ b a s e 6 4 = " M / B t e L u Y r L v W T n z 3 Z v h 9 l d T s H n U = " > A A A E Z H i c j Z P P b 9 M w F M e 9 r M A o g 3 V M n J C Q x V i F 0 F Q l Z Q g u S J O 4 c B w S 3 S Y 1 U e W 4 L 6 1 V J 4 7 s F 9 Q q y l / I X 8 A / g L h y h Q t J m q Z d 2 g N P i f R + 2 O / z / E 3 s x 1 I Y t O 0 f e 9 Z + 6 9 7 9 B w c P 2 4 8 O H z 8 5 6 h w / v T Y q 0 R w G X E m l b 3 1 m Q I o I B i h Q w m 2 s g Y W + h B t / 9 q m o 3 3 w D b Y S K v u I i B i 9 k k 0 g E g j P M U 6 N j a + z 6 M B F R G j L U Y v 4 m G 0 q v n b p + Q O f Z K O 0 7 G a X d j 7 R L 6 b D r I s w x 9 R N E 0 N l 5 n l 8 m Y q V n e V h F M k c X E d 0 Z L t Q k 0 V j 0 9 K j r 5 s 8 G q r 9 C r U h 8 K v g M o g 0 U n w I Y W M N 4 c d K i X l l N B c S E l w u r S j U 7 Q L A b / r Y J L 5 e e r 1 v W 6 K 1 Z 1 h m m t c L l O N 1 N b K l Q g W 1 S L 5 r U h l y 1 1 v U U 1 Y G 7 d W d k q J Z D b C q + C / b u P / R d t W 9 + 6 p W + K 3 l h V j F 3 H L L S 1 o V o X P 9 U o 8 6 p 3 b N L o 9 u O U z m n p L K r U e e X O 1 Y 8 C S F C L p k x Q 8 e O 0 U u Z R s E l Z G 0 3 M R A z P m M T G O Z u x E I w X l p e h 4 y e 5 Z k x D Z T O 3 w h p m d 3 c k b L Q m E U h 0 1 k + 4 N Q 0 a 0 V y V 2 2 Y Y P D B S 0 U U J w g R X 4 K C R F J U t L h b d C w 0 c J S L 3 G F c i 3 x W y q d M M 5 7 L e J d S 9 E a l p C m k c Z p C b D v X / Z 5 j 9 5 w v F 6 e X r y q R D s h z 8 p K 8 J g 5 5 T y 7 J Z 3 J F B o R b 3 6 3 f 1 h / r 7 / 7 P 1 m H r p P V s u d T a q / a c k D v W e v E P 1 7 l g h A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " M / B t e L u Y r L v W T n z 3 Z v h 9 l d T s H n U = " > A A A E Z H i c j Z P P b 9 M w F M e 9 r M A o g 3 V M n J C Q x V i F 0 F Q l Z Q g u S J O 4 c B w S 3 S Y 1 U e W 4 L 6 1 V J 4 7 s F 9 Q q y l / I X 8 A / g L h y h Q t J m q Z d 2 g N P i f R + 2 O / z / E 3 s x 1 I Y t O 0 f e 9 Z + 6 9 7 9 B w c P 2 4 8 O H z 8 5 6 h w / v T Y q 0 R w G X E m l b 3 1 m Q I o I B i h Q w m 2 s g Y W + h B t / 9 q m o 3 3 w D b Y S K v u I i B i 9 k k 0 g E g j P M U 6 N j a + z 6 M B F R G j L U Y v 4 m G 0 q v n b p + Q O f Z K O 0 7 G a X d j 7 R L 6 b D r I s w x 9 R N E 0 N l 5 n l 8 m Y q V n e V h F M k c X E d 0 Z L t Q k 0 V j 0 9 K j r 5 s 8 G q r 9 C r U h 8 K v g M o g 0 U n w I Y W M N 4 c d K i X l l N B c S E l w u r S j U 7 Q L A b / r Y J L 5 e e r 1 v W 6 K 1 Z 1 h m m t c L l O N 1 N b K l Q g W 1 S L 5 r U h l y 1 1 v U U 1 Y G 7 d W d k q J Z D b C q + C / b u P / R d t W 9 + 6 p W + K 3 l h V j F 3 H L L S 1 o V o X P 9 U o 8 6 p 3 b N L o 9 u O U z m n p L K r U e e X O 1 Y 8 C S F C L p k x Q 8 e O 0 U u Z R s E l Z G 0 3 M R A z P m M T G O Z u x E I w X l p e h 4 y e 5 Z k x D Z T O 3 w h p m d 3 c k b L Q m E U h 0 1 k + 4 N Q 0 a 0 V y V 2 2 Y Y P D B S 0 U U J w g R X 4 K C R F J U t L h b d C w 0 c J S L 3 G F c i 3 x W y q d M M 5 7 L e J d S 9 E a l p C m k c Z p C b D v X / Z 5 j 9 5 w v F 6 e X r y q R D s h z 8 p K 8 J g 5 5 T y 7 J Z 3 J F B o R b 3 6 3 f 1 h / r 7 / 7 P 1 m H r p P V s u d T a q / a c k D v W e v E P 1 7 l g h A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " M / B t e L u Y r L v W T n z 3 Z v h 9 l d T s H n U = " > A A A E Z H i c j Z P P b 9 M w F M e 9 r M A o g 3 V M n J C Q x V i F 0 F Q l Z Q g u S J O 4 c B w S 3 S Y 1 U e W 4 L 6 1 V J 4 7 s F 9 Q q y l / I X 8 A / g L h y h Q t J m q Z d 2 g N P i f R + 2 O / z / E 3 s x 1 I Y t O 0 f e 9 Z + 6 9 7 9 B w c P 2 4 8 O H z 8 5 6 h w / v T Y q 0 R w G X E m l b 3 1 m Q I o I B i h Q w m 2 s g Y W + h B t / 9 q m o 3 3 w D b Y S K v u I i B i 9 k k 0 g E g j P M U 6 N j a + z 6 M B F R G j L U Y v 4 m G 0 q v n b p + Q O f Z K O 0 7 G a X d j 7 R L 6 b D r I s w x 9 R N E 0 N l 5 n l 8 m Y q V n e V h F M k c X E d 0 Z L t Q k 0 V j 0 9 K j r 5 s 8 G q r 9 C r U h 8 K v g M o g 0 U n w I Y W M N 4 c d K i X l l N B c S E l w u r S j U 7 Q L A b / r Y J L 5 e e r 1 v W 6 K 1 Z 1 h m m t c L l O N 1 N b K l Q g W 1 S L 5 r U h l y 1 1 v U U 1 Y G 7 d W d k q J Z D b C q + C / b u P / R d t W 9 + 6 p W + K 3 l h V j F 3 H L L S 1 o V o X P 9 U o 8 6 p 3 b N L o 9 u O U z m n p L K r U e e X O 1 Y 8 C S F C L p k x Q 8 e O 0 U u Z R s E l Z G 0 3 M R A z P m M T G O Z u x E I w X l p e h 4 y e 5 Z k x D Z T O 3 w h p m d 3 c k b L Q m E U h 0 1 k + 4 N Q 0 a 0 V y V 2 2 Y Y P D B S 0 U U J w g R X 4 K C R F J U t L h b d C w 0 c J S L 3 G F c i 3 x W y q d M M 5 7 L e J d S 9 E a l p C m k c Z p C b D v X / Z 5 j 9 5 w v F 6 e X r y q R D s h z 8 p K 8 J g 5 5 T y 7 J Z 3 J F B o R b 3 6 3 f 1 h / r 7 / 7 P 1 m H r p P V s u d T a q / a c k D v W e v E P 1 7 l g h A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " M / B t e L u Y r L v W T n z 3 Z v h 9 l d T s H n U = " > A A A E Z H i c j Z P P b 9 M w F M e 9 r M A o g 3 V M n J C Q x V i F 0 F Q l Z Q g u S J O 4 c B w S 3 S Y 1 U e W 4 L 6 1 V J 4 7 s F 9 Q q y l / I X 8 A / g L h y h Q t J m q Z d 2 g N P i f R + 2 O / z / E 3 s x 1 I Y t O 0 f e 9 Z + 6 9 7 9 B w c P 2 4 8 O H z 8 5 6 h w / v T Y q 0 R w G X E m l b 3 1 m Q I o I B i h Q w m 2 s g Y W + h B t / 9 q m o 3 3 w D b Y S K v u I i B i 9 k k 0 g E g j P M U 6 N j a + z 6 M B F R G j L U Y v 4 m G 0 q v n b p + Q O f Z K O 0 7 G a X d j 7 R L 6 b D r I s w x 9 R N E 0 N l 5 n l 8 m Y q V n e V h F M k c X E d 0 Z L t Q k 0 V j 0 9 K j r 5 s 8 G q r 9 C r U h 8 K v g M o g 0 U n w I Y W M N 4 c d K i X l l N B c S E l w u r S j U 7 Q L A b / r Y J L 5 e e r 1 v W 6 K 1 Z 1 h m m t c L l O N 1 N b K l Q g W 1 S L 5 r U h l y 1 1 v U U 1 Y G 7 d W d k q J Z D b C q + C / b u P / R d t W 9 + 6 p W + K 3 l h V j F 3 H L L S 1 o V o X P 9 U o 8 6 p 3 b N L o 9 u O U z m n p L K r U e e X O 1 Y 8 C S F C L p k x Q 8 e O 0 U u Z R s E l Z G 0 3 M R A z P m M T G O Z u x E I w X l p e h 4 y e 5 Z k x D Z T O 3 w h p m d 3 c k b L Q m E U h 0 1 k + 4 N Q 0 a 0 V y V 2 2 Y Y P D B S 0 U U J w g R X 4 K C R F J U t L h b d C w 0 c J S L 3 G F c i 3 x W y q d M M 5 7 L e J d S 9 E a l p C m k c Z p C b D v X / Z 5 j 9 5 w v F 6 e X r y q R D s h z 8 p K 8 J g 5 5 T y 7 J Z 3 J F B o R b 3 6 3 f 1 h / r 7 / 7 P 1 m H r p P V s u d T a q / a c k D v W e v E P 1 7 l g h A = = < / l a t t N 8 1 T W V M B 4 T h V P T K 1 u j M v a 7 p Q = " > A A A E n n i c f Z N N j 9 M w E I b T U G A p H 7 u F I x f D s h V C V d U s i 4 A D 0 k o c 4 I J Y J L p b q Y k q x 5 2 2 V p 0 4 s i e o V Z S f x g / h z B X + A 3 a a b t O P x U q k 8 X i c 5 3 0 n d p g I r r H b / V V z b 9 V v 3 7 l 7 c K 9 x / 8 H D R 4 d H z c e X W q a K Q Y 9 J I V U / p B o E j 6 G H H A X 0 E w U 0 C g V c h b O P d v 3 q B y j N Z f w d F w k E E Z 3 E f M w Z R Z M a N t 1 L P 4 Q J j 7 O I o u L z V / l A B I 3 M D 8 d k n g 8 z z 8 s J a X 0 g L U I G L R 9 h j h m b A m j I 2 y a / T I Q p I i i T K O c C E F N m K 1 p k t Y W z G c S V j A C Y 2 S 8 H x P f N U w G e r o D X P K q U x A o v k W q 2 p j F r t s p a l a / h h V 4 z 3 4 W 9 3 o a t t e 8 R f 5 P h R C J F a Q v s r m J s F F q f h r y J P r s Z v S K H A O P K d C E n q c L t J r Z J R W r R C + t 0 B / d m G 7 c W v e K b M 7 P 2 9 B + P r Y r D U l P x J x s + x K P r Y z Q 8 O u 5 2 u s U g u 4 F X B s d O O S 6 G z V r X H 0 m W R h A j E 1 T r g d d N M M i o Q s 4 E 5 A 0 / 1 Z B Q N q M T G J g w p h H o I C t u Q E 5 O T G Z E x l K Z N 0 Z S Z K s 7 M h p p v T A O y Y l R O N X b a z a 5 b 2 2 Q 4 v h d k P E 4 S R F i t g S N U 0 F Q E n u d y I g r Y C g W J q B M c a O V s C l V l J n 2 7 V B w G r V L W H s l a K P G r q C U Q r c 1 U n v s z M 0 0 1 s n e G t t j b 7 u j u 0 H v t P O + 4 3 0 7 O z 5 / U T b 7 w H n q P H d e O p 7 z 1 j l 3 P j s X T s 9 h 7 k / 3 t / v H / V t / V v 9 U / 1 L / u i x 1 a + W e J 8 7 G q P f / A Y M 1 f c w = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 8 X t N 8 1 T W V M B 4 T h V P T K 1 u j M v a 7 p Q = " > A A A E n n i c f Z N N j 9 M w E I b T U G A p H 7 u F I x f D s h V C V d U s i 4 A D 0 k o c 4 I J Y J L p b q Y k q x 5 2 2 V p 0 4 s i e o V Z S f x g / h z B X + A 3 a a b t O P x U q k 8 X i c 5 3 0 n d p g I r r H b / V V z b 9 V v 3 7 l 7 c K 9 x / 8 H D R 4 d H z c e X W q a K Q Y 9 J I V U / p B o E j 6 G H H A X 0 E w U 0 C g V c h b O P d v 3 q B y j N Z f w d F w k E E Z 3 E f M w Z R Z M a N t 1 L P 4 Q J j 7 O I o u L z V / l A B I 3 M D 8 d k n g 8 z z 8 s J a X 0 g L U I G L R 9 h j h m b A m j I 2 y a / T I Q p I i i T K O c C E F N m K 1 p k t Y W z G c S V j A C Y 2 S 8 H x P f N U w G e r o D X P K q U x A o v k W q 2 p j F r t s p a l a / h h V 4 z 3 4 W 9 3 o a t t e 8 R f 5 P h R C J F a Q v s r m J s F F q f h r y J P r s Z v S K H A O P K d C E n q c L t J r Z J R W r R C + t 0 B / d m G 7 c W v e K b M 7 P 2 9 B + P r Y r D U l P x J x s + x K P r Y z Q 8 O u 5 2 u s U g u 4 F X B s d O O S 6 G z V r X H 0 m W R h A j E 1 T r g d d N M M i o Q s 4 E 5 A 0 / 1 Z B Q N q M T G J g w p h H o I C t u Q E 5 O T G Z E x l K Z N 0 Z S Z K s 7 M h p p v T A O y Y l R O N X b a z a 5 b 2 2 Q 4 v h d k P E 4 S R F i t g S N U 0 F Q E n u d y I g r Y C g W J q B M c a O V s C l V l J n 2 7 V B w G r V L W H s l a K P G r q C U Q r c 1 U n v s z M 0 0 1 s n e G t t j b 7 u j u 0 H v t P O + 4 3 0 7 O z 5 / U T b 7 w H n q P H d e O p 7 z 1 j l 3 P j s X T s 9 h 7 k / 3 t / v H / V t / V v 9 U / 1 L / u i x 1 a + W e J 8 7 G q P f / A Y M 1 f c w = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 8 X t N 8 1 T W V M B 4 T h V P T K 1 u j M v a 7 p Q = " > A A A E n n i c f Z N N j 9 M w E I b T U G A p H 7 u F I x f D s h V C V d U s i 4 A D 0 k o c 4 I J Y J L p b q Y k q x 5 2 2 V p 0 4 s i e o V Z S f x g / h z B X + A 3 a a b t O P x U q k 8 X i c 5 3 0 n d p g I r r H b / V V z b 9 V v 3 7 l 7 c K 9 x / 8 H D R 4 d H z c e X W q a K Q Y 9 J I V U / p B o E j 6 G H H A X 0 E w U 0 C g V c h b O P d v 3 q B y j N Z f w d F w k E E Z 3 E f M w Z R Z M a N t 1 L P 4 Q J j 7 O I o u L z V / l A B I 3 M D 8 d k n g 8 z z 8 s J a X 0 g L U I G L R 9 h j h m b A m j I 2 y a / T I Q p I i i T K O c C E F N m K 1 p k t Y W z G c S V j A C Y 2 S 8 H x P f N U w G e r o D X P K q U x A o v k W q 2 p j F r t s p a l a / h h V 4 z 3 4 W 9 3 o a t t e 8 R f 5 P h R C J F a Q v s r m J s F F q f h r y J P r s Z v S K H A O P K d C E n q c L t J r Z J R W r R C + t 0 B / d m G 7 c W v e K b M 7 P 2 9 B + P r Y r D U l P x J x s + x K P r Y z Q 8 O u 5 2 u s U g u 4 F X B s d O O S 6 G z V r X H 0 m W R h A j E 1 T r g d d N M M i o Q s 4 E 5 A 0 / 1 Z B Q N q M T G J g w p h H o I C t u Q E 5 O T G Z E x l K Z N 0 Z S Z K s 7 M h p p v T A O y Y l R O N X b a z a 5 b 2 2 Q 4 v h d k P E 4 S R F i t g S N U 0 F Q E n u d y I g r Y C g W J q B M c a O V s C l V l J n 2 7 V B w G r V L W H s l a K P G r q C U Q r c 1 U n v s z M 0 0 1 s n e G t t j b 7 u j u 0 H v t P O + 4 3 0 7 O z 5 / U T b 7 w H n q P H d e O p 7 z 1 j l 3 P j s X T s 9 h 7 k / 3 t / v H / V t / V v 9 U / 1 L / u i x 1 a + W e J 8 7 G q P f / A Y M 1 f c w = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 8 X t N 8 1 T W V M B 4 T h V P T K 1 u j M v a 7 p Q = " > A A A E n n i c f Z N N j 9 M w E I b T U G A p H 7 u F I x f D s h V C V d U s i 4 A D 0 k o c 4 I J Y J L p b q Y k q x 5 2 2 V p 0 4 s i e o V Z S f x g / h z B X + A 3 a a b t O P x U q k 8 X i c 5 3 0 n d p g I r r H b / V V z b 9 V v 3 7 l 7 c K 9 x / 8 H D R 4 d H z c e X W q a K Q Y 9 J I V U / p B o E j 6 G H H A X 0 E w U 0 C g V c h b O P d v 3 q B y j N Z f w d F w k E E Z 3 E f M w Z R Z M a N t 1 L P 4 Q J j 7 O I o u L z V / l A B I 3 M D 8 d k n g 8 z z 8 s J a X 0 g L U I G L R 9 h j h m b A m j I 2 y a / T I Q p I i i T K O c C E F N m K 1 p k t Y W z G c S V j A C Y 2 S 8 H x P f N U w G e r o D X P K q U x A o v k W q 2 p j F r t s p a l a / h h V 4 z 3 4 W 9 3 o a t t e 8 R f 5 P h R C J F a Q v s r m J s F F q f h r y J P r s Z v S K H A O P K d C E n q c L t J r Z J R W r R C + t 0 B / d m G 7 c W v e K b M 7 P 2 9 B + P r Y r D U l P x J x s + x K P r Y z Q 8 O u 5 2 u s U g u 4 F X B s d O O S 6 G z V r X H 0 m W R h A j E 1 T r g d d N M M i o Q s 4 E 5 A 0 / 1 Z B Q N q M T G J g w p h H o I C t u Q E 5 O T G Z E x l K Z N 0 Z S Z K s 7 M h p p v T A O y Y l R O N X b a z a 5 b 2 2 Q 4 v h d k P E 4 S R F i t g S N U 0 F Q E n u d y I g r Y C g W J q B M c a O V s C l V l J n 2 7 V B w G r V L W H s l a K P G r q C U Q r c 1 U n v s z M 0 0 1 s n e G t t j b 7 u j u 0 H v t P O + 4 3 0 7 O z 5 / U T b 7 w H n q P H d e O p 7 z 1 j l 3 P j s X T s 9 h 7 k / 3 t / v H / V t / V v 9 U / 1 L / u i x 1 a + W e J 8 7 G q P f / A Y M 1 f c w = < / l a t e x i t > c 0 = [ dairy, meat, meat, meat, dairy ] c2 = [ meat, dairy, dairy, veggie, meat ] < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9  E 3 b x j c f v U s y Q j b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " R t Z W k F o j n t 9 U L Q 6 4 r Q 4 Y Y T 2 B V 9 E = " > A A A C N 3 i c b V F N S 8 N A E N 3 U r x q r t m c v i 0 X w U E r i R b 0 J X j x W M L b Q h r L Z b N q l m 9 2 w O x F K 6 B / w 6 o / y N / g j P I l 3 N 2 0 E 2 z q w 8 H h v l j f z J s o E N + B 5 H 0 5 t Z 3 d v / 6 B + 6 B 4 1 3 O O T 0 2 b j 2 a h c U x Z Q J Z Q e R M Q w w S U L g I N g g 0 w z k k a C 9 a P Z f a n 3 X 5 g 2 X M k n m G c s T M l E 8 o R T A p b q j Z t t r + s t C 2 8 D v w J t V N W 4 5 X i j W N E 8 Z R K o I M Y M f S + D s C A a O B V s 4 Y 5 y w z J C Z 2 T C h h Z K k j I T F s s 5 F / j C M j F O l L Z P A l 6 y f 3 8 U J D V m n k a 2 M y U w N Z t a S f 6 n D X N I b s K C y y w H J u n K K M k F B o X L p X H M N a M g 5 h Y Q q r m d F d M p 0 Y S C j W b T B a Z p p z L r / A 6 0 1 l M q o J Q w H Q O W Y n J i w 1 + 4 + N + e h Y 3 Y 3 w x 0 G w R X 3 d u u / + i h O j p D 5 + g S + e g a 3 a E H 1 E M B o i h G r + j N e X c + n a / V J W p O d Z I W W i v n + w e u A b C b < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " R D z C o Z O u G g y G e B Z k U T K 6 X t V j Y V 4 = " > A A A D H X i c h V J N j 9 M w E H X C 1 1 I W 6 H L l Y r G i I B R V y V 6 A A x I S B z g u E m V X S q L K c S a p t b Y T 2 Z O y V Z R f x K / h h P b I P 8 F p U 0 S 7 i x g 5 0 p s 3 M 3 p j v 2 S 1 F B b D 8 M r z b 9 2 + c / f e w f 3 R g 8 O H j x 6 P j w 6 / 2 q o x H G a 8 k p U 5 z 5 g F K T T M U K C E 8 9 o A U 5 m E s + z i Q 1 8 / W 4 K x o t J f c F V D q l i p R S E 4 Q 0 f N x 7 + S D E q h W 8 X Q i M t X X S z T U Z t k B e X d i / k J n b y b 0 H h C E 4 R L b H M m z K o L 6 D Z X w L B P / 5 9 t J i m d p D R J 3 N l K 7 C s M U 7 t j w T / S J Z S l g I 7 u a j q N U Q I 6 / 3 O j + f g 4 n I b r o N d B N I B j M s T p / M g L k 7 z i j Q K N X D J r 4 y i s M W 2 Z Q c E l d K O k s V A z f s F K i B 3 U T I F N 2 7 U X H X 3 u m J w W l X G f R r p m / 5 5 o m b J 2 p T L X 6 T Z c 2 P 1 a T 9 5 U i x s s 3 q S t 0 H W D o P l G q G g k x Y r 2 x t J c G O A o V w 4 w b o T b l f I F M 4 y j s 3 9 f B R c q G M S C 7 U I 7 P X 0 F q 0 r a w K K j Q L u f x F 2 d 3 t j T v 3 G 0 / 6 L X w e x k + n Y a f Q 7 J A X l K n p G X J C K v y X v y i Z y S G e H e R 0 9 5 S + + b / 9 3 / 4 f / c m O F 7 g y t P y E 7 4 V 7 8 B m C M A 1 w = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " R D z C o Z O u G g y G e B Z k U T K 6 X t V j Y V 4 = " > A A A D H X i c h V J N j 9 M w E H X C 1 1 I W 6 H L l Y r G i I B R V y V 6 A A x I S B z g u E m V X S q L K c S a p t b Y T 2 Z O y V Z R f x K / h h P b I P 8 F p U 0 S 7 i x g 5 0 p s 3 M 3 p j v 2 S 1 F B b D 8 M r z b 9 2 + c / f e w f 3 R g 8 O H j x 6 P j w 6 / 2 q o x H G a 8 k p U 5 z 5 g F K T T M U K C E 8 9 o A U 5 m E s + z i Q 1 8 / W 4 K x o t J f c F V D q l i p R S E 4 Q 0 f N x 7 + S D E q h W 8 X Q i M t X X S z T U Z t k B e X d i / k J n b y b 0 H h C E 4 R L b H M m z K o L 6 D Z X w L B P / 5 9 t J i m d p D R J 3 N l K 7 C s M U 7 t j w T / S J Z S l g I 7 u a j q N U Q I 6 / 3 O j + f g 4 n I b r o N d B N I B j M s T p / M g L k 7 z i j Q K N X D J r 4 y i s M W 2 Z Q c E l d K O k s V A z f s F K i B 3 U T I F N 2 7 U X H X 3 u m J w W l X G f R r p m / 5 5 o m b J 2 p T L X 6 T Z c 2 P 1 a T 9 5 U i x s s 3 q S t 0 H W D o P l G q G g k x Y r 2 x t J c G O A o V w 4 w b o T b l f I F M 4 y j s 3 9 f B R c q G M S C 7 U I 7 P X 0 F q 0 r a w K K j Q L u f x F 2 d 3 t j T v 3 G 0 / 6 L X w e x k + n Y a f Q 7 J A X l K n p G X J C K v y X v y i Z y S G e H e R 0 9 5 S + + b / 9 3 / 4 f / c m O F 7 g y t P y E 7 4 V 7 8 B m C M A 1 w = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " t m i W q 8 A i K a d 2 o A 6 y B 4 X I h t P 0 C Q U = " > A A A D K H i c h V J L b 9 N A E F 6 b V w m P p n D k s i I Q E L I i u x f g g F S J A z 0 W i d B K t h W t 1 2 N n 1 f X a 2 h 2 H R p Z / E b + G E + o R f k n X i Y N I W s R o V / N Y c p L 2 W p z x J m Q A o F U x Q o 4 a z S w I p E w m l y / r H L n y 5 A G 1 G q L 7 i s I C 5 Y r k Q m O E M b m g 1 / R Q n k Q j U F Q y 0 u 3 r S h j A d N l G S U t 6 9 m h 3 T 8 Y U z D M Y 0 Q L r B J m d D L 1 q M b v w C G n f t / b 8 2 k d B z T K L J n I 7 G r 0 L O 2 a d 4 / 3 A X k u Y C W b m t a j U E E K v 3 z o t l w 5 E / 8 l d H r I O j B i P R 2 M j t w / C g t e V 2 A Q i 6 Z M W H g V x g 3 T K P g E t p B V B u o G D 9 n O Y Q W K l a A i Z v V L l r 6 0 k Z S m p X a X o V 0 F f 2 b 0 b D C m G W R 2 E r b 4 d z s 5 r r g T b m w x u x d 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 5 v r O l m H O x O 9 D q Y H k 7 e T 4 L P / u j o R T / s P f K M P C e v S U D e k i N y T E 7 I l H D n k 1 M 4 C + e b + 9 3 9 4 f 5 0 L 9 e l r t N z n p I t c 3 9 f A b V h A k k = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j b I = " > A A A D K H i c h V J b i 9 N A F J 7 E 2 1 p v X X 3 0 Z b B a R U J J F s H 1 Q V j w Q R 9 X s O 5 C E s p k c p I O O 5 m E m Z P u l p B f 5 K / x S f Z R f 4 m T N h X b X f E w A 9 + 5 f H x n z p m k k s K g 7 1 8 6 7 o 2 b t 2 7 f 2 b s 7 u H f / w c N H w / 3 H X 0 1 Z a w 5 T X s p S n y b M g B Q K p i h Q w m m l g R W J h J P k 7 E O X P 1 m A N q J U X 3 B Z Q V y w X I l M c I Y 2 N B v + j B L I h W o K h l p c v G 5 D G Q + a K M k o b 1 / O D u j 4 / Z i G Y x o h X G C T M q G X r U c 3 f g E M O / f / 3 p p J 6 T i m U W T P R m J X o W d t 0 7 x / u A v I c w E t 3 d a 0 G o M I V P r n R b P h y J / 4 K 6 N X Q d C D E e n t e L b v + F F a 8 r o A h V w y Y 8 L A r z B u m E b B J b S D q D Z Q M X 7 G c g g t V K w A E z e r X b T 0 h Y 2 k N C u 1 v Q r p K v o 3 o 2 G F M c s i s Z W 2 w 7 n Z z X X B 6 3 J h j d l h 3 A h V 1 Q i K r 4 W y W l I s a b d Y m g o N H O X S A s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < / l a t e x i t > Test point x test = [ yogurt, butter, carrot, chicken, carrot ] < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z T I l M c I Y O T f o / 4 x R m Q l U 5 Q y O W + 3 U k k 1 4 V p x l d 1 j 9 i h C V W C B Z r S o c n d E h p N G z h S s 9 K g / V H x 1 u Q l o h g H O j u n B m j m 4 I h 7 c B c 8 A W o 2 6 Q t c S M S 2 o t B T f 8 t M e k P g l G w D v p Q h J 0 Y k C 7 O J / 3 f 8 V T z M g e F X D J r o z A o M K m Y Q c E l 1 L 2 4 t F A w v m A z i J x U L A e b V G v 3 a v r O k S n N t H F H I V 3 T 2 x 0 V y 6 1 d 5 a m r d A v O 7 f 1 c A / + X i 0 r M j p J K q K J E U L x 9 K C s l R U 2 b r 6 B T Y Y C j X D n B u B F u V 8 r n z D D u n L z 7 S j M b t Z a 2 s S a 8 b 8 R D c f l p F A a j 8 O J g c P q 2 M 2 m L 7 J E 3 5 D 0 J y S E 5 J W f k n I w J J 3 8 8 6 n 3 w 9 v 3 X / r H / 2 f / a l v p e 1 7 N L 7 o T / 7 S / d u s o L < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z T I l M c I Y O T f o / 4 x R m Q l U 5 Q y O W + 3 U k k 1 4 V p x l d 1 j 9 i h C V W C B Z r S o c n d E h p N G z h S s 9 K g / V H x 1 u Q l o h g H O j u n B m j m 4 I h 7 c B c 8 A W o 2 6 Q t c S M S 2 o t B T f 8 t M e k P g l G w D v p Q h J 0 Y k C 7 O J / 3 f 8 V T z M g e F X D J r o z A o M K m Y Q c E l 1 L 2 4 t F A w v m A z i J x U L A e b V G v 3 a v r O k S n N t H F H I V 3 T 2 x 0 V y 6 1 d 5 a m r d A v O 7 f 1 c A / + X i 0 r M j p J K q K J E U L x 9 K C s l R U 2 b r 6 B T Y Y C j X D n B u B F u V 8 r n z D D u n L z 7 S j M b t Z a 2 s S a 8 b 8 R D c f l p F A a j 8 O J g c P q 2 M 2 m L 7 J E 3 5 D 0 J y S E 5 J W f k n I w J J 3 8 8 6 n 3 w 9 v 3 X / r H / 2 f / a l v p e 1 7 N L 7 o T / 7 S / d u s o L < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z T I l M c I Y O T f o / 4 x R m Q l U 5 Q y O W + 3 U k k 1 4 V p x l d 1 j 9 i h C V W C B Z r S o c n d E h p N G z h S s 9 K g / V H x 1 u Q l o h g H O j u n B m j m 4 I h 7 c B c 8 A W o 2 6 Q t c S M S 2 o t B T f 8 t M e k P g l G w D v p Q h J 0 Y k C 7 O J / 3 f 8 V T z M g e F X D J r o z A o M K m Y Q c E l 1 L 2 4 t F A w v m A z i J x U L A e b V G v 3 a v r O k S n N t H F H I V 3 T 2 x 0 V y 6 1 d 5 a m r d A v O 7 f 1 c A / + X i 0 r M j p J K q K J E U L x 9 K C s l R U 2 b r 6 B T Y Y C j X D n B u B F u V 8 r n z D D u n L z 7 S j M b t Z a 2 s S a 8 b 8 R D c f l p F A a j 8 O J g c P q 2 M 2 m L 7 J E 3 5 D 0 J y S E 5 J W f k n I w J J 3 8 8 6 n 3 w 9 v 3 X / r H / 2 f / a l v p e 1 7 N L 7 o T / 7 S / d u s o L < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z T I l M c I Y O T f o / 4 x R m Q l U 5 Q y O W + 3 U k k 1 4 V p x l d 1 j 9 i h C V W C B Z r S o c n d E h p N G z h S s 9 K g / V H x 1 u Q l o h g H O j u n B m j m 4 I h 7 c B c 8 A W o 2 6 Q t c S M S 2 o t B T f 8 t M e k P g l G w D v p Q h J 0 Y k C 7 O J / 3 f 8 V T z M g e F X D J r o z A o M K m Y Q c E l 1 L 2 4 t F A w v m A z i J x U L A e b V G v 3 a v r O k S n N t H F H I V 3 T 2 x 0 V y 6 1 d 5 a m r d A v O 7 f 1 c A / + X i 0 r M j p J K q K J E U L x 9 K C s l R U 2 b r 6 B T Y Y C j X D n B u B F u V 8 r n z D D u n L z 7 S j M b t Z a 2 s S a 8 b 8 R D c f l p F A a j 8 O J g c P q 2 M 2 m L 7 J E 3 5 D 0 J y S E 5 J W f k n I w J J 3 8 8 6 n 3 w 9 v 3 X / r H / 2 f / a l v p e 1 7 N L 7 o T / 7 S / d u s o L < / l a t e x i t > Figure 4 : The test point x test and the train point x 1,1 are generated by the same sequence of concepts.

B.3 CONCLUSION OF PROOF

We now establish the desired upper bound (28), which we restate below for convenience: sup K∈K P DM E K ≤ 1 |X | x∈X H 2R-1 (K (x, •)) + 1 R where E K = ω ∈ Ω DM : There exists 1 ≤ s * ≤ n spl such that K x test , x 1,s * > K x test , x r,s for all 2 ≤ r ≤ R and all 1 ≤ s ≤ n spl We recall that the test point x test is generated by the unfamiliar sequence of concepts c 1 and that it belongs to category 1, see Figure 4 . The event E K describes all the outcomes in which the training point most similar to x test (where similarity is measured with respect to the kernel K) belongs to the first category. There are two very distinct cases within the event E K : the training point most similar to x test can be x 1,1 -this corresponds to a 'meaningful success' in which the learner recognizes that x 1,1 is generated by the same sequence of concepts than x test , see Figure 4 . Or the training point most similar to x test can be one of the points x 1,2 , . . . , x 1,nspl -this corresponds to a 'lucky success' because x 1,2 , . . . , x 1,nspl are not related to x test (they are generated by a different sequence of concept, see Figure 4 ). To make this discussion formal, we fix a kernel K ∈ K, and we partition the event E K as follow E K = E meaningful ∪ E luck where E meaningful = E K ∩ ω ∈ Ω DM : K x test , x 1,1 > K x test , x 1,s for all 2 ≤ s ≤ n spl E luck = E K ∩ ω ∈ Ω DM : K x test , x 1,1 ≤ K x test , x 1,s for some 2 ≤ s ≤ n spl The next two lemmas provide upper bounds for the probability of the events E meaningful and E luck . Lemma 10. P DM [E meaningful ] ≤ 1 |X | x∈X H 2R-1 (K x ). Proof. Define the event A := ω ∈ Ω DM : K x test , x 1,1 > K x test , x r,1 for all 2 ≤ r ≤ R ∩ ω ∈ Ω DM : K x test , x 1,1 > K x test , x r,2 for all 1 ≤ r ≤ R . This events involves only the first two training points of each category. On the example depicted on Figure 4 , that would be the points x 1,1 and x 1,2 , the points x 2,1 and x 2,2 , and finally the points x 3,1 and x 3,2 . The event A consists in all the outcomes in which, among these 2R training points, x 1,1 is most similar to x test . We then make two key remarks. First, these 2R points are generated by 2R distinct sequences on concepts -so if we restrict our attention to these 2R points, we are in a situation very similar to the simpler data model SDM (i.e. we first generate 2R sequences of concepts, then from each sequence of concepts we generate a single training point, and finally we generate a test point from the first sequence of concepts.) We will make this intuition precise by appealing to the fact that SDM is the marginal of DM , and this will allow us to obtain a bound for P DM [A] in term of the permuted moment of K . The second remark is that E meaningful is clearly contained in A, and therefore we have Let us rename some of the variables. We define d 1 , . . . , d 2R , and y 1 , . . . , y 2R as follow: P DM [E meaningful ] ≤ P DM [A] d 2r-1 = c r and d 2r = c r for r = 1, . . . , R y 2r-1 = x r,1 and y 2r = x r,2 for r = 1, . . . , R On the example depicted on Figure 4 , that would be: y 1 = x 1,1 , y 2 = x 1,2 , y 3 = x 2,1 , y 4 = x 2,2 , y 5 = x 3,1 , y 6 = x 3,2 d 1 = c 1 , d 2 = c 1 , d 3 = c 2 , d 4 = c 2 , d 5 = c 3 , d 6 = c 3 In other words, the y r 's are the first two training points of each category and the d r 's are their corresponding sequence of concepts. With these notations it is clear that the training points y r are generated by distinct sequences of concepts, and that the test point x test is generated by the same sequence of concepts than y 1 . Moreover the event A can now be conveniently written as A = {ω ∈ Ω DM : K x test , y 1 > K x test , y r for all 2 ≤ r ≤ 2R}. Let h : X 2R+1 → R be the indicator function defined by h(y 1 , . . . , y 2R , x test ) = 1 if K (x test , y 1 ) > K (x test , x r ) for all 2 ≤ r ≤ 2R 0 otherwise We now recall the splitting ω = (ω a , ω b ) described in ( 58)-( 59) and note that ω a can be written as ω a = (ϕ ; d 1 , . . . , d 2R ; y 1 , . . . , y 2R ; x test ) Since h only depends on the variables involved in ω a , and since, according to Lemma 8, ω b DM (ω a , ω b ) = SDM (ω a ) , we obtain P DM [A] = ωa∈Ωa ω b ∈Ω b h(y 1 , . . . , y 2R , x test ) DM (ω a , ω b ) = ωa∈Ωa h(y 1 , . . . , y 2R , x test ) SDM (ω a ) = P SDM [ω a ∈ Ω a : K x test , y 1 > K x test , y r for all 2 ≤ r ≤ 2R] ≤ 1 |X | x∈X H 2R-1 (K x ) where we have used Theorem 4 in order to get the last inequality (with the understanding that t + 1 = 2R.) Combining the above bound with (66) concludes the proof. We now estimate the probability of the event E luck . Lemma 11. P DM [E luck ] ≤ 1 R . Proof. For 1 ≤ r ≤ R, we define the events B r = 1≤r ≤R r =r ω ∈ Ω DM : max 2≤s≤nspl K(x test , x r,s ) > max 2≤s ≤nspl K(x test , x r ,s ) Note that the events B r involve only the training points with an index s ≥ 2: these are the familiar training points. On the example depicted on Figure 4 , these are the training points generated by c 1 , c 2 and c 3 . Let us pursue with this example. The event B 1 consists in all the outcomes in which one of the points generated by c 1 is more similar to x test than any of the points generated by c 2 and c 3 . The event B 2 consists in all the outcomes in which one of the points generated by c 2 is more similar to x test than any of the points generated by c 1 and c 3 . And finally the event B 3 consists in all the outcomes in which one of the points generated by c 3 is more similar to x test than any of the points generated by c 1 and c 2 . Importantly, the test point x test is generated by the unfamiliar sequence of concepts c 1 , and this sequence of concept is unrelated to the sequence c 1 , c 2 and c 3 . So one would expect that, from simple symmetry, that P DM [B 1 ] = P DM [B 2 ] = P DM [B 3 ]. We will prove (67) rigorously, for general R, using Lemma 9 from the previous subsection. But before to do so, let us show that (67) implies the desired upperbound on the probability of E luck . First, note that E luck ⊂ B 1 and therefore P DM [E luck ] ≤ P DM [B 1 ]. Then, note that B 1 , B 2 and B 3 are mutually disjoints, and therefore, continuing with the same example, 67) and ( 68), gives P DM [E luck ] ≤ 1/3 as desired. P DM [B 1 ] + P DM [B 2 ] + P DM [B 3 ] = P DM [B 1 ∪ B 2 ∪ B 3 ] ≤ 1 which, combined with ( We now provide a formal proof of (67). As in the proof of the previous lemma, it is convenient to rename some of the variables. Let denote by fam r the variable that consists in the familiar training points from category r: fam r = (x r,2 , . . . , x r,nspl ) ∈ X nspl-1 With this notation we have fam r,s = x r,s+1 . We now recall the splitting ω = (ω c , ω d ) described in ( 61)-( 62), and note that ω c can be written as ω c = (ϕ; fam 1 ; . . . ; fam R ; x test ). Using Lemma 9 we have ω d ∈Ω d DM (ω c , ω d ) = α R r=1 1 {φ(xr,2)=φ(xr,3)=...=φ(xr,n spl )} (70) = α R r=1 1 {φ(famr,1)=φ(famr,2)=...=φ(famr,n spl -1 )} (71) = α R r=1 h(ϕ, fam r ) ( ) where α is the constant appearing in front of the product in Lemma 9 an h : Φ × X nspl-1 → {0, 1} is the indicator function implicitly defined in equality ( 71)-( 72). With the slight abuse of notation of viewing fam r as a set instead of a tuple, let us rewrite the event B r as B r = 1≤r ≤R r =r ω ∈ Ω DM : max x∈famr K(x test , x) > max x∈fam r K(x test , x) We also define the corresponding indicator function g(fam r , fam r , x test ) =      1 if max x∈famr K(x test , x) > max x∈fam r K(x test , x) 0 otherwise We now compute P DM (B 1 ) (the formula for the other P DM (B r ) are obtained in a similar manner.) Recall from (69) that the variables involved in fam r only appear in ω c . Therefore P DM [B 1 ] = ωc∈Ωc ω d ∈Ω d R r=2 g(fam 1 , fam r , x test ) DM (ω c , ω d ) (73) = ωc∈Ωc R r=2 g(fam 1 , fam r , x test ) ω d ∈Ω d DM (ω c , ω d ) (74) = α ωc∈Ωc R r=2 g(fam 1 , fam r , x test ) R r =1 h(ϕ, fam r ) where we have used ( 72) to obtain the last equality. Let us now compare P DM (B 1 ) and P DM (B 2 ). Letting Z := X nspl-1 , and recalling that ω c = (ϕ; fam 1 , . . . , fam R ; x test ), we obtain: P DM [B 1 ] = ϕ∈Φ fam1∈Z • • • fam R ∈Z x test ∈X     1≤r≤R r =1 g(fam 1 , fam r , x test )     R r =1 h(φ, fam r ) P DM [B 2 ] = ϕ∈Φ fam1∈Z • • • fam R ∈Z x test ∈X     1≤r≤R r =2 g(fam 2 , fam r , x test )     R r =1 h(φ, fam r ) From the above it is clear that P DM [B 1 ] = P DM [B 2 ] (as can be seen by exchanging the name of the variables fam 1 and fam 2 ). Similar reasoning shows that the events B r are all equiprobable, which concludes the proof. Combining Lemma 10 and 11, together with the fact that E K = E correct ∪ E luck , concludes the proof of (63). Inequality (63) implies inequality (26), which itself is equivalent to inequality (14).

C UPPER BOUND FOR THE PERMUTED MOMENT OF K

This section is devoted to the proof of inequality ( 15), which we state below as a theorem for convenience. Theorem 5. For all 0 ≤ ≤ L, we have the upper bound 1 |X | x∈X H t (K x ) ≤ 1 - k∈S f(k)g(k) + 1 t + 1 max k∈S f(k) . ( ) The rather intricate formula for the function f and g can be found in the main body of the paper, but we will recall them as we go through the proof. We also recall that the optimal kernel is given by the formula: K (x, y) = 1 s L c {ϕ ∈ Φ : ϕ(x ) = ϕ(y ) for all 1 ≤ ≤ L} |Φ| The key insight to derive the upper bound ( 76) is to note that each pair of sentences (x, y) induces a graph on the vocabulary {1, 2, . . . , n w }, and that the quantity {ϕ ∈ Φ : ϕ(x ) = ϕ(y ) for all 1 ≤ ≤ L} 1 15 25 10 1 can be interpreted as the number of equipartitions of the vocabulary that do not sever any of the edges of the graph. This graph-cut interpretation of the optimal kernel is presented in detail in Subsection C.1. In Subsection C.2 we derive a formula for K which is more tractable than (77). To do this we partition X × X into subsets on which K is constant, then provide a formula for the value of K on each of these subsets (c.f. Lemma 16). With this formula at hand, we then appeal to Lemma 3 to derive a first bound for the permuted moment of K (c.f. Lemma 17). This first bound is not fully explicit because it involves the size of the subsets on which K is constant. In section C.3 we appeal to Cayley's formula, a classical result from graph theory, to estimate the size of these subsets (c.f. Lemma 18) and therefore conclude the proof of Theorem 5. We now introduce the combinatorial notations that will be used in this section, and we recall a few basics combinatorial facts. We denote by N = {0, 1, 2, . . .} the nonnegative integers. We use the standard notations n k := n! k!(n -k)! and n k 1 , k 2 , . . . , k m := n! k 1 !k 2 ! • • • k n ! for the binomial and multinomial coefficients, with the understanding that 0! = 1. We recall that multinomial coefficients can be interpreted as the number of ways of placing n distinct objects into m distinct bins, with the constraint that k 1 objects must go in the first bin, k 2 objects must go in the second bin, and so forth. The Stirling numbers of the second kind n k are close relatives of the binomial coefficients. n k stands for the number of ways to partition a set of n objects into k nonempty subsets. To give a simple example,foot_3 2 = 7 because there are 7 ways to partition the set {1, 2, 3, 4} into two nonempty subsets, namely: {1} ∪ {2, 3, 4}, {2} ∪ {1, 3, 4}, {3} ∪ {1, 2, 4}, {4} ∪ {1, 2, 3}, {1, 2} ∪ {3, 4}, {1, 3} ∪ {2, 4}, {1, 4} ∪ {3, 4}. Stirling numbers are easily computed via the following variant of Pascal's recurrence formula 4 : n 1 = 1, n n = 1 for n ≥ 1, n k = n -1 k -1 + k n -1 k for 2 ≤ k ≤ n -1. The above formula is easily derived from the definition of the Stirling numbers as providing the number of ways to partition a set of n objects into k nonempty subsets (see for example chapter 6 of ?). Table 2 shows the first few Stirling numbers. We recall that an undirected graph is an ordered pair G = (V, E), where V is the vertex set and E ⊂ {{v, v } : v, v ∈ V and v = v } is the edge set. Edges are unordered pairs of distinct vertices (so loops are not allowed.) A tree is a connected graph with no cycles. A tree on n vertices has exactly n -1 edges. Cayley's formula states that there are n n-2 ways to put n -1 edges on n labeled vertices in order to make a tree. We formally state this classical result below: Lemma 12 (Cayley's formula). There are n n-2 trees on n labeled vertices. The equipartition on the left is cut-free (no edges are severed). The equipartition on the right is not cut-free (4 edges are severed). The optimal kernel K (x, y) can be interpreted as the number of distinct cut-free equipartitions of the graph ζ(x, y) (modulo some scaling factor.) Recall that Φ is the set of maps ϕ : V → {1, . . . , n c } that satisfy |ϕ -1 (c)| = s c for all 1 ≤ c ≤ n c . Given a graph G, the quantity I(G) can therefore be interpreted as the number of ways to partition the vertices into n c labelled subsets of equal size so that no edges are severed (i.e. two connected vertices must be in the same subset.) In other words, I(G) is the number of "cut-free" equipartition of the graph G, see Figure 5 for an illustration. Note that if the graph G is connected, then I(G) = 0 since any equipartition of the graph will severe some edges. On the other hand, if the graph G has no edges, then I(G) = |Φ| since there are no edges to be cut (and therefore any equipartition is acceptable.) The optimal kernel K can be expressed as a composition of the function ζ and I. Indeed: K (x, y) = 1 |Φ|s L c |{ϕ ∈ Φ : ϕ(x ) = ϕ(y ) for all 1 ≤ ≤ L}| (82) = 1 |Φ|s L c |{ ϕ ∈ Φ : ϕ(v) = ϕ(v ) for all {v, v } ∈ ζ(x, y) }| (83) = 1 |Φ|s L c I(ζ(x, y)) where we have simply used that ζ(x, y) := 1≤ ≤L x =y {{x , y } to go from (82) to (83). We will refer to (84) as the graph-cut formulation of the optimal kernel. We have discussed earlier that the function ζ : X × X → G is surjective but not injective. We conclude this subsection with a lemma that provides an exact count of how many distinct (x, y) are mapped by ζ to a same graph G. Lemma 13. Suppose G ∈ G has m edges. Then |ζ -1 (G)| = |{(x, y) ∈ X × X : ζ(x, y) = G}| = m! L α=m L α α m 2 α n L-α w . Proof. We start by noting that the set O α is empty for all α < m: indeed, since G has m edges, at least m positions of a pair (x, y) must be coding for edges (i.e., must be non-silent) in order to have ζ(x, y) = G. We therefore have the following partition: ζ -1 (G) = L α=m O α and O α ∩ O α = ∅ if α = α . To conclude the proof, we need to show that |O α | = L α α m m! 2 α n L-α w for all m ≤ α ≤ L. ( ) Consider the following process to generate an ordered pair (x, y) that belong to O α : we start by deciding which positions of (x,y) are going to be silent, and which positions are going to code for which edge of the graph G. This is equivalent to choosing a map ρ : {1, . . . , L} → {e 1 , . . . , e m , s} where {1, . . . , L} denotes the L positions, e 1 , . . . , e m denote the m edges of the graph G, and s is the silent symbol. Choosing a map ρ correspond to assigning a "role" to each position: ρ( ) = e i means that position is given the role to code for edge e i , and ρ( ) = s means that position is given the role of being silent. The map ρ must satisfy: |ρ -1 (s)| = L -α and ρ -1 (e i ) = ∅ for 1 ≤ i ≤ m because Lα position must be silent and each edge must be represented by at least one position. The number of maps ρ : {1, . . . , L} → {e 1 , . . . , e m , s} that satisfies ( 86) is equal to L L -α α m m! Indeed, we need to choose the Lα positions that will be mapped to the silent symbol s: there are L L-α ways of accomplishing this. We then partition the α remaining positions into m non-empty subsets: there are α m ways of accomplishing this. We finally map each of these non-empty subset to a different edge: there are m! ways of accomplishing this. We have shown that there are L α α m m! ways to assign roles to the positions. Let say that position is assigned the role of coding for edge {v, v }. Then we have two choices to generate entries x and y : either x = v and y = v , or x = v and y = v. Since α positions are coding for edges, this lead to the factor 2 α in equation ( 85). Finally, if the position is silent, then we have n w choices to generate entries x and y (because we need to choose v ∈ V such that x = y = v) , hence the factor n L-α w appearing in (85).

C.2 LEVEL SETS OF THE OPTIMAL KERNEL

We recall that a connected component (or simply a component) of a graph is a connected subgraph that is not part of any larger connected subgraph. Since graphs in G have at most L edges, their components contain at most L + 1 vertices. This comes from the fact that the largest component that can be made with L edges contains L + 1 vertices. It is therefore natural, given a vector k = [k 1 , . . . , k L+1 ] ∈ N L+1 , to define G k := {G ∈ G : G has exactly k 1 components of size 1, exactly k 2 components of size 2, . . . , exactly k L+1 components of size L + 1 } (87) where the size of a component refers to the number of vertices it contains. We recall that N = {0, 1, 2, . . .} therefore some of the entries of the vector k can be equal to zero. Note that components of size 1 are simply isolated vertices. The following lemma identify which k ∈ N L+1 lead to nonempty G k . Lemma 14. The set G k is not empty if and only if k satisfies L+1 i=1 ik i = n w and L+1 i=1 (i -1)k i ≤ L. ( ) Proof. Suppose G k is not empty. Then there exists a graph G ∈ G that has exactly k i components of size i, for 1 ≤ i ≤ L + 1. A component of size i contains i vertices (by definition) and at least i -1 edges (otherwise it would not be connected.) Since G ∈ G it must have n w vertices and at most L edges. Therefore (88) must hold. Suppose that k ∈ N L+1 satisfies (88). Then we can easily construct a graph G on V that has a number of edges less or equal to L, and that has exactly k i components of size i, for 1 ≤ i ≤ L+1. To do this we first partition the vertices into subsets so that there are k i subsets of size i, for 1 ≤ i ≤ L + 1. We then put i -1 edges on each subset of size i so that they form connected components. The resulting graph has k i components of size i, for 1 ≤ i ≤ L + 1, and L+1 i=1 (i -1)k i edges. The previous lemma allows us to partition G into non-empty subsets as follow: G = k∈S G k , G k = ∅ for all k ∈ S, G k ∩ G k = ∅ if k = k (89) where S := k ∈ N L+1 : L+1 i=1 ik i = n w and L+1 i=1 (i -1)k i ≤ L . Recall that I(G) count the number of equipartitions that do not severe edges of G. The next lemma shows that two graphs that belongs to the same subset G k have the same number of cut-free equipartitions, and it provides a formula for this number in term of the index k of the subset. Lemma 15. Suppose k ∈ S and define the set of admissible assignment matrices A k :=    A ∈ N (L+1)×nc : nc j=1 A ij = k i for all i and L+1 i=1 iA ij = s c for all j    . Then for all G ∈ G k , we have that I(G) = A∈A k L+1 i=1 k i A i,1 , A i,2 , . . . , A i,nc . Let us remark that, since 0! = 1, the multinomial coefficient ki Ai,1,Ai,2,...,Ai,n c appearing in ( 92) is equal to 1 when k i is equal to 0. Poof of Lemma 15. Let k ∈ S and fix once and for all a graph G ∈ G k . Define the set Ψ = {ϕ ∈ Φ : ϕ(v) = ϕ(v ) for all edge {v, v } of the graph G} so that I(G) = |Ψ|. Note that a map ϕ that belongs to Ψ must map all vertices that are in a connected component to the same concept (otherwise some edges of G would be severed.) So a map ϕ ∈ Ψ can be viewed as assigning connected components to concepts. Given a matrix A ∈ N (L+1)×nc we define the set: Ψ A = {ϕ ∈ Ψ : ϕ assigns A ij components of size i to concept j, for all 1 ≤ i ≤ L+1 and 1 ≤ j ≤ n c }. We then note that the set Ψ A is empty unless the matrix A satisfies: A i,1 + A i,2 + A i,3 + . . . + A i,nc = k i for all 1 ≤ i ≤ L + 1 A 1,j + 2A 2,j + 3A 3,j + . . . + (L + 1)A L+1,j = s c for all 1 ≤ j ≤ n c . The first constraint states that the total number of connected components of size i is equal to k i (because G ∈ G k ). The second constraint states that concept j must receive a total of s c vertices (because ϕ ∈ Φ.) The matrices that satisfy these two constraints constitute the set A k defined in (91). We therefore have the following partition of the set Ψ: Ψ = A∈A k Ψ A , Ψ A = ∅ if A ∈ A k , Ψ A ∩ Ψ B = ∅ if A = B. To conclude the proof, we need to show that |Ψ A | = L+1 i=1 k i A i,1 , A i,2 , . . . , A i,nc for all A ∈ A k . ( ) To see this, consider the k i components of size i. The number of ways to assign them to the n c concepts so that concept j receives A ij of them is equal to the multinomial coefficient ki Ai,1,Ai,2,...,Ai,n c . Repeating this reasonning for the components of each size gives (93). We now leverage the previous lemma to obtain a formula for K . For k ∈ S we define Ω k := ζ -1 (G k ). Since ζ : X × X → G is surjective, partition (89) of G induces the following partition of X × X : X × X = k∈S Ω k , Ω k = ∅ if k ∈ S and Ω k ∩ Ω k = ∅ if k = k . ( ) Using the graph-cut formulation of the optimal kernel together with Lemma 15 we therefore have K (x, y) = 1 |Φ|s L c I(ζ(x, y)) = 1 |Φ|s L c A∈A k L+1 i=1 k i A i,1 , A i,2 , . . . , A i,nc for all (x, y) ∈ Ω k . (95) The above formula is key to our analysis. We restate it in the lemma below, but in a slightly different format that will better suit the rest of the analysis. Let f : S → R be the function defined by f(k) := n L c (s c !) nc n w ! A∈A k L+1 i=1 k i A i,1 , A i,2 , . . . , A i,nc We then have: Lemma 16 (Level set decomposition of K ). The kernel K is constant on each subsets Ω k of the partition (94). Moreover we have K (x, y) = f(k)/|X | for all (x, y) ∈ Ω k and for all k ∈ S. Proof. The quantity |Φ| appearing in (95) can be interpreted as the number of ways to assign the n w words to the n c concepts so that each concept receives s c words. Therefore |Φ| = n w s c , s c , . . . , s c = n w ! (s c !) nc . Combined with the fact that |X | = n L w , this leads to the desired formula for K . The above lemma provides us with the level sets of the optimal kernel. Together with Lemma 3, this allows us to derive the following upper bound for the permuted moment of K . Lemma 17. Let Ω = X × X . The inequality 1 |X | x∈X H t (K x ) ≤ 1 - k∈S |Ω k | |Ω| f(k) + 1 t + 1 max k∈S f(k) holds for all S ⊂ S. Proof. Let S ⊂ S and define: λ = max k∈S max (x,y)∈Ω k K (x, y) = 1 |X | max k∈S f(k) where we have used the fact that K is equal to f(k)/|X | on Ω k . We then appeal to Lemma 3 to obtain: 1 |X | x∈X H t (K (x, •)) ≤ 1 |X | x∈X   λ|X | t + 1 + 1 - y∈X min{K (x, y), λ}   (97) = λ|X | t + 1 + 1 - 1 |X | x∈X y∈X min{K (x, y), λ} = λ|X | t + 1 + 1 - 1 |X | k∈S (x,y)∈Ω k min{K (x, y), λ} ≤ λ|X | t + 1 + 1 - 1 |X | k∈S (x,y)∈Ω k min{K (x, y), λ} = λ|X | t + 1 + 1 - 1 |X | k∈S (x,y)∈Ω k K (x, y) (101) = λ|X | t + 1 + 1 - 1 |X | k∈S |Ω k | f (k) |X | where we have use the fact that X × X = k∈S Ω k to go from ( 98) to (99), and the fact that K (x, y) ≤ λ on k∈S Ω k to go from (100) to (101). To conclude the proof, we simply note that λ|X | = max k∈S f(k) according to our definition of λ. The bound provided by the above lemma is not fully explicit because it involves the size of level sets Ω k . In the next section, we appeal to Cayley's formula to obtain a lower bound for |Ω k |.

C.3 FOREST LOWER BOUND FOR THE SIZE OF THE LEVEL SETS

We recall that a forest is a graph whose connected components are trees (equivalently, a forest is a graph with no cycles.) Let us define: F := {G ∈ G : G is a forest}. In other words, F is the set of forests on V = {1, 2, . . . , n w } that have at most L edges. We obviously have the following lower bound on the size of the level sets: |Ω k | = ζ -1 (G k ) ≥ ζ -1 (G k ∩ F) . In this subsection, we use Cayley's formula to derive an explicit formula for ζ -1 (G k ∩ F) . We start with the following lemma: Lemma 18. Let k ∈ S, then |G k ∩ F| = n w ! k 1 !k 2 ! • • • k L+1 ! L+1 i=2 i i-2 i! ki Proof. First we note that (104) can be written as |G k ∩ F| = n w k 1 , 2k 2 , . . . , (L + 1)k L+1 L+1 i=2 i ki(i-2) k i ! ik i i, i, . . . , i We now explain the above formula. The set G k ∩ F consists in all the forests that have exactly k 1 trees of size 1, k 2 trees of size 2, . . . , k L+1 trees of size L + 1. In order to construct a forest with this specific structure, we start by assigning the n w vertices to L + 1 bins, with bin 1 receiving k 1 vertices, bin 2 receiving 2k 2 vertices, . . . , bin L + 1 receiving (L + 1)k L+1 vertices. The number of ways of accomplishing this is n w k 1 , 2k 2 , . . . , (L + 1)k L+1 . Let us now consider the vertices in bin i for some i ≥ 2. We claim that there are 1 k i ! ik i i, i, . . . , i i ki(i-2) ways of putting edges on these ik i vertices in order to make k i trees of size i. Indeed, there are 1 ki! iki i,i,. ..,i ways of partitioning the vertices into k i disjoint subsets of size i, and then, according to Cayley's formula, there are i i-2 ways of putting edges on each of these subsets so that they form a tree. To conclude, we remark that there is obviously only one way to to make k 1 trees of size 1 out of the k 1 vertices in the first bin. Recall that a tree on n vertices always has n -1 edges. So a graph that belongs to G k ∩ F has m = L+1 i=1 (i -1)k i edges since it is made of k 1 trees of size 1, k 2 trees of size 2, . . . , k L+1 trees of size (L + 1). The fact that all graphs in G k ∩ F have the same number of edges allows us to to obtain an explicit formula for |ζ -1 (G k ∩ F)| by combining combine Lemma 13 and 18, namely |ζ -1 (G k ∩ F)| = n w ! k 1 !k 2 ! • • • k L+1 ! L+1 i=2 i i-2 i! ki m! L α=m L α α m 2 α n L-α w . This lead us to define the function g : S → R by g(k) = 1 n 2L w n w ! k 1 !k 2 ! • • • k L+1 ! L+1 i=2 i i-2 i! ki   γ(k)! L α=γ(k) L α α γ(k) 2 α n L-α w   where γ(k) = L+1 i=1 (i -1)k i . Recalling (103) we therefore have that |Ω k | |Ω| ≥ g(k) for all k ∈ S. Combining the above inequality with Lemma 17 we obtain: Theorem 6. The inequality 1 |X | x∈X H t (K x ) ≤ 1 - k∈S g(k) f(k) + 1 t + 1 max k∈S f(k) holds for all S ⊂ S. The above theorem is more general than Theorem 5 -indeed, in Theorem 5, the choice of the subset S is restricted to the L + 1 candidates: S := k ∈ N L+1 : L+1 i=1 ik i = n w and ≤ L+1 i=1 (i -1)k i ≤ L where = 0, 1, . . . , L. When L = 9, n w = 150, n c = 5 and t = 1999 (these are the parameters used in Theorem 1), choosing S = S 7 leads to a relatively tight upper bound. When L = 15, n w = 30, n c = 5 and t = 5999 (these are the parameters corresponding the the second experiment of the experimental section), choosing S = S 11 gives a good upper bound.

D MULTIPLE UNFAMILIAR SENTENCES PER CATEGORY

In the data model depicted in Figure 1 , each unfamiliar sequence of concept has a single representative in the training set. In this section we consider the more general case in which each unfamiliar sequence of concepts has n * representatives in the training set, where 1 ≤ n * ≤ n spl . An example with n * = 2 is depicted in Figure 6 . The variables L, n w , n c , R, n spl and n * parametrize instances of this more general data model, and the associated sampling process is: Sampling Process DM2: (i) Sample T = ( ϕ ; c 1 , . . . , c R ; c 1 , . . . , c R ) uniformly at random in T = Φ × Z 2R . (ii) For r = 1, . . . , R: • Sample (x r,1 , . . . , x r,n * ) uniformly at random in φ-1 (c r ) × . . . × φ-1 (c r ). • Sample (x r,n * +1 , . . . , x r,nspl ) uniformly at random in φ-1 (c r ) × . . . × φ-1 (c r ). (iii) Sample x test uniformly at random in φ-1 (c 1 ). Our analysis easily adapts to this more general case and gives: Theorem 7. Let T = Φ × Z 2R . Then 1 -err(F, ψ, T) ≤ n * 1 - k∈S f(k)g(k) + 1 2R max k∈S f(k) + 1 R for all feature space F, all feature map ψ : X → F, and all 0 ≤ ≤ L. < l a t e x i t s h a 1 _ b a s e 6 4 = " T n X m + M L P V T s U M T + 3 i C + M n 7 B K p n c = " > A A A E z H i c f Z P N j t M w E M e z b Y A l f H X h y M W i 3 Q q h 1 S r p 8 n V B W o k L J 7 R I d H d R U l W O O 2 2 t u n F k T 1 C r K F e e j h f g D X g M n D R p 0 4 / F S q T x j D 2 / m b / t M B Z c o + v + O W o 0 7 X v 3 H x w / d B 4 9 f v L 0 W e v k + b W W i W L Q Z 1 J I d R t S D Y J H 0 E e O A m 5 j B X Q e C r g J Z 5 / z + M 1 P U J r L 6 D s u Y x j M 6 S T i Y 8 4 o G t f w p P G 7 0 3 G C E C Y 8 S u c U F V + 8 y X w x c N I g H J N F N k w 9 L y O k + 4 l 0 C f G 7 A c I C U z Y F 0 J C d G f / K E S a I o I y j n A t A T F i + o k u q L Z z N I K p 5 B M A s z z w g Q W C + G r C 3 B 1 z K S a K w B l x X U M 2 p U h J r 2 W O p Z n X 8 K n 4 Q d 1 H h / N 1 k V f Y q W R n O 5 d 3 P X e + 1 K M 7 M 9 2 F v d 2 E b q Q 5 o d Z e + s U S K M l + Q 7 y r G 1 s K 8 T 0 P e R r + 7 G 1 2 R Q 4 B x b b q W f e v M z k i t 1 E K L v N M 9 3 P t d 3 K b o i m 9 u 6 a a n / / T Y r X V Y 1 l S c p B N A N F r f W s d c 5 G G r 7 Z 6 7 x S D 7 h l c a b a s c V 8 P W 3 2 A k W T K H C J m g W v u e G + M g p Q o 5 E 5 A 5 Q a I h p m x G J + A b M 6 J z 0 I O 0 e H Y Z O T W e E R l L Z f 4 I S e G t 7 0 j p X O u l a Z K c m i K n e j e W O w / F / A T H H w c p j + I E I W I r 0 D g R B C X J 3 z A Z c Q U M x d I Y l C l u a i V s S h V l R s F t S p 4 b p R Q 6 M 9 J 4 u 0 L s G 9 e 9 c 8 8 9 9 7 7 1 2 p e d U q R j 6 6 X 1 y n p t e d Y H 6 9 L 6 Y l 1 Z f Y s 1 L 5 o / m m G T 2 V 9 t t F M 7 W y 1 t H J V 7 X l h b w / 7 1 D 5 A e i E o = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " T n X m + M L P V T s U M T + 3 i C + M n 7 B K p n c = " > A A A E z H i c f Z P N j t M w E M e z b Y A l f H X h y M W i 3 Q q h 1 S r p 8 n V B W o k L J 7 R I d H d R U l W O O 2 2 t u n F k T 1 C r K F e e j h f g D X g M n D R p 0 4 / F S q T x j D 2 / m b / t M B Z c o + v + O W o 0 7 X v 3 H x w / d B 4 9 f v L 0 W e v k + b W W i W L Q Z 1 J I d R t S D Y J H 0 E e O A m 5 j B X Q e C r g J Z 5 / z + M 1 P U J r L 6 D s u Y x j M 6 S T i Y 8 4 o G t f w p P G 7 0 3 G C E C Y 8 S u c U F V + 8 y X w x c N I g H J N F N k w 9 L y O k + 4 l 0 C f G 7 A c I C U z Y F 0 J C d G f / K E S a I o I y j n A t A T F i + o k u q L Z z N I K p 5 B M A s z z w g Q W C + G r C 3 B 1 z K S a K w B l x X U M 2 p U h J r 2 W O p Z n X 8 K n 4 Q d 1 H h / N 1 k V f Y q W R n O 5 d 3 P X e + 1 K M 7 M 9 2 F v d 2 E b q Q 5 o d Z e + s U S K M l + Q 7 y r G 1 s K 8 T 0 P e R r + 7 G 1 2 R Q 4 B x b b q W f e v M z k i t 1 E K L v N M 9 3 P t d 3 K b o i m 9 u 6 a a n / / T Y r X V Y 1 l S c p B N A N F r f W s d c 5 G G r 7 Z 6 7 x S D 7 h l c a b a s c V 8 P W 3 2 A k W T K H C J m g W v u e G + M g p Q o 5 E 5 A 5 Q a I h p m x G J + A b M 6 J z 0 I O 0 e H Y Z O T W e E R l L Z f 4 I S e G t 7 0 j p X O u l a Z K c m i K n e j e W O w / F / A T H H w c p j + I E I W I r 0 D g R B C X J 3 z A Z c Q U M x d I Y l C l u a i V s S h V l R s F t S p 4 b p R Q 6 M 9 J 4 u 0 L s G 9 e 9 c 8 8 9 9 7 7 1 2 p e d U q R j 6 6 X 1 y n p t e d Y H 6 9 L 6 Y l 1 Z f Y s 1 L 5 o / m m G T 2 V 9 t t F M 7 W y 1 t H J V 7 X l h b w / 7 1 D 5 A e i E o = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " T n X m + M L P V T s U M T + 3 i C + M n 7 B K p n c = " > A A A E z H i c f Z P N j t M w E M e z b Y A l f H X h y M W i 3 Q q h 1 S r p 8 n V B W o k L J 7 R I d H d R U l W O O 2 2 t u n F k T 1 C r K F e e j h f g D X g M n D R p 0 4 / F S q T x j D 2 / m b / t M B Z c o + v + O W o 0 7 X v 3 H x w / d B 4 9 f v L 0 W e v k + b W W i W L Q Z 1 J I d R t S D Y J H 0 E e O A m 5 j B X Q e C r g J Z 5 / z + M 1 P U J r L 6 D s u Y x j M 6 S T i Y 8 4 o G t f w p P G 7 0 3 G C E C Y 8 S u c U F V + 8 y X w x c N I g H J N F N k w 9 L y O k + 4 l 0 C f G 7 A c I C U z Y F 0 J C d G f / K E S a I o I y j n A t A T F i + o k u q L Z z N I K p 5 B M A s z z w g Q W C + G r C 3 B 1 z K S a K w B l x X U M 2 p U h J r 2 W O p Z n X 8 K n 4 Q d 1 H h / N 1 k V f Y q W R n O 5 d 3 P X e + 1 K M 7 M 9 2 F v d 2 E b q Q 5 o d Z e + s U S K M l + Q 7 y r G 1 s K 8 T 0 P e R r + 7 G 1 2 R Q 4 B x b b q W f e v M z k i t 1 E K L v N M 9 3 P t d 3 K b o i m 9 u 6 a a n / / T Y r X V Y 1 l S c p B N A N F r f W s d c 5 G G r 7 Z 6 7 x S D 7 h l c a b a s c V 8 P W 3 2 A k W T K H C J m g W v u e G + M g p Q o 5 E 5 A 5 Q a I h p m x G J + A b M 6 J z 0 I O 0 e H Y Z O T W e E R l L Z f 4 I S e G t 7 0 j p X O u l a Z K c m i K n e j e W O w / F / A T H H w c p j + I E I W I r 0 D g R B C X J 3 z A Z c Q U M x d I Y l C l u a i V s S h V l R s F t S p 4 b p R Q 6 M 9 J 4 u 0 L s G 9 e 9 c 8 8 9 9 7 7 1 2 p e d U q R j 6 6 X 1 y n p t e d Y H 6 9 L 6 Y l 1 Z f Y s 1 L 5 o / m m G T 2 V 9 t t F M 7 W y 1 t H J V 7 X l h b w / 7 1 D 5 A e i E o = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " T n X m + M L P V T s U M T + 3 < l a t e x i t s h a 1 _ b a s e 6 4 = " z 9 F L V z c Z n P N g 0 t I 0 P K t q 0 I z 9 P c 4 i C + M n 7 B K p n c = " > A A A E z H i c f Z P N j t M w E M e z b Y A l f H X h y M W i 3 Q q h 1 S r p 8 n V B W o k L J 7 R I d H d R U l W O O 2 2 t u n F k T 1 C r K F e e j h f g D X g M n D R p 0 4 / F S q T x j D 2 / m b / t M B Z c o + v + O W o 0 7 X v 3 H x w / d B 4 9 f v L 0 W e v k + b W W i W L Q Z 1 J I d R t S D Y J H 0 E e O A m 5 j B X Q e C r g J Z 5 / z + M 1 P U J r L 6 D s u Y x j M 6 S T i Y 8 4 o G t f w p P G 7 0 3 G C E C Y 8 S u c U F V + 8 y X w x c N I g H J N F N k w 9 L y O k + 4 l 0 C f G 7 A c I C U z Y F 0 J C d G f / K E S a I o I y j n A t A T F i + o k u q L Z z N I K p 5 B M A s z z w g Q W C + G r C 3 B 1 z K S a K w B l x X U M 2 p U h J r 2 W O p Z n X 8 K n 4 Q d 1 H h / N 1 k V f Y q W R n O 5 d 3 P X e + 1 K M 7 M 9 2 F v d 2 E b q Q 5 o d Z e + s U S K M l + Q 7 y r G 1 s K 8 T 0 P e R r + 7 G 1 2 R Q 4 B x b b q W f e v M z k i t 1 E K L v N M 9 3 P t d 3 K b o i m 9 u 6 a a n / / T Y r X V Y 1 l S c p B N A N F r f W s d c 5 G G r 7 Z 6 7 x S D 7 h l c a b a s c V 8 P W 3 2 A k W T K H C J m g W v u e G + M g p Q o 5 E 5 A 5 Q a I h p m x G J + A b M 6 J z 0 I O 0 e H Y Z O T W e E R l L Z f 4 I S e G t 7 0 j p X O u l a Z K c m i K n e j e W O w / F / A T H H w c p j + I E I W I r 0 D g R B C X J 3 z A Z c Q U M x d I Y l C l u a i V s S h V l R s F t S p 4 b p R Q 6 M 9 J 4 u 0 L s G = " > A A A E 2 H i c j Z N N j 9 M w E I a z b Y C l f G w X j l w s 2 q 0 Q Q q u k f F 4 Q K 3 H h u E h 0 d 0 U T V Y 4 7 a a 2 m c W R P U K s o E j f E l V 8 H / 4 C f g d M m a Z r m g J V I 4 7 E z z z u v H S 8 K u E L L + n P U a p u 3 b t 8 5 v t u 5 d / / B w 5 P u 6 a M r J W L J Y M R E I O S N R x U E P I Q R c g z g J p J A l 1 4 A 1 9 7 i Y 7 Z + / Q 2 k 4 i L 8 g u s I 3 C W d h d z n j K J O T U 5 b v / v 9 j u P B j I f J k q L k q + f p O H A 7 i e P 5 Z J V O k q G d E j J 4 T w a E j A c O w g o T L 0 Y E m b 7 Q + W 0 i E n K h p / k s 0 P R s R h q n a z G L J W Y 1 X e I 4 + q m g h g c o l v V S I X k A / o 7 E 5 p w t I K x U L 5 S Q P a W N s J c F b H x Q b V c f Q E G F V 6 j J R 9 k i I M Z s s z F f q a h t h L + q w / P G y p I l + k D L L k O l F L i V M 6 h i N y Z k 2 D r 1 d Z 1 a O 5 v y Y E s V e c M 7 e 5 G i 2 I q o H m 8 T 7 M 1 / + F u U r 9 + r w t / C X l j k z I Y m c 2 8 d C K f l D e 7 o S z 3 p 9 q x z a z P I Y W D n Q c / I x + W k + 9 e Z C h Y v I U Q W U K X G t h W h m 1 C J n A W Q d p x Y Q U T Z g s 5 g r M O Q L k G 5 y e Y X T M m Z z k y J L 6 R + Q y S b b P W L h C 6 V W m d W n W m R c 1 V f y 5 J N a + M Y / X d u w s M o R g j Z F u T H A U F B s v + Z T L k E h s F a B 5 R J r r U S N q e S M m 3 l P i W r j U I E K t X W 2 H U j D o O r 4 b l t n d u f h 7 2 L f m 7 S s f H E e G o 8 M 2 z j r X F h f D I u j Z H B 2 h / a 0 A 7 b w v x q f j d / m D + 3 W 1 t H + T e P j b 1 h / v o H 6 f e F v Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " z 9 F L V z c Z n P N g 0 t I 0 P K t q 0 I z 9 P c 4 = " > A A A E 2 H i c j Z N N j 9 M w E I a z b Y C l f G w X j l w s 2 q 0 Q Q q u k f F 4 Q K 3 H h u E h 0 d 0 U T V Y 4 7 a a 2 m c W R P U K s o E j f E l V 8 H / 4 C f g d M m a Z r m g J V I 4 7 E z z z u v H S 8 K u E L L + n P U a p u 3 b t 8 5 v t u 5 d / / B w 5 P u 6 a M r J W L J Y M R E I O S N R x U E P I Q R c g z g J p J A l 1 4 A 1 9 7 i Y 7 Z + / Q 2 k 4 i L 8 g u s I 3 C W d h d z n j K J O T U 5 b v / v 9 j u P B j I f J k q L k q + f p O H A 7 i e P 5 Z J V O k q G d E j J 4 T w a E j A c O w g o T L 0 Y E m b 7 Q + W 0 i E n K h p / k s 0 P R s R h q n a z G L J W Y 1 X e I 4 + q m g h g c o l v V S I X k A / o 7 E 5 p w t I K x U L 5 S Q P a W N s J c F b H x Q b V c f Q E G F V 6 j J R 9 k i I M Z s s z F f q a h t h L + q w / P G y p I l + k D L L k O l F L i V M 6 h i N y Z k 2 D r 1 d Z 1 a O 5 v y Y E s V e c M 7 e 5 G i 2 I q o H m 8 T 7 M 1 / + F u U r 9 + r w t / C X l j k z I Y m c 2 8 d C K f l D e 7 o S z 3 p 9 q x z a z P I Y W D n Q c / I x + W k + 9 e Z C h Y v I U Q W U K X G t h W h m 1 C J n A W Q d p x Y Q U T Z g s 5 g r M O Q L k G 5 y e Y X T M m Z z k y J L 6 R + Q y S b b P W L h C 6 V W m d W n W m R c 1 V f y 5 J N a + M Y / X d u w s M o R g j Z F u T H A U F B s v + Z T L k E h s F a B 5 R J r r U S N q e S M m 3 l P i W r j U I E K t X W 2 H U j D o O r 4 b l t n d u f h 7 2 L f m 7 S s f H E e G o 8 M 2 z j r X F h f D I u j Z H B 2 h / a 0 A 7 b w v x q f j d / m D + 3 W 1 t H + T e P j b 1 h / v o H 6 f e F v Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " z 9 F L V z c Z n P N g 0 t I 0 P K t q 0 I z 9 P c 4 = " > A A A E 2 H i c j Z N N j 9 M w E I a z b Y C l f G w X j l w s 2 q 0 Q Q q u k f F 4 Q K 3 H h u E h 0 d 0 U T V Y 4 7 a a 2 m c W R P U K s o E j f E l V 8 H / 4 C f g d M m a Z r m g J V I 4 7 E z z z u v H S 8 K u E L L + n P U a p u 3 b t 8 5 v t u 5 d / / B w 5 P u 6 a M r J W L J Y M R E I O S N R x U E P I Q R c g z g J p J A l 1 4 A 1 9 7 i Y 7 Z + / Q 2 k 4 i L 8 g u s I 3 C W d h d z n j K J O T U 5 b v / v 9 j u P B j I f J k q L k q + f p O H A 7 i e P 5 Z J V O k q G d E j J 4 T w a E j A c O w g o T L 0 Y E m b 7 Q + W 0 i E n K h p / k s 0 P R s R h q n a z G L J W Y 1 X e I 4 + q m g h g c o l v V S I X k A / o 7 E 5 p w t I K x U L 5 S Q P a W N s J c F b H x Q b V c f Q E G F V 6 j J R 9 k i I M Z s s z F f q a h t h L + q w / P G y p I l + k D L L k O l F L i V M 6 h i N y Z k 2 D r 1 d Z 1 a O 5 v y Y E s V e c M 7 e 5 G i 2 I q o H m 8 T 7 M 1 / + F u U r 9 + r w t / C X l j k z I Y m c 2 8 d C K f l D e 7 o S z 3 p 9 q x z a z P I Y W D n Q c / I x + W k + 9 e Z C h Y v I U Q W U K X G t h W h m 1 C J n A W Q d p x Y Q U T Z g s 5 g r M O Q L k G 5 y e Y X T M m Z z k y J L 6 R + Q y S b b P W L h C 6 V W m d W n W m R c 1 V f y 5 J N a + M Y / X d u w s M o R g j Z F u T H A U F B s v + Z T L k E h s F a B 5 R J r r U S N q e S M m 3 l P i W r j U I E K t X W 2 H U j D o O r 4 b l t n d u f h 7 2 L f m 7 S s f H E e G o 8 M 2 z j r X F h f D I u j Z H B 2 h / a 0 A 7 b w v x q f j d / m D + 3 W 1 t H + T e P j b 1 h / v o H 6 f e F v Q = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " z 9 F L V z c Z n P N g 0 t I 0 P K t q 0 I z 9 P c 4 < l a t e x i t s h a 1 _ b a s e 6 4 = " C W y e r V o 4 L O Q T z m j q K X x c e N X t 9 t y P J j x M A k o S r 5 8 n o 5 8 t 5 U 4 3 p Q s 0 3 F y Z q e E 9 N 6 S H i G j n o O w x I T N O V t A m J 7 q x F Y B U K C F 2 p x s h J W Q s 1 h i Q W F U S o H Z 3 i 5 x H P 0 U k I M a 0 t c d F X h e j A j y H z y W e V C Y R w I p i v 2 0 s y 1 t B w N Y F G C R k I s C K t 9 5 O 7 a E v M J e L p f q 3 I t 9 U c X m j p z W 9 q x 1 W S 1 t X U Y B u x J r s z N s l f q y 3 i x i z E r e 7 Y 7 3 R q n a e / M H F I 7 4 r 9 B X / 3 G 4 Q K y c 7 8 7 l s i F 5 u 6 X q C i Y L O Q T z m j q K X x c e N X t 9 t y P J j x M A k o S r 5 8 n o 5 8 t 5 U 4 3 p Q s 0 3 F y Z q e E 9 N 6 S H i G j n o O w x I T N O V t A m J 7 q x F Y B U K C F 2 p x s h J W Q s 1 h i Q W F U S o H Z 3 i 5 x H P 0 U k I M a 0 t c d F X h e j A j y H z y W e V C Y R w I p i v 2 0 s y 1 t B w N Y F G C R k I s C K t 9 5 O 7 a E v M J e L p f q 3 I t 9 U c X m j p z W 9 q x 1 W S 1 t X U Y B u x J r s z N s l f q y 3 i x i z E r e 7 Y 7 3 R q n a e / M H F I 7 4 r 9 B X / 3 G 4 Q K y c 7 8 7 l s i F 5 u 6 X q C i Y L O Q T z m j q K X x c e N X t 9 t y P J j x M A k o S r 5 8 n o 5 8 t 5 U 4 3 p Q s 0 3 F y Z q e E 9 N 6 S H i G j n o O w x I T N O V t A m J 7 q x F Y B U K C F 2 p x s h J W Q s 1 h i Q W F U S o H Z 3 i 5 x H P 0 U k I M a 0 t c d F X h e j A j y H z y W e V C Y R w I p i v 2 0 s y 1 t B w N Y F G C R k I s C K t 9 5 O 7 a E v M J e L p f q 3 I t 9 U c X m j p z W 9 q x 1 W S 1 t X U Y B u x J r s z N s l f q y 3 i x i z E r e 7 Y 7 3 R q n a e / M H F I 7 4 r 9 B X / 3 G 4 Q K y c 7 8 7 l s i F 5 u 6 X q C i Y x test = [ yogurt, butter, carrot, chicken, carrot ] = " > A A A E 2 H i c j Z N N j 9 M w E I a z b Y C l f G w X j l w s 2 q 0 Q Q q u k f F 4 Q K 3 H h u E h 0 d 0 U T V Y 4 7 a a 2 m c W R P U K s o E j f E l V 8 H / 4 C f g d M m a Z r m g J V I 4 7 E z z z u v H S 8 K u E L L + n P U a p u 3 b t 8 5 v t u 5 d / / B w 5 P u 6 a M r J W L J Y M R E I O S N R x U E P I Q R c g z g J p J A l 1 4 A 1 9 7 i Y 7 Z + / Q 2 k 4 i L 8 g u s I 3 C W d h d z n j K J O T U 5 b v / v 9 j u P B j I f J k q L k q + f p O H A 7 i e P 5 Z J V O k q G d E j J 4 T w a E j A c O w g o T L 0 Y E m b 7 Q + W 0 i E n K h p / k s 0 P R s R h q n a z G L J W Y 1 X e I 4 + q m g h g c o l v V S I X k A / o 7 E 5 p w t I K x U L 5 S Q P a W N s J c F b H x Q b V c f Q E G F V 6 j J R 9 k i I M Z s s z F f q a h t h L + q w / P G y p I l + k D L L k O l F L i V M 6 h i N y Z k 2 D r 1 d Z 1 a O 5 v y Y E s V e c M 7 e 5 G i 2 I q o H m 8 T 7 M 1 / + F u U r 9 + r w t / C X l j k z I Y m c 2 8 d C K f l D e 7 o S z 3 p 9 q x z a z P I Y W D n Q c / I x + W k + 9 e Z C h Y v I U Q W U K X G t h W h m 1 C J n A W Q d p x Y Q U T Z g s 5 g r M O Q L k G 5 y e Y X T M m Z z k y J L 6 R + Q y S b b P W L h C 6 V W m d W n W m R c 1 V f y 5 J N a + M Y / X d u w s M o R g j Z F u T H A U F B s v + Z T L k E h s F a B 5 R J r r U S N q e S M m 3 l P i W r j U I E K t X W 2 H U j D o O r 4 b l t n d u f h 7 2 L f m 7 S s f H E e G o 8 M 2 z j r X F h f D I u j Z H B 2 h / a 0 A 7 b w v x q f j d / m D + 3 W 1 t H + T e P j b 1 h / v o H 6 f e F v Q = = < / l a t J k d Z K W 7 a x s c B g O Q e L C A = " > A A A E 4 3 i c h Z P L j t M w F I Y z b Y C h X K Y D S z Y W 7 V Q I j a q k w 2 2 D N B I b x G q Q 6 M x I T V Q 5 7 m l r N Y k j + w S 1 R H k C d o g t D 8 a a D Y + B 0 6 a d X A p Y i X T 8 n 9 j f O b 9 j L / K 5 Q s v 6 e d B o m r d u 3 z m 8 2 7 p 3 / 8 H D o / b x o 0 s l Y s l g y I Q v 5 L V H F f g 8 h C F y 9 O E 6 k k A D z 4 c r b / E u y 1 9 9 B q m 4 C D / h K g I 3 o L O Q T z m j q K X x c e N X t 9 t y P J j x M A k o S r 5 8 n o 5 8 t 5 U 4 3 p Q s 0 3 F y Z q e E 9 N 6 S H i G j n o O w x I T N O V t A m J 7 q x F Y B U K C F 2 p x s h J W Q s 1 h i Q W F U S o H Z 3 i 5 x H P 0 U k I M a 0 t c d F X h e j A j y H z y W e V C Y R w I p i v 2 0 s y 1 t B w N Y F G C R k I s C K t 9 5 O 7 a E v M J e L p f q 3 I t 9 U c X m j p z W 9 q x 1 W S 1 t X U Y B u x J r s z N s l f q y 3 i x i z E r e 7 Y 7 3 R q n a e / M H F I 7 4 r 9 B X / 3 G 4 Q K y c 7 8 7 l s i F 5 u 6 X q C i Y 7 E E 5 2 f 3 O r 2 x 2 3 O 1 b f W g 9 S D + w 8 6 B j 5 u B i 3 f z s T w e I A Q m Q + V W p k W x G 6 C Z X I m Q 9 p y 4 k V R J Q t 6 A x G O g x p A M p N 1 r c x J S d a m Z C p k P o N k a z V 4 o q E B k q t d C P k R N c 4 V 9 V c J u 7 L j W K c v n E T H k Y x Q s g 2 o G n s E x Q k u 9 p k w i U w 9 F c 6 o E x y X S t < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z 2 M 2 m L 7 J E 3 5 D 0 J y S E 5 J W f k n I w J J 3 8 8 6 n 3 w 9 v 3 X / r H / 2 f / a l v p e 1 7 N L 7 o T / 7 S / d u s o L < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z 2 M 2 m L 7 J E 3 5 D 0 J y S E 5 J W f k n I w J J 3 8 8 6 n 3 w 9 v 3 X / r H / 2 f / a l v p e 1 7 N L 7 o T / 7 S / d u s o L < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z 2 M 2 m L 7 J E 3 5 D 0 J y S E 5 J W f k n I w J J 3 8 8 6 n 3 w 9 v 3 X / r H / 2 f / a l v p e 1 7 N L 7 o T / 7 S / d u s o L < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " n g 4 Q E m 9 z 7 l X R C J U v u 2 N b h p g D z p s = " > A A A C k 3 i c b Z H d a t s w F M d l t 9 u 6 7 C t t 2 V V v x L K F r Y x g j 0 I L p V C 2 X f R m 0 M L S F m w v y M p x I i J L R j o u C c Z v s Z f b G + w Z d j U 5 N q M f O y D 4 6 3 c + d P g r L a S w G A S / P H 9 j 8 9 H j J 1 t P e 8 + e v 3 j 5 q r + 9 c 2 l 1 a T i M u Z b a X K f M g h Q K x i h Q w n V h g O W p h K t 0 8 a X J X 9 2 A s U K r 7 7 g q I M n Z Theorem 3 in the main body of the paper can be viewed as a special case of the above theoremindeed, setting n * = 1 in inequality (105) we exactly recover (9). In order to prove Theorem 7, we simply need to revisit subsection B.3. We denote by Ω DM2 and P DM2 the sample space and probability measure associated with the sampling process DM2. As in subsection B.3, given a kernel K ∈ K, we define the event E K = ω ∈ Ω DM2 : There exists 1 ≤ s * ≤ n spl such that K x test , x 1,s * > K x test , x r,s for all 2 ≤ r ≤ R and all 1 ≤ s ≤ n spl . Such event describes all outcomes corresponding to successful classification of the test point x test . For simplicity let us assume that n * = 2 (therefore matching the scenario depicted in Figure 6 ). We further partition the event E K according to which training point from the first category is most similar to the test point: E K = E (1) meaningful ∪ E (2) meaningful ∪ E luck (106) The event E (1) meaningful consists in all the outcomes where, among the points from first category, x 1,1 is the most similar to x test , E (2) meaningful consists in all the outcomes where, among the points from first category, x 1,2 is the most similar to x test , and E luck consists in all the remaining cases. Formally we have: E (1) meaningful = E K ∩ ω ∈ Ω DM2 : K x test , x 1,1 > K x test , x 1,2 ∩ ω ∈ Ω DM2 : K x test , x 1,1 > K x test , x 1,s for all 3 ≤ s ≤ n spl

E

(2) meaningful = E K ∩ ω ∈ Ω DM2 : K x test , x 1,2 ≥ K x test , x 1,1 ∩ ω ∈ Ω DM2 : K x test , x 1,2 > K x test , x 1,s for all 3 ≤ s ≤ n spl E luck = E K ∩ ω ∈ Ω DM2 : there exists 3 ≤ s * ≤ n spl such that K x test , x 1,s * ≥ K x test , x 1,s for all 1 ≤ s ≤ n spl Exactly as in subsection B.3, we then prove that P DM2 [E (i) meaningful ] ≤ 1 |X | x∈X H 2R-1 (K x ) for i = 1, 2 P DM2 [E luck ] ≤ 1 R . ( ) The proof of inequality ( 107) is essentially identical to the proof of Lemma 10. We define the event A (1) := ω ∈ Ω DM2 : K x test , x 1,1 > K x test , x r,1 for all 2 ≤ r ≤ R ∩ ω ∈ Ω DM2 : K x test , x 1,1 > K x test , x r,3 for all 1 ≤ r ≤ R (109) and the event A (2) := ω ∈ Ω DM2 : K x test , x 1,2 > K x test , x r,2 for all 2 ≤ r ≤ R ∩ ω ∈ Ω DM2 : K x test , x 1,2 > K x test , x r,3 for all 1 ≤ r ≤ R The x's involved in the definition of the event A (1) are highlighted in yellow in Figure 6 . Crucially they are all generated by different sequences of concepts, except for x 1,1 and x test . We can therefore appeal to Theorem 4 to obtain P DM2 [A (1) ] ≤ 1 |X | x∈X H 2R-1 (K x ) since there is a total of t = 2R -1 'distractors' (the 'distractors' in Figure 6 are x 1,3 , x 2,1 , x 2,3 , x 3,1 and x 3,3 ). We then use the fact E (1) meaningful ⊂ A (1) to obtain (107) with i = 1. The case i = 2 is exactly similar. We now prove (108). The proof is similar to the proof of Lemma 11. For 1 ≤ r ≤ R, we define the events By symmetry, these events are equiprobable. They also are mutually disjoints, and therefore P DM2 [B r ] ≤ 1/R. Inequality (108) then comes from the fact that E luck ⊂ B 1 . Combining ( 106), ( 107), (108) then gives sup K∈K P DM2 E K ≤ 2 |X | x∈X H 2R-1 (K x ) + 1 R . ( ) and, going back to the general case where n * denotes the number of representatives that each sequence of unfamiliar concepts has in the training set, 111) which in turn implies (16). Combining inequalities (15) and ( 16) then concludes the proof of Theorem 7. sup K∈K P DM2 E K ≤ n * |X | x∈X H 2R-1 (K x ) + 1 R

E DETAILS OF THE EXPERIMENTS

In this section we provide the details of the experiments described in Section 6, as well as additional experiments. Table 4 provides the results of experiments in which the parameters L, n w , n c and R are set to L = 9, n w = 150, n c = 5, R = 1000. Table 3 : Accuracy in % on unfamiliar test points (L = 9, n w = 150, n c = 5, R = 1000). Neural network 99.8 ± 0.3 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 100 ± 0.1 NN on feat. extracted by neural net 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 Neural network 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 100 ± 0.1 NN on feat. extracted by neural net 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 99.9 ± 0.1 NN on feat. extracted by ψ 2.4 ± 0.3 4.1 ± 0.6 5.5 ± 0.6 6.9 ± 0.8 8.0 ± 0.8 NN on feat. extracted by ψ one-hot 2.0 ± 0.3 3.4 ± 0.5 4.8 ± 0.6 5.7 ± 0.5 6.7 ± 0.7 Upper bound (0.073n * + 1/1000) 7.4 14.7 22.0 29.3 36.6 NN SVM on feat. extracted by ψ 2.2 ± 0.5 5.2 ± 0.9 8.6 ± 0.9 11.7 ± 0.6 15.1 ± 1.2 SVM on feat. extracted by ψ one-hot 1.2 ± 0.1 3.5 ± 0.2 6.4 ± 0.2 9.9 ± 0.3 13.6 ± 0.4 SVM with Gaussian kernel 2.0 ± 0.1 3.7 ± 0.2 5.4 ± 0.2 8.6 ± 0.3 12.1 ± 0.3 The parameters n spl and n * are chosen so that the training set contains 5 familiar sentences per category, and between 1 and 5 unfamiliar sentences per category. Table 3 is identical to Table 1 in Section 6, with the exception that it contains additional information (i.e. the standard deviations of the obtained accuracies). The abbreviation NN appearing in Table 3 stands for 'Nearest Neighbor'. Table 4 provides the results of additional experiments in which the parameters L, n w , n c and R are set to L = 9, n w = 50, n c = 5, R = 1000. The parameters n spl and n * are chosen, as in the previous set experiments, so that the training set contains 5 familiar sentences per category, and between 1 and 5 unfamiliar sentences per category. The tasks considered in this set of experiments are easier due to the fact that the vocabulary is smaller (n w = 50 instead of n w = 150).

E.1 NEURAL NETWORK EXPERIMENTS

We consider the neural network in Figure 2 . The number of neurons in the input, hidden and output layers of the MLPs constituting the neural network are set to: For each of the 10 possible parameter settings in Table 3 and Table 4 , we do 104 experiments. For each experiment we generate: • A training set containing R × n spl sentences. • A test set containing 10, 000 unfamiliar sentences (10 sentences per category). We then train the neural network with stochastic gradient descent until the training loss reaches 10 -4 (we use a cross entropy loss). The learning rate is set to 0.01 (constant learning rate), and the batch size to 100. At test time, we either use the neural network to classify the test points (first row of the tables) or we use a nearest neighbor classification rule on the top of the features extracted by the neural network (second row of the tables). The mean and standard deviation of the 104 test accuracies, for each of the 10 settings, and for each of the two evaluation strategies, is reported in the first two rows of Table 3 and Table 4 .

E.2 NEAREST-NEIGHBOR EXPERIMENTS

In these experiments we use a nearest neighbor classification rule on the top of features extracted by ψ (third row of Table 3 and 4 ) or ψ one-hot (fourth row). For each of the 10 possible parameter settings in Table 3 and Table 4 , we do 50 experiments. For each experiment we generate: • A training set containing R × n spl sentences. • A test set containing 1, 000 unfamiliar sentences (one sentences per category). In order to perform the nearest neighbor classification rule on the features extracted by ψ , one needs to evaluate the kernel K (x, y) = ψ (x), ψ (y) F for each pair of sentences. Computing K (x, y) requires an expensive combinatorial calculation which is the reason why we perform fewer experiments and use a smaller test set than in E.1. In order to break ties, the values of K (x, y) are perturbed according to (31). With the parameter setting L = 9, n w = 50, n c = 5 and R = 1000, our theoretical lower bound for the generalization error is err(F, ψ, T) ≥ 1 -0.073 n * -1/R for all F and all ψ, which is obtained by choosing = 6 in inequality (105). This lead to an upper bound of 0.073 n * + 1/R on the success rate. This upper bound is evaluated for n * ranging from 1 to 5 in the fifth row of Table 4 . E.3 SVM ON FEATURES EXTRACTED BY ψ one-hot AND SVM WITH GAUSSIAN KERNEL For each of the 10 possible parameter settings in Table 3 and Table 4 , we do 100 experiments. For each experiment we generate: • A training set containing R × n spl sentences. • A test set containing 10, 000 unfamiliar sentences (10 sentences per category). We use the feature map ψ one-hot (which simply concatenates the one-hot-encodings of the words composing a sentence) to extract features from each sentence. These features are further normalized according to the formula x = ψ one-hot (x) -p p(1 -p) where p = 1/n w (113) so that they are centered around 0 and are O(1). We then use the SVC function of Scikit-learn Pedregosa et al. (2011) , which itself relies on the LIBSVM library Chang & Lin (2011) , in order to run a soft multiclass SVM algorithm on these features. We tried various values for the parameter controlling the 2 regularization in the soft-SVM formulation, and found that the algorithm, on this task, is not sensitive to this choice -so we chose C = 1. The results are reported in the seventh row of both tables. We also tried a soft SVM with Gaussian Kernel K(x, y) = e -γ x-y 2 applied on the top of features extracted by ψ one-hot and normalized according to (113). We use the SVC function of Scikit-learn with 2 regularization parameter set to C = 1. For the experiments in Table 3 (n w = 150), the parameter γ involved in the definition of the kernel was set to γ = 0.25 when n * ∈ {1, 2} and to γ = 0.1 when n * ∈ {3, 4, 5}. For the experiments in Table 4 (n w = 50), it was set to γ = 0.75 when n * = 1, to γ = 0.1 when n * = 2, and finally to γ = 0.005 when n * ∈ {3, 4, 5}.



We use e i to denote the i th basis vector of R nc . So e ϕ(x ) is a one-hot vector coding for the concept ϕ(x ). We refer to S as the 'training set'. In our formalism however it is not a set, but an element of X R×n spl . That is, vi = vj for all i = j. Alternatively, Stirling numbers can be defined through the formulan k = 1 k! k i=0 (-1) i k k-i (k -i) n



x11 = [ cheese, butter, lettuce, chicken, leek ] x12 = [ carrot, pork, cream, carrot, cheese ] x13 = [ lettuce, chicken, butter, potato, butter ] x14 = [ lettuce, beef, yogurt, leek, cream ] x15 = [ potato, lamb, butter, potato, yogurt ] < l a t e x i t s h a 1 _ b a s e 6 4 = " K E Q w m B v 9 W v 4 h + S + F E P n 3 v 8 D / 1 k w = " > A A A F v n i c h V P B b t N A E H U T A i U U a O H I Z U X a C K G o i t s i 4 I B U x I V j k Q i t Z F v V e j 1 O r K y 9 1 n q d J l r 5 Q z n w L 8 w 6 T u L E p a y c a D T z Z t + 8 8 b O f 8 i h T w + H v v V b 7 U e f x k / 2 n 3 W c H z 1 + 8 P D x 6 9 S s T u W Q w Y o I L e e P T D H i U w E h F i s N N K o H G P o d r f / r N 1 K 9 n I L N I J D / V I g U v p u M k C i N G F a Z u j 1 q z 4 + O u 6 8 M 4 S n R M l Y z m 7 w u H e 1 3 t + i G Z F 7 f a t g t C + l 9 I n x C n 7 y q Y q 0 C z C U A G x Q A L V c b P l Q K J m W V i p j k o l T O D 6 Z M y F W N X x K a Q b F I G B V N z v U d c F 5 8 a 6 9 m K 1 V n d y K i U Q m 1 I Y 5 0 K O V 1 T 4 l R G d / 3 2 d U e V W Q + O i S b j e Y N x o + E + E f / W n g p F l T A Q 0 1 i e b a h R j P T b / B c P 8 K / p f Y C w L m g h x r lU j Z U O C K m J L v d i N D c 4 P z Q 4 N 7 O v t s z R T L U t P 6 S 2 X 9 d a j V a + 3 a 4 L S b D 2 V x c t d 3 v Y G 5 4 O y 0 O a g V 0 F P a s 6 V + j U o R s I l s e Q K M Z p l j n 2 M F W e p l J F j E P R d f M M U s q m d A w O h g m N I f N 0 + Y U U 5 A Q z A Q m F x F + i S J m t d 2 g a Z 9 k C h Z I T n H K S 7 d Z M 8 r 6 a k 6 v w k 6 e j J M 0 V J G x J F O a c K E H M 5 0 a C S A J T f I E B Z T L C W Q m b U E k Z L r H B o i b x o C I b r A b a w p i K E o J n g 0 x R 4 0 P 8 b I 3 0 A E I M S l U 6 X i y A c 3 F X a D n 2 C 4 0 7 H Z D q 7 6 I B H U t A M 1 d I A z k v w Q 2 c h G B 9 3 w V C 8 E F M A n d M x D H F d 7 s 0 S + H Y n t Z u v a s s G M f 1 7 K K 4 p 2 e 2 0 7 O c 6 H 9 d w U 5 X J X m 7 D T 1 m 7 z q q G Y z O T j + f 2 j / O e p f H l d n 2 r T f W W + u d Z V s f r U v r u 3 V l j S z W + t N u t w / a z z t f O + N O 3 B F L a G u v 6 n l t b Z 3 O / C 9 y T e j u < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " K E Q w m B v 9 W v 4 h + S + F E P n 3 v 8 D / 1 k w = " > A A A F v n i c h V P B b t N A E H U T A i U U a O H I Z U X a C K G o i t s i 4 I B U x I V j k Q i t Z F v Ve j 1 O r K y 9 1 n q d J l r 5 Q z n w L 8 w 6 T u L E p a y c a D T z Z t + 8 8 b O f 8 i h T w + H v v V b 7 U e f x k / 2 n 3 W c H z 1 + 8 P D x 6 9 S s T u W Q w Y o I L e e P T D H i U w E h F i s N N K o H G P o d r f / r N 1 K 9 n I L N I J D / V I g U v p u M k C i N G F a Z u j 1 q z 4 + O u 6 8 M 4 S n R M l Y z m 7 w u H e 1 3 t + i G Z F 7 f a t g t C + l 9 I n x C n 7 y qY q 0 C z C U A G x Q A L V c b P l Q K J m W V i p j k o l T O D 6 Z M y F W N X x K a Q b F I G B V N z v U d c F 5 8 a 6 9 m K 1 V n d y K i U Q m 1 I Y 5 0 K O V 1 T 4 l R G d / 3 2 d U e V W Q + O i S b j e Y N x o + E + E f / W n g p F l T A Q 0 1 i e b a h R j P T b / B c P 8 K / p f Y C w L m g h x r l U j Z U O C K m J L v d i N D c 4 P z Q 4 N 7 O v t s z R T L U t P 6 S 2 X 9 d a j V a + 3 a 4 L S b D 2 V x c t d 3 v Y G 5 4 O y 0 O a g V 0 F P a s 6 V + j U o R s I l s e Q K M Z p l j n 2 M F W e p l J F j E P R d f M M U s q m d A w O h g m N I f N 0 + Y U U 5 A Q z A Q m F x F + i S J m t d 2 g a Z 9 k C h Z I T n H K S 7 d Z M 8 r 6 a k 6 v w k 6 e j J M 0 V J G x J F O a c K E H M 5 0 a C S A J T f I E B Z T L C W Q m b U E k Z L r H B o i b x o C I b r A b a w p i K E o J n g 0 x R 4 0 P 8 b I 3 0 A E I M S l U 6 X i y A c 3 F X a D n 2 C 4 0 7 H Z D q 7 6 I B H U t A M 1 d I A z k v w Q 2 c h G B 9 3 w V C 8 E F M A n d M x D H F d 7 s 0 S + H Y n t Z u v a s s G M f 1 7 K K 4 p 2 e 2 0 7 O c 6 H 9 d w U 5 X J X m 7 D T 1 m 7 z q q G Y z O T j + f 2 j / O e p f H l d n 2 r T f W W + u d Z V s f r U v r u 3 V l j S z W + t N u t w / a z z t f O + N O 3 B F L a G u v 6 n l t b Z 3 O / C 9 y T e j u < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " K E Q w m B v 9 W v 4 h + S + F E P n 3 v 8 D / 1 k w = " > A A A F v n i c h V P B b t N A E H U T A i U U a O H I Z U X a C K G o i t s i 4 I B U x I V j k Q i t Z F v V e j 1 O r K y 9 1 n q d J l r 5 Q z n w L 8 w 6 T u L E p a y c a D T z Z t + 8 8 b O f 8 i h T w + H v v V b 7 U e f x k / 2 n 3 W c H z 1 + 8 P D x 6 9 S s T u W Q w Y o I L e e P T D H i U w E h F i s N N K o H G P o d r f / r N 1 K 9 n I L N I J D / V I g U v p u M k C i N G F a Z u j 1 q z 4 + O u 6 8 M 4 S n R M l Y z m 7 w u H e 1 3 t + i G Z F 7 f a t g t C + l 9 I n x C n 7 y q Y q 0 C z C U A G x Q A L V c b P l Q K J m W V i p j k o l T O D 6 Z M y F W N X x K a Q b F I G B V N z v U d c F 5 8 a 6 9 m K 1 V n d y K i U Q m 1 I Y 5 0 K O V 1 T 4 l R G d / 3 2 d U e V W Q + O i S b j e Y N x o + E + E f / W n g p F l T A Q 0 1 i e b a h R j P T b / B c P 8 K / p f Y C w L m g h x r l U j Z U O C K m J L v d i N D c 4 P z Q 4 N 7 O v t s z R T L U t P 6 S 2 X 9 d a j V a + 3 a 4 L S b D 2 V x c t d 3 v Y G 5 4 O y 0 O a g V 0 F P a s 6 V + j U o R s I l s e Q K M Z p l j n 2 M F W e p l J F j E P R d f M M U s q m d A w O h g m N I f N 0 + Y U U 5 A Q z A Q m F x F + i S J m t d 2 g a Z 9 k C h Z I T n H K S 7 d Z M 8 r 6 a k 6 v w k 6 e j J M 0 V J G x J F O a c K E H M 5 0 a C S A J T f I E B Z T L C W Q m b U E k Z L r H B o i b x o C I b r A b a w p i K E o J n g 0 x R 4 0 P 8 b I 3 0 A E I M S l U 6 X i y A c 3 F X a D n 2 C 4 0 7 H Z D q 7 6 I B H U t A M 1 d I A z k v w Q 2 c h G B 9 3 w V C 8 E F M A n d M x D H F d 7 s 0 S + H Y n t Z u v a s s G M f 1 7 K K 4 p 2 e 2 0 7 O c 6 H 9 d w U 5 X J X m 7 D T 1 m 7 z q q G Y z O T j + f 2 j / O e p f H l d n 2 r T f W W + u d Z V s f r U v r u 3 V l j S z W + t N u t w / a z z t f O + N O 3 B F L a G u v6 n l t b Z 3 O / C 9 y T e j u < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " K E Q w m B v 9 W v 4 h + S + F E P n 3 v 8 D / 1 k w = " > A A A F v n i c h V P B b t N A E H U T A i U U a O H I Z U X a C K G o i t s i 4 I B U x I V j k Q i t Z F v V e j 1 O r K y 9 1 n q d J l r 5 Q z n w L 8 w 6 T u L E p a y c a D T z Z t + 8 8 b O f 8 i h T w + H v v V b 7 U e f x k / 2 n 3 W c H z 1 + 8 P D x 6 9 S s T u W Q w Y o I L e e P T D H i U w E h F i s N N K o H G P o d r f / r N 1 K 9 n I L N I J D / V I g U v p u M k C i N G F a Z u j 1 q z 4 + O u 6 8 M 4 S n R M l Y z m 7 w u H e 1 3 t + i G Z F 7 f a t g t C + l 9 I n x C n 7 y qY q 0 C z C U A G x Q A L V c b P l Q K J m W V i p j k o l T O D 6 Z M y F W N X x K a Q b F I G B V N z v U d c F 5 8 a 6 9 m K 1 V n d y K i U Q m 1 I Y 5 0 K O V 1 T 4 l R G d / 3 2 d U e V W Q + O iS b j e Y N x o + E + E f / W n g p F l T A Q 0 1 i e b a h R j P T b / B c P 8 K / p f Y C w L m g h x r l U j Z U O C K m J L v d i N D c 4 P z Q 4 N 7 O v t s z R T L U t P 6 S 2 X 9 d a j V a + 3 a 4 L S b D 2 V x c t d 3 v Y G 5 4 O y 0 O a g V 0 F P a s 6 V + j U o R s I l s e Q K M Z p l j n 2 M F W e p l J F j E P R d f M M U s q m d A w O h g m N I f N 0 + Y U U 5 A Q z A Q m F x F + i S J m t d 2 g a Z 9 k C h Z I T n H K S 7 d Z M 8 r 6 a k 6 v w k 6 e j J M 0 V J G x J F O a c K E H M 5 0 a C S A J T f I E B Z T L C W Q m b U E k Z L r H B o i b x o C I b r A b a w p i K E o J n g 0 x R 4 0 P 8 b I 3 0 A E I M S l U 6 X i y A c 3 F X a D n 2 C 4 0 7 H Z D q 7 6 I B H U t A M 1 d I A z k v w Q 2 c h G B 9 3 w V C 8 E F M A n d M x D H F d 7 s 0 S + H Y n t Z u v a s s G M f 1 7 K K 4 p 2 e 2 0 7 O c 6 H 9 d w U 5 X J X m 7 D T 1 m 7 z q q G Y z O T j + f 2 j / O e p f H l d n 2 r T f W W + u d Z V s f r U v r u 3 V l j S z W + t N u t w / a z z t f O + N O 3 B F L a G u v 6 n l t b Z 3 O / C 9 y T e j u < / l a t e x i t > x21 = [ butter, pork, lamb, lamb, yogurt ] x22 = [ chicken, cheese, cream, lettuce, beef ] x23 = [ beef, cheese, cheese, carrot, pork ] x24 = [ lamb, butter, cream, potato, lamb ] x25 = [ chicken, cream, butter, leek, pork ]< l a t e x i t s h a 1 _ b a s e 6 4 = " x + 7 E 4 W q 6V e A o f a R + M g k h q m X 2 x m k = " > A A A F z H i c j V P f b 9 M w E M 5 a C q P 8 2 A Z v 7 M W i W 4 V Q N T X d E P C A N I k X 3 h g S Z Z O S a H K c a x v Vi S P b 6 V Z Z e U P 8 j / w N / B P Y q Z O t T S W w 0 u p 8 9 5 2 / 7 8 7 n M K O x k M P h 7 5 1 W + 0 H n 4 a P d x 9 0 n T 5 8 9 3 9 s / e P F D s J w TG B N G G b 8 K s Q A a p z C W s a R w l X H A S U j h M p x / N v H L B X A R s / S 7 X G Y Q J H i a x p O Y Y K l d 1 w e t n 0 d H X T + E a Z y q B E s e 3 7 4 t P B p 0 l R 9 O 0 G 1 x r U Z u g V D / E + o j 5 P V 9 C b c y U m E u J f B i o A O l J 1 E Z 4 3 O 9 r 7 Z U C z B b t H U f q S W b 5 l y a g w P k + / q 7 x z e q + L z q N D K L y R z S O 7 5 I u w A E 1 I z a Y Y o 2 C L s s 1 U J R k D I n J d S G 7 K k h w G S 7 g t O G g h I 7 Q H c F 1 P x N R Z V r o Q j m n M m V q P 4a d 9 k u w 7 1 J f d a g 3 m x d 3 f s 7 L b b 0 u u S M S S z Z S s r a D W x j f P d f 7 a 4 o G g N Q t b v u N s w t 8 b Z y b a t 9 S K N 6 2 r p 6 A K / 3 e 8 O T Y b l Q 03 C t 0 X P s u t B z O / Q j R v I E U k k o F s J z h 5 k M F O Y y J h S K r p 8 L y D C Z 4 y l 4 2 k x x A i J Q 5 X s p 0 L H 2 R G j C u P 6 l E p X e + x k K J 0 I s T c O O t c q Z 2 I w Z 5 7 a Y l 8 v J h 0 D F a Z Z L S M m K a J J T J B k y j w 9 F M Q c i 6 V I b m P B Y a 0 V k h j k m u p s N F j l L B p Z s U A l a w 5 i I Z I y K g Z D Y X J l + x K b 0 C C b a K K t S y X I J l L K b Q v F p W C j d 0 w G y f 2 c N 6 J S D v n e L N J D T E t z A c Y j q 8 8 4 0 R H 8 a k 8 I N Y U m C 9 d 2 u 7 r 3 w 3 E A p / 3 5 W G T D D 1 3 O L Y k v O Y i N n p e h f W d F G l i1 5 P U 3 P m L s 5 U U 1 j P D r 5 e O J + G / X O j + y w 7 T q H z m v n j e M 6 7 5 1 z 5 4 t z 4 Y w d 0 v r T P m i / a h 9 2 v n b y j u o U K 2 h r x + a 8 d N Z W 5 9 d f 4 f b n Z g = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " x + 7E 4 W q 6 V e A o f a R + M g k h q m X 2 x m k = " > A A A F z H i c j V P f b 9 M w E M 5 a C q P 8 2 A Z v 7 M W i W 4 V Q N T X d E P C A N I k X 3 h g S Z Z O S a H K c a x v Vi S P b 6 V Z Z e U P 8 j / w N / B P Y q Z O t T S W w 0 u p 8 9 5 2 / 7 8 7 n M K O x k M P h 7 5 1 W + 0 H n 4 a P d x 9 0 n T 5 8 9 3 9 s / e P F D s J w TG B N G G b 8 K s Q A a p z C W s a R w l X H A S U j h M p x / N v H L B X A R s / S 7 X G Y Q J H i a x p O Y Y K l d 1 w e t n 0 d H X T + E a Z y q B E s e 3 7 4 t P B p 0 l R 9 O 0 G 1 x r U Z u g V D / E + o j 5 P V 9 C b c y U m E u J f B i o A O l J 1 E Z 4 3 O 9 r 7 Z U C z B b t H U f q S Wb 5 l y a g w P k + / q 7 x z e q + L z q N D K L y R z S O 7 5 I u w A E 1 I z a Y Y o 2 C L s s 1 U J R k D I n J d S G 7 K k h w G S 7 g t O G g h I 7 Q H c F 1 P x N R Z V r o Q j m n M m V q P 4 a d 9 k u w 7 1 J f d a g 3 m x d 3 f s 7 L b b 0 u u S M S S z Z S s r a D W x j f P d f 7 a 4 o G g N Q t b v u N s w t 8 b Z y b a t 9 S K N 6 2 r p 6 A K / 3 e 8 O T Y b l Q 0 3 C t 0 X P s u t B z O / Q j R v I E U k k o F s J z h 5 k M F O Y y J h S K r p 8 L y D C Z 4 y l 4 2 k x x A i J Q 5 X s p 0 L H 2 R G j C u P 6 l E p X e + x k K J 0 I s T c O O t c q Z 2 I w Z 5 7 a Y l 8 v J h 0 D F a Z Z L S M m K a J J T J B k y j w 9 F M Q c i 6 V I b m P B Y a 0 V k h j k m u p s N F j l L B p Z s U A l a w 5 i I Z I y K g Z D Y X J l + x K b 0 C C b a K K t S y X I J l L K b Q v F p W C j d 0 w G y f 2 c N 6 J S D v n e L N J D T E t z A c Y j q 8 8 4 0 R H 8 a k 8 I N Y U m C 9 d 2 u 7 r 3 w 3 E A p / 3 5 W G T D D 1 3 O L Y k v O Y i N n p e h f W d F G l i 1 5 P U 3 P m L s 5 U U 1 j P D r 5 e O J + G / X O j + y w 7 T q H z m v n j e M 6 7 5 1 z 5 4 t z 4 Y w d 0 v r T P m i / a h 9 2 v n b y j u o U K 2 h r x + a 8 d N Z W 5 9 d f 4 f b n Z g = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " x + 7 E 4 W q 6V e A o f a R + M g k h q m X 2 x m k = " > A A A F z H i c j V P f b 9 M w E M 5 a C q P 8 2 A Z v 7 M W i W 4 V Q N T X d E P C A N I k X 3 h g S Z Z O S a H K c a x v Vi S P b 6 V Z Z e U P 8 j / w N / B P Y q Z O t T S W w 0 u p 8 9 5 2 / 7 8 7 n M K O x k M P h 7 5 1 W + 0 H n 4 a P d x 9 0 n T 5 8 9 3 9 s / e P F D s J w TG B N G G b 8 K s Q A a p z C W s a R w l X H A S U j h M p x / N v H L B X A R s / S 7 X G Y Q J H i a x p O Y Y K l d 1 w e t n 0 d H X T + E a Z y q B E s e 3 7 4 t P B p 0 l R 9 O 0 G 1 x r U Z u g V D / E + o j 5 P V 9 C b c y U m E u J f B i o A O l J 1 E Z 4 3 O 9 r 7 Z U C z B b t H U f q S W b 5 l y a g w P k + / q 7 x z e q + L z q N D K L y R z S O 7 5 I u w A E 1 I z a Y Y o 2 C L s s 1 U J R k D I n J d S G 7 K k h w G S 7 g t O G g h I 7 Q H c F 1 P x N R Z V r o Q j m n M m V q P 4a d 9 k u w 7 1 J f d a g 3 m x d 3 f s 7 L b b 0 u u S M S S z Z S s r a D W x j f P d f 7 a 4 o G g N Q t b v u N s w t 8 b Z y b a t 9 S K N 6 2 r p 6 A K / 3 e 8 O T Y b l Q 0 3 C t 0 X P s u t B z O / Q j R v I E U k k o F s J z h 5 k M F O Y y J h S K r p 8 L y D C Z 4 y l 4 2 k x x A i J Q 5 X s p 0 L H 2 R G j C u P 6 l E p X e + x k K J 0 I s T c O O t c q Z 2 I w Z 5 7 a Y l 8 v J h 0 D F a Z Z L S M m K a J J T J B k y j w 9 F M Q c i 6 V I b m P B Y a 0 V k h j k m u p s N F j l L B p Z s U A l a w 5 i I Z I y K g Z D Y X J l + x K b 0 C C b a K K t S y X I J l L K b Q v F p W C j d 0 w G y f 2 c N 6 J S D v n e L N J D T E t z A c Y j q 8 8 4 0 R H 8 a k 8 I N Y U m C 9 d 2 u 7 r 3 w 3 E A p / 3 5 W G T D D 1 3 O L Y k v O Y i N n p e h f W d F G l i 1 5 P U 3 P m L s 5 U U 1 j P D r 5 e O J + G / X O j + y w 7 T q H z m v n j e M 6 7 5 1 z 5 4 t z 4 Y w d 0 v r T P m i / a h 9 2 v n b y j u o U K 2 h r x + a 8 d N Z W 5 9 d f 4 f b n Z g = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " x + 7 E 4 W q 6 V e A o f a R + M g k h q m X 2 x m k = " > A A A F z H i c j V P f b 9 M w E M 5 a C q P 8 2 A Z v 7 M W i W 4 V Q N T X d E P C A N I k X 3 h g S Z Z O S a H K c a x v V i S P b 6 V Z Z e U P 8 j / w N / B P Y q Z O t T S W w 0 u p 8 9 5 2 / 7 8 7 n M K O x k M P h 7 5 1 W + 0 H n 4 a P d x 9 0 n T 5 8 9 3 9 s / e P F D s J w T

1 5 P U 3 P m L s 5 U U 1 j P D r 5 e O J + G / X O j + y w 7 T q H z m v n j e M 6 7 5 1 z 5 4 t z 4 Y w d 0 v r T P m i / a h 9 2 v n b y j u o U K 2 h r x + a 8 d N Z W 5 9 d f 4 f b n Z g = = < / l a t e x i t > x31 = [ chicken, cheese, cheese, yogurt, carrot ] lettuce, chicken, cheese, chicken, yogurt ] x35 = [ leek, chicken, butter, lamb, cheese ]< l a t e x i t s h a 1 _ b a s e 6 4 = "

L t b y e Z n f M r 2 9 U 0 x g d H n w 8 8 L 8 e 9 k 7 6 b t m 2 v V f e a + + N 5 3 v v v R P v i 3 f q j T z S + t d + 2 e 6 1 + 5 3 z z o / O z 8 6 v F b S 1 5 X J e e G u n 8 + c / Y V r s B g = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = "

L t b y e Z n f M r 2 9 U 0 x g d H n w 8 8 L 8 e 9 k 7 6 b t m 2 v V f e a + + N 5 3 v v v R P v i 3 f q j T z S + t d + 2 e 6 1 + 5 3 z z o / O z 8 6 v F b S 1 5 X J e e G u n 8 + c / Y V r s B g = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = "

L t b y e Z n f M r 2 9 U 0 x g d H n w 8 8 L 8 e 9 k 7 6 b t m 2 v V f e a + + N 5 3 v v v R P v i 3 f q j T z S + t d + 2 e 6 1 + 5 3 z z o / O z 8 6 v F b S 1 5 X J e e G u n 8 + c / Y V r s B g = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = "

L t b y e Z n f M r 2 9 U 0 x g d H n w 8 8 L 8 e 9 k 7 6 b t m 2 v V f e a + + N 5 3 v v v R P v i 3 f q j T z S + t d + 2 e 6 1 + 5 3 z z o / O z 8 6 v F b S 1 5 X J e e G u n 8 + c / Y V r s B g = = < / l a t e x i t >

9 s 0 u 2 N e f a O a x v h w 9 H r k v T / s H f e r Z d t 3 H j m P n W e O 5 7 x 0 j p 1 3 z o k z d k j r U + t z 6 0 v r a / t b + 0 f 7 Z / v X C r r X q j g P n a 3 T / v M X w I 9 s c Q = = < / l a t e x i t > c 0 2 = [ dairy, meat, meat, meat, dairy ] c2 = [ meat, dairy, dairy, veggie, meat ] < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3

T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3

T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3

T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " m e D Y M 6 n V y a I 7 Q 3

T M b D t 0 P / 4 7 h 7 3 K u H 7 c B 5 4 j x 1 X j i + 8 9 o 5 d j 4 4 J 8 7 E I e 7 U L d w v 7 t f 2 t / a P 9 s / 2 r z W 0 5 d a c x 8 7 O a v / 5 C / 8 R a U Q = < / l a t e x i t > c 0 3 = [ meat, dairy, dairy, dairy, veggie ] c3 = [ veggie, meat, dairy, meat, dairy ]

Figure 1: Data model with parameters set to L = 5, n w = 12, n c = 3, R = 3, and n spl = 5.

Figure 2: A simple neural net.

r 9 8 P P 8 3 4 4 b k y D R I z H 3 3 d 2 O z e 6 N 2 / t 3 e 7 d u X v v / o P 9 g 4 e f E p Z y A l P C K

6 2 P e O x 8 c R 4 Z p j G K + P E e G + c G l O D d O z O 5 8 6 X z t f u t + 6 P 7 s / u r 7 X r 7 o 6 O e W R s r O 6 f v 5 3 G E e U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " o p P b O u / 7 t S m S s p 0 A 7 o / c J L B 5 0

r 9 8 P P 8 3 4 4 b k y D R I z H 3 3 d 2 O z e 6 N 2 / t 3 e 7 d u X v v / o P 9 g 4 e f E p Z y A l P C K

6 2 P e O x 8 c R 4 Z p j G K + P E e G + c G l O D d O z O 5 8 6 X z t f u t + 6 P 7 s / u r 7 X r 7 o 6 O e W R s r O 6 f v 5 3 G E e U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " o p P b O u / 7 t S m S s p 0 A 7 o / c J L B 5 0 M k = " > A A A G P X i c j V R N b 9 M w G M 5 W C q N 8 b X D k Y t F S I V R N T T e + D k i T u H A c E m W T k m h y n D d t V C e O H K d b Z e X C F X 4 U f 4 M / w A l x Q 1 x x U i d L m 0 h g J d H r 9 8 P P 8 3 4 4 b k y D R I z H 3 3 d 2 O z e 6 N 2 / t 3 e 7 d u X v v / o P 9 g 4 e f E p Z y A l P C K

6 2 P e O x 8 c R 4 Z p j G K + P E e G + c G l O D d O z O 5 8 6 X z t f u t + 6 P 7 s / u r 7 X r 7 o 6 O e W R s r O 6 f v 5 3 G E e U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " o p P b O u / 7 t S m S s p 0 A 7 o / c J L B 5 0 M k = " > A A A G P X i c j V R N b 9 M w G M 5 W C q N 8 b X D k Y t F S I V R N T T e + D k i T u H A c E m W T k m h y n D d t V C e O H K d b Z e X C F X 4 U f 4 M / w A l x Q 1 x x U i d L m 0 h g J d H r 9 8 P P 8 3 4 4 b k y D R I z H 3 3 d 2 O z e 6 N 2 / t 3 e 7 d u X v v / o P 9 g 4 e f E p Z y A l P C K

Figure 3: More general version of our data model.The two previous sections were concerned with the case in which each unfamiliar sequence of concepts has a single representative in the training set. In this section we consider the more general case in which each unfamiliar sequence of concepts has n * representatives in the training set. Figure3depicts an example with n spl = 6 and n * = 2. This means that each category contains a total of n spl = 6 sentences, and that n * = 2 of these sentences are generated by the unfamiliar sequence of concepts (the remaining four are generated by the familiar sequence of concepts). The other parameters in this example are L = 5, n w = 12, n c = 3 and R = 3.

s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < /l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j

s a 1 s L 1 S P m e a c b T r 3 1 X B e e H 1 Y t 6 m o a 2 a L o N l K Y 1 n 0 I Z A 2 U 9 i n 0 6 v r e l m H O x O 9 C q Y H k z e T Y L P b 0 Z H z / t h 7 5 G n 5 B l 5 R Q L y l h y R T + S Y T A l 3 P j q F s 3 D O 3 W / u d / e H e 7 k u d Z 2 e 8 4 R s m f v r N 7 a h A k 0 = < /l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " N e 9 n / s K 5 P T g 9 E 3 b x j c f v U s y Q j

r m 8 e m b n d m k k s K g 7 1 8 6 7 q 3 b d + 7 e 2 7 s / e P D w 0 e P 9 4 c G T r 6 a s

) so an upper bound for P DM [A] is also an upper bound for P DM [E meaningful ].

Figure 5: Two equipartitions of the same graph (each subsets of the equipartitions contain 7 vertices).The equipartition on the left is cut-free (no edges are severed). The equipartition on the right is not cut-free (4 edges are severed). The optimal kernel K (x, y) can be interpreted as the number of distinct cut-free equipartitions of the graph ζ(x, y) (modulo some scaling factor.)

Fix once and for all a graph G = (V, E) with edge set E = {e 1 , . . . , e m } where m ≤ L. Given 0 ≤ α ≤ L, define the set O α = {(x, y) ∈ X × X : ζ(x, y) = G and (x, y) has exactly α non-silent positions}.

h c y o p 0 1 6 W K d n e K I S v U m 2 N X T W i H l w O + r b V t z 8 O O u f d 3 K R D 4 4 n x 1 H h m 2 M Z r 4 9 x 4 b 1 w Y Q 4 M 1 P z S j 5 q r 5 x Q T z q / n N / L 7 5 t H G Q r 3 l s l I b 5 4 w 9 L m o t h < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " C W y e r V o 4 J k d Z K W 7 a x s c B g O Q e L C A = " > A A A E 4 3 i c h Z P L j t M w F I Y z b Y C h X K Y D S z Y W 7 V Q I j a q k w 2 2 D N B I b x G q Q 6 M x I T V Q 5 7 m l r N Y k j + w S 1 R H k C d o g t D 8 a a D Y + B 0 6 a d X A p Y i X T 8 n 9 j f O b 9 j L / K 5 Q s v6 e d B o m r d u 3 z m 8 2 7 p 3 / 8 H D o / b x o 0 s l Y s l g y I Q v 5 L V H F f g 8 h C F y 9 O E 6 k k A D z 4 c r b / E u y 1 9 9 B q m 4 C D / h K g I 3 o

7 E E 5 2 f 3 O r 2 x 2 3 O 1 b f W g 9 S D + w 8 6 B j 5 u B i 3 f z s T w e I A Q mQ + V W p k W x G 6 C Z X I m Q 9 p y 4 k V R J Q t 6 A x G O g x p A M p N 1 r c x J S d a m Z C p k P o N k a z V 4 o q E B k q t d C P k R N c 4 V 9 V c J u 7 L j W K c v n E T H k Y x Q s g 2 o G n s E x Q k u 9 p k w i U w 9 F c 6 o E x y X S t h c y o p 0 1 6 W K d n e K I S v U m 2 N X T W i H l w O + r b V t z 8 O O u f d 3 K R D 4 4 n x 1 H h m 2 M Z r 4 9 x 4 b 1 w Y Q 4 M 1 P z S j 5 q r 5 x Q T z q / n N / L 7 5 t H G Q r 3 l s l I b 5 4 w 9 L m o t h < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " C W y e r V o 4 J k d Z K W 7 a x s c B g O Q e L C A = " > A A A E 4 3 i c h Z P L j t M w F I Y z b Y C h X K Y D S z Y W 7 V Q I j a q k w 2 2 D N B I b x G q Q 6 M x I T V Q 5 7 m l r N Y k j + w S 1 R H k C d o g t D 8 a a D Y + B 0 6 a d X A p Y i X T 8 n 9 j f O b 9 j L / K 5 Q s v6 e d B o m r d u 3 z m 8 2 7 p 3 / 8 H D o / b x o 0 s l Y s l g y I Q v 5 L V H F f g 8 h C F y 9 O E 6 k k A D z 4 c r b / E u y 1 9 9 B q m 4 C D / h K g I 3 o

7 E E 5 2 f 3 O r 2 x 2 3 O 1 b f W g 9 S D + w 8 6 B j 5 u B i 3 f z s T w e I A Q mQ + V W p k W x G 6 C Z X I m Q 9 p y 4 k V R J Q t 6 A x G O g x p A M p N 1 r c x J S d a m Z C p k P o N k a z V 4 o q E B k q t d C P k R N c 4 V 9 V c J u 7 L j W K c v n E T H k Y x Q s g 2 o G n s E x Q k u 9 p k w i U w 9 F c 6 o E x y X S t h c y o p 0 1 6 W K d n e K I S v U m 2 N X T W i H l w O + r b V t z 8 O O u f d 3 K R D 4 4 n x 1 H h m 2 M Z r 4 9 x 4 b 1 w Y Q 4 M 1 P z S j 5 q r 5 x Q T z q / n N / L 7 5 t H G Q r 3 l s l I b 5 4 w 9 L m o t h < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " C W y e r V o 4 J k d Z K W 7 a x s c B g O Q e L C A = " > A A A E 4 3 i c h Z P L j t M w F I Y z b Y C h X K Y D S z Y W 7 V Q I j a q k w 2 2 D N B I b x G q Q 6 M x I T V Q 5 7 m l r N Y k j + w S 1 R H k C d o g t D 8 a a D Y + B 0 6 a d X A p Y i X T 8 n 9 j f O b 9 j L / K 5 Q s v6 e d B o m r d u 3 z m 8 2 7 p 3 / 8 H D o / b x o 0 s l Y s l g y I Q v 5 L V H F f g 8 h C F y 9 O E 6 k k A D z 4 c r b / E u y 1 9 9 B q m 4 C D / h K g I 3 o

7 E E 5 2 f 3 O r 2 x 2 3 O 1 b f W g 9 S D + w 8 6 B j 5 u B i 3 f z s T w e I A Q m Q + V W p k W x G 6 C Z X I m Q 9 p y 4 k V R J Q t 6 A x G O g x p A M p N 1 r c x J S d a m Z C p k P o N k a z V 4 o q E B k q t d C P k R N c 4 V 9 V c J u 7 L j W K c v n E T H k Y x Q s g 2 o G n s E x Q k u 9 p k w i U w 9 F c 6 o E x y X S t h c y o p 0 1 6 W K d n e K I S v U m 2 N X T W i H l w O + r b V t z 8 O O u f d 3 K R D 4 4 n x 1 H h m 2 M Zr 4 9 x 4 b 1 w Y Q 4 M 1 P z S j 5 q r 5 x Q T z q / n N / L 7 5 t H G Q r 3 l s l I b 5 4 w 9 L m o t h < / l a t e x i t >

T I l M c I Y O T f o / 4 x R m Q l U 5 Q y O W + 3 U k k 1 4 V p x l d 1 j 9 i h C V W C B Z r S o c n d E h p N G z h S s 9 K g / V H x 1 u Q l o h g H O j u n B m j m 4 I h 7 c B c 8 A W o 2 6 Q t c S M S 2 o t B T f 8 t M e k P g l G w D v p Q h J 0 Y k C 7 O J / 3 f 8 V T z M g e F X D J r o z A o M K m Y Q c E l 1 L 2 4 t F A w v m A z i J x U L A e b V G v 3 a v r O k S n N t H F H I V 3 T 2 x 0 V y 6 1 d 5 a m r d A v O 7 f 1 c A / + X i 0 r M j p J K q K J E U L x 9 K C s l R U 2 b r 6 B T Y Y C j X D n B u B F u V 8r n z D D u n L z 7 S j M b t Z a 2 s S a 8 b 8 R D c f l p F A a j 8 O J g c P q

T I l M c I Y O T f o / 4 x R m Q l U 5 Q y O W + 3 U k k 1 4 V p x l d 1 j 9 i h C V W C B Z r S o c n d E h p N G z h S s 9 K g / V H x 1 u Q l o h g H O j u n B m j m 4 I h 7 c B c 8 A W o 2 6 Q t c S M S 2 o t B T f 8 t M e k P g l G w D v p Q h J 0 Y k C 7 O J / 3 f 8 V T z M g e F X D J r o z A o M K m Y Q c E l 1 L 2 4 t F A w v m A z i J x U L A e b V G v 3 a v r O k S n N t H F H I V 3 T 2 x 0 V y 6 1 d 5 a m r d A v O 7 f 1 c A / + X i 0 r M j p J K q K J E U L x 9 K C s l R U 2 b r 6 B T Y Y C j X D n B u B F u V 8r n z D D u n L z 7 S j M b t Z a 2 s S a 8 b 8 R D c f l p F A a j 8 O J g c P q

T I l M c I Y O T f o / 4 x R m Q l U 5 Q y O W + 3 U k k 1 4 V p x l d 1 j 9 i h C V W C B Z r S o c n d E h p N G z h S s 9 K g / V H x 1 u Q l o h g H O j u n B m j m 4 I h 7 c B c 8 A W o 2 6 Q t c S M S 2 o t B T f 8 t M e k P g l G w D v p Q h J 0 Y k C 7 O J / 3 f 8 V T z M g e F X D J r o z A o M K m Y Q c E l 1 L 2 4 t F A w v m A z i J x U L A e b V G v 3 a v r O k S n N t H F H I V 3 T 2 x 0 V y 6 1 d 5 a m r d A v O 7 f 1 c A / + X i 0 r M j p J K q K J E U L x 9 K C s l R U 2 b r 6 B T Y Y C j X D n B u B F u V 8r n z D D u n L z 7 S j M b t Z a 2 s S a 8 b 8 R D c f l p F A a j 8 O J g c P q

Figure 6: Data Model with n * = 2 unfamiliar sentences per category. The other parameters in this example are set to L = 5, n w = 12, n c = 3, R = 3 and n spl = 6. The points highlighted in yellow are the ones involved in the definition of the event A (1) , see equation (109).

test , x r,s ) > max 3≤s ≤nspl K(x test , x r ,s )

n spl = 6 n spl = 7 n spl = 8 n spl = 9 n spl = 10

d in = 150, d hidden = 500, d out = 10, MLP 2: d in = 90, d hidden = 2000, d out = 1000.

Success rate on unfamiliar test sentences. = 5 n spl = 6 n spl = 7 n spl = 8 n spl = 9 n spl = 10 Algorithms. We evaluate empirically seven different algorithms. The first two rows of the table correspond to experiments in which the neural network in Figure2is trained with SGD and constant learning rate. At test time, we consider two different strategies to classify test sentences. The first row of the table considers the usual situation in which the trained neural network is used to classify test points. The second row considers the situation in which the trained neural network is only used to extract features (i.e. the concatenated words representation right before MLP2). The classification of test points is then accomplished by running a nearest neighbor classifier on these learned features.

Table1reports the success rate of each algorithm on unfamiliar test sentences. A crystal-clear pattern emerges. Algorithms that learn features generalize almost perfectly, while algorithms that do not learn features catastrophically fail. Moreover, the specific classification rule matters little. For example, replacing MLP2 by a nearest neighbor classifier on the top of features learned by the neural network leads to equally accurate results. Similarly, replacing the nearest neighbor classifier by a SVM on the top of features extracted by ψ or ψ one-hot leads to almost equally poor results. The only thing that matters is whether or not the features are learned. Finally, inequality (10) gives an upper bound of 0.015n * + 1/1000 on the success rate of the nearest neighbor classification rule applied on the top of any possible feature map (including ψ and ψ one-hot ). The fifth row of Table1compares this bound against the empirical accuracy obtained with ψ and ψ one-hot , and the comparison shows that our theoretical upper bound is relatively tight.

First five rows of the Strirling triangle for the Stirling numbers n k .

Accuracy in % on unfamiliar test points (L = 9, n w = 50, n c = 5, R = 1000).

acknowledgement

Acknowledgements. Xavier Bresson is supported by NUS-R-252-000-B97-133 and A*STAR Grant ID A20H4g2141.

Appendix

In Section A we prove a few elementary properties of the permuted moment (11) . Section B is devoted to the proof of inequality ( 13), which we restate here for convenience:

C.1 GRAPH-CUT FORMULATION OF THE OPTIMAL KERNEL

In this section we consider undirected graphs on the vertex set V := {1, 2, . . . , n w }.Since the data space X consists of sentences of length L, graphs that have at most L edges will be of particular importance. We therefore define:G := {All graphs on V that have at most L edges}.In other words, G consists in all the graphs G = (V, E) whose edge set E has cardinality less or equal to L. Since these graphs all have the same vertex set, we will often identify them with their edge set. We now introduce a mapping between pairs of sentences containing L words, and graphs containing at most L edges.The right hand side of ( 78) is a set of at most L edges. Since graphs in G are identified with their edge set, ζ indeed define a mapping from X × X to G. Let us give a few examples illustrating how the map ζ works. Suppose we have a vocabulary of n w = 10 words and sentences of length L = 6.Consider the pair of sentences (x, y) ∈ X × X whereThen ζ(x, y) is the set of 3 edgeswhich indeed define a graph on V. Note that position = 2 and = 4 of (x, y) 'code' for the same edge {2, 5}, position 5 codes for the edge {9, 2}, and position 6 codes for the edge {7, 1}. On the other hand, position 1 and 3 do not code for any edge: indeed, since x 1 = y 1 and x 3 = y 3 , these two positions do not contribute any edges to the edge set defined by ( 78). We will say that positions 1 and 3 are silent. We make this terminology formal in the definition below:Definition 3. Let (x, y) ∈ X × X . If x = y for some 1 ≤ ≤ L, we say that position of the pair (x, y) is silent. If x = y for some 1 ≤ ≤ L, we say that position of the pair (x, y) codes for the edge {x , y }.Note that if (x, y) has some silent positions, or if multiple positions codes for the same edge, then the graph ζ(x, y) will have strictly less than L edges. On the other hand, if none of these take place, then ζ(x, y) will have exactly L edges. For example the pair of sentencesdoes not have silent positions, and all positions code for different edges. The corresponding graphhas the maximal possible number of edges, namely L = 6 edges. From the above discussion, it is clear that any graph with L or less edges can be expressed as ζ(x, y) for some pair of sentences (x, y) ∈ X × X . Therefore ζ : X × X → G is surjective. On the other hand, different pair of sentences can be mapped to the same graph. Therefore ζ is not injective. We now introduce the following function.Definition 4 (Number of cut-free equipartitions of a graph). The function I : G → N is defined by :Published as a conference paper at ICLR 2023 Applying a SVM on the feature extracted by ψ is equivalent to running a kernelized SVM with kernel K . A naive implementation of such algorithm leads to very poor results on our data model. For such algorithm to not completely fail, it is important to carefully "rescale" K so that the eigenvalues of the corresponding Gram matrix are well behaved. Recall thatand let ξ : R → R be a strictly increasing function. Since the nearest neighbor classification rule works by comparing the values of K on various pairs of points, it is clear that using the kernel K (x, y) = ξ(K (x, y)) is equivalent to using the kernel K (x, y). In particular, choosing ξ(x) := log(1 + (n L w /α)x) gives the following family of optimal kernels:To be clear, all these kernels are exactly equivalent to the the kernel K when using a nearest neighbor classification rule. However, they lead to different algorithms when used for kernelized SVM.We have experimented with various choice of the function ξ and found out that this logarithmic scaling works well for kernelized SVM.For each of the 10 possible parameter settings in Table 3 and Table 4 , we do 10 experiments. For each experiment we generate:• A training set containing R × n spl sentences.• A test set containing 1, 000 unfamiliar sentences (one sentences per category).Let us denote by x train i , 1 ≤ i ≤ R × n spl , the data points in one of these training set, and by x test i , 1 ≤ i ≤ 1000, the data points in the corresponding test set. In order to run the kernelized SVM algorithm we need to form the Gram matricesConstructing each of these Gram matrices takes a few days on CPU. We then use the SVC function of Scikit-learn to run a soft multiclass kernelized-SVM algorithm. We tried various values for the parameter controlling the 2 and found that the algorithm is not sensitive to this choice -so we chose C = 1. The algorithm, on the other hand, is quite sensitive to the choice of the hyperparamater α defining the kernel K α . We experimented with various choices of α and found that choosing the smallest α that makes the Gram matrix G train positive definite works well (note that the Gram matrix should be positive semidefinite for the kernelized SVM method to make sense). In Table 5 we show an example, on a specific pair of train and test set 5 , of how the eigenvalues of G train and the test accuracy depends on α.5 the training set and test set used in this experiment were generated by our data model with parameters L = 9, nw = 50, nc = 5, R = 1000, n spl = 8, and n * = 3

