ORIGINAL: `Ladino is in serious danger of extinction because many native speakers today are elderly as well as elderly ''olim'' (immigrants to [[Israel]]) who have not transmitted the language to their children or grandchildren.' FULL: None WITH CHUNKING: None ORIGINAL: `[[Software license]] gives the user the right to use the software in the licensed environment, some software comes with the license when purchased off the shelf, or an OEM license when bundled with hardware.' FULL: None WITH CHUNKING: None ORIGINAL: `In the summer of 1924, the [[American Radio Relay League]] adopted Esperanto as its official [[international auxiliary language]], and hoped that the language would be used by [[Amateur radio|radio amateurs]] in international communications, but its actual use for radio communications was negligible.' FULL: [[0.033879000000000006, 'In the summer of 1924 the American radio relay league adopted Esperanto as its official international auxiliary language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [0.033879000000000006, 'In the summer of 1924 the American radio relay league adopted Esperanto as its official auxiliary international language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [0.033879000000000006, 'In the summer of 1924 the American radio relay league adopted Esperanto as its international official auxiliary language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [0.033879000000000006, 'In the summer of 1924 the American radio relay league adopted Esperanto as its international auxiliary official language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [0.033879000000000006, 'In the summer of 1924 the American radio relay league adopted Esperanto as its auxiliary official international language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.']] WITH CHUNKING: [[-4.576657797065464, 'In the summer of 1924 the american radio relay league adopted esperanto as its auxiliary official international language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [-4.576657797065464, 'In the summer of 1924 the american radio relay league adopted esperanto as its international auxiliary official language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [-4.576657797065464, 'In the summer of 1924 the american radio relay league adopted esperanto as its international official auxiliary language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [-4.576657797065464, 'In the summer of 1924 the american radio relay league adopted esperanto as its official auxiliary international language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.'], [-4.576657797065464, 'In the summer of 1924 the american radio relay league adopted esperanto as its official international auxiliary language and hoped that the language would be used by radio amateurs in international communications but its actual use for radio communications was negligible.']] ORIGINAL: `Although German is usually cited as an outstanding example of a highly inflected language, the degree of inflection is considerably less than in [[Old German]], or in other old [[Indo-European languages]] such as [[Latin]], [[Ancient Greek]], or [[Sanskrit]].' FULL: [[0.0037130000000000006, 'Although German is usually cited as an outstanding example of a highly inflected language the degree of inflection is considerably less than in old German or in indo- european other old languages such as Latin ancient Greek or Sanskrit.'], [0.003336, 'Although German is usually cited as an outstanding example of a language inflected highly the degree of inflection is considerably less than in old German or in indo- european other old languages such as Latin ancient Greek or Sanskrit.'], [0.002783, 'Although German is usually cited as an outstanding example of a highly inflected language the degree of inflection is considerably less than in old German or in other indo- european old languages such as Latin ancient Greek or Sanskrit.'], [0.002725, 'Although German is usually cited as an outstanding example of a language inflected high the degree of inflection is considerably less than in old German or in indo- european other old languages such as Latin ancient Greek or Sanskrit.'], [0.0026769999999999997, 'Although German is usually cited as an outstanding example of a highly inflected language the degree of inflection is considerably less than in old German or in indo- european old other languages such as Latin ancient Greek or Sanskrit.']] WITH CHUNKING: [[-5.2713332978844845, 'Although usually german is cited as an outstanding example of a highly inflected language the degree of inflection is considerably more few than in old german or in indo- european other old languages such as latin ancient greek or sanskrit.'], [-5.2713332978844845, 'Although usually german is cited as an outstanding example of a highly inflected language the degree of inflection is considerably more little than in old german or in indo- european other old languages such as latin ancient greek or sanskrit.'], [-5.058321567553195, 'Although usually german is cited as an outstanding example of a highly inflected language the degree of inflection is considerably less than in old german or in indo- european old other languages such as latin ancient greek or sanskrit.'], [-5.01982639418871, 'Although usually german is cited as an outstanding example of a highly inflected language the degree of inflection is considerably less than in old german or in other indo- european old languages such as latin ancient greek or sanskrit.'], [-4.731660289528816, 'Although usually german is cited as an outstanding example of a highly inflected language the degree of inflection is considerably less than in old german or in indo- european other old languages such as latin ancient greek or sanskrit.']] ORIGINAL: `However, it is difficult if not impossible to precisely distinguish program "metadata" from general aspects of [[Von Neumann architecture|stored-program computing architecture]]; if the machine reads it and acts upon it, it is a computational [[Instruction (computer science)|instruction]], and the prefix "meta" has little significance.' FULL: [[0.00020999999999999995, 'However it is difficult if not impossible to distinguish program metadata from general aspects of stored program computing architecture precisely if the machine reads it and acts upon it it is a computational instruction and the prefix meta has little significance.'], [0.00019799999999999993, 'However to precisely distinguish program metadata from general aspects of stored program computing architecture it is difficult if not impossible if the machine reads it and acts upon it it is a computational instruction and the prefix meta has little significance.'], [0.00016899999999999993, 'However it is difficult if not impossible to precisely distinguish program metadata from general aspects of stored program computing architecture if the machine reads it and acts upon it it is a computational instruction and the prefix meta has little significance.'], [0.00014399999999999992, 'However to distinguish program metadata from general aspects of stored program computing architecture precisely it is difficult if not impossible if the machine reads it and acts upon it it is a computational instruction and the prefix meta has little significance.'], [9.699999999999997e-05, 'However to precisely distinguish program metadata from general aspects of stored program computing architecture it is difficult if not impossible it is a computational instruction if the machine reads it and acts upon it and the prefix meta has little significance.']] WITH CHUNKING: None ORIGINAL: `These categories evolve as [[learning theory (education)|learned]] concepts of the world — meaning is not an objective truth, but a subjective construct, learned from experience, and language arises out of the "grounding of our conceptual systems in shared [[embodied philosophy|embodiment]] and bodily experience".' FULL: None WITH CHUNKING: None ORIGINAL: `From the diagram, it is clear that the value of the hidden variable x(t) (at time t) ''only'' depends on the value of the hidden variable x(t-1) : the values at time t-2 and before have no influence.' FULL: [[0.002377, 'From the diagram it is clear that the value at time of the hidden variable only depends upon the value of the hidden variable : the values at and before time have no influence.'], [0.002377, 'From the diagram it is clear that the value at time of the hidden variable only depends on the value of the hidden variable : the values at and before time have no influence.'], [0.001712, 'From the diagram it is clear that the value at time of the hidden variable only depends upon the value of the variable hidden away : the values at and before time have no influence.'], [0.001712, 'From the diagram it is clear that the value at time of the hidden variable only depends on the value of the variable hidden away : the values at and before time have no influence.'], [0.0008679999999999998, 'From the diagram it is clear that the value at time of the hidden variable only depends upon the value of the hidden variable : the values at time and before have no influence.']] WITH CHUNKING: [[-6.531739453453014, 'From the diagram it is clear that only the value at time of the hidden variable depends upon the value of the hidden variable : the values at time and before have no influence.'], [-6.480299232488448, 'From the diagram it is clear that only the value at time of the hidden variable depends on the value of the variable hidden away : the values at and before time have no influence.'], [-6.480299232488448, 'From the diagram it is clear that only the value at time of the hidden variable depends upon the value of the variable hidden away : the values at and before time have no influence.'], [-5.522438955485155, 'From the diagram it is clear that only the value at time of the hidden variable depends on the value of the hidden variable : the values at and before time have no influence.'], [-5.522438955485155, 'From the diagram it is clear that only the value at time of the hidden variable depends upon the value of the hidden variable : the values at and before time have no influence.']] ORIGINAL: `No figures were provided during the 1995 and 2000 censuses; however, figures for 2000 did specify there were over 600,000 native speakers of [[Chavacano language|Chavacano]], a Spanish based [[Creole language|creole]] language spoken in [[Cavite]] and [[Zamboanga]].' FULL: [[0.020624999999999998, 'No figures were provided during the 1995 and 2000 censuses. figures for 2000 specified that there were over 600000 native speakers of Chavacano a Spanish based creole language spoken in Cavite and Zamboanga.'], [0.0025830000000000002, 'No figures were provided during the 1995 and 2000 censuses. figures for 2000 specified there were over 600000 native speakers of Chavacano a Spanish based creole language spoken in Cavite and Zamboanga.']] WITH CHUNKING: [[-7.793098000224208, 'No figures were provided during the 1995 and 2000 censuses. figures for 2000 specified there were over 600000 native speakers of chavacano a spanish based creole language spoken in cavite and zamboanga.'], [-4.674559397321639, 'No figures were provided during the 1995 and 2000 censuses. figures for 2000 specified that there were over 600000 native speakers of chavacano a spanish based creole language spoken in cavite and zamboanga.']] ORIGINAL: `Just because a particular [[operating system]] may run on different [[computer architecture]]s, that does not mean that the software written for that operating system will automatically work on all [[computer architecture|architecture]]s that the operating system supports.' FULL: None WITH CHUNKING: [[-6.639848541712948, 'Just because a particular operating system may run on different computer architectures that does not mean like the software written for that operating system will work on all architectures which the operating system supports automatically.'], [-6.577464380592872, 'That does not mean that the software written for that operating system will work on all architectures which the operating system supports automatically just because a particular operating system may run on different computer architectures.'], [-6.576584787443675, 'Just because a particular operating system may run on different computer architectures that does not mean that the software written for that operating system will work on all architectures who the operating system supports automatically.'], [-6.391454973029215, 'Just because a particular operating system may run on different computer architectures that does not mean that the software written for that operating system will work on all architectures that the operating system supports automatically.'], [-6.323231723844678, 'Just because a particular operating system may run on different computer architectures that does not mean that the software written for that operating system will work on all architectures which the operating system supports automatically.']] ORIGINAL: `It is the language with the largest number of speakers in [[South America]], spoken by nearly all of Brazil's population, which amounts to over 51% of the continent's population even though it is the only Portuguese-speaking nation in [[the Americas]].' FULL: [[0.011022000000000007, "It is the language with the largest number of speakers in South America spoken by nearly all Brazil's population which even though it is the only Portuguese speaking nation in the Americas amounts to over 51 % of the continent's population."], [0.008870000000000006, "It is the language with the largest number of speakers in South America spoken by nearly all Brazil's population who even though it is the only Portuguese speaking nation in the Americas amounts to over 51 % of the continent's population."], [0.005815000000000003, "It is the language with the largest number of speakers in South America spoken by nearly all Brazil's population which amounts to over 51 % of the continent's population even though it is the only Portuguese speaking nation in the Americas."], [0.004719000000000001, "It is the language with the largest number of speakers in South America spoken by nearly all Brazil's population who amounts to over 51 % of the continent's population even though it is the only Portuguese speaking nation in the Americas."], [0.0020819999999999988, "It is the language with the largest number of speakers in South America spoken by nearly all of Brazil's population which even though it is the only Portuguese speaking nation in the Americas amounts to over 51 % of the continent's population."]] WITH CHUNKING: None ORIGINAL: `Information theory, however, does not consider message importance or meaning, as these are matters of the quality of data rather than the quantity and readability of data, the latter of which is determined solely by probabilities.' FULL: None WITH CHUNKING: None ORIGINAL: `These disciplines also involve rigorous data analysis, and are widely used in business for segmentation and decision making, but have different purposes and the statistical techniques underlying them vary.' FULL: None WITH CHUNKING: None ORIGINAL: `As it happens, no other method can do any better, as was shown by [[Alan Turing]] with his celebrated result on the undecidability of the so-called [[halting problem]].' FULL: [[0.0008030000000000001, 'As it happens no other method can do any better as were shown by Alan Turing with his celebrated result on the undecidability of the so-called halting problem.'], [0.0008030000000000001, 'As it happens no other method can do any better as were shown by Alan Turing with his celebrated result on the undecidability of the so called halting problem.'], [0.0006659999999999999, 'As it happens no other method can do any better as was shown by Alan Turing with his celebrated result on the undecidability of the so-called halting problem.'], [0.0006659999999999999, 'As it happens no other method can do any better as was shown by Alan Turing with his celebrated result on the undecidability of the so called halting problem.'], [0.000253, 'As it happens no other method can do any better as were shown by Alan Turing with his result on the undecidability of the so-called halting problem celebrated.']] WITH CHUNKING: None ORIGINAL: `[[Music]], the [[performing arts]], [[amusement park]]s, works of [[fiction]] and so on are thus forms of information in this sense, but they are not necessarily forms of information according to some definitions given above.' FULL: [[6.599999999999998e-05, 'Thus music the performing arts amusement parks works of fiction and so on are forms of information in this sense but they are not necessarily forms of information according to some definitions given above.'], [3.499999999999999e-05, 'Music the performing arts amusement parks works of fiction and so on are thus forms of information in this sense but they are not necessarily forms of information according to some definitions given above.'], [1.9e-05, 'Music the performing arts amusement parks works of fiction and so on thus are forms of information in this sense but they are not necessarily forms of information according to some definitions given above.'], [2e-06, 'Music the performing arts amusement parks works of fiction and so on are forms of information in this sense thus but they are not necessarily forms of information according to some definitions given above.']] WITH CHUNKING: [[-12.95267963781293, 'Music the performing arts amusement parks works of fiction and so on are forms of information in this sense thus but they are not necessarily forms of information according to some definitions given above.'], [-12.30370367556612, 'Music the performing arts amusement parks works of fiction and so on are thus forms of information in this sense but they are not necessarily forms of information according to some definitions given above.'], [-12.235141004128826, 'Music the performing arts amusement parks works of fiction and so on thus are forms of information in this sense but they are not necessarily forms of information according to some definitions given above.'], [-10.389631421608417, 'Thus music the performing arts amusement parks works of fiction and so on are forms of information in this sense but they are not necessarily forms of information according to some definitions given above.']] ORIGINAL: `Conversely, if one distributes copies of the work without abiding by the terms of the GPL (for instance, by keeping the source code secret), he or she can be [[lawsuit|sued]] by the original author under copyright law.' FULL: [[0.020246, 'Under © law if one distributes copies of the work without abiding for instance by the terms of the GPL by keeping the source code secret he or she can be sued by the original author conversely.'], [0.020246, 'Under copyright law if one distributes copies of the work without abiding for instance by the terms of the GPL by keeping the source code secret he or she can be sued by the original author conversely.'], [0.009289999999999986, 'The original author if one distributes copies of the work without abiding for instance by the terms of the GPL by keeping the source code secret he or she can be sued by under © law conversely.'], [0.009289999999999986, 'The original author if one distributes copies of the work without abiding for instance by the terms of the GPL by keeping the source code secret he or she can be sued by under copyright law conversely.'], [0.008254999999999972, '© Law if one distributes copies of the work without abiding for instance by the terms of the GPL by keeping the source code secret he or she can be sued by the original author under conversely.']] WITH CHUNKING: None ORIGINAL: `Convinced that the pages with the most links to them from other highly relevant web pages must be the most relevant pages associated with the search, Page and Brin tested their thesis as part of their studies, and laid the foundation for their search engine.' FULL: [[0.0019419999999999995, 'Convinced that the pages with the most links to them from other highly relevant web pages must be the most relevant pages associated with the search Page and Brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [0.0011149999999999999, 'Convinced that the pages with the most links to them from highly relevant other web pages must be the most relevant pages associated with the search Page and Brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [0.0009810000000000008, 'Convinced that the pages with the most links to them from other web pages relevant highly must be the most relevant pages associated with the search Page and Brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [0.0008880000000000014, 'Convinced that the pages with the most links to them from other web pages relevant high must be the most relevant pages associated with the search Page and Brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [0.0003340000000000002, 'Convinced the pages with the most links to them from other highly relevant web pages must be the most relevant pages associated with the search Page and Brin tested their thesis as part of their studies and laid the foundation for their search engine.']] WITH CHUNKING: [[-7.121213409528758, 'Convinced the pages with the most links to them from other highly relevant web pages must be the most relevant pages associated with the search page and brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [-6.445176087164738, 'Convinced that the pages with the most links to them from other web pages relevant high must be the most relevant pages associated with the search page and brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [-6.347970956106668, 'Convinced that the pages with the most links to them from other web pages relevant highly must be the most relevant pages associated with the search page and brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [-6.250628246532372, 'Convinced that the pages with the most links to them from highly relevant other web pages must be the most relevant pages associated with the search page and brin tested their thesis as part of their studies and laid the foundation for their search engine.'], [-5.6964107950742395, 'Convinced that the pages with the most links to them from other highly relevant web pages must be the most relevant pages associated with the search page and brin tested their thesis as part of their studies and laid the foundation for their search engine.']] ORIGINAL: `Though there is very little literature on [[parsing]] [[algorithms]], most of these algorithms assume that the language to be parsed is initially ''described'' by means of a ''generative'' formal grammar, and that the goal is to transform this generative grammar into a working parser.' FULL: [[0.009888000000000001, 'Though there is very little literature on parsing algorithms most of these algorithms assume that the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.'], [0.007250000000000002, 'Though there is very little literature on parsing algorithms most of these algorithms assume that the language to be parsed is initially described by means of a generative formal grammar and that the goal is to transform this generative grammar into a working parser.'], [0.006770000000000001, 'Though there is very little literature on parsing algorithms most of these algorithms assume like the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.'], [0.006770000000000001, 'Though there is very little literature on parsing algorithms most of these algorithms assume as though the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.'], [0.006770000000000001, 'Though there is very little literature on parsing algorithms most of these algorithms assume as if the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.']] WITH CHUNKING: [[-5.173913602672884, 'Though there is very little literature on parsing algorithms most of these algorithms assume as if the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.'], [-5.173913602672884, 'Though there is very little literature on parsing algorithms most of these algorithms assume as though the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.'], [-5.173913602672884, 'Though there is very little literature on parsing algorithms most of these algorithms assume like the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.'], [-5.05918331766409, 'Though there is very little literature on parsing algorithms most of these algorithms assume that the language to be parsed is initially described by means of a generative formal grammar and that the goal is to transform this generative grammar into a working parser.'], [-4.748813856932851, 'Though there is very little literature on parsing algorithms most of these algorithms assume that the language to be parsed is initially described by means of a formal generative grammar and that the goal is to transform this generative grammar into a working parser.']] ORIGINAL: `The purpose of the random variance is to find close to globally optimal solutions rather than simply locally optimal ones, the idea being that the random element will be decreased as the algorithm settles down to a solution.' FULL: [[0.07322500000000011, 'The idea being that as the algorithm settles down to a solution the random element will be decreased the purpose of the random variance is to find close to globally optimal solutions rather than ones locally optimal simply.'], [0.052618000000000074, 'The idea being that as the algorithm settles down to a solution the random element will be decreased the purpose of the random variance is to find close to globally optimal solutions rather than ones optimal locally simply.'], [0.05090400000000013, 'The idea being that as the algorithm settles down to a solution the random element will be decreased the purpose of the random variance is to find close to globally optimal solutions rather than ones optimal simply locally.'], [0.04000400000000001, 'The idea being that as the algorithm settles down to a solution the random element will be decreased the purpose of the random variance is to find close to globally optimal solutions rather than ones simply optimal locally.'], [0.013393999999999998, 'The idea being that as the algorithm settles down to a solution the random element will be decreased the purpose of the random variance is to find close to globally optimal solutions rather than locally simply optimal ones.']] WITH CHUNKING: [[-4.779864071225594, 'The idea being that as the algorithm settles down to a solution the random element will be decreased the purpose of the random variance is to find close to globally optimal solutions rather than ones optimal locally simply.'], [-4.5915784520211265, 'The idea being that the random element will be decreased as the algorithm settles down to a solution the purpose of the random variance is to find close to globally optimal solutions rather than ones optimal simply locally.'], [-4.55830106939718, 'The idea being that the random element will be decreased as the algorithm settles down to a solution the purpose of the random variance is to find close to globally optimal solutions rather than ones optimal locally simply.'], [-4.449567257346313, 'The idea being that as the algorithm settles down to a solution the random element will be decreased the purpose of the random variance is to find close to globally optimal solutions rather than ones locally optimal simply.'], [-4.228079365562059, 'The idea being that the random element will be decreased as the algorithm settles down to a solution the purpose of the random variance is to find close to globally optimal solutions rather than ones locally optimal simply.']] ORIGINAL: `Two key [[data model]]s arose at this time: [[CODASYL]] developed the [[network model]] based on Bachman's ideas, and (apparently independently) the [[hierarchical model]] was used in a system developed by [[North American Rockwell]] later adopted by [[IBM]] as the cornerstone of their [[Information Management System|IMS]] product.' FULL: None WITH CHUNKING: [[-7.8693038853578035, "Two key data models arose at this time :. codasyl developed the network model based on bachman's ideas and apparently independently the hierarchical model was used in a system later adopted by ibm as the cornerstone of their ims product developed by north american rockwell."], [-7.481658850506863, "Two key data models arose : at this time. codasyl developed the network model based on bachman's ideas and apparently in a system later adopted by ibm as the cornerstone of their ims product developed by north american rockwell the hierarchical model was used independently."], [-7.448352208125062, "Two key data models arose at this time :. codasyl developed the network model based on bachman's ideas and apparently in a system later adopted by ibm as the cornerstone of their ims product developed by north american rockwell the hierarchical model was used independently."], [-7.404800103528466, "Two key data models arose : at this time. codasyl developed the network model based on bachman's ideas and apparently in a system developed by north american rockwell later adopted by ibm as the cornerstone of their ims product the hierarchical model was used independently."], [-7.371493461146665, "Two key data models arose at this time :. codasyl developed the network model based on bachman's ideas and apparently in a system developed by north american rockwell later adopted by ibm as the cornerstone of their ims product the hierarchical model was used independently."]] ORIGINAL: `Linguists such as [[David Crystal]] recognize that one impact of this massive growth of English, in common with other global languages, has been to reduce native [[Natural language#Linguistic diversity|linguistic diversity]] in many parts of the world historically, most particularly in [[Australasia]] and [[North America]], and its huge influence continues to play an important role in [[language attrition]].' FULL: [[1.1e-05, 'Linguists such as David Crystal recognize that one impact of this massive growth of English in common with other global languages has been to reduce native linguistic diversity historically in many parts of the world most particularly in Australasia and North America and its huge influence continues to play an important role in language attrition.'], [1.1e-05, 'Linguists such as David Crystal recognize that one impact of this massive growth of English in common with other global languages has been to reduce linguistic native diversity historically in many parts of the world most particularly in Australasia and North America and its huge influence continues to play an important role in language attrition.'], [9e-06, 'Linguists such as David Crystal recognize that one impact of this massive growth of English in common with other global languages has been to reduce native linguistic diversity most particularly in Australasia and North America in many parts of the world historically and its huge influence continues to play an important role in language attrition.'], [9e-06, 'Linguists such as David Crystal recognize that one impact of this massive growth of English in common with other global languages has been to reduce linguistic native diversity most particularly in Australasia and North America in many parts of the world historically and its huge influence continues to play an important role in language attrition.'], [9e-06, 'Linguists such as David Crystal recognize that one impact of this massive growth of English in common with global other languages has been to reduce native linguistic diversity historically in many parts of the world most particularly in Australasia and North America and its huge influence continues to play an important role in language attrition.']] WITH CHUNKING: [[-13.607177746175994, 'Linguists such as david crystal recognize that one impact of this massive growth of english in common with other global languages has been to historically reduce native linguistic diversity most particularly in australasia and north america in many parts of the world and its huge influence continues to play an important role in language attrition.'], [-11.912582025401585, 'Linguists such as david crystal recognize that one impact of this massive growth of english in common with global other languages has been to reduce linguistic native diversity most particularly in australasia and north america in many parts of the world historically and its huge influence continues to play an important role in language attrition.'], [-11.912582025401585, 'Linguists such as david crystal recognize that one impact of this massive growth of english in common with global other languages has been to reduce native linguistic diversity most particularly in australasia and north america in many parts of the world historically and its huge influence continues to play an important role in language attrition.'], [-11.630015053616576, 'Linguists such as david crystal recognize that one impact of this massive growth of english in common with other global languages has been to reduce linguistic native diversity most particularly in australasia and north america in many parts of the world historically and its huge influence continues to play an important role in language attrition.'], [-11.630015053616576, 'Linguists such as david crystal recognize that one impact of this massive growth of english in common with other global languages has been to reduce native linguistic diversity most particularly in australasia and north america in many parts of the world historically and its huge influence continues to play an important role in language attrition.']] ORIGINAL: `Because the [[inner product]] is a [[linear operator]] in the input space, the Perceptron can only perfectly classify a set of data for which different classes are [[linearly separable]] in the input space, while it often fails completely for non-separable data.' FULL: None WITH CHUNKING: None ORIGINAL: `An inflectional rule takes a stem, changes it as is required by the rule, and outputs a word-form; a derivational rule takes a stem, changes it as per its own requirements, and outputs a derived stem; a compounding rule takes word-forms, and similarly outputs a compound stem.' FULL: None WITH CHUNKING: None ORIGINAL: `Ratio measurements have both a zero value defined and the distances between different measurements defined; they provide the greatest flexibility in statistical methods that can be used for analyzing the data.' FULL: [[0.00013800000000000002, 'Ratio measurements have a zero value defined and the defined distances between different measurements from it. they provide the greatest flexibility in statistical methods who can be used for analyzing the data.'], [0.000134, 'Ratio measurements have a defined zero value and the defined distances between different measurements from it. they provide the greatest flexibility in statistical methods who can be used for analyzing the data.'], [0.00012800000000000002, 'Ratio measurements have a zero value defined and the defined distances between different measurements from it. they provide the greatest flexibility in statistical methods which can be used for analyzing the data.'], [0.00012700000000000002, 'Ratio measurements have a defined zero value and the defined distances between different measurements from it. they provide the greatest flexibility in statistical methods which can be used for analyzing the data.'], [0.000118, 'Ratio measurements have a zero value defined and the defined distances between different measurements from it. they provide the greatest flexibility in statistical methods that can be used for analyzing the data.']] WITH CHUNKING: [[-9.698371964805688, 'Ratio measurements have a zero value defined and the defined distances between different measurements. they provide the greatest flexibility in statistical methods that can be used for analyzing the data.'], [-9.667728899274628, 'Ratio measurements have a defined zero value and the defined distances between different measurements. they provide the greatest flexibility in statistical methods which can be used for analyzing the data.'], [-9.62116949572788, 'Ratio measurements have a zero value defined and the defined distances between different measurements. they provide the greatest flexibility in statistical methods which can be used for analyzing the data.'], [-9.595552836413669, 'Ratio measurements have a defined zero value and the defined distances between different measurements. they provide the greatest flexibility in statistical methods who can be used for analyzing the data.'], [-9.54899343286692, 'Ratio measurements have a zero value defined and the defined distances between different measurements. they provide the greatest flexibility in statistical methods who can be used for analyzing the data.']] ORIGINAL: `Today, Esperanto is employed in world travel, correspondence, cultural exchange, conventions, literature, language instruction, television, and radio broadcasting.' FULL: None WITH CHUNKING: [[-1.6391242228044607, 'Today esperanto is employed in world travel. correspondence cultural exchange conventions literature language instruction television and radio broadcasting.']] ORIGINAL: `French is an official language of [[Haiti]], although it is mostly spoken by the [[upper class]], while [[Haitian Creole]] (a [[French-based creole language]]) is more widely spoken as a [[mother tongue]].' FULL: None WITH CHUNKING: [[-8.201831035403439, 'While haitian creole a french based creole language is more widely spoken as a mother tongue although it is spoken by the upper class mostly french is an official language of haiti.'], [-8.06325458703309, 'While haitian creole a french based creole language is spoken more widely as a mother tongue although it is spoken by the upper class mostly french is an official language of haiti.'], [-8.012275864044605, 'While haitian creole a french based creole language is spoken as a mother tongue widely more although it is spoken by the upper class mostly french is an official language of haiti.'], [-7.724340298360454, 'While haitian creole a french based creole language is spoken as a mother tongue more widely although it is mostly spoken by the upper class french is an official language of haiti.'], [-6.741259140005176, 'While haitian creole a french based creole language is spoken as a mother tongue more widely although it is spoken by the upper class mostly french is an official language of haiti.']] ORIGINAL: `Most German vocabulary is derived from the Germanic branch of the Indo-European language family, although there are significant minorities of words derived from Latin, and [[Greek language|Greek]], and a smaller amount from French and most recently English .' FULL: [[0.00011599999999999997, 'Although there are significant minorities of words derived from Latin and Greek and a smaller amount from French and English most recently most German vocabulary is derived from the Germanic branch of the Indo-European language family.'], [0.00010499999999999995, 'Although there are significant minorities of words derived from Latin and Greek and a smaller amount from French and most recently English most German vocabulary is derived from the Germanic branch of the Indo-European language family.'], [8.299999999999994e-05, 'Although there are significant minorities of words derived from Latin and Greek and a small more amount from French and English most recently most German vocabulary is derived from the Germanic branch of the Indo-European language family.'], [7.399999999999997e-05, 'Although there are significant minorities of words derived from Latin and Greek and a small more amount from French and most recently English most German vocabulary is derived from the Germanic branch of the Indo-European language family.'], [1.5000000000000004e-05, 'Although there are significant minorities of words derived from Latin and Greek and a smaller amount from French and English most recently most German vocabulary is derived from the Germanic branch of the language Indo-European family.']] WITH CHUNKING: [[-11.091013967106008, 'Most german vocabulary is derived from the germanic branch of the indo-european language family although there are significant minorities of words derived from latin and greek and a smaller amount from french and english most recently.'], [-10.44139077887614, 'Although there are significant minorities of words derived from latin and greek and a small more amount from french and most recently english most german vocabulary is derived from the germanic branch of the indo-european language family.'], [-10.085009133077067, 'Although there are significant minorities of words derived from latin and greek and a smaller amount from french and most recently english most german vocabulary is derived from the germanic branch of the indo-european language family.'], [-9.950708016232374, 'Although there are significant minorities of words derived from latin and greek and a small more amount from french and english most recently most german vocabulary is derived from the germanic branch of the indo-european language family.'], [-9.591164658709848, 'Although there are significant minorities of words derived from latin and greek and a smaller amount from french and english most recently most german vocabulary is derived from the germanic branch of the indo-european language family.']] ORIGINAL: `In a word like ''independently'', we say that the morphemes are ''in-'', ''depend'', ''-ent'', and ''ly''; ''depend'' is the [[root (linguistics)|root]] and the other morphemes are, in this case, derivational affixes.' FULL: None WITH CHUNKING: None ORIGINAL: `These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same [[type system]] as the application program.' FULL: [[0.050601999999999994, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring that the database uses the same type system as the application program.'], [0.0050669999999999995, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring the database uses the same type system as the application program.'], [0.002407, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring that the database uses the type system same as the application program.'], [0.000241, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring the database uses the type system same as the application program.']] WITH CHUNKING: [[-7.675104404726516, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring the database uses the type system same as the application program.'], [-5.438087346755498, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring that the database uses the type system same as the application program.'], [-5.3668731235834315, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring the database uses the same type system as the application program.'], [-3.1298560656124144, 'These databases attempt to bring the database world and the application programming world close more together in particular by ensuring that the database uses the same type system as the application program.']] ORIGINAL: `The following language groupings can serve as some linguistically significant examples of areal linguistic units, or ''[[sprachbund]]s'': [[Balkan linguistic union]], or the bigger group of [[European languages]]; [[Caucasian languages]]; [[East Asian languages]].' FULL: None WITH CHUNKING: None ORIGINAL: `In contrast, OpenOffice.org was used in [[2005]] by ''[[The Guardian]]'' newspaper to illustrate what it claims are the limitations of open-source software, although the article does finish by stating that the software may be better than MS Word for books.' FULL: None WITH CHUNKING: None ORIGINAL: `This could lead to the fact that all attempts at physically observing a particle with an "entangled" relationship to another are slowed down, even though the particles are not connected in any other way other than by the information they carry.' FULL: None WITH CHUNKING: [[-12.49063526925409, 'This could lead to the fact as if all attempts at observing a particle with a entangled relationship to another physically are slowed down even though the particles are not connected in any other way other than by the information which they do carry.'], [-12.49063526925409, 'This could lead to the fact as though all attempts at observing a particle with a entangled relationship to another physically are slowed down even though the particles are not connected in any other way other than by the information which they do carry.'], [-12.49063526925409, 'This could lead to the fact like all attempts at observing a particle with a entangled relationship to another physically are slowed down even though the particles are not connected in any other way other than by the information which they do carry.'], [-12.444903363044716, 'This could lead to the fact that all attempts at observing a particle with a entangled relationship to another physically are slowed down even though the particles are not connected in any other way other than by the information that they do carry.'], [-12.35288205688224, 'This could lead to the fact that all attempts at observing a particle with a entangled relationship to another physically are slowed down even though the particles are not connected in any other way other than by the information which they do carry.']] ORIGINAL: `Other key components of a Linux system may use other licenses; many libraries use the [[GNU Lesser General Public License]] (LGPL), a more permissive variant of the GPL, and the [[X Window System]] uses the [[MIT License]].' FULL: None WITH CHUNKING: None ORIGINAL: `A common abbreviation of this is .htm; it originates from older operating systems and file systems, such as the [[DOS]] versions from the 80s and early 90s and [[File Allocation Table|FAT]], which limit file extensions to three letters.' FULL: [[0.010844000000000001, 'A common abbreviation of this is .htm. it originates from older operating systems and file systems such as the Dos versions from the 80s and early 90s and fat which limit file extensions to three letters.'], [0.010844000000000001, 'A common abbreviation of this is .htm. it originates from elder operating systems and file systems such as the Dos versions from the 80s and early 90s and fat which limit file extensions to three letters.'], [0.008936, 'A common abbreviation of this is .htm. it originates from older operating systems and file systems such as the Dos versions from the 80s and early 90s and fat who limit file extensions to three letters.'], [0.008936, 'A common abbreviation of this is .htm. it originates from elder operating systems and file systems such as the Dos versions from the 80s and early 90s and fat who limit file extensions to three letters.'], [0.008884, 'A common abbreviation of this is .htm. it originates from older operating systems and file systems such as the Dos versions from the 80s and early 90s and fat that limit file extensions to three letters.']] WITH CHUNKING: [[-4.723599779217783, 'A common abbreviation of this is .htm. it originates from older operating systems and file systems such as the dos versions from the 80s and early 90s and fat that limit file extensions to three letters.'], [-4.71765705835905, 'A common abbreviation of this is .htm. it originates from elder operating systems and file systems such as the dos versions from the 80s and early 90s and fat who limit file extensions to three letters.'], [-4.71765705835905, 'A common abbreviation of this is .htm. it originates from older operating systems and file systems such as the dos versions from the 80s and early 90s and fat who limit file extensions to three letters.'], [-4.52417203477306, 'A common abbreviation of this is .htm. it originates from elder operating systems and file systems such as the dos versions from the 80s and early 90s and fat which limit file extensions to three letters.'], [-4.52417203477306, 'A common abbreviation of this is .htm. it originates from older operating systems and file systems such as the dos versions from the 80s and early 90s and fat which limit file extensions to three letters.']] ORIGINAL: `The first '''import''' statement directs the Java compiler to include the {{Javadoc:SE|java/awt|BorderLayout}} class from the {{Javadoc:SE|package=java.awt|java/awt}} package in the compilation; the second '''import''' includes all of the public classes and interfaces from the '''{{Javadoc:SE|package=javax.swing|javax/swing}}''' package.' FULL: [[2e-06, 'The first statement directs the Java compiler to include the class from the package in the compilation. the second includes all of the public classes and interfaces from the package.'], [1e-06, 'The first statement directs the Java compiler to include the class from the package in the compilation. the second includes all of the public classes and interfaces from the package.']] WITH CHUNKING: [[-2.287070522758145, 'The first statement directs the java compiler to include the class from the package in the compilation. the second includes all of the public classes and interfaces from the package.'], [-1.9088338260216338, 'The first statement directs the java compiler to include the class from the package in the compilation. the second includes all of the public classes and interfaces from the package.']] ORIGINAL: `This phenomenon may be caused by mixing the word-order pattern used for the word ''{{lang|de|weil}}'' with the pattern used for an alternative word for "because", ''{{lang|de|denn}}'', which is used with the main clause order (''"{{lang|de|…denn ich bin pleite.}}"'').' FULL: None WITH CHUNKING: None ORIGINAL: `Specifically, some countries and militaries contend the software can be used to pinpoint with near-precision accuracy the physical location of critical infrastructure, commercial and residential buildings, bases, government agencies, and so on.' FULL: [[0.018538, 'Specifically some countries and militaries contends that the software can be used to pinpoint the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [0.018538, 'Specifically some countries and militaries contends that the software can be used to pin-point the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [0.018538, 'Specifically some countries and militaries contends that the software can be used to pin point the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [0.018424, 'Specifically some countries and militaries contend that the software can be used to pinpoint the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [0.018424, 'Specifically some countries and militaries contend that the software can be used to pin-point the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.']] WITH CHUNKING: [[-3.6798833818266226, 'Specifically some countries and militaries contend that the software can be used to pin-point the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [-3.6798833818266226, 'Specifically some countries and militaries contend that the software can be used to pinpoint the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [-3.673747296658978, 'Specifically some countries and militaries contends that the software can be used to pin point the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [-3.673747296658978, 'Specifically some countries and militaries contends that the software can be used to pin-point the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.'], [-3.673747296658978, 'Specifically some countries and militaries contends that the software can be used to pinpoint the physical location of critical infrastructure commercial and residential buildings bases government agencies and so on with near precision accuracy.']] ORIGINAL: `* ''molib''—A robust commercial application toolkit library that abstracts the system calls through C++ objects (such as the file system, database system and thread implementation.).' FULL: [[0.063712, 'Molib. a robust commercial application toolkit library who abstracts the system calls through c++ objects such as the file system database system and thread implementation.'], [0.059739999999999994, 'Molib. a robust commercial application toolkit library which abstracts the system calls through c++ objects such as the file system database system and thread implementation.'], [0.05624400000000001, 'Molib. a robust commercial application toolkit library who abstracts the systems calls through c++ objects such as the file system database system and thread implementation.'], [0.05593299999999998, 'Molib. a robust commercial application toolkit library that abstracts the system calls through c++ objects such as the file system database system and thread implementation.'], [0.05274, 'Molib. a robust commercial application toolkit library which abstracts the systems calls through c++ objects such as the file system database system and thread implementation.']] WITH CHUNKING: None ORIGINAL: `It turns out that when you select the k largest singular values, and their corresponding singular vectors from U and V, you get the rank k approximation to X with the smallest error ([[Frobenius norm]]).' FULL: None WITH CHUNKING: None ORIGINAL: `In the early 80s, AI research was revived by the commercial success of [[expert systems]] (a form of AI program that simulated the knowledge and analytical skills of one or more human experts) and by 1985 the market for AI had reached more than a billion dollars.' FULL: [[0.0019100000000000005, 'In the early 80s AI research was revived by the commercial success of expert systems a form of AI program which simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for AI had reached more than one billion dollars.'], [0.0016689999999999997, 'In the early 80s AI research was revived by the commercial success of expert systems a form of AI program that simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for AI had reached more than one billion dollars.'], [0.0016100000000000003, 'In the early 80s AI research was revived by the commercial success of expert systems a form of AI program which simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for AI had reached more than a billion dollars.'], [0.0015289999999999998, 'In the early 80s AI research was revived by the commercial success of expert systems a form of AI program who simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for AI had reached more than one billion dollars.'], [0.001405, 'In the early 80s AI research was revived by the commercial success of expert systems a form of AI program that simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for AI had reached more than a billion dollars.']] WITH CHUNKING: [[-8.099421835697797, 'In the early 80s ai research was revived by the commercial success of expert systems a form of ai program that simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for ai had reached more than a billion dollars.'], [-8.014496834261951, 'In the early 80s ai research was revived by the commercial success of expert systems a form of ai program who simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for ai had reached more than one billion dollars.'], [-7.964859967627623, 'In the early 80s ai research was revived by the commercial success of expert systems a form of ai program which simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for ai had reached more than a billion dollars.'], [-7.927638187322256, 'In the early 80s ai research was revived by the commercial success of expert systems a form of ai program that simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for ai had reached more than one billion dollars.'], [-7.793076319252083, 'In the early 80s ai research was revived by the commercial success of expert systems a form of ai program which simulated the knowledge and analytical skills of one or more human experts and by 1985 the market for ai had reached more than one billion dollars.']] ORIGINAL: `Its speakers belong to some Schmiedleit, Lehrerleit, and Dariusleit Hutterite groups, but there are also speakers among the older generations of Prairieleit (the descendants of those Hutterites who chose not to settle in colonies).' FULL: [[0.0034980000000000002, 'Its speakers do belong to some Schmiedleit Lehrerleit and Dariusleit Hutterite groups but also there are speakers among the older generations of Prairieleit the descendants of those Hutteriteses which chose not to settle in colonies.'], [0.0034980000000000002, 'Its speakers do belong to some Schmiedleit Lehrerleit and Dariusleit Hutterite groups but also there are speakers among the elder generations of Prairieleit the descendants of those Hutteriteses which chose not to settle in colonies.'], [0.003089999999999999, 'Its speakers do belong to some Schmiedleit Lehrerleit and Dariusleit Hutterite groups but also there are speakers among the older generations of Prairieleit the descendants of those Hutteriteses that chose not to settle in colonies.'], [0.003089999999999999, 'Its speakers do belong to some Schmiedleit Lehrerleit and Dariusleit Hutterite groups but also there are speakers among the elder generations of Prairieleit the descendants of those Hutteriteses that chose not to settle in colonies.'], [0.002851999999999999, 'Its speakers do belong to some Schmiedleit Lehrerleit and Dariusleit Hutterite groups but also there are speakers among the older generations of Prairieleit the descendants of those Hutteriteses which did choose not to settle in colonies.']] WITH CHUNKING: [[-6.338318646413683, 'Its speakers belong to some schmiedleit lehrerleit and dariusleit hutterite groups but also there are speakers among the older generations of prairieleit the descendants of those hutteriteses which did choose not to settle in colonies.'], [-6.256223695704797, 'Its speakers belong to some schmiedleit lehrerleit and dariusleit hutterite groups but also there are speakers among the elder generations of prairieleit the descendants of those hutteriteses that chose not to settle in colonies.'], [-6.256223695704797, 'Its speakers belong to some schmiedleit lehrerleit and dariusleit hutterite groups but also there are speakers among the older generations of prairieleit the descendants of those hutteriteses that chose not to settle in colonies.'], [-6.133808322102918, 'Its speakers belong to some schmiedleit lehrerleit and dariusleit hutterite groups but also there are speakers among the elder generations of prairieleit the descendants of those hutteriteses which chose not to settle in colonies.'], [-6.133808322102918, 'Its speakers belong to some schmiedleit lehrerleit and dariusleit hutterite groups but also there are speakers among the older generations of prairieleit the descendants of those hutteriteses which chose not to settle in colonies.']] ORIGINAL: `The Cascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no [[back-propagation]] of error signals through the connections of the network.' FULL: [[0.007047, 'The cascade correlation architecture has several advantages over existing algorithms : it learns very quickly the network determines its own size and topology even if the training set changes it retains the structures which it has built and it requires no back propagation of error signals through the connections of the network.'], [0.006247000000000001, 'The cascade correlation architecture has several advantages over existing algorithms : it learns very quickly the network determines its own size and topology even if the training set changes it retains the structures that it has built and it requires no back propagation of error signals through the connections of the network.'], [0.005613999999999998, 'The cascade correlation architecture has several advantages over existing algorithms : it learns very quickly the network determines its own size and topology even if the training set changes it retains the structures who it has built and it requires no back propagation of error signals through the connections of the network.'], [0.0008320000000000031, 'The cascade correlation architecture has several advantages over algorithms existing : it learns very quickly the network determines its own size and topology even if the training set changes it retains the structures which it has built and it requires no back propagation of error signals through the connections of the network.'], [0.0007340000000000034, 'The cascade correlation architecture has several advantages over algorithms existing : it learns very quickly the network determines its own size and topology even if the training set changes it retains the structures that it has built and it requires no back propagation of error signals through the connections of the network.']] WITH CHUNKING: None ORIGINAL: `* 1947: [[Hans Peter Luhn]] (research engineer at IBM since 1941) began work on a mechanized, punch card based system for searching chemical compounds.' FULL: None WITH CHUNKING: None ORIGINAL: `In 2002, Linux coordinator [[Linus Torvalds]] decided to use BitKeeper to develop the Linux kernel, a free software project, claiming no free software alternative met his needs.' FULL: None WITH CHUNKING: None ORIGINAL: `International auxiliary languages are generally constructed languages that strive to be easier to learn than natural languages; other constructed languages strive to be more logical ("loglangs") than natural languages; a prominent example of this is [[Lojban]].' FULL: None WITH CHUNKING: None ORIGINAL: `Different ontologies in the same domain can also arise due to different perceptions of the domain based on cultural background, education, ideology, or because a different representation language was chosen.' FULL: None WITH CHUNKING: None ORIGINAL: `* [[Supervised learning]], such as [[statistical classification|classification]] (be able to determine what category something belongs in, after seeing a number of examples of things from each category), or [[regression]] (given a set of numerical input/output examples, discover a continuous function that would generate the outputs from the inputs).' FULL: None WITH CHUNKING: None ORIGINAL: `Separation of functionality attempts to simply omit those subsets of functionality that are not capable from within certain client browsers or operating systems, while still delivering a ‘complete’ application to the user. (see also [[Separation of concerns]]).' FULL: [[0.006164000000000001, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. ¦isee also :i¦ separation of concerns.'], [0.006164000000000001, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. see also separation of concerns.'], [0.006164000000000001, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. see also : separation of concerns.'], [0.005342000000000002, 'While still delivering a complete application to the user separation of functionality does attempt to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. ¦isee also :i¦ separation of concerns.'], [0.005342000000000002, 'While still delivering a complete application to the user separation of functionality does attempt to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. see also separation of concerns.']] WITH CHUNKING: [[-5.010071863053341, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality who are not capable from within certain client browsers or operating systems simply. see also separation of concerns.'], [-5.010071863053341, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality who are not capable from within certain client browsers or operating systems simply. ¦isee also :i¦ separation of concerns.'], [-4.828763660002341, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. see also : separation of concerns.'], [-4.828763660002341, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. see also separation of concerns.'], [-4.828763660002341, 'While still delivering a complete application to the user separation of functionality attempts to omit those subsets of functionality which are not capable from within certain client browsers or operating systems simply. ¦isee also :i¦ separation of concerns.']] ORIGINAL: `For example, an article about "[[sport utility vehicle]]s" would also be [[tag (metadata)|tagged]] "4 wheel drives", "4WDs" and "four wheel drives", as this is how SUVs are known in some countries.' FULL: [[0.005335999999999999, 'For example as this is how suvs are known in some countries an article ’bout sport utility vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [0.005335999999999999, 'For example as this is how suvs are known in some countries an article ’bout sport utilities vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [0.005335999999999999, 'For example as this is how suvs are known in some countries an article ‘bout sport utility vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [0.005335999999999999, 'For example as this is how suvs are known in some countries an article ‘bout sport utilities vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [0.005335999999999999, 'For example as this is how suvs are known in some countries an article about sport utility vehicles would also be tagged four wheel drives 4wds and four wheel drives.']] WITH CHUNKING: [[-5.768289699003522, 'For example as this is how suvs are known in some countries an article about sport utility vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [-5.768289699003522, 'For example as this is how suvs are known in some countries an article ‘bout sport utilities vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [-5.768289699003522, 'For example as this is how suvs are known in some countries an article ‘bout sport utility vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [-5.768289699003522, 'For example as this is how suvs are known in some countries an article ’bout sport utilities vehicles would also be tagged four wheel drives 4wds and four wheel drives.'], [-5.768289699003522, 'For example as this is how suvs are known in some countries an article ’bout sport utility vehicles would also be tagged four wheel drives 4wds and four wheel drives.']] ORIGINAL: `That is, the lowest level is the first normal form, and the database cannot meet the requirements for higher level normal forms without first having met all the requirements of the lesser normal form.' FULL: [[2.5000000000000005e-05, 'That is the lowest level is the first normal form and the database can not meet the requirements for normal higher level forms without having been meeting all of the requirements of the lesser normal form firstly.'], [2.5000000000000005e-05, 'That is the lowest level is the first normal form and the database can not meet the requirements for normal higher level forms without having been meeting all of the requirements of the lesser normal form first.'], [1.6e-05, 'That is the lowest level is the first normal form and the database can not meet the requirements for normal higher level forms without having met all of the requirements of the lesser normal form firstly.'], [1.6e-05, 'That is the lowest level is the first normal form and the database can not meet the requirements for normal higher level forms without having met all of the requirements of the lesser normal form first.'], [8e-06, 'That is the lowest level is the normal first form and the database can not meet the requirements for normal higher level forms without having been meeting all of the requirements of the lesser normal form firstly.']] WITH CHUNKING: [[-12.346834156285237, 'That is the lowest level is the first normal form and the database can not meet the requirements for normal high more level forms without having been meeting all of the requirements of the lesser normal form firstly.'], [-12.331192554721012, 'That is the lowest level is the normal first form and the database can not meet the requirements for normal higher level forms without having been meeting all of the requirements of the lesser normal form first.'], [-12.331192554721012, 'That is the lowest level is the normal first form and the database can not meet the requirements for normal higher level forms without having been meeting all of the requirements of the lesser normal form firstly.'], [-11.88501611162594, 'That is the lowest level is the first normal form and the database can not meet the requirements for normal higher level forms without having been meeting all of the requirements of the lesser normal form first.'], [-11.88501611162594, 'That is the lowest level is the first normal form and the database can not meet the requirements for normal higher level forms without having been meeting all of the requirements of the lesser normal form firstly.']] ORIGINAL: `Google was co-founded by [[Larry Page]] and [[Sergey Brin]] while they were students at [[Stanford University]] and the company was first incorporated as a [[privately held company]] on [[September 7]], [[1998]].' FULL: None WITH CHUNKING: None ORIGINAL: `Traditionally, business analysts have performed the task of extracting useful [[information]] from recorded [[data]], but the increasing volume of data in modern business and science calls for computer-based approaches.' FULL: [[0.268405, 'Traditionally business analysts have performed the task of extracting useful information from recorded data but the volume of data in modern business and science increasing calls for computer based approaches.'], [0.034475, 'Traditionally business analysts have performed the task of extracting useful information from recorded data but the increasing volume of data in modern business and science calls for computer based approaches.'], [0.019481, 'Traditionally business analysts have performed the task of extracting useful information from data recorded but the volume of data in modern business and science increasing calls for computer based approaches.'], [0.012344, 'The task of extracting useful information from recorded data business analysts have performed traditionally but the volume of data in modern business and science increasing calls for computer based approaches.'], [0.0031030000000000003, 'The task of extracting useful information from recorded data business analysts have traditionally performed but the volume of data in modern business and science increasing calls for computer based approaches.']] WITH CHUNKING: [[-5.608909241196676, 'Traditionally business analysts have performed the task of extracting useful information from recorded data but the increasing volume of data in modern business and science calls for computer based approaches.'], [-5.581477579958933, 'The task of extracting useful information from recorded data business analysts have traditionally performed but the volume of data in modern business and science increasing calls for computer based approaches.'], [-4.08961542227138, 'Traditionally business analysts have performed the task of extracting useful information from data recorded but the volume of data in modern business and science increasing calls for computer based approaches.'], [-3.7432680089044466, 'The task of extracting useful information from recorded data business analysts have performed traditionally but the volume of data in modern business and science increasing calls for computer based approaches.'], [-2.9516894666022786, 'Traditionally business analysts have performed the task of extracting useful information from recorded data but the volume of data in modern business and science increasing calls for computer based approaches.']] ORIGINAL: `Although, in practice, regular grammars are commonly expressed using [[regular expression]]s, some forms of regular expression used in practice do not strictly generate the regular languages and do not show linear recognitional performance due to those deviations.' FULL: None WITH CHUNKING: [[-10.674280086720593, 'Although in practice using regular expressions regular grammars are expressed commonly. some forms of regular expression used in practice do not strictly generate the regular languages and do not show recognitional linear performance due to those deviations.'], [-10.662674540600284, 'Although in practice using regular expressions regular grammars are expressed commonly. some forms of regular expression used in practice do not generate the regular languages strictly and do not show linear recognitional performance due to those deviations.'], [-10.596916052662712, 'Although in practice regular grammars are commonly expressed using regular expressions. some forms of regular expression used in practice do not generate the regular languages strictly and do not show recognitional linear performance due to those deviations.'], [-10.481515651505449, 'Although in practice regular grammars are expressed commonly using regular expressions. some forms of regular expression used in practice do not generate the regular languages strictly and do not show recognitional linear performance due to those deviations.'], [-9.800109208082292, 'Although in practice using regular expressions regular grammars are expressed commonly. some forms of regular expression used in practice do not generate the regular languages strictly and do not show recognitional linear performance due to those deviations.']] ORIGINAL: `The ability for bi-directional flow of inputs between neurons/nodes was produced with the [[Hopfield net|Hopfield's network]] (1982), and specialization of these node layers for specific purposes was introduced through the first [[hybrid neural network|hybrid network]].' FULL: [[0.045410000000000006, "The ability for bi directional flow of inputs between neurons and nodes was produced with the Hopfield's network 1982 and specialization of these node layers for specific purposes was introduced through the first hybrid network."], [0.013840000000000009, "The ability for flow of inputs directional bi between neurons and nodes was produced with the Hopfield's network 1982 and specialization of these node layers for specific purposes was introduced through the first hybrid network."], [0.012087999999999993, "The ability for bi directional flow of inputs between neurons and nodes was produced with the Hopfield's network 1982 and specialization for specific purposes of these node layers was introduced through the first hybrid network."], [0.0049299999999999995, "The ability for flow of inputs between neurons and nodes directional bi was produced with the Hopfield's network 1982 and specialization of these node layers for specific purposes was introduced through the first hybrid network."], [0.0036809999999999994, "The ability for flow of inputs directional bi between neurons and nodes was produced with the Hopfield's network 1982 and specialization for specific purposes of these node layers was introduced through the first hybrid network."]] WITH CHUNKING: [[-9.621220993294177, "The ability for flow of inputs directional bi between neurons and nodes was produced with the hopfield's network 1982 and specialization for specific purposes of these node layers was introduced through the first hybrid network."], [-9.329581550556046, "The ability for flow of inputs between neurons and nodes directional bi was produced with the hopfield's network 1982 and specialization of these node layers for specific purposes was introduced through the first hybrid network."], [-8.433045703765604, "The ability for bi directional flow of inputs between neurons and nodes was produced with the hopfield's network 1982 and specialization for specific purposes of these node layers was introduced through the first hybrid network."], [-8.297907449235488, "The ability for flow of inputs directional bi between neurons and nodes was produced with the hopfield's network 1982 and specialization of these node layers for specific purposes was introduced through the first hybrid network."], [-7.109732159706914, "The ability for bi directional flow of inputs between neurons and nodes was produced with the hopfield's network 1982 and specialization of these node layers for specific purposes was introduced through the first hybrid network."]] ORIGINAL: `Estimating the probability of sequences can become difficult in [[corpora]], in which [[phrase]]s or [[Sentence (linguistics)|sentence]]s can be arbitrarily long and hence some sequences are not observed during [[training]] of the language model ([[data sparseness problem]] of [[overfitting]]).' FULL: None WITH CHUNKING: None ORIGINAL: `If we compress data in a manner that assumes ''q(X)'' is the distribution underlying some data, when, in reality, ''p(X)'' is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression.' FULL: None WITH CHUNKING: None ORIGINAL: `''J'' may also appear in many words from different dialects, but its use is discouraged in contemporary Italian, and it is not part of the standard 21-letter contemporary Italian alphabet.' FULL: None WITH CHUNKING: None ORIGINAL: `Using vibration monitoring, it can be observed that each tap change operation generates a signal that contains information about the condition of the tap changer contacts and the drive mechanisms.' FULL: None WITH CHUNKING: [[-3.27805283335603, 'Using vibration monitoring it can be observed that each tap change operation generates a signal that contains information about the condition of the tap changer contacts and the drive mechanisms.'], [-3.12477124095719, 'Using vibration monitoring it can be observed that each tap change operation generates a signal which contains information about the condition of the tap changer contacts and the drive mechanisms.'], [-2.7428849618993447, 'Using vibration monitoring it can be observed that each tap change operation generates a signal who contains information on the condition of the tap changer contacts and the drive mechanisms.'], [-2.6830153370816636, 'Using vibration monitoring it can be observed that each tap change operation generates a signal that contains information on the condition of the tap changer contacts and the drive mechanisms.'], [-2.529750600930901, 'Using vibration monitoring it can be observed that each tap change operation generates a signal which contains information on the condition of the tap changer contacts and the drive mechanisms.']] ORIGINAL: `It is functionally equivalent to the Unix version, and its user interface resembles the [[look and feel]] of that version; for example, the application uses its own [[menu bar]] instead of the OS X menu at the top of the screen.' FULL: [[0.05170300000000001, 'It is functionally equivalent to the Unix version and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the OS x menu at the top of the screen.'], [0.05170300000000001, 'It is functionally equivalent to the Unix version and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the OS X. menu at the top of the screen.'], [0.02166200000000002, 'It is equivalent to the Unix version functionally and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the OS x menu at the top of the screen.'], [0.02166200000000002, 'It is equivalent to the Unix version functionally and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the OS X. menu at the top of the screen.'], [0.002200000000000001, 'It functionally is equivalent to the Unix version and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the OS x menu at the top of the screen.']] WITH CHUNKING: [[-7.279313472444603, 'It functionally is equivalent to the unix version and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the os x. menu at the top of the screen.'], [-4.796135792263962, 'It is equivalent to the unix version functionally and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the os x menu at the top of the screen.'], [-4.796135792263962, 'It is equivalent to the unix version functionally and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the os x. menu at the top of the screen.'], [-4.457649196135014, 'It is functionally equivalent to the unix version and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the os x menu at the top of the screen.'], [-4.457649196135014, 'It is functionally equivalent to the unix version and its user interface resembles the look and feel of that version. for example the application uses its own menu bar instead of the os x. menu at the top of the screen.']] ORIGINAL: `Some see avoiding the VM in this manner as defeating the point of developing in Java; however it can be useful to provide both a generic [[bytecode]] version, as well as an optimised native code version of an application.' FULL: [[0.11831599999999998, 'Some see avoiding the vm in this manner as defeating the point of developing in Java. however it can be useful to provide a generic bytecode version as well as a optimised native code version of an application.'], [0.11797400000000002, 'Some see avoiding the vm in this manner as defeating the point of developing in Java. however it can be useful to provide a generic bytecode version as well as a native code version of an application optimised.'], [0.05248300000000001, 'Some see avoiding the vm in this manner as defeating of the point of developing in Java. however it can be useful to provide a generic bytecode version as well as a optimised native code version of an application.'], [0.052329000000000014, 'Some see avoiding the vm in this manner as defeating of the point of developing in Java. however it can be useful to provide a generic bytecode version as well as a native code version of an application optimised.'], [0.013165999999999999, 'Some see avoiding the vm in this manner as defeating the point of developing in Java. it can be useful to provide a generic bytecode version as well as a optimised native code version of an application however.']] WITH CHUNKING: [[-6.010087164649599, 'Some see avoiding the vm in this manner as defeating the point of developing in java. it can be useful to provide a generic bytecode version as well as a optimised native code version of an application however.'], [-3.0445909234845194, 'Some see avoiding the vm in this manner as defeating of the point of developing in java. however it can be useful to provide a generic bytecode version as well as a native code version of an application optimised.'], [-3.041682414192775, 'Some see avoiding the vm in this manner as defeating of the point of developing in java. however it can be useful to provide a generic bytecode version as well as a optimised native code version of an application.'], [-2.0571319569975675, 'Some see avoiding the vm in this manner as defeating the point of developing in java. however it can be useful to provide a generic bytecode version as well as a native code version of an application optimised.'], [-2.0542234477058225, 'Some see avoiding the vm in this manner as defeating the point of developing in java. however it can be useful to provide a generic bytecode version as well as a optimised native code version of an application.']] ORIGINAL: `It is a [[First language|native language]] of most of the population in Portugal (100%), Brazil (99%), Angola (60%), and São Tomé and Príncipe (50%), and it is spoken by a [[plurality]] of the population of Mozambique (40%), though only 6.5% are native speakers.' FULL: None WITH CHUNKING: None ORIGINAL: `The longest application has been in the use of [[screenreaders]] for people with [[visual impairment]], but text-to-speech systems are now commonly used by people with [[dyslexia]] and other reading difficulties as well as by pre-literate youngsters.' FULL: None WITH CHUNKING: [[-5.742271139595489, 'The longest application has been in the use of screenreaders for people with visual impairment but text-to-speech systems now are used by people with dyslexia and other reading difficulties as well as by pre-literate youngsters commonly.'], [-5.040439265405141, 'The longest application has been in the use of screenreaders for people with visual impairment but text-to-speech systems commonly are now used by people with dyslexia and other reading difficulties as well as by pre-literate youngsters.'], [-4.947276785846808, 'The longest application has been in the use of screenreaders for people with visual impairment but text-to-speech systems now are commonly used by people with dyslexia and other reading difficulties as well as by pre-literate youngsters.'], [-4.405187256296076, 'The longest application has been in the use of screenreaders for people with visual impairment but text-to-speech systems are now used commonly by people with dyslexia and other reading difficulties as well as by pre-literate youngsters.'], [-3.6016813347495087, 'The longest application has been in the use of screenreaders for people with visual impairment but text-to-speech systems are now used by people with dyslexia and other reading difficulties as well as by pre-literate youngsters commonly.']] ORIGINAL: `For example, a literal < normally indicates the start of a tag, and & normally indicates the start of a character entity reference or numeric character reference; writing it as &amp; or &#x26; or &#38; allows & to be included in the content of elements or the values of attributes.' FULL: [[0.0007750000000000004, 'For example a literal indicates the start of a tag normally and indicates the start of a character entity reference or numeric character reference normally. writing it as or or allows to be included in the content of elements or the values of attributes.'], [0.00032800000000000017, 'For example a literal normally indicates the start of a tag and indicates the start of a character entity reference or numeric character reference normally. writing it as or or allows to be included in the content of elements or the values of attributes.'], [5.599999999999999e-05, 'For example a literal indicates the start of a tag normally and normally indicates the start of a character entity reference or numeric character reference. writing it as or or allows to be included in the content of elements or the values of attributes.'], [2.3000000000000003e-05, 'For example a literal normally indicates the start of a tag and normally indicates the start of a character entity reference or numeric character reference. writing it as or or allows to be included in the content of elements or the values of attributes.']] WITH CHUNKING: [[-12.565675436404941, 'For example a literal normally indicates the start of a tag and normally indicates the start of a character entity reference or numeric character reference. writing it as or or allows to be included in the content of elements or the values of attributes.'], [-10.27284870543881, 'For example a literal indicates the start of a tag normally and normally indicates the start of a character entity reference or numeric character reference. writing it as or or allows to be included in the content of elements or the values of attributes.'], [-9.90394704786617, 'For example a literal normally indicates the start of a tag and indicates the start of a character entity reference or numeric character reference normally. writing it as or or allows to be included in the content of elements or the values of attributes.'], [-7.61112031690004, 'For example a literal indicates the start of a tag normally and indicates the start of a character entity reference or numeric character reference normally. writing it as or or allows to be included in the content of elements or the values of attributes.']] ORIGINAL: `{{transl|ja|''Bungo''}} was the main method of writing Japanese until about 1900; since then {{transl|ja|''kōgo''}} gradually extended its influence and the two methods were both used in writing until the 1940s.' FULL: [[0.13972199999999993, ' was the main method of writing Japanese until about 1900 since then extended its influence gradually and the two methods were both used in writing until the 1940s.'], [0.09450500000000005, ' was the main method of writing of Japanese until about 1900 since then extended its influence gradually and the two methods were both used in writing until the 1940s.'], [0.05845800000000001, ' was the main method of writing Japanese until about 1900 since then gradually extended its influence and the two methods were both used in writing until the 1940s.'], [0.03953899999999999, ' was the main method of writing of Japanese until about 1900 since then gradually extended its influence and the two methods were both used in writing until the 1940s.']] WITH CHUNKING: None ORIGINAL: `LSA can use a [[term-document matrix]] which describes the occurrences of terms in documents; it is a [[sparse matrix]] whose rows correspond to [[terminology|terms]] and whose columns correspond to documents, typically [[stemming|stemmed]] words that appear in the documents.' FULL: [[0.02268300000000005, 'LSA can use a term document matrix which describes the occurrences in documents of terms. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words that appear in the documents.'], [0.021955000000000047, 'LSA can use a term document matrix which describes the occurrences of terms in documents. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words that appear in the documents.'], [0.02127900000000004, 'LSA can use a term document matrix which describes the occurrences in documents of terms. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words which appear in the documents.'], [0.02059800000000003, 'LSA can use a term document matrix which describes the occurrences of terms in documents. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words which appear in the documents.'], [0.020523000000000024, 'LSA can use a term document matrix which describes the occurrences in documents of terms. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words who appear in the documents.']] WITH CHUNKING: [[-3.8862510596250464, 'Lsa can use a term document matrix which describes the occurrences in documents of terms. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words who appear in the documents.'], [-3.8819674071926933, 'Lsa can use a term document matrix which describes the occurrences of terms in documents. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words which appear in the documents.'], [-3.849491668901026, 'Lsa can use a term document matrix which describes the occurrences in documents of terms. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words which appear in the documents.'], [-3.8182889158198963, 'Lsa can use a term document matrix which describes the occurrences of terms in documents. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words that appear in the documents.'], [-3.7858131775282287, 'Lsa can use a term document matrix which describes the occurrences in documents of terms. it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents typically stemmed words that appear in the documents.']] ORIGINAL: `These two invasions caused English to become "mixed" to some degree (though it was never a truly mixed language in the strict linguistic sense of the word; mixed languages arise from the cohabitation of speakers of different languages, who develop a hybrid tongue for basic communication).' FULL: None WITH CHUNKING: [[-16.37351180285357, 'Though never was it a truly mixed language in the strict linguistic sense of the word these two invasions caused english to become mixed to some degree. mixed languages arise from the cohabitation of speakers of different languages that develop a hybrid tongue for basic communication.'], [-16.37047689915842, 'Though never was it a truly mixed language in the linguistic strict sense of the word these two invasions caused english to become mixed to some degree. mixed languages arise from the cohabitation of speakers of different languages who develop a hybrid tongue for basic communication.'], [-16.37047689915842, 'Though never was it a truly mixed language in the strict linguistic sense of the word these two invasions caused english to become mixed to some degree. mixed languages arise from the cohabitation of speakers of different languages who develop a hybrid tongue for basic communication.'], [-16.26284623496605, 'Though never was it a truly mixed language in the linguistic strict sense of the word these two invasions caused english to become mixed to some degree. mixed languages arise from the cohabitation of speakers of different languages which develop a hybrid tongue for basic communication.'], [-16.26284623496605, 'Though never was it a truly mixed language in the strict linguistic sense of the word these two invasions caused english to become mixed to some degree. mixed languages arise from the cohabitation of speakers of different languages which develop a hybrid tongue for basic communication.']] ORIGINAL: `In the [[scientific journal]] style, the expression ~ s i n \alpha~ means product of variables ~s~, ~i~, ~n~ and ~\alpha~, although in a slideshow, it may mean ~\sin[\alpha]~.' FULL: None WITH CHUNKING: None ORIGINAL: `This could be seen to inhibit commercial use of GPL'ed code by others wishing to use that code for proprietary purposes—if they don't wish to avail themselves of GPL'ed code, they will have to re-implement it themselves.' FULL: [[0.116636, 'This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. if they do not wish to avail themselves of gpl’ed code they will have to re-implement it themselves.'], [0.08039299999999999, 'This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. they will have to re-implement it themselves if they do not wish to avail themselves of gpl’ed code.'], [0.007095999999999998, "This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. if they don't wish to avail themselves of gpl’ed code they will have to re-implement it themselves."], [0.005697000000000003, "This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. they will have to re-implement it themselves if they don't wish to avail themselves of gpl’ed code."]] WITH CHUNKING: [[-6.798527425525783, "This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. they will have to re-implement it themselves if they don't wish to avail themselves of gpl’ed code."], [-5.647711688334722, "This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. if they don't wish to avail themselves of gpl’ed code they will have to re-implement it themselves."], [-3.498368097636982, 'This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. they will have to re-implement it themselves if they do not wish to avail themselves of gpl’ed code.'], [-2.3475523604459214, 'This could be seen to inhibit commercial use of gpl’ed code by others wishing to use that code for proprietary purposes. if they do not wish to avail themselves of gpl’ed code they will have to re-implement it themselves.']] ORIGINAL: `Such a language can be defined, then, without any [[reference]] to any [[meaning (linguistics)|meaning]]s of any of its expressions; it can exist before any [[formal interpretation]] is assigned to it -- that is, before it has any meaning.' FULL: [[0.042466000000000004, 'Such a language can be defined then without any reference to any meanings of any of its expressions. that is before any formal interpretation is assigned to it it can exist before it has any meaning.'], [0.021399999999999995, 'Such a language can be defined then without any reference to any meanings of any of its expressions. before it has any meaning that is before any formal interpretation is assigned to it it can exist.'], [0.016699999999999996, 'Such a language can be defined without any reference to any meanings of any of its expressions then. that is before any formal interpretation is assigned to it it can exist before it has any meaning.'], [0.016674, 'Such a language can be defined then without any reference to any meanings of any of its expressions. that is it can exist before any formal interpretation is assigned to it before it has any meaning.'], [0.008414999999999999, 'Such a language can be defined without any reference to any meanings of any of its expressions then. before it has any meaning that is before any formal interpretation is assigned to it it can exist.']] WITH CHUNKING: [[-7.523948464615264, 'It can exist that is before any formal interpretation is assigned to it.'], [-6.609625351832062, 'Before any formal interpretation is assigned to it it can exist that is.'], [-3.04499421304717, 'That is it can exist before any formal interpretation is assigned to it.'], [-2.1306711002639673, 'Before any formal interpretation is assigned to it that is it can exist.']] ORIGINAL: `A [[greedy algorithm]] is similar to a [[dynamic programming|dynamic programming algorithm]], but the difference is that solutions to the subproblems do not have to be known at each stage; instead a "greedy" choice can be made of what looks best for the moment.' FULL: [[0.206708, 'A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems do not have to be known at each stage. instead a greedy choice can be made of what looks best for the moment.'], [0.16723100000000005, 'A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems do not have to be known at each stage. instead a greedy choice can be made of what does look best for the moment.'], [0.007103, "A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems don't have to be known at each stage. instead a greedy choice can be made of what looks best for the moment."], [0.005748, "A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems don't have to be known at each stage. instead a greedy choice can be made of what does look best for the moment."]] WITH CHUNKING: [[-6.17546626301088, "A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems don't have to be known at each stage. instead a greedy choice can be made of what does look best for the moment."], [-5.963536588830848, "A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems don't have to be known at each stage. instead a greedy choice can be made of what looks best for the moment."], [-2.804536022300306, 'A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems do not have to be known at each stage. instead a greedy choice can be made of what does look best for the moment.'], [-2.592606348120274, 'A greedy algorithm is similar to a dynamic programming algorithm but the difference is that solutions to the subproblems do not have to be known at each stage. instead a greedy choice can be made of what looks best for the moment.']] ORIGINAL: `* [[Omnis Studio]]—A proprietary [[Integrated development environment|IDE]] or Rapid Application Development tool for creating enterprise and web applications for Windows, Linux, and Mac OS X.' FULL: [[0.04578600000000002, 'Omnis Studio. a proprietary ide or rapid application development tool for creating enterprise and web applications for Windows Linux and OS Mac x.'], [0.04578600000000002, 'Omnis Studio. a proprietary ide or rapid application development tool for creating enterprise and web applications for Windows Linux and Mac OS x.']] WITH CHUNKING: None ORIGINAL: `This resulted in a legal dispute with [[Microsoft]] after Sun claimed that the Microsoft implementation did not support the [[Java remote method invocation|RMI]] and [[Java Native Interface|JNI]] interfaces and had added platform-specific features of their own.' FULL: [[0.020173999999999997, 'This did result in a legal dispute with Microsoft after Sun did claim that the Microsoft implementation did not support the RMI and JNI interfaces and had added platform specific features of their own.'], [0.01685, 'This resulted in a legal dispute with Microsoft after Sun did claim that the Microsoft implementation did not support the RMI and JNI interfaces and had added platform specific features of their own.'], [0.012816, 'After Sun did claim that the Microsoft implementation did not support the RMI and JNI interfaces and had added platform specific features of their own this resulted in a legal dispute with Microsoft.'], [0.009963999999999995, 'This did result in a legal dispute with Microsoft after Sun claimed that the Microsoft implementation did not support the RMI and JNI interfaces and had added platform specific features of their own.'], [0.009607999999999997, 'After Sun did claim that the Microsoft implementation did not support the RMI and JNI interfaces and had added platform specific features of their own this did result in a legal dispute with Microsoft.']] WITH CHUNKING: [[-7.26531850574722, 'This resulted in a legal dispute with microsoft after sun claimed the microsoft implementation did not support the rmi and jni interfaces and had added platform specific features of their own.'], [-6.7248454899098835, "After sun claimed that the microsoft implementation didn't support the rmi and jni interfaces and had added platform specific features of their own this resulted in a legal dispute with microsoft."], [-6.350995392964016, 'After sun claimed the microsoft implementation did not support the rmi and jni interfaces and had added platform specific features of their own this resulted in a legal dispute with microsoft.'], [-4.1961649704595905, 'This resulted in a legal dispute with microsoft after sun claimed that the microsoft implementation did not support the rmi and jni interfaces and had added platform specific features of their own.'], [-3.2818418576763873, 'After sun claimed that the microsoft implementation did not support the rmi and jni interfaces and had added platform specific features of their own this resulted in a legal dispute with microsoft.']] ORIGINAL: `Some vernacular forms of French in Africa can be difficult to understand for French speakers from other countries but written forms of the language are very closely related to those of the rest of the French-speaking world.' FULL: None WITH CHUNKING: [[-12.642802655954883, 'French in africa can be difficult to understand some vernacular forms of for french speakers from other countries but forms of the language written are related to those of the rest of the french speaking world very closely.'], [-12.61424940274689, 'It can be difficult to understand some vernacular forms of french in africa for french speakers from other countries but written forms of the language are related to those of the rest of the french speaking world very closely.'], [-12.490713358739239, 'It can be difficult to understand some vernacular forms of french in africa for french speakers from other countries but forms of the language written are related to those of the rest of the french speaking world of very closely.'], [-12.413350318566149, 'It can be difficult to understand some vernacular forms of french in africa for french speakers from other countries but forms of the language written are very closely related to those of the rest of the french speaking world.'], [-11.413548276185107, 'It can be difficult to understand some vernacular forms of french in africa for french speakers from other countries but forms of the language written are related to those of the rest of the french speaking world very closely.']] ORIGINAL: `The distinction between “traditional” and “web” applications is not always unambiguous, however, because applications have many different features, installation methods and architectures; and some of these can overlap and occur in ways that blur the distinction.' FULL: None WITH CHUNKING: [[-11.420515243137853, 'However the distinction between traditional and web applications is not always unambiguous because applications have many different features installation methods and architectures and some of these can overlap and occur in ways that blur the distinction.'], [-11.350377969223022, 'However the distinction between traditional and web applications is not always unambiguous because applications have many different features installation methods and architectures and some of these can overlap and occur in ways which blur the distinction.'], [-10.013789723951085, 'Because applications have many different features installation methods and architectures however the distinction between traditional and web applications is not always unambiguous and some of these can overlap and occur in ways who blur the distinction.'], [-9.920665934741695, 'Because applications have many different features installation methods and architectures however the distinction between traditional and web applications is not always unambiguous and some of these can overlap and occur in ways that blur the distinction.'], [-9.850528660826862, 'Because applications have many different features installation methods and architectures however the distinction between traditional and web applications is not always unambiguous and some of these can overlap and occur in ways which blur the distinction.']] ORIGINAL: `Kyoto-Osaka-type dialects are in the central region, with borders roughly formed by [[Toyama Prefecture|Toyama]], [[Kyoto Prefecture|Kyōto]], [[Hyōgo Prefecture|Hyōgo]], and [[Mie Prefecture|Mie]] Prefectures; most [[Shikoku]] dialects are also that type.' FULL: [[0.20480400000000004, 'Kyoto-osaka- type dialects are in the central region with borders formed by Toyama Kyōto Hyōgo and Mie Prefectureses roughly. also most Shikoku dialects are that type.'], [0.14302499999999996, 'Kyoto-osaka- type dialects are in the central region with borders roughly formed by Toyama Kyōto Hyōgo and Mie Prefectureses. also most Shikoku dialects are that type.'], [0.067977, 'Kyoto-osaka- type dialects are in the central region with borders formed by Toyama Kyōto Hyōgo and Mie Prefectureses roughly. most Shikoku dialects are also that type.'], [0.047474, 'Kyoto-osaka- type dialects are in the central region with borders roughly formed by Toyama Kyōto Hyōgo and Mie Prefectureses. most Shikoku dialects are also that type.'], [0.026156999999999996, 'Kyoto-osaka- type dialects are in the central region with borders formed by Toyama Kyōto Hyōgo and Mie Prefectureses roughly. most Shikoku dialects also are that type.']] WITH CHUNKING: [[-4.299340499062751, 'Kyoto-osaka- type dialects are in the central region with borders formed by toyama kyōto hyōgo and mie prefectureses roughly. most shikoku dialects also are that type.'], [-4.154163407532387, 'Kyoto-osaka- type dialects are in the central region with borders roughly formed by toyama kyōto hyōgo and mie prefectureses. most shikoku dialects are also that type.'], [-3.795133085770038, 'Kyoto-osaka- type dialects are in the central region with borders formed by toyama kyōto hyōgo and mie prefectureses roughly. most shikoku dialects are also that type.'], [-1.695080815329859, 'Kyoto-osaka- type dialects are in the central region with borders roughly formed by toyama kyōto hyōgo and mie prefectureses. also most shikoku dialects are that type.'], [-1.3360504935675104, 'Kyoto-osaka- type dialects are in the central region with borders formed by toyama kyōto hyōgo and mie prefectureses roughly. also most shikoku dialects are that type.']] ORIGINAL: `The pronoun {{lang|es|''vosotros''}} is the plural form of {{lang|es|''tú''}} in most of Spain, but in the Americas (and certain southern Spanish cities such as [[Cádiz]] or [[Seville]], and in the [[Canary Islands]]) it is replaced with {{lang|es|''ustedes''}}.' FULL: [[0.007417999999999999, 'The pronoun is the plural form of in most of Spain but in the Americas and southern certain Spanish cities such as Cádiz or Seville and in the Canary islands it is replaced with .'], [0.007417999999999999, 'The pronoun is the plural form of in most of Spain but in the Americas and southern Spanish certain cities such as Cádiz or Seville and in the Canary islands it is replaced with .'], [0.007417999999999999, 'The pronoun is the plural form of in most of Spain but in the Americas and certain southern Spanish cities such as Cádiz or Seville and in the Canary islands it is replaced with .'], [0.007417999999999999, 'The pronoun is the plural form of in most of Spain but in the Americas and certain Spanish southern cities such as Cádiz or Seville and in the Canary islands it is replaced with .'], [0.007417999999999999, 'The pronoun is the plural form of in most of Spain but in the Americas and Spanish southern certain cities such as Cádiz or Seville and in the Canary islands it is replaced with .']] WITH CHUNKING: [[-6.162141716865998, 'The pronoun is the plural form of in most of spain but in the americas and certain spanish southern cities such as cádiz or seville and in the canary islands it is replaced with .'], [-6.162141716865998, 'The pronoun is the plural form of in most of spain but in the americas and southern certain spanish cities such as cádiz or seville and in the canary islands it is replaced with .'], [-6.162141716865998, 'The pronoun is the plural form of in most of spain but in the americas and southern spanish certain cities such as cádiz or seville and in the canary islands it is replaced with .'], [-6.162141716865998, 'The pronoun is the plural form of in most of spain but in the americas and spanish certain southern cities such as cádiz or seville and in the canary islands it is replaced with .'], [-6.162141716865998, 'The pronoun is the plural form of in most of spain but in the americas and spanish southern certain cities such as cádiz or seville and in the canary islands it is replaced with .']] ORIGINAL: `It is estimated that 12% (4,200) of common French words found in a typical [[dictionary]] such as the ''Petit Larousse'' or ''Micro-Robert Plus'' (35,000 words) are of foreign origin.' FULL: None WITH CHUNKING: None ORIGINAL: `An ontology about the domain of [[poker]] would model the "[[playing card]]" meaning of the word, while an ontology about the domain of [[computer hardware]] would model the "[[punch card]]" and "[[video card]]" meanings.' FULL: [[0.058058, 'While a ontology ’bout the domain of computer hardware would model the punch card and video card meanings a ontology ’bout the domain of poker would model the playing card meaning of the word.'], [0.058058, 'While a ontology ’bout the domain of computer hardware would model the punch card and video card meanings a ontology ‘bout the domain of poker would model the playing card meaning of the word.'], [0.058058, 'While a ontology ’bout the domain of computer hardware would model the punch card and video card meanings a ontology about the domain of poker would model the playing card meaning of the word.'], [0.058058, 'While a ontology ‘bout the domain of computer hardware would model the punch card and video card meanings a ontology ’bout the domain of poker would model the playing card meaning of the word.'], [0.058058, 'While a ontology ‘bout the domain of computer hardware would model the punch card and video card meanings a ontology ‘bout the domain of poker would model the playing card meaning of the word.']] WITH CHUNKING: [[-2.9869563385574223, 'While a ontology ‘bout the domain of computer hardware would model the punch card and video card meanings a ontology ‘bout the domain of poker would model the playing card meaning of the word.'], [-2.9869563385574223, 'While a ontology ‘bout the domain of computer hardware would model the punch card and video card meanings a ontology ’bout the domain of poker would model the playing card meaning of the word.'], [-2.9869563385574223, 'While a ontology ’bout the domain of computer hardware would model the punch card and video card meanings a ontology about the domain of poker would model the playing card meaning of the word.'], [-2.9869563385574223, 'While a ontology ’bout the domain of computer hardware would model the punch card and video card meanings a ontology ‘bout the domain of poker would model the playing card meaning of the word.'], [-2.9869563385574223, 'While a ontology ’bout the domain of computer hardware would model the punch card and video card meanings a ontology ’bout the domain of poker would model the playing card meaning of the word.']] ORIGINAL: `As these pidgins became the mother tongue of succeeding generations, they evolved into fully fledged [[creole language]]s, which remained in use in many parts of Asia and Africa until the 18th century.' FULL: [[0.022140000000000003, 'As these pidgins became the mother tongue of succeeding of generations they evolved into fully fledged creole languages who remained in use until the 18th century in many parts of Asia and Africa.'], [0.019516000000000006, 'As these pidgins became the mother tongue of succeeding of generations they evolved into fully fledged creole languages which remained in use until the 18th century in many parts of Asia and Africa.'], [0.016600999999999998, 'As these pidgins became the mother tongue of succeeding of generations they evolved into fully fledged creole languages that remained in use until the 18th century in many parts of Asia and Africa.'], [0.01525299999999999, 'As these pidgins became the mother tongue of succeeding of generations they evolved into creole languages which remained in use until the 18th century in many parts of Asia and Africa fledged fully.'], [0.014114999999999989, 'As these pidgins became the mother tongue of succeeding of generations they evolved into creole languages that remained in use until the 18th century in many parts of Asia and Africa fledged fully.']] WITH CHUNKING: [[-4.719351292661626, 'As these pidgins became the mother tongue of succeeding of generations they evolved into creole languages that remained in use until the 18th century in many parts of asia and africa fledged fully.'], [-4.642465104116964, 'As these pidgins became the mother tongue of succeeding of generations they evolved into creole languages which remained in use until the 18th century in many parts of asia and africa fledged fully.'], [-4.558040147174348, 'As these pidgins became the mother tongue of succeeding of generations they evolved into fully fledged creole languages that remained in use until the 18th century in many parts of asia and africa.'], [-4.396275121353386, 'As these pidgins became the mother tongue of succeeding of generations they evolved into fully fledged creole languages which remained in use until the 18th century in many parts of asia and africa.'], [-4.270279901474977, 'As these pidgins became the mother tongue of succeeding of generations they evolved into fully fledged creole languages who remained in use until the 18th century in many parts of asia and africa.']] ORIGINAL: `Thus Japanese, like [[Chinese language|Chinese]], [[Korean language|Korean]], and many other Asian languages, is often called a [[topic-prominent language]], which means it has a strong tendency to indicate the topic separately from the subject, and the two do not always coincide.' FULL: [[8e-06, 'Thus Japanese like Chinese Korean and many other Asian languages is often called a topic prominent language who means that it has a strong tendency to separately indicate the topic from the subject and that the two do not coincide always.'], [6.999999999999999e-06, 'Thus Japanese like Chinese Korean and many other Asian languages is often called a topic prominent language who means that it has a strong tendency to indicate the topic from the subject separately and that the two do not coincide always.'], [5.999999999999999e-06, 'Thus Japanese like Chinese Korean and many other Asian languages is often called a topic prominent language who means that it does have a strong tendency to separately indicate the topic from the subject and that the two do not coincide always.'], [5.999999999999999e-06, 'Thus Japanese like Chinese Korean and many other Asian languages is often called a topic prominent language who does mean that it does have a strong tendency to separately indicate the topic from the subject and that the two do not coincide always.'], [4.9999999999999996e-06, 'Thus Japanese like Chinese Korean and many other Asian languages is often called a topic prominent language who means that it does have a strong tendency to indicate the topic from the subject separately and that the two do not coincide always.']] WITH CHUNKING: [[-9.805733110667651, 'Thus japanese like chinese korean and many other asian languages is called a topic prominent language which means that it has a strong tendency to separately indicate the topic from the subject and that the two do not coincide always often.'], [-9.796189675086488, 'Thus japanese like chinese korean and many other asian languages is called a topic prominent language that means that it has a strong tendency to indicate the topic separately from the subject and that the two do not coincide always often.'], [-9.736453608060069, 'Thus japanese like chinese korean and many other asian languages is called a topic prominent language which means that it has a strong tendency to indicate the topic separately from the subject and that the two do not coincide always often.'], [-9.728893642688318, 'Thus japanese like chinese korean and many other asian languages is called a topic prominent language who means that it has a strong tendency to separately indicate the topic from the subject and that the two do not coincide always often.'], [-9.659614140080738, 'Thus japanese like chinese korean and many other asian languages is called a topic prominent language who means that it has a strong tendency to indicate the topic separately from the subject and that the two do not coincide always often.']] ORIGINAL: `Because these systems are limited by the words and phrases in their databases, they are not general-purpose and can only synthesize the combinations of words and phrases with which they have been preprogrammed.' FULL: None WITH CHUNKING: None ORIGINAL: `The early versions of Windows were often thought of as just graphical user interfaces, mostly because they ran on top of [[MS-DOS]] and used it for [[file system]] services.' FULL: [[0.055111000000000035, 'The early versions of Windows were thought of as just graphical user interfaces mostly because they ran on top of ms. Dos and used it for file systems services often.'], [0.055111000000000035, 'The early versions of Windows were thought of as just graphical user interfaces mostly because they ran on top of ms Dos and used it for file systems services often.'], [0.053655999999999995, 'The early versions of Windows were thought of as just graphical user interfaces mostly because they ran on top of ms. Dos and used it for file system services often.'], [0.053655999999999995, 'The early versions of Windows were thought of as just graphical user interfaces mostly because they ran on top of ms Dos and used it for file system services often.'], [0.017902, 'The early versions of Windows were mostly because they ran on top of ms. Dos and used it for file systems services thought of as just graphical user interfaces often.']] WITH CHUNKING: None ORIGINAL: `Initially, nobody registered it, but on [[August 15]] [[1994]], William R. Della Croce, Jr. filed for the trademark ''Linux'', and then demanded royalties from Linux distributors.' FULL: None WITH CHUNKING: None ORIGINAL: `In the 1994 film ''[[Street Fighter]]'', Esperanto is the native language of the fictional country of [[Shadaloo]], and in a barracks scene the soldiers of villain [[M. Bison]] sing a rousing Russian Army-style chorus, the "Bison Troopers Marching Song", in the language.' FULL: [[0.009389999999999999, 'In the 1994 film street fighter Esperanto is the native language of the fictional country of Shadaloo and in a barracks scene the soldiers of villain M. Bison sing a rousing Russian army style chorus the Bison Troopers marching song in the language.'], [0.007970999999999999, 'In the 1994 film street fighter Esperanto is the native language of the fictional country of Shadaloo and in a barracks scene the soldiers of villain m Bison sing a rousing Russian army style chorus the Bison Troopers marching song in the language.'], [0.005964000000000003, 'In the 1994 film street fighter Esperanto is the native language of the fictional country of Shadaloo and in a barracks scene the soldiers of villain M. Bison sing a Russian army style chorus rousing the Bison Troopers marching song in the language.'], [0.005058000000000001, 'In the 1994 film street fighter Esperanto is the native language of the fictional country of Shadaloo and in a barracks scene the soldiers of villain m Bison sing a Russian army style chorus rousing the Bison Troopers marching song in the language.'], [0.0009270000000000006, 'In the 1994 film street fighter Esperanto is the native language of the fictional country of Shadaloo and in a barracks scene the soldiers of villain M. Bison sing a rousing Russian army style chorus the marching Bison Troopers song in the language.']] WITH CHUNKING: [[-8.481038863748601, 'In the 1994 film street fighter esperanto is the native language of the fictional country of shadaloo and in a barracks scene the soldiers of villain m. bison sing a rousing russian army style chorus the marching bison troopers song in the language.'], [-6.78340823927904, 'In the 1994 film street fighter esperanto is the native language of the fictional country of shadaloo and in a barracks scene the soldiers of villain m bison sing a russian army style chorus rousing the bison troopers marching song in the language.'], [-6.618760894646439, 'In the 1994 film street fighter esperanto is the native language of the fictional country of shadaloo and in a barracks scene the soldiers of villain m. bison sing a russian army style chorus rousing the bison troopers marching song in the language.'], [-6.3289468451827195, 'In the 1994 film street fighter esperanto is the native language of the fictional country of shadaloo and in a barracks scene the soldiers of villain m bison sing a rousing russian army style chorus the bison troopers marching song in the language.'], [-6.164325325534203, 'In the 1994 film street fighter esperanto is the native language of the fictional country of shadaloo and in a barracks scene the soldiers of villain m. bison sing a rousing russian army style chorus the bison troopers marching song in the language.']] ORIGINAL: `We may suppose this paper is divided into squares like a child's arithmetic book....I assume then that the computation is carried out on one-dimensional paper, i.e., on a tape divided into squares.' FULL: None WITH CHUNKING: [[-3.2343977287671657, "We may suppose that this paper is divided into squares like a child's arithmetic book. i assume that then that the computation is carried out i. e. on a tape divided into squares on paper which is one dimensional."], [-3.2343977287671657, "We may suppose that this paper is divided into squares like a child's arithmetic book. i assume that then that the computation is carried out i.e on a tape divided into squares on paper which is one dimensional."], [-3.2343977287671657, "We may suppose that this paper is divided into squares like a child's arithmetic book. i assume that then that the computation is carried out i.e. on a tape divided into squares on paper which is one dimensional."], [-3.2343977287671657, "We may suppose that this paper is divided into squares like a child's arithmetic book. i assume that then that the computation is carried out ie on a tape divided into squares on paper which is one dimensional."], [-3.2343977287671657, "We may suppose that this paper is divided into squares like a child's arithmetic book. i assume that then that the computation is carried out ie. on a tape divided into squares on paper which is one dimensional."]] ORIGINAL: `In the 20th century, over 100,000 German [[Refugee|political refugees]] and invited entrepreneurs settled in [[Latin America]], such as [[Costa Rica]], [[Panama]], Venezuela and the Dominican Republic to establish German-speaking enclaves, and there is a reportedly small [[German immigration to Puerto Rico]].' FULL: [[0.0032840000000000074, 'In the 20th century over 100000 political German refugees and invited entrepreneurs settled in Latin America such as Costa Rica Panama Venezuela and the Dominican Republic to establish German speaking enclaves and there is a German immigration to Puerto Rico small reportedly.'], [0.0032840000000000074, 'In the 20th century over 100000 German political refugees and invited entrepreneurs settled in Latin America such as Costa Rica Panama Venezuela and the Dominican Republic to establish German speaking enclaves and there is a German immigration to Puerto Rico small reportedly.'], [0.002158000000000001, 'In the 20th century over 100000 political German refugees and invited entrepreneurs settled in Latin America such as Costa Rica Panama Venezuela and the Dominican Republic to establish German speaking enclaves and there is a German reportedly small immigration to Puerto Rico.'], [0.002158000000000001, 'In the 20th century over 100000 German political refugees and invited entrepreneurs settled in Latin America such as Costa Rica Panama Venezuela and the Dominican Republic to establish German speaking enclaves and there is a German reportedly small immigration to Puerto Rico.'], [0.0011919999999999962, 'In the 20th century over 100000 political German refugees and invited entrepreneurs settled such as Costa Rica Panama Venezuela and the Dominican Republic in Latin America to establish German speaking enclaves and there is a German immigration to Puerto Rico small reportedly.']] WITH CHUNKING: [[-8.153142871251216, 'In the 20th century to establish german speaking enclaves over 100000 political german refugees and invited entrepreneurs settled in latin america such as costa rica panama venezuela and the dominican republic and there is a german immigration to puerto rico small reportedly.'], [-7.644338856574561, 'In the 20th century over 100000 german political refugees and invited entrepreneurs settled in latin america such as costa rica panama venezuela and the dominican republic to establish german speaking enclaves and there is a german reportedly small immigration to puerto rico.'], [-7.644338856574561, 'In the 20th century over 100000 political german refugees and invited entrepreneurs settled in latin america such as costa rica panama venezuela and the dominican republic to establish german speaking enclaves and there is a german reportedly small immigration to puerto rico.'], [-7.227623990587343, 'In the 20th century over 100000 german political refugees and invited entrepreneurs settled in latin america such as costa rica panama venezuela and the dominican republic to establish german speaking enclaves and there is a german immigration to puerto rico small reportedly.'], [-7.227623990587343, 'In the 20th century over 100000 political german refugees and invited entrepreneurs settled in latin america such as costa rica panama venezuela and the dominican republic to establish german speaking enclaves and there is a german immigration to puerto rico small reportedly.']] ORIGINAL: `[[Linux Weekly News]] is a weekly digest of Linux-related news; the [[Linux Journal]] is an online magazine of Linux articles published monthly; [[Slashdot]] is a technology-related news website with many stories on Linux and open source software; [[Groklaw]] has written in depth about Linux-related legal proceedings and there are many articles relevant to the Linux kernel and its relationship with [[GNU]] on the [[GNU Project|GNU project's]] website.' FULL: [[9.999999999999999e-06, "Linux weekly news is a weekly digest of Linux related news the Linux journal is an online magazine of Linux articles published monthly Slashdot is a technology related news website with many stories on Linux and open source software Groklaw has written legal about Linux related in-depth proceedings and there are many articles relevant to the Linux kernel and its relationship with GNU on the GNU project's website."], [9.999999999999999e-06, "Linux weekly news is a weekly digest of Linux related news the Linux journal is an online magazine of Linux articles published monthly Slashdot is a technology related news website with many stories on Linux and open source software Groklaw has written in-depth about Linux related legal proceedings and there are many articles relevant to the Linux kernel and its relationship with GNU on the GNU project's website."], [8e-06, "Linux weekly news is a weekly digest of Linux related news the Linux journal is an online magazine of Linux articles published monthly Slashdot is a technology related news website with many stories on Linux and open source software Groklaw has written in-depth legal about Linux related proceedings and there are many articles relevant to the Linux kernel and its relationship with GNU on the GNU project's website."], [8e-06, "Linux weekly news is a weekly digest of Linux related news the Linux journal is an online magazine of Linux articles published monthly Slashdot is a technology related news website with many stories on Linux and open source software Groklaw has written about Linux related legal in-depth proceedings and there are many articles relevant to the Linux kernel and its relationship with GNU on the GNU project's website."], [8e-06, "Linux weekly news is a weekly digest of Linux related news the Linux journal is an online magazine of Linux articles published monthly Slashdot is a technology related news website with many stories on Linux and open source software Groklaw has written about Linux related in-depth legal proceedings and there are many articles relevant to the Linux kernel and its relationship with GNU on the GNU project's website."]] WITH CHUNKING: None ORIGINAL: `The choice of syntax is affected by both [[linguistic]] and computational concerns; for instance some parsing systems use [[lexical functional grammar]], but in general, parsing for grammars of this type is known to be [[NP-complete]].' FULL: [[0.000386, 'The choice of syntax is affected by linguistic and computational concerns for instance some parsing systems use lexical functional grammar but in general parsing for grammars of this type is known to be NP-complete.'], [0.000386, 'The choice of syntax is affected by linguistic and computational concerns for instance some parsing systems use functional lexical grammar but in general parsing for grammars of this type is known to be NP-complete.'], [9.999999999999999e-05, 'The choice of syntax is affected by concerns linguistic and computational for instance some parsing systems use lexical functional grammar but in general parsing for grammars of this type is known to be NP-complete.'], [9.999999999999999e-05, 'The choice of syntax is affected by concerns linguistic and computational for instance some parsing systems use functional lexical grammar but in general parsing for grammars of this type is known to be NP-complete.']] WITH CHUNKING: None ORIGINAL: `IBM has been known through most of its recent history as the world's largest computer company; with over 388,000 employees worldwide, IBM is the largest [[information technology]] employer in the world.' FULL: None WITH CHUNKING: None ORIGINAL: `Since the early 1960s, with the availability of [[Oracle machine|oracle]]s for certain [[combinatorial game]]s, also called [[tablebase]]s (e.g. for 3x3-chess) with any beginning configuration, small-board [[dots-and-boxes]], small-board-hex, and certain endgames in chess, dots-and-boxes, and hex; a new area for data mining has been opened up.' FULL: None WITH CHUNKING: None ORIGINAL: `* The vocabulary, diacritic letters, and grammar are too dissimilar from the major Western European languages, and therefore Esperanto is not as easy as it could be for speakers of those languages to learn.' FULL: None WITH CHUNKING: [[-7.17649712767141, "The vocabulary diacritic letters and grammar are too dissimilar from the major western european languages and therefore esperanto isn't as easy as it could be for speakers of those languages to learn."], [-6.039245072538175, 'The vocabulary diacritic letters and grammar is too dissimilar from the major western european languages and therefore esperanto is not as easy as it could be for speakers of those languages to learn.'], [-5.643760816630717, 'The vocabulary diacritic letters and grammar am too dissimilar from the major western european languages and therefore esperanto is not as easy as it could be for speakers of those languages to learn.'], [-5.595179398939447, 'The vocabulary diacritic letters and grammar are too dissimilar from the western european major languages and therefore esperanto is not as easy as it could be for speakers of those languages to learn.'], [-3.403317107360678, 'The vocabulary diacritic letters and grammar are too dissimilar from the major western european languages and therefore esperanto is not as easy as it could be for speakers of those languages to learn.']] ORIGINAL: `Given the complexity of NLP problems, it is often difficult to predict performance only on the basis of glass-box evaluation, but this type of evaluation is more informative with respect to error analysis or future developments of a system.' FULL: None WITH CHUNKING: [[-8.701345033691078, 'Given the complexity of nlp problems performance is difficult to predict only on the basis of glass box evaluation often but this type of evaluation is more informative with respect to error analysis or future developments of a system.'], [-6.913221342288738, 'Given the complexity of nlp problems it is difficult to predict performance only on the basis of glass box evaluation often but this type of evaluation is informative more with respect to error analysis or future developments of a system.'], [-6.63805652817913, 'Given the complexity of nlp problems it is often difficult to predict performance only on the basis of glass box evaluation but this type of evaluation is informative more with respect to error analysis or future developments of a system.'], [-5.9270104151344025, 'Given the complexity of nlp problems it is difficult to predict performance only on the basis of glass box evaluation often but this type of evaluation is more informative with respect to error analysis or future developments of a system.'], [-5.651845601024795, 'Given the complexity of nlp problems it is often difficult to predict performance only on the basis of glass box evaluation but this type of evaluation is more informative with respect to error analysis or future developments of a system.']] ORIGINAL: `One of those new optional requirements, sometimes referred to as the Affero clause, is intended to fulfill a request regarding [[software as a service]]; the permitting addition of this requirement makes GPLv3 compatible with the [[Affero General Public License]].' FULL: None WITH CHUNKING: [[-12.886491575054281, 'One of those optional new requirements sometimes referred as the affero clause is intended to fulfill a request regarding software as a service. the permitting addition of this requirement makes gplv3 compatible with the affero public general license.'], [-12.702031331914164, 'One of those optional new requirements referred sometimes as the affero clause is intended to fulfill a request regarding software as a service. the permitting addition of this requirement makes gplv3 compatible with the affero general public licence.'], [-12.702031331914164, 'One of those optional new requirements referred sometimes as the affero clause is intended to fulfill a request regarding software as a service. the permitting addition of this requirement makes gplv3 compatible with the affero general public license.'], [-12.702031331914164, 'One of those optional new requirements referred sometimes as the affero clause is intended to fulfill a request regarding software as a service. the permitting addition of this requirement makes gplv3 compatible with the affero public general licence.'], [-12.702031331914164, 'One of those optional new requirements referred sometimes as the affero clause is intended to fulfill a request regarding software as a service. the permitting addition of this requirement makes gplv3 compatible with the affero public general license.']] ORIGINAL: `Linguistic structures are pairings of meaning and form (which may consist of sound patterns, movements of the hand, written symbols, and so on); such pairings are known as [[Ferdinand de Saussure|Saussurean]] [[linguistic sign|signs]].' FULL: [[0.008239, 'Linguistic structures are pairings of meaning and form which may consist of sound patterns movements of the hand written symbols and so on. such pairings are known as saussurean signs.'], [0.007609, 'Linguistic structures are pairings of meaning and form that may consist of sound patterns movements of the hand written symbols and so on. such pairings are known as saussurean signs.'], [0.007089000000000001, 'Linguistic structures are pairings of meaning and form who may consist of sound patterns movements of the hand written symbols and so on. such pairings are known as saussurean signs.'], [0.006619, 'Linguistic structures are pairings of meaning and form which may consist of sound patterns movements of the hand symbols written and so on. such pairings are known as saussurean signs.'], [0.0061140000000000005, 'Linguistic structures are pairings of meaning and form that may consist of sound patterns movements of the hand symbols written and so on. such pairings are known as saussurean signs.']] WITH CHUNKING: [[-5.438728183096516, 'Linguistic structures are pairings of meaning and form that may consist of sound patterns movements of the hand symbols written and so on. such pairings are known as saussurean signs.'], [-5.359365105428373, 'Linguistic structures are pairings of meaning and form which may consist of sound patterns movements of the hand symbols written and so on. such pairings are known as saussurean signs.'], [-5.290765119692279, 'Linguistic structures are pairings of meaning and form who may consist of sound patterns movements of the hand written symbols and so on. such pairings are known as saussurean signs.'], [-5.219977649370773, 'Linguistic structures are pairings of meaning and form that may consist of sound patterns movements of the hand written symbols and so on. such pairings are known as saussurean signs.'], [-5.140430429231706, 'Linguistic structures are pairings of meaning and form which may consist of sound patterns movements of the hand written symbols and so on. such pairings are known as saussurean signs.']] ORIGINAL: `A typical Spanish word is stressed on the [[syllable]] before the last if it ends with a vowel (not including ''y'') or with a vowel followed by ''n'' or ''s''; it is stressed on the last syllable otherwise.' FULL: None WITH CHUNKING: None ORIGINAL: `Use of HTML in e-mail is controversial because of compatibility issues, because it can be used in [[phishing]]/privacy attacks, because it can confuse [[E-Mail spam|spam]] filters, and because the message size is larger than plain text.' FULL: [[0.2768469999999999, 'Use of HTML in email is controversial because of compatibility issues because because it can confuse spam filters it can be used in phishing and privacy attacks and because the message size is larger than plain text.'], [0.06492699999999998, 'Use of HTML in email is controversial because of compatibility issues because it can be used in phishing and privacy attacks because it can confuse spam filters and because the message size is larger than plain text.']] WITH CHUNKING: [[-4.2855804808753515, 'Use of html in email is and is controversial because of compatibility issues because it can be used in phishing and privacy attacks because it can confuse spam filters.'], [-3.251325919243519, 'Use of html in email is and is controversial because of compatibility issues because because it can confuse spam filters it can be used in phishing and privacy attacks.'], [-2.7857311724791933, 'Because it can be used in phishing and privacy attacks because it can confuse spam filters use of html in email is and is controversial because of compatibility issues.'], [-1.751476610847361, 'Because because it can confuse spam filters it can be used in phishing and privacy attacks use of html in email is and is controversial because of compatibility issues.']] ORIGINAL: `According to industry experts, at its inception, speech recognition (SR) was sold as a way to completely eliminate transcription rather than make the transcription process more efficient, hence it was not accepted.' FULL: None WITH CHUNKING: [[-3.1338798467186315, 'According to industry experts at its inception speech recognition sr was sold as a way to eliminate transcription from completely rather than make more efficient the transcription process. hence it was not accepted.'], [-3.0301557292047585, 'According to industry experts at its inception speech recognition sr was sold as a way to eliminate transcription completely rather than make the transcription process efficient more. hence it was not accepted.'], [-2.6817992881123343, 'According to industry experts at its inception speech recognition sr was sold as a way to much rather eliminate transcription from completely than make be more efficient the transcription process. hence it was not accepted.'], [-2.6817992881123343, 'According to industry experts at its inception speech recognition sr was sold as a way to rather eliminate transcription from completely than make be more efficient the transcription process. hence it was not accepted.'], [-2.2603171441330137, 'According to industry experts at its inception speech recognition sr was sold as a way to eliminate transcription from completely rather than make be more efficient the transcription process. hence it was not accepted.']] ORIGINAL: `* [[Python (programming language)|Python]]—A modern [[scripting language]] where the focus is on [[rapid application development]] and ease-of-writing, instead of program run-time efficiency.' FULL: [[0.072816, 'Python. a modern scripting language where the focus does on rapid application development and ease of writing instead of program run time efficiency.'], [0.044854, 'Python. a modern scripting language where the focus is on rapid application development and ease of writing instead of program run time efficiency.'], [0.029448000000000002, 'Python. a modern scripting language where the focus does on rapid application development and ease of writing instead of run time program efficiency.'], [0.01814, 'Python. a modern scripting language where the focus is on rapid application development and ease of writing instead of run time program efficiency.']] WITH CHUNKING: None ORIGINAL: `By the 1980s, however, progress in symbolic AI seemed to stall and many believed that symbolic systems would never be able to imitate all the processes of human cognition, especially [[machine perception|perception]], [[robotics]], [[machine learning|learning]] and [[pattern recognition]].' FULL: [[0.05237, 'By the 1980s progress in symbolic AI seemed to stall and many believed that symbolic systems would be never able to imitate all of the processes of human cognition especially perception robotics learning and pattern recognition.'], [0.049510000000000005, 'By the 1980s progress in symbolic AI seemed to stall and many believed that symbolic systems would never be able to imitate all of the processes of human cognition especially perception robotics learning and pattern recognition.'], [0.048675, 'By the 1980s progress in symbolic AI seemed to stall and many believed that symbolic systems would be never able to imitate all the processes of human cognition especially perception robotics learning and pattern recognition.'], [0.046015, 'By the 1980s progress in symbolic AI seemed to stall and many believed that symbolic systems would never be able to imitate all the processes of human cognition especially perception robotics learning and pattern recognition.'], [0.008060999999999999, 'By the 1980s progress in symbolic AI seemed to stall and many believed symbolic systems would never be able to imitate all of the processes of human cognition especially perception robotics learning and pattern recognition.']] WITH CHUNKING: [[-7.056385448008053, 'By the 1980s progress in symbolic ai seemed to stall and many believed symbolic systems would be never able to imitate all of the processes of human cognition especially perception robotics learning and pattern recognition.'], [-4.943462851138275, 'By the 1980s progress in symbolic ai seemed to stall and many believed that symbolic systems would never be able to imitate all the processes of human cognition especially perception robotics learning and pattern recognition.'], [-4.870259879601274, 'By the 1980s progress in symbolic ai seemed to stall and many believed that symbolic systems would never be able to imitate all of the processes of human cognition especially perception robotics learning and pattern recognition.'], [-4.0110433549247215, 'By the 1980s progress in symbolic ai seemed to stall and many believed that symbolic systems would be never able to imitate all the processes of human cognition especially perception robotics learning and pattern recognition.'], [-3.937846845105483, 'By the 1980s progress in symbolic ai seemed to stall and many believed that symbolic systems would be never able to imitate all of the processes of human cognition especially perception robotics learning and pattern recognition.']] ORIGINAL: `This strategy is arguably the most complicated and expensive way to fulfill cross-platform capability, since even different versions of the same client browser (within the same operating system) can differ dramatically between each other.' FULL: None WITH CHUNKING: [[-7.974175924543923, 'Since even different versions of the same client browser within the same operating system can differ dramatically between one another this strategy arguably is the most complicated and expensive way to fulfill cross platform capability.'], [-7.301396348750686, 'Since even different versions of the same client browser within the same operating system can differ between each other dramatically arguably this strategy is the most complicated and expensive way to fulfill cross platform capability.'], [-7.301396348750686, 'Since even different versions of the same client browser within the same operating system can differ between one another dramatically arguably this strategy is the most complicated and expensive way to fulfill cross platform capability.'], [-6.834373347643088, 'Since even different versions of the same client browser within the same operating system can differ dramatically between each other arguably this strategy is the most complicated and expensive way to fulfill cross platform capability.'], [-6.834373347643088, 'Since even different versions of the same client browser within the same operating system can differ dramatically between one another arguably this strategy is the most complicated and expensive way to fulfill cross platform capability.']] ORIGINAL: `A review of cluster analysis in health psychology research found that the most common distance measure in published studies in that research area is the Euclidean distance or the squared Euclidean distance.' FULL: None WITH CHUNKING: [[-8.868349839196712, 'A review of cluster analysis in health psychology research found that the commonest distance measure in published studies in that research area is the euclidean distance or the euclidean distance squared.'], [-8.828344504583013, 'A review of cluster analysis in health psychology research found that the commonest distance measure in published studies in that research area is the euclidean distance or the euclidean squared distance.'], [-7.849969240532854, 'A review of cluster analysis in health psychology research found that the most common distance measure in published studies in that research area is the euclidean distance or the squared euclidean distance.'], [-7.689171693483993, 'A review of cluster analysis in health psychology research found that the most common distance measure in published studies in that research area is the euclidean distance or the euclidean distance squared.'], [-7.644174327553257, 'A review of cluster analysis in health psychology research found that the most common distance measure in published studies in that research area is the euclidean distance or the euclidean squared distance.']] ORIGINAL: `In 1997, the [[Science Citation Index]] reported that 95% of its articles were written in English, even though only half of them came from authors in English-speaking countries.' FULL: None WITH CHUNKING: None ORIGINAL: `Although the members of each group are not closely [[genetic relatedness of languages|genetically related]], there is a reason for them to share similar features, namely: their speakers have been in contact for a long time within a common community and the languages ''converged'' in the course of the history.' FULL: None WITH CHUNKING: None ORIGINAL: `For example, the study of [[computer hardware]] is usually considered part of [[computer engineering]], while the study of commercial [[computer system]]s and their deployment is often called [[information technology]] or [[information systems]].' FULL: [[0.04732900000000002, 'For example while the study of commercial computer systems and their deployment is called information technology or information systems often the study of computer hardware is usually considered part of computer engineering.'], [0.022052999999999996, 'For example while the study of commercial computer systems and their deployment is called information technology or information systems often usually the study of computer hardware is considered part of computer engineering.'], [0.016490000000000015, 'For example while the study of commercial computer systems and their deployment is often called information technology or information systems the study of computer hardware is usually considered part of computer engineering.'], [0.01325899999999999, 'For example the study of computer hardware is usually considered part of computer engineering while the study of commercial computer systems and their deployment is called information technology or information systems often.'], [0.007684000000000002, 'For example while the study of commercial computer systems and their deployment is often called information technology or information systems usually the study of computer hardware is considered part of computer engineering.']] WITH CHUNKING: [[-5.611496777336156, 'For example while the study of commercial computer systems and their deployment is often called information technology or information systems usually the study of computer hardware is considered part of computer engineering.'], [-5.311225848701955, 'For example the study of computer hardware is usually considered part of computer engineering while the study of commercial computer systems and their deployment is called information technology or information systems often.'], [-4.317502479679944, 'For example while the study of commercial computer systems and their deployment is called information technology or information systems often the study of computer hardware is usually considered part of computer engineering.'], [-3.7745850661903404, 'For example usually the study of computer hardware is considered part of computer engineering while the study of commercial computer systems and their deployment is called information technology or information systems often.'], [-2.780861697168329, 'For example while the study of commercial computer systems and their deployment is called information technology or information systems often usually the study of computer hardware is considered part of computer engineering.']] ORIGINAL: `Some written languages like [[Chinese language|Chinese]], [[Japanese language|Japanese]] and [[Thai language|Thai]] do not have single-word boundaries either, so any significant text [[parsing]] usually requires the identification of word boundaries, which is often a non-trivial task.' FULL: None WITH CHUNKING: None ORIGINAL: `The data is often found to contain considerable variability, or [[noise]], and thus [[Hidden Markov model]] and [[change-point analysis]] methods are being developed to infer real [[copy number variation|copy number]] changes.' FULL: None WITH CHUNKING: [[-8.879251865755144, 'The datum is found to contain considerable variability or noise often and to infer real copy number changes hidden markov model and change point analysis methods are being developed.'], [-8.877941363757841, 'The data is found to contain considerable variability or noise often and hidden markov model and change point analysis methods are being developed to infer real copy numbers changes.'], [-8.80648535366086, 'The datum is found to contain considerable variability or noise often and hidden markov model and change point analysis methods are being developed to infer real copy number changes.'], [-8.773279542185913, 'The datum is found to contain considerable variability or noise often and to infer real copy numbers changes hidden markov model and change point analysis methods are being developed.'], [-8.700063186276715, 'The datum is found to contain considerable variability or noise often and hidden markov model and change point analysis methods are being developed to infer real copy numbers changes.']] ORIGINAL: `Since 1991, when Brazil signed into the economic market of Mercosul with other South American nations, such as Argentina, Uruguay, and Paraguay, there has been an increase in interest in the study of Portuguese in those South American countries.' FULL: [[0.12246800000000002, 'Since 1991 when Brazil signed into the economic market of mercosul with South American other nations such as Argentina Uruguay and Paraguay there has been an increase in interest in the study of Portuguese in those South American countries.'], [0.080231, 'Since 1991 when Brazil signed with South American other nations such as Argentina Uruguay and Paraguay into the economic market of mercosul there has been an increase in interest in the study of Portuguese in those South American countries.'], [0.064351, 'Since 1991 when Brazil signed into the economic market of mercosul with other South American nations such as Argentina Uruguay and Paraguay there has been an increase in interest in the study of Portuguese in those South American countries.'], [0.04215800000000001, 'Since 1991 when Brazil signed with other South American nations such as Argentina Uruguay and Paraguay into the economic market of mercosul there has been an increase in interest in the study of Portuguese in those South American countries.'], [0.011497, 'Since 1991 there has been an increase in interest in the study of Portuguese in those South American countries when Brazil signed with South American other nations such as Argentina Uruguay and Paraguay into the economic market of mercosul.']] WITH CHUNKING: None ORIGINAL: `Since variables conforming only to nominal or ordinal measurements cannot be reasonably measured numerically, sometimes they are called together as categorical variables, whereas ratio and interval measurements are grouped together as quantitative or [[continuous variables]] due to their numerical nature.' FULL: None WITH CHUNKING: [[-7.993901214208519, 'Since variables only conforming to nominal or ordinal measurements can not be measured reasonably numerically whereas ratio and interval measurements are grouped together due to their numerical nature as quantitative or continuous variables together they are called as categorical variables sometimes.'], [-7.941982080930774, 'Since variables only conforming to nominal or ordinal measurements can not be measured numerically reasonably whereas ratio and interval measurements are grouped together as quantitative or continuous variables due to their numerical nature together they are called as categorical variables sometimes.'], [-7.941982080930774, 'Since variables only conforming to nominal or ordinal measurements can not be measured reasonably numerically whereas ratio and interval measurements are grouped together as quantitative or continuous variables due to their numerical nature together they are called as categorical variables sometimes.'], [-7.539632802103448, 'Since variables only conforming to nominal or ordinal measurements can not be measured numerically reasonably whereas ratio and interval measurements are grouped as quantitative or continuous variables due to their numerical nature together together they are called as categorical variables sometimes.'], [-7.539632802103448, 'Since variables only conforming to nominal or ordinal measurements can not be measured reasonably numerically whereas ratio and interval measurements are grouped as quantitative or continuous variables due to their numerical nature together together they are called as categorical variables sometimes.']] ORIGINAL: `These detection methods simultaneously measure several hundred thousand sites throughout the genome, and when used in high-throughput to measure thousands of samples, generate [[terabyte]]s of data per experiment.' FULL: [[0.015009, 'These detection methods measure several hundred thousand sites throughout the genome simultaneously and when used in high throughput to measure thousands of samples generate terabytes of data per experiment.'], [0.009486, 'These detection methods simultaneously measure several hundred thousand sites throughout the genome and when used in high throughput to measure thousands of samples generate terabytes of data per experiment.'], [0.0026330000000000004, 'These detection methods measure several hundred thousand sites throughout the genome simultaneously and when used in high throughput to measure thousands of samples generate terabytes per experiment of data.'], [0.0016639999999999997, 'These detection methods simultaneously measure several hundred thousand sites throughout the genome and when used in high throughput to measure thousands of samples generate terabytes per experiment of data.']] WITH CHUNKING: [[-10.121265235774342, 'These detection methods simultaneously measure several hundred thousand sites throughout the genome and when used in high throughput to measure thousands of samples generate terabytes per experiment of data.'], [-8.381344462115607, 'These detection methods simultaneously measure several hundred thousand sites throughout the genome and when used in high throughput to measure thousands of samples generate terabytes of data per experiment.'], [-7.190120575269146, 'These detection methods measure several hundred thousand sites throughout the genome simultaneously and when used in high throughput to measure thousands of samples generate terabytes per experiment of data.'], [-5.450199801610412, 'These detection methods measure several hundred thousand sites throughout the genome simultaneously and when used in high throughput to measure thousands of samples generate terabytes of data per experiment.']] ORIGINAL: `#Achievement of very high recognition accuracy (95% or more) was the most critical factor for making the speech recognition system useful — with lower recognition rates, pilots would not use the system.' FULL: None WITH CHUNKING: None ORIGINAL: `That is, a script written in [[Python (programming language)|Python]] for a [[Unix-like]] system will likely run with little or no modification on [[Microsoft Windows|Windows]], because Python also runs on [[Microsoft Windows|Windows]]; there is also more than one implementation of Python that will run the same scripts (e.g., [[IronPython]] for [[.NET Framework|.NET]]).' FULL: None WITH CHUNKING: [[-5.647404459531294, 'That is likely a script written in python for a system which will be unix like will run with little or no modification on windows because also python runs on windows.'], [-4.649546455874269, 'Because also python runs on windows that is a script written in python for a system who will be unix like will likely run with little or no modification on windows.'], [-4.433181724404821, 'Because also python runs on windows that is a script written in python for a system which will be unix like will likely run with little or no modification on windows.'], [-4.364137519533191, 'Because also python runs on windows that is likely a script written in python for a system who will be unix like will run with little or no modification on windows.'], [-4.147555151135135, 'Because also python runs on windows that is likely a script written in python for a system which will be unix like will run with little or no modification on windows.']] ORIGINAL: `If a different menu entry such as "Paste" is chosen, the software may execute the instructions to copy the text from the clipboard data area to a specific location in the same or another document in memory.' FULL: None WITH CHUNKING: None ORIGINAL: `Italian is widely taught in many schools around the world, but rarely as the first non-native language of pupils; in fact, Italian generally is the fourth or fifth most taught second-language in the world.' FULL: None WITH CHUNKING: None ORIGINAL: `Front-End SR is where the provider dictates into a speech-recognition engine, the recognized words are displayed right after they are spoken, and the dictator is responsible for editing and signing off on the document.' FULL: None WITH CHUNKING: None ORIGINAL: `Computer science has many sub-fields; some emphasize the computation of specific results (such as [[computer graphics]]), while others relate to properties of [[computational problem]]s (such as [[computational complexity theory]]).' FULL: [[0.030567999999999998, 'Computer science has many sub-fields. some emphasize the computation of specific results such as computer graphics while others relate to properties of computational problems such as computational complexity theory.'], [0.023122, 'Computer science has many sub-fields. while others relate to properties of computational problems such as computational complexity theory some emphasize the computation of specific results such as computer graphics.'], [0.0024879999999999998, 'Computer science has many sub-fields. some emphasize the computation of specific results such as computer graphics while others relate properties of computational problems such as computational complexity theory.'], [0.0019229999999999998, 'Computer science has many sub-fields. while others relate properties of computational problems such as computational complexity theory some emphasize the computation of specific results such as computer graphics.']] WITH CHUNKING: [[-7.064144335214481, 'Computer science has many sub-fields. some emphasize the computation of specific results such as computer graphics while others relate properties of computational problems such as computational complexity theory.'], [-5.564295026818321, 'Computer science has many sub-fields. while others relate properties of computational problems such as computational complexity theory some emphasize the computation of specific results such as computer graphics.'], [-4.681073178329771, 'Computer science has many sub-fields. some emphasize the computation of specific results such as computer graphics while others relate to properties of computational problems such as computational complexity theory.'], [-3.1812238699336124, 'Computer science has many sub-fields. while others relate to properties of computational problems such as computational complexity theory some emphasize the computation of specific results such as computer graphics.']] ORIGINAL: `All of these techniques are extremely noise-prone and/or subject to bias in the biological measurement, and a major research area in computational biology involves developing statistical tools to separate [[signal (information theory)|signal]] from [[noise]] in high-throughput gene expression studies.' FULL: [[0.123779, 'All of these techniques are noise extremely prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.'], [0.06637399999999999, 'All of these techniques are extremely noise prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.'], [0.040392000000000004, 'All these techniques are noise extremely prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.'], [0.021662, 'All these techniques are extremely noise prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.']] WITH CHUNKING: [[-19.354842948731104, 'All these techniques are extremely noise prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.'], [-18.652474000617932, 'All these techniques are noise extremely prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.'], [-17.739840995699186, 'All of these techniques are extremely noise prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.'], [-17.037485231066775, 'All of these techniques are noise extremely prone and subject to bias in the biological measurement and a major research area in computational biology involves developing statistical tools to separate signal from noise in high throughput gene expression studies.']] ORIGINAL: `The use of '''ß''' has recently been limited by the latest German spelling reform and is no longer used for '''ss''' at the end of a syllable; Switzerland and Liechtenstein already abolished it in 1934.' FULL: None WITH CHUNKING: None ORIGINAL: `With the advent of digital copiers in the mid-1980s this technical restriction had largely disappeared; at roughly the same time, the 13-bar logo was abandoned for almost the opposite reason it was difficult to render accurately on the low-resolution digital printers (240 dots per inch) of the time.' FULL: None WITH CHUNKING: None ORIGINAL: `They have been used for example for extracting features for clustering large sets of satellite earth images and for determining what part of the Earth a particular image came from.' FULL: [[0.00046000000000000007, 'They have been used for extracting features for clustering large sets of satellite Earth images and for determining what part of the Earth a particular image came from for example.'], [0.00025800000000000004, 'They have been used for extracting features for clustering large sets of satellite Earth images and for determining which part of the Earth a particular image came from for example.'], [9.3e-05, 'They have been used for extracting features for clustering large sets of satellite Earth images and for determining from what part of the Earth a particular image did come for example.'], [8.1e-05, 'They have been used for extracting of features for clustering large sets of satellite Earth images and for determining what part of the Earth a particular image came from for example.'], [5.4999999999999995e-05, 'They have been used for extracting features for clustering large sets of satellite Earth images and for determining from which part of the Earth a particular image did come for example.']] WITH CHUNKING: [[-10.338606254578426, 'They have been used for extracting of features for clustering large sets of satellite earth images and for determining that from what part of the earth did a particular image come for example.'], [-9.427939447343892, 'They have been used for extracting features for clustering large sets of satellite earth images and for determining that which part of the earth did a particular image come from for example.'], [-9.211564200548965, 'They have been used for extracting features for clustering large sets of satellite earth images and for determining that what part of the earth did a particular image come from for example.'], [-9.21074867532065, 'They have been used for extracting features for clustering large sets of satellite earth images and for determining that from which part of the earth did a particular image come for example.'], [-8.860045453183977, 'They have been used for extracting features for clustering large sets of satellite earth images and for determining that from what part of the earth did a particular image come for example.']] ORIGINAL: `They have ''formal'' properties, like what kinds of [[morphology (linguistics)|morphological]] [[prefix]]es or [[suffix]]es they take and what kinds of other expressions they combine with; but they also have [[semantics|semantic]] properties, i.e. properties pertaining to their meaning.' FULL: [[4.7999999999999994e-05, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they do have semantic properties ie. properties pertaining to their meaning.'], [4.7999999999999994e-05, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they do have semantic properties ie properties pertaining to their meaning.'], [4.7999999999999994e-05, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they do have semantic properties i.e. properties pertaining to their meaning.'], [4.7999999999999994e-05, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they do have semantic properties i.e properties pertaining to their meaning.'], [4.7999999999999994e-05, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they do have semantic properties i. e. properties pertaining to their meaning.']] WITH CHUNKING: [[-10.182126163876235, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they have semantic properties i. e. properties pertaining to their meaning.'], [-10.182126163876235, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they have semantic properties i.e properties pertaining to their meaning.'], [-10.182126163876235, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they have semantic properties i.e. properties pertaining to their meaning.'], [-10.182126163876235, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they have semantic properties ie properties pertaining to their meaning.'], [-10.182126163876235, 'They have formal properties like what kinds of morphological prefixes or suffixes they do take and what kinds of other expressions they combine with but also they have semantic properties ie. properties pertaining to their meaning.']] ORIGINAL: `However, it is important to note that this study does not apply to Windows XP systems running the Service Pack 2 update (released in late 2004), which vastly improved the security of Windows XP.' FULL: [[0.07966700000000003, 'However it is important to note that this study does not apply to Windows XP systems running the Service Pack two update released in late 2004 which did improve the security of Windows XP vastly.'], [0.07502500000000015, 'However it is important whether to note that this study does not apply to Windows XP systems running the Service Pack two update released in late 2004 which did improve the security of Windows XP vastly.'], [0.06647, 'However it is important to note that this study does not apply to Windows XP systems running the Service Pack two update released in late 2004 who did improve the security of Windows XP vastly.'], [0.06254100000000007, 'However it is important whether to note that this study does not apply to Windows XP systems running the Service Pack two update released in late 2004 who did improve the security of Windows XP vastly.'], [0.062073, 'However it is important to note that this study does not apply to Windows XP systems running the Service Pack two update released in late 2004 that did improve the security of Windows XP vastly.']] WITH CHUNKING: [[-2.6428146506335257, 'However it is important to note that this study does not apply to windows xp systems running the service pack two update released in late 2004 that did improve the security of windows xp vastly.'], [-2.635659304854204, 'However it is important whether to note that this study does not apply to windows xp systems running the service pack two update released in late 2004 who did improve the security of windows xp vastly.'], [-2.576315281805023, 'However it is important to note that this study does not apply to windows xp systems running the service pack two update released in late 2004 who did improve the security of windows xp vastly.'], [-2.459311778677237, 'However it is important whether to note that this study does not apply to windows xp systems running the service pack two update released in late 2004 which did improve the security of windows xp vastly.'], [-2.3999677556280554, 'However it is important to note that this study does not apply to windows xp systems running the service pack two update released in late 2004 which did improve the security of windows xp vastly.']] ORIGINAL: `Copyleft thus uses copyright law to accomplish the opposite of its usual purpose: instead of imposing restrictions, it grants rights to other people, in a way that ensures the rights cannot subsequently be taken away.' FULL: None WITH CHUNKING: [[-6.10614159922908, 'Thus copyleft uses © law to accomplish the opposite of its usual purpose :. instead of imposing on restrictions it does grant rights to other people in a way who does ensure that the rights can not be taken away subsequently.'], [-6.031171178876594, 'Thus copyleft uses copyright law to accomplish the opposite of its usual purpose :. instead of imposing on restrictions it does grant rights to other people in a way that does ensure that the rights can not be taken away subsequently.'], [-6.031171178876594, 'Thus copyleft uses © law to accomplish the opposite of its usual purpose :. instead of imposing on restrictions it does grant rights to other people in a way that does ensure that the rights can not be taken away subsequently.'], [-5.8884850799776896, 'Thus copyleft uses copyright law to accomplish the opposite of its usual purpose :. instead of imposing on restrictions it does grant rights to other people in a way which does ensure that the rights can not be taken away subsequently.'], [-5.8884850799776896, 'Thus copyleft uses © law to accomplish the opposite of its usual purpose :. instead of imposing on restrictions it does grant rights to other people in a way which does ensure that the rights can not be taken away subsequently.']] ORIGINAL: `French and German are not official languages nor recognised minority languages in the [[Flemish Region]], although along borders with the Walloon and Brussels-Capital regions, there are a dozen of [[municipalities with language facilities]] for French-speakers; a mirroring situation exists for the Walloon Region with respect to the Dutch and German languages.' FULL: None WITH CHUNKING: [[-8.11235214547432, 'Although along borders with the walloon and brussels capital regions there are one dozen of municipalities with language facilities for french speakers french and german are not official languages nor recognized minority languages in the flemish region. a mirroring situation exists with respect to the dutch and german languages for the walloon region.'], [-6.873626115074121, 'Although along borders with the walloon and brussels capital regions there are a dozen of municipalities with language facilities for french speakers french and german are not neither official languages nor recognized minority languages in the flemish region. a mirroring situation exists for the walloon region with respect to the dutch and german languages.'], [-6.873626115074121, 'Although along borders with the walloon and brussels capital regions there are one dozen of municipalities with language facilities for french speakers french and german are not neither official languages nor recognized minority languages in the flemish region. a mirroring situation exists for the walloon region with respect to the dutch and german languages.'], [-6.698347307944344, 'Although along borders with the walloon and brussels capital regions there are a dozen of municipalities with language facilities for french speakers french and german are not neither official languages nor recognized minority languages in the flemish region. a mirroring situation exists with respect to the dutch and german languages for the walloon region.'], [-6.698347307944344, 'Although along borders with the walloon and brussels capital regions there are one dozen of municipalities with language facilities for french speakers french and german are not neither official languages nor recognized minority languages in the flemish region. a mirroring situation exists with respect to the dutch and german languages for the walloon region.']] ORIGINAL: `They hypothesized that a search engine that analyzed the relationships between websites would produce better ranking of results than existing techniques, which ranked results according to the number of times the search term appeared on a page.' FULL: None WITH CHUNKING: None ORIGINAL: `Torvalds has publicly stated that he would not move the Linux kernel (currently licensed under GPL version 2) to version 3 of the GPL, released in mid-2007, specifically citing some provisions in the new license which prohibit the use of the software in [[digital rights management]].' FULL: None WITH CHUNKING: None ORIGINAL: `For example, if (X,Y) represents the position of a [[chess]] piece — X the row and Y the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.' FULL: None WITH CHUNKING: [[-inf, 'For example if represents the position of a chess piece the row and the column then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.'], [-inf, 'For example then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece if represents the position of a chess piece the row and the column.']] ORIGINAL: `The ambiguity becomes even worse, if ~|x\rangle~ is used for the states with certain value of the coordinate, and ~|p\rangle~ means the state with certain value of the momentum, which may be used in books on [[quantum mechanics]].' FULL: None WITH CHUNKING: [[-12.046018069245395, 'If is used for the states with certain value of the co-ordinate and means the state which with certain value of the momentum may be used in books on quantum mechanics the ambiguity becomes even worse.'], [-12.046018069245395, 'If is used for the states with certain value of the coordinate and means the state which with certain value of the momentum may be used in books on quantum mechanics the ambiguity becomes even badder.'], [-12.046018069245395, 'If is used for the states with certain value of the coordinate and means the state which with certain value of the momentum may be used in books on quantum mechanics the ambiguity becomes even worse.'], [-12.046018069245395, 'If is used for the states with certain value of the coördinate and means the state which with certain value of the momentum may be used in books on quantum mechanics the ambiguity becomes even badder.'], [-12.046018069245395, 'If is used for the states with certain value of the coördinate and means the state which with certain value of the momentum may be used in books on quantum mechanics the ambiguity becomes even worse.']] ORIGINAL: `The backpropagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signalling was not obvious at the time, but most importantly because there was no plausible source for the 'teaching' or 'target' signal.' FULL: None WITH CHUNKING: None ORIGINAL: `These machines commonly run [[Microsoft Windows]], though they can run other [[operating system]]s as well, including [[Linux]], [[OpenBSD]], [[NetBSD]], [[Mac OS X]] and [[FreeBSD]].' FULL: [[0.21590900000000002, 'Though they can run other operating systems as well including Linux OpenBSD NetBSD Mac OS x and FreeBSD these machines run Microsoft Windows commonly.'], [0.21590900000000002, 'Though they can run other operating systems as well including Linux OpenBSD NetBSD Mac OS X. and FreeBSD these machines run Microsoft Windows commonly.'], [0.015595, 'Though they can run other operating systems as well including Linux OpenBSD NetBSD Mac OS x and FreeBSD these machines commonly run Microsoft Windows.'], [0.015595, 'Though they can run other operating systems as well including Linux OpenBSD NetBSD Mac OS X. and FreeBSD these machines commonly run Microsoft Windows.'], [0.005592000000000002, 'These machines run Microsoft Windows commonly though they can run other operating systems as well including Linux OpenBSD NetBSD Mac OS x and FreeBSD.']] WITH CHUNKING: [[-3.367438806574847, 'Though they can run other operating systems as well including linux openbsd netbsd mac os x. and freebsd these machines commonly run microsoft windows.'], [-3.352093182672458, 'These machines run microsoft windows commonly though they can run other operating systems as well including linux openbsd netbsd mac os x and freebsd.'], [-3.352093182672458, 'These machines run microsoft windows commonly though they can run other operating systems as well including linux openbsd netbsd mac os x. and freebsd.'], [-1.8522438742762999, 'Though they can run other operating systems as well including linux openbsd netbsd mac os x and freebsd these machines run microsoft windows commonly.'], [-1.8522438742762999, 'Though they can run other operating systems as well including linux openbsd netbsd mac os x. and freebsd these machines run microsoft windows commonly.']] ORIGINAL: `In BRP, if this iotated vowel {{IPA|/ju/}} occurs after {{IPA|/t/}}, {{IPA|/d/}}, {{IPA|/s/}} or {{IPA|/z/}}, it often triggers palatalization of the preceding consonant, turning it to {{IPA|/ʨ/}}, {{IPA|/ʥ/}}, {{IPA|/ɕ/}} and {{IPA|/ʑ/}} respectively, as in ''tune'', ''during'', ''sugar'', and ''azure''.' FULL: None WITH CHUNKING: None ORIGINAL: `These distinctive inputs and outputs perform the function of the input and output layers of a feed-forward or simple recurrent network, and also join all the other neurons in the recurrent processing.' FULL: [[0.052152000000000004, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or simple recurrent network and also join all of the other neurons in the recurrent processing.'], [0.028685000000000002, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or simple recurrent network and also join all the other neurons in the recurrent processing.'], [0.013272, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or recurrent simple network and also join all of the other neurons in the recurrent processing.'], [0.007297000000000002, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or recurrent simple network and also join all the other neurons in the recurrent processing.']] WITH CHUNKING: [[-6.241300742074525, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or recurrent simple network and also join all the other neurons in the recurrent processing.'], [-5.5680172395785785, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or recurrent simple network and also join all of the other neurons in the recurrent processing.'], [-4.872781513619283, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or simple recurrent network and also join all the other neurons in the recurrent processing.'], [-4.199498011123335, 'These distinctive inputs and outputs perform the function of the input and output layers of a feed forward or simple recurrent network and also join all of the other neurons in the recurrent processing.']] ORIGINAL: `A common point of confusion is that [[mail merge]] to generate emails requires the Java API JavaMail in [[StarOffice]]; however, as of version 2.0.1, OpenOffice.org uses a [[Python (programming language)|Python]]-component instead.' FULL: [[0.7175039999999999, 'A common point of confusion is that mail merge to generate emails requires the Java API JavaMail in StarOffice. however as of version 2.0.1 OpenOffice.org uses a python- component instead.']] WITH CHUNKING: [[-0.12835623545214223, 'A common point of confusion is that mail merge to generate emails requires the java api javamail in staroffice. however as of version 2.0.1 openoffice.org uses a python- component instead.']] ORIGINAL: `Another notable program, known as [[Jabberwacky]], also deals in strong AI, as it is claimed to learn new responses based on user interactions, rather than being driven from a static database like many other existing chatterbots.' FULL: None WITH CHUNKING: None ORIGINAL: `Other factors were that around the same time, the Hanseatic league lost its importance as new trade routes to [[Asia]] and the [[Americas]] were established, and that the most powerful German states of that period were located in Middle and Southern Germany.' FULL: [[0.0062099999999999985, 'Other factors were that around the same time as new trade routes to Asia and the Americas were established the hanseatic league lost its importance and that the most powerful German states of that period were located in middle and southern Germany.'], [0.004924, 'Other factors were like around the same time as new trade routes to Asia and the Americas were established the hanseatic league lost its importance and that the most powerful German states of that period were located in middle and southern Germany.'], [0.004924, 'Other factors were as though around the same time as new trade routes to Asia and the Americas were established the hanseatic league lost its importance and that the most powerful German states of that period were located in middle and southern Germany.'], [0.004924, 'Other factors were as if around the same time as new trade routes to Asia and the Americas were established the hanseatic league lost its importance and that the most powerful German states of that period were located in middle and southern Germany.'], [0.0032829999999999995, 'Other factors were that around the same time as new trade routes to Asia and the Americas were established the hanseatic league lost its importance and that the German most powerful states of that period were located in middle and southern Germany.']] WITH CHUNKING: None ORIGINAL: `The part of morphology that covers the relationship between [[syntax]] and morphology is called morphosyntax, and it concerns itself with inflection and paradigms, but not with word-formation or compounding.' FULL: [[0.014877000000000001, 'The part of morphology which covers the relationship between syntax and morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [0.012281999999999996, 'The part of morphology who covers the relationship between syntax and morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [0.012147000000000008, 'The part of morphology that covers the relationship between syntax and morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [1.1000000000000001e-05, 'The part which covers the relationship between syntax and morphology of morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [1.1000000000000001e-05, 'The part that covers the relationship between syntax and morphology of morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.']] WITH CHUNKING: [[-12.829120849272504, 'The part that covers the relationship between syntax and morphology of morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [-12.759325087336963, 'The part which covers the relationship between syntax and morphology of morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [-5.823397370840034, 'The part of morphology that covers the relationship between syntax and morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [-5.797943843501315, 'The part of morphology who covers the relationship between syntax and morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.'], [-5.610087578079829, 'The part of morphology which covers the relationship between syntax and morphology is called morphosyntax and it concerns itself with inflection and paradigms but not with word formation or compounding.']] ORIGINAL: `Following the apparent [[reverse engineering]] of BitKeeper's protocols, McVoy withdrew permission for gratis use by free software projects, leading the Linux kernel community to develop a free software replacement in [[Git (software)|Git]].' FULL: [[0.718465, "Following the apparent reverse engineering of BitKeeper's protocols McVoy withdrew permission for gratis use by free software projects leading the Linux kernel community to develop a free software replacement in Git."], [0.002802, "McVoy withdrew permission for gratis use by free software projects leading the Linux kernel community to develop a free software replacement in Git following the apparent reverse engineering of BitKeeper's protocols."]] WITH CHUNKING: None ORIGINAL: `For example, all regular languages can be recognized by a [[finite state machine]], and for useful subsets of context-free grammars there are well-known algorithms to generate efficient [[LL parser]]s and [[LR parser]]s to recognize the corresponding languages those grammars generate.' FULL: [[0.009734000000000003, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate efficient LL parsers and LR parsers to recognize the corresponding languages that those grammars generate.'], [0.009377000000000003, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate efficient LL parsers and LR parsers to recognize the corresponding languages which those grammars generate.'], [0.009249000000000005, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate efficient LL parsers and LR parsers to recognize the corresponding languages who those grammars generate.'], [0.0018649999999999993, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate LL efficient parsers and LR parsers to recognize the corresponding languages that those grammars generate.'], [0.001798, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate LL efficient parsers and LR parsers to recognize the corresponding languages which those grammars generate.']] WITH CHUNKING: [[-7.852842992356961, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate ll efficient parsers and lr parsers to recognize the corresponding languages which those grammars generate.'], [-7.815689691100687, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate ll efficient parsers and lr parsers to recognize the corresponding languages that those grammars generate.'], [-6.215420057471741, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate efficient ll parsers and lr parsers to recognize the corresponding languages who those grammars generate.'], [-6.201513235361093, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate efficient ll parsers and lr parsers to recognize the corresponding languages which those grammars generate.'], [-6.1644246914568805, 'For example all regular languages can be recognized by a finite state machine and for useful subsets of context free grammars there are well known algorithms to generate efficient ll parsers and lr parsers to recognize the corresponding languages that those grammars generate.']] ORIGINAL: `Just as English itself has borrowed words from many different languages over its history, English [[loanword]]s now appear in a great many languages around the world, indicative of the technological and cultural influence of its speakers.' FULL: [[0.11186400000000003, 'Just as English itself has borrowed words from many languages different from it over its history English loanwords now appear in a great many languages around the world indicative of the technological and cultural influence of its speakers.'], [0.08189399999999998, 'Just as English itself has borrowed words from many different languages from it over its history English loanwords now appear in a great many languages around the world indicative of the technological and cultural influence of its speakers.'], [0.074816, 'Just as English itself has borrowed words from many different languages over its history English loanwords now appear in a great many languages around the world indicative of the technological and cultural influence of its speakers.'], [0.05466800000000001, 'English loanwords now appear in a great many languages around the world indicative of the technological and cultural influence of its speakers just as English itself has borrowed words from many languages different from it over its history.'], [0.04002399999999999, 'English loanwords now appear in a great many languages around the world indicative of the technological and cultural influence of its speakers just as English itself has borrowed words from many different languages from it over its history.']] WITH CHUNKING: [[-4.295940455225999, 'Just as english itself has borrowed words from many different languages over its history english loanwords now appear in a great many languages around the world indicative of the influence technological and cultural of its speakers.'], [-4.030208985794967, 'English loanwords now appear in a great many languages around the world indicative of the influence of its speakers technological and cultural just as english itself has borrowed words from many different languages over its history.'], [-3.775976329046773, 'Just as english itself has borrowed words from many different languages over its history english loanwords now appear in a great many languages around the world indicative of the influence of its speakers technological and cultural.'], [-1.949511774779458, 'English loanwords now appear in a great many languages around the world indicative of the technological and cultural influence of its speakers just as english itself has borrowed words from many different languages over its history.'], [-1.6952791180312645, 'Just as english itself has borrowed words from many different languages over its history english loanwords now appear in a great many languages around the world indicative of the technological and cultural influence of its speakers.']] ORIGINAL: `This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it.' FULL: None WITH CHUNKING: None ORIGINAL: `In some [[Common Law]] jurisdictions, the legal distinction between a license and a contract is an important one: contracts are enforceable by [[contract law]], whereas licenses are enforced under [[copyright law]].' FULL: [[0.005727, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licenses are enforced under © law contracts are enforceable by contract law.'], [0.005727, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licenses are enforced under copyright law contracts are enforceable by contract law.'], [0.005727, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licences are enforced under © law contracts are enforceable by contract law.'], [0.005727, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licences are enforced under copyright law contracts are enforceable by contract law.'], [0.005727, 'In some common law jurisdictions the legal distinction between a licence and a contract is an important one : whereas licenses are enforced under © law contracts are enforceable by contract law.']] WITH CHUNKING: [[-6.377468370876728, 'In some common law jurisdictions the legal distinction between a licence and a contract is an important one : whereas licenses are enforced under © law contracts are enforceable by contract law.'], [-6.377468370876728, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licences are enforced under copyright law contracts are enforceable by contract law.'], [-6.377468370876728, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licences are enforced under © law contracts are enforceable by contract law.'], [-6.377468370876728, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licenses are enforced under copyright law contracts are enforceable by contract law.'], [-6.377468370876728, 'In some common law jurisdictions the legal distinction between a license and a contract is an important one : whereas licenses are enforced under © law contracts are enforceable by contract law.']] ORIGINAL: `Google has created services and tools for the general public and business environment alike; including Web applications, advertising networks and solutions for businesses.' FULL: [[0.015915000000000002, 'Google has created services and tools for the general public and business environment alike. including web applications advertising networks and solutions for businesses.'], [0.015915000000000002, 'Google has created services and tools for the Gen. public and business environment alike. including web applications advertising networks and solutions for businesses.'], [0.00519, 'Google has created services and tools for the general public and business environment alike. including of web applications advertising networks and solutions for businesses.'], [0.00519, 'Google has created services and tools for the Gen. public and business environment alike. including of web applications advertising networks and solutions for businesses.'], [0.003839, 'Google has services and tools created for the general public and business environment alike. including web applications advertising networks and solutions for businesses.']] WITH CHUNKING: [[-9.447743801291605, 'Google has services and tools created for the general public and business environment alike. including web applications advertising networks and solutions for businesses.'], [-8.426860369038446, 'Google has created services and tools for the gen. public and business environment alike. including of web applications advertising networks and solutions for businesses.'], [-8.426860369038446, 'Google has created services and tools for the general public and business environment alike. including of web applications advertising networks and solutions for businesses.'], [-8.02401461895058, 'Google has created services and tools for the gen. public and business environment alike. including web applications advertising networks and solutions for businesses.'], [-8.02401461895058, 'Google has created services and tools for the general public and business environment alike. including web applications advertising networks and solutions for businesses.']] ORIGINAL: `There is no precise agreed-upon definition among researchers as to what a [[neural network]] is, but most would agree that it involves a network of simple processing elements ([[artificial neuron|neurons]]), which can exhibit complex global behavior, determined by the connections between the processing elements and element parameters.' FULL: None WITH CHUNKING: None ORIGINAL: `As the OpenOffice.org database, called "Base", uses documents created under the Writer application for reports and forms, one could say that Base can also be programmed with OpenOffice.org Basic.' FULL: [[0.04516, 'As the OpenOffice.org database called Base uses documents created under the Writer application for reports and forms one could say that Base can also be programmed with OpenOffice.org BASIC.'], [0.032769999999999994, 'As the OpenOffice.org database called Base uses documents created under the Writer application for reports and forms one could say like Base can also be programmed with OpenOffice.org BASIC.'], [0.032769999999999994, 'As the OpenOffice.org database called Base uses documents created under the Writer application for reports and forms one could say as though Base can also be programmed with OpenOffice.org BASIC.'], [0.032769999999999994, 'As the OpenOffice.org database called Base uses documents created under the Writer application for reports and forms one could say as if Base can also be programmed with OpenOffice.org BASIC.'], [0.023214000000000002, 'As the OpenOffice.org database called Base uses documents created under the Writer application for reports and forms one could say that Base can be also programmed with OpenOffice.org BASIC.']] WITH CHUNKING: [[-4.502686422853312, 'As the openoffice.org database called base uses documents created under the writer application for reports and forms one could say that base can be also programmed with openoffice.org basic.'], [-3.488153225916303, 'As the openoffice.org database called base uses documents created under the writer application for reports and forms one could say as if base can also be programmed with openoffice.org basic.'], [-3.488153225916303, 'As the openoffice.org database called base uses documents created under the writer application for reports and forms one could say as though base can also be programmed with openoffice.org basic.'], [-3.488153225916303, 'As the openoffice.org database called base uses documents created under the writer application for reports and forms one could say like base can also be programmed with openoffice.org basic.'], [-3.1674420544017656, 'As the openoffice.org database called base uses documents created under the writer application for reports and forms one could say that base can also be programmed with openoffice.org basic.']] ORIGINAL: `The [[ELKS]] kernel [[fork (software development)|fork]] can run on [[Intel 8086]] or [[Intel 80286]] [[16-bit]] microprocessors, while the [[µClinux]] kernel fork may run on systems without a [[memory management unit]].' FULL: [[0.32668599999999987, 'While the µClinux kernel fork may run on systems without a memory management unit the ELKS kernel fork can run on 16-bit Intel 8086 or Intel 80286 microprocessors.'], [0.166549, 'While the µClinux kernel fork may run on systems without a memory management unit the ELKS kernel fork can run on Intel 8086 or Intel 80286 16-bit microprocessors.'], [0.044820000000000006, 'The ELKS kernel fork can run on Intel 8086 or Intel 80286 16-bit microprocessors while the µClinux kernel fork may run on systems without a memory management unit.'], [0.030088, 'The ELKS kernel fork can run on 16-bit Intel 8086 or Intel 80286 microprocessors while the µClinux kernel fork may run on systems without a memory management unit.']] WITH CHUNKING: [[-3.457171276298693, 'The elks kernel fork can run on intel 8086 or intel 80286 16-bit microprocessors while the µclinux kernel fork may run on systems without a memory management unit.'], [-2.6879915688235116, 'The elks kernel fork can run on 16-bit intel 8086 or intel 80286 microprocessors while the µclinux kernel fork may run on systems without a memory management unit.'], [-1.9573219679025347, 'While the µclinux kernel fork may run on systems without a memory management unit the elks kernel fork can run on intel 8086 or intel 80286 16-bit microprocessors.'], [-1.1881422604273533, 'While the µclinux kernel fork may run on systems without a memory management unit the elks kernel fork can run on 16-bit intel 8086 or intel 80286 microprocessors.']] ORIGINAL: `Practical [[computer system]]s divide [[software system]]s into three major classes: [[system software]], [[programming software]] and [[application software]], although the distinction is arbitrary, and often blurred.' FULL: [[0.029057999999999997, 'Although the distinction is arbitrary and is often blurred practical computer systems divide software systems : system software programming software and application software into three major classes.'], [0.023772, 'Although the distinction is arbitrary and is blurred often practical computer systems divide software systems : system software programming software and application software into three major classes.'], [0.009590000000000003, 'Although the distinction is arbitrary and is often blurred practical computer systems divide software systems : systems software programming software and application software into three major classes.'], [0.008574, 'Although the distinction is arbitrary and is often blurred practical computer systems divide software systems into three major classes : system software programming software and application software.'], [0.008376999999999999, 'Although the distinction is arbitrary and often is blurred practical computer systems divide software systems : system software programming software and application software into three major classes.']] WITH CHUNKING: [[-4.474736994205012, 'Although the distinction is arbitrary and is blurred often practical computer systems divide software systems : systems software programming software and application software into three major classes.'], [-4.269963577822949, 'Although the distinction is arbitrary and is often blurred practical computer systems divide software systems into three major classes : system software programming software and application software.'], [-4.040540209907061, 'Although the distinction is arbitrary and is blurred often practical computer systems divide software systems into three major classes : system software programming software and application software.'], [-3.59598076824657, 'Although the distinction is arbitrary and is often blurred practical computer systems divide software systems : system software programming software and application software into three major classes.'], [-3.3665574003306817, 'Although the distinction is arbitrary and is blurred often practical computer systems divide software systems : system software programming software and application software into three major classes.']] ORIGINAL: `Thus we might expect an algorithm to be an algebraic equation such as '''y = m + n''' — two arbitrary "input variables" '''m''' and '''n''' that produce an output '''y'''.' FULL: [[0.00035700000000000006, 'Thus we might expect an algorithm to be a algebraic equation such as y equal to m plus n. arbitrary two input variables m and n which produce an output y.'], [0.00035700000000000006, 'Thus we might expect an algorithm to be a algebraic equation such as y equal to m plus n. arbitrary two input variables m and N. which produce an output y.'], [0.00035700000000000006, 'Thus we might expect an algorithm to be a algebraic equation such as y equal to m + n. arbitrary two input variables m and n which produce an output y.'], [0.00035700000000000006, 'Thus we might expect an algorithm to be a algebraic equation such as y equal to m + n. arbitrary two input variables m and N. which produce an output y.'], [0.00035700000000000006, 'Thus we might expect an algorithm to be a algebraic equation such as Y. equal to m plus n. arbitrary two input variables m and n which produce an output y.']] WITH CHUNKING: [[-8.217120255022502, 'Thus we might expect an algorithm to be a algebraic equation such as y equal to m plus n. arbitrary two input variables m and n. which produce an output y.'], [-8.217120255022502, 'Thus we might expect an algorithm to be a algebraic equation such as y. equal to m + n. arbitrary two input variables m and n which produce an output y.'], [-8.217120255022502, 'Thus we might expect an algorithm to be a algebraic equation such as y. equal to m + n. arbitrary two input variables m and n. which produce an output y.'], [-8.217120255022502, 'Thus we might expect an algorithm to be a algebraic equation such as y. equal to m plus n. arbitrary two input variables m and n which produce an output y.'], [-8.217120255022502, 'Thus we might expect an algorithm to be a algebraic equation such as y. equal to m plus n. arbitrary two input variables m and n. which produce an output y.']] ORIGINAL: `Moreover, there is the question of data format: storing metadata in a human-readable format such as XML can be useful because users can understand and edit it without specialized tools.' FULL: [[0.02663999999999999, 'Moreover there is the question of data format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as XML can be useful.'], [0.02663999999999999, 'Moreover there is the question of data format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as X.M.L. can be useful.'], [0.015579999999999998, 'Moreover there is the question of datum format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as XML can be useful.'], [0.015579999999999998, 'Moreover there is the question of datum format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as X.M.L. can be useful.'], [0.003647000000000002, 'Moreover there is the question of data format : because users can understand and edit it without tools specialized storing metadata in a human readable format such as XML can be useful.']] WITH CHUNKING: [[-5.372862747610288, 'Moreover there is the question of data format : because users can understand and edit it without tools specialized storing metadata in a human readable format such as xml can be useful.'], [-4.724433130958857, 'Moreover there is the question of datum format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as x.m.l. can be useful.'], [-4.724433130958857, 'Moreover there is the question of datum format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as xml can be useful.'], [-4.1880416286430595, 'Moreover there is the question of data format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as x.m.l. can be useful.'], [-4.1880416286430595, 'Moreover there is the question of data format : because users can understand and edit it without specialized tools storing metadata in a human readable format such as xml can be useful.']] ORIGINAL: `Usually only a subset of the neurons receive external inputs in addition to the inputs from all the other neurons, and another disjunct subset of neurons report their output externally as well as sending it to all the neurons.' FULL: None WITH CHUNKING: [[-8.42235040965127, 'Usually only a subset of the neurons receive external inputs in addition to the inputs from all of the other neurons and another disjunct subset of neurons reports their output externally as well as sending it to all of the neurons.'], [-8.195825852694789, 'Usually only a subset of the neurons receive external inputs in addition to the inputs from all of the other neurons and another disjunct subset of neurons externally report to their output as well as sending it to all of the neurons.'], [-8.159986145183778, 'Usually only a subset of the neurons receive external inputs in addition to the inputs from all of the other neurons and another disjunct subset of neurons reports to their output externally as well as sending it to all of the neurons.'], [-8.077669750097979, 'Usually only a subset of the neurons receives external inputs in addition to the inputs from all of the other neurons and another disjunct subset of neurons report to their output externally as well as sending it to all of the neurons.'], [-7.679650579181557, 'Usually only a subset of the neurons receive external inputs in addition to the inputs from all of the other neurons and another disjunct subset of neurons report to their output externally as well as sending it to all of the neurons.']] ORIGINAL: `While a neural network does not have to be adaptive ''per se'', its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.' FULL: None WITH CHUNKING: None ORIGINAL: `The measure of sufficient randomness in extractors is [[min-entropy]], a value related to Shannon entropy through [[Rényi entropy]]; Rényi entropy is also used in evaluating randomness in cryptographic systems.' FULL: None WITH CHUNKING: None ORIGINAL: `The Japanese government provides standardised tests to measure spoken and written comprehension of Japanese for second language learners; the most prominent is the [[Japanese Language Proficiency Test]] (JLPT).' FULL: [[5.499999999999999e-05, 'The Japanese government provides standardized tests to measure spoken and written comprehension of Japanese for second language learners. the most prominent is the Japanese language proficiency test JLPT.'], [5.499999999999999e-05, 'The Japanese government provides standardized tests to measure comprehension of Japanese spoken and written for second language learners. the most prominent is the Japanese language proficiency test JLPT.'], [5.299999999999999e-05, 'The Japanese government provides tests to measure spoken and written comprehension of Japanese for second language learners standardized. the most prominent is the Japanese language proficiency test JLPT.'], [5.1e-05, 'The Japanese government provides tests to measure comprehension of Japanese spoken and written for second language learners standardized. the most prominent is the Japanese language proficiency test JLPT.'], [2e-05, 'The Japanese government provides tests standardized to measure spoken and written comprehension of Japanese for second language learners. the most prominent is the Japanese language proficiency test JLPT.']] WITH CHUNKING: [[-11.176453228349015, 'The japanese government provides tests standardized to measure comprehension of japanese spoken and written for second language learners. the most prominent is the japanese language proficiency test jlpt.'], [-10.349774655164548, 'The japanese government provides standardized tests to measure comprehension of japanese spoken and written for second language learners. the most prominent is the japanese language proficiency test jlpt.'], [-10.319002996497794, 'The japanese government provides standardized tests to measure spoken and written comprehension of japanese for second language learners. the most prominent is the japanese language proficiency test jlpt.'], [-10.26016249647486, 'The japanese government provides tests to measure comprehension of japanese spoken and written for second language learners standardized. the most prominent is the japanese language proficiency test jlpt.'], [-10.26016249647486, 'The japanese government provides tests to measure spoken and written comprehension of japanese for second language learners standardized. the most prominent is the japanese language proficiency test jlpt.']] ORIGINAL: `Methods have been developed for the [[analysis of algorithms]] to obtain such quantitative answers; for example, the algorithm above has a time requirement of O(''n''), using the [[big O notation]] with ''n'' as the length of the list.' FULL: None WITH CHUNKING: None ORIGINAL: `* The original term-document matrix is presumed too large for the computing resources; in this case, the approximated low rank matrix is interpreted as an ''approximation'' (a "least and necessary evil").' FULL: [[0.120764, 'Too large for the computing resources the original term document matrix is presumed. in this case the low rank matrix approximated is interpreted as an approximation a least and necessary evil.'], [0.081534, 'Too large for the computing resources the original term document matrix is presumed. in this case the approximated low rank matrix is interpreted as an approximation a least and necessary evil.'], [0.05333900000000002, 'The original term document matrix is presumed too large for the computing resources. in this case the low rank matrix approximated is interpreted as an approximation a least and necessary evil.'], [0.036011, 'The original term document matrix is presumed too large for the computing resources. in this case the approximated low rank matrix is interpreted as an approximation a least and necessary evil.'], [0.033291999999999995, 'Too large for the computing resources the original term document matrix is presumed. in this case the low rank matrix approximated is interpreted as an approximation an evil least and necessary.']] WITH CHUNKING: [[-4.578782823955705, 'The original term document matrix is presumed too large for the computing resources. in this case the low rank matrix approximated is interpreted as an approximation a least and necessary evil.'], [-4.049904924569991, 'Too large for the computing resources the original term document matrix is presumed. in this case the approximated low rank matrix is interpreted as an approximation an evil least and necessary.'], [-3.407680073133844, 'Too large for the computing resources the original term document matrix is presumed. in this case the low rank matrix approximated is interpreted as an approximation an evil least and necessary.'], [-2.7614387851188464, 'Too large for the computing resources the original term document matrix is presumed. in this case the approximated low rank matrix is interpreted as an approximation a least and necessary evil.'], [-2.1191946708901956, 'Too large for the computing resources the original term document matrix is presumed. in this case the low rank matrix approximated is interpreted as an approximation a least and necessary evil.']] ORIGINAL: `As mentioned earlier some grammar formalisms are very computationally difficult to parse; in general, even if the desired structure is not [[context-free]], some kind of context-free approximation to the grammar is used to perform a first pass.' FULL: None WITH CHUNKING: None ORIGINAL: `* [[Wine (software)|Wine]] - a [[free and open source software]] implementation of the [[Windows API]], allowing one to run many Windows applications on x86-based platforms, including [[Linux]].' FULL: [[0.315392, 'Wine. a free and open source software implementation of the Windows API allowing one to run many Windows applications on x86- based platforms including Linux.'], [0.031427, 'Wine. a free and open source software implementation of the Windows API allowing one to run Windows many applications on x86- based platforms including Linux.']] WITH CHUNKING: None ORIGINAL: `In other words, a grammar describes which of the possible sequences of symbols (strings) in a language constitute valid words or statements in that language, but it does not describe their [[semantics]] (i.e. what they mean).' FULL: None WITH CHUNKING: None ORIGINAL: `* 1971: [[N. Jardine]] and [[C. J. Van Rijsbergen]] published "The use of hierarchic clustering in information retrieval", which articulated the "cluster hypothesis."' FULL: [[9e-06, '1971 : n Jardine and c j van Rijsbergen published the use of hierarchic clustering in information retrieval which articulated the cluster hypothesis.'], [9e-06, '1971 : n Jardine and c J. van Rijsbergen published the use of hierarchic clustering in information retrieval which articulated the cluster hypothesis.'], [9e-06, '1971 : n Jardine and C. j van Rijsbergen published the use of hierarchic clustering in information retrieval which articulated the cluster hypothesis.'], [9e-06, '1971 : n Jardine and C. J. van Rijsbergen published the use of hierarchic clustering in information retrieval which articulated the cluster hypothesis.'], [9e-06, '1971 : n Jardine and (C) j van Rijsbergen published the use of hierarchic clustering in information retrieval which articulated the cluster hypothesis.']] WITH CHUNKING: None ORIGINAL: `That is, for every context-free language, a machine can be built that takes a string as input and determines in O(n^3) time whether the string is a member of the language, where n is the length of the string.' FULL: None WITH CHUNKING: None ORIGINAL: `The [[Taxonomic classification|classification]] of natural languages can be performed on the basis of different underlying principles (different closeness notions, respecting different properties and relations between languages); important directions of present classifications are:' FULL: [[0.09600500000000001, 'The classification of natural languages can be performed on the basis of underlying different principles different closeness notions respecting different properties and relations between languages. important directions of present classifications are :'], [0.060788999999999996, 'The classification of natural languages can be performed on the basis of different underlying principles different closeness notions respecting different properties and relations between languages. important directions of present classifications are :']] WITH CHUNKING: [[-2.5949265386859635, 'The classification of natural languages can be performed on the basis of different underlying principles different closeness notions respecting different properties and relations between languages. important directions of present classifications are :.'], [-2.137927664865295, 'The classification of natural languages can be performed on the basis of underlying different principles different closeness notions respecting different properties and relations between languages. important directions of present classifications are :.']] ORIGINAL: `[[Yahoo!]] was among the most popular ways for people to find web pages of interest, but its search function operated on its [[web directory]], rather than full-text copies of web pages.' FULL: [[0.0011509999999999997, 'Yahoo! was among the most popular ways for people to find web pages of interest but its search function operated on its web directory rather than full text copies of web pages.']] WITH CHUNKING: [[-8.593922104099207, 'Yahoo! was among the most popular ways for people to find web pages of interest but its search function operated on its web directory rather than full text copies of web pages.']] ORIGINAL: `Indeed, it is generally fair to say that an English word derived from Latin/French roots typically corresponds to a Sino-Japanese word in Japanese, whereas a simpler Anglo-Saxon word would best be translated by a Yamato equivalent.' FULL: [[1.4999999999999999e-05, 'Indeed to say like whereas a simpler anglo-saxon word would be translated by an Yamato equivalent the best an English word derived from Latin and French roots typically corresponds to a Sino-Japanese word in Japanese it is fair generally.'], [1.4999999999999999e-05, 'Indeed to say like whereas a simpler anglo-saxon word would be translated by a Yamato equivalent the best an English word derived from Latin and French roots typically corresponds to a Sino-Japanese word in Japanese it is fair generally.'], [1.4999999999999999e-05, 'Indeed to say as though whereas a simpler anglo-saxon word would be translated by an Yamato equivalent the best an English word derived from Latin and French roots typically corresponds to a Sino-Japanese word in Japanese it is fair generally.'], [1.4999999999999999e-05, 'Indeed to say as though whereas a simpler anglo-saxon word would be translated by a Yamato equivalent the best an English word derived from Latin and French roots typically corresponds to a Sino-Japanese word in Japanese it is fair generally.'], [1.4999999999999999e-05, 'Indeed to say as if whereas a simpler anglo-saxon word would be translated by an Yamato equivalent the best an English word derived from Latin and French roots typically corresponds to a Sino-Japanese word in Japanese it is fair generally.']] WITH CHUNKING: [[-8.44507612960673, 'Indeed to say as though whereas a simpler anglo-saxon word would be translated by a yamato equivalent the best an english word derived from latin and french roots typically corresponds to a sino-japanese word in japanese it is fair generally.'], [-8.44507612960673, 'Indeed to say like whereas a simpler anglo-saxon word would be translated by a yamato equivalent the best an english word derived from latin and french roots typically corresponds to a sino-japanese word in japanese it is fair generally.'], [-8.429289787664823, 'Indeed to say as if whereas a simpler anglo-saxon word would be translated by an yamato equivalent the best an english word derived from latin and french roots typically corresponds to a sino-japanese word in japanese it is fair generally.'], [-8.429289787664823, 'Indeed to say as though whereas a simpler anglo-saxon word would be translated by an yamato equivalent the best an english word derived from latin and french roots typically corresponds to a sino-japanese word in japanese it is fair generally.'], [-8.429289787664823, 'Indeed to say like whereas a simpler anglo-saxon word would be translated by an yamato equivalent the best an english word derived from latin and french roots typically corresponds to a sino-japanese word in japanese it is fair generally.']] ORIGINAL: `*[[Top-down parsing]] - Top-down parsing can be viewed as an attempt to find left-most derivations of an input-stream by searching for [[parse tree|parse-trees]] using a top-down expansion of the given [[formal grammar]] rules.' FULL: [[0.00024900000000000004, 'Top-down parsing. top-down parsing can be viewed as an attempt to find leftmost derivations of an input stream by searching for parse trees using a top-down expansion of the given formal grammar rules.'], [0.000115, 'Top-down parsing. top-down parsing can be viewed as an attempt to find leftmost derivations of an input stream by searching for parse trees using a top-down expansion of the formal grammar rules given.'], [8.900000000000001e-05, 'Top-down parsing. top-down parsing can be viewed as an attempt to find leftmost derivations of an input stream by searching for parse trees using a top-down expansion of the formal given grammar rules.']] WITH CHUNKING: None ORIGINAL: `On the other hand, these formats are not optimized for storage capacity; it may be useful to store metadata in a binary, non-human-readable format instead to speed up transfer and save memory.' FULL: None WITH CHUNKING: None ORIGINAL: `In addition, patterns in the data may be [[mathematical model|modeled]] in a way that accounts for [[random]]ness and uncertainty in the observations, and then used to draw inferences about the process or population being studied; this is called '''[[inferential statistics]]'''.' FULL: None WITH CHUNKING: None ORIGINAL: `While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.' FULL: None WITH CHUNKING: None ORIGINAL: `Although the United States has no formally designated "official languages," Spanish is formally recognized at the state level beside English; in the U.S. state of [[New Mexico]], 30 per cent of the population speak it.' FULL: [[4.8999999999999965e-05, 'Although the United States has no official languages designated formally Spanish is formally recognized at the state level beside English. in the U.S. state of New Mexico thirties per cts of the population speak it.'], [4.8999999999999965e-05, 'Although the United States has no official languages designated formally Spanish is formally recognized at the state level beside English. in the U.S. state of New Mexico thirties per cent of the population speak it.'], [4.8999999999999965e-05, 'Although the United States has no official languages designated formally Spanish is formally recognized at the state level beside English. in the U.S. state of New Mexico 30s per cts of the population speak it.'], [4.8999999999999965e-05, 'Although the United States has no official languages designated formally Spanish is formally recognized at the state level beside English. in the U.S. state of New Mexico 30s per cent of the population speak it.'], [4.699999999999997e-05, 'Although the United States has no official languages designated formally Spanish is recognized beside English at the state level formally. in the U.S. state of New Mexico thirties per cts of the population speak it.']] WITH CHUNKING: [[-11.41601988178103, 'Although the united states has no official languages designated formally spanish is recognized at the state level beside english formally. in the u.s. state of new mexico thirties per cts of the population speak it.'], [-11.135936354286947, 'Although the united states has no official languages designated formally spanish is recognized beside english at the state level formally. in the u.s. state of new mexico 30s per cent of the population speak it.'], [-11.135936354286947, 'Although the united states has no official languages designated formally spanish is recognized beside english at the state level formally. in the u.s. state of new mexico 30s per cts of the population speak it.'], [-11.135936354286947, 'Although the united states has no official languages designated formally spanish is recognized beside english at the state level formally. in the u.s. state of new mexico thirties per cent of the population speak it.'], [-11.135936354286947, 'Although the united states has no official languages designated formally spanish is recognized beside english at the state level formally. in the u.s. state of new mexico thirties per cts of the population speak it.']] ORIGINAL: `Most Web search engines are commercial ventures supported by [[advertising]] revenue and, as a result, some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results.' FULL: [[0.022788999999999997, 'Most web search engines are commercial ventures supported by advertising of revenue and as a result some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results.'], [0.011830999999999998, 'Most web search engines are commercial ventures supported by advertising revenue and as a result some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results.'], [0.002809, 'Most web search engines are commercial ventures supported by advertising of revenue and as a result some employ the controversial practice of allowing advertisers to have their listings ranked higher in search results to pay money.'], [0.0014580000000000003, 'Most web search engines are commercial ventures supported by advertising revenue and as a result some employ the controversial practice of allowing advertisers to have their listings ranked higher in search results to pay money.']] WITH CHUNKING: [[-7.793447921488982, 'Most web search engines are commercial ventures supported by advertising revenue and as a result some employ the controversial practice of allowing advertisers to have their listings ranked higher in search results to pay money.'], [-7.523379811678742, 'Most web search engines are commercial ventures supported by advertising of revenue and as a result some employ the controversial practice of allowing advertisers to have their listings ranked higher in search results to pay money.'], [-5.700502992657357, 'Most web search engines are commercial ventures supported by advertising revenue and as a result some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results.'], [-5.430434882847118, 'Most web search engines are commercial ventures supported by advertising of revenue and as a result some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results.']] ORIGINAL: `The Spanish dialects of Latin America have only one form of the second-person plural for daily use, {{lang|es|''ustedes''}} (formal or familiar, as the case may be, though {{lang|es|''vosotros''}} non-formal usage can sometimes appear in poetry and rhetorical or literary style).' FULL: None WITH CHUNKING: None ORIGINAL: `However, there are places where the traditional regional dialects have been replaced by standard German; this is the case in vast stretches of Northern Germany, but also in major cities in other parts of the country.' FULL: None WITH CHUNKING: None ORIGINAL: `[[Economist]] [[Herbert Simon]] and [[Alan Newell]] studied human problem solving skills and attempted to formalize them, and their work laid the foundations of the field of artificial intelligence, as well as [[cognitive science]], [[operations research]] and [[management science]].' FULL: [[0.00010099999999999999, 'Economist Herbert Simon and Alan Newell studied human problem solving skills and attempted to formalize them and their work laid the foundations of the field of artificial intelligence as well as cognitive science operations research and management science.'], [3.2e-05, 'Economist Herbert Simon and Alan Newell studied problem solving human skills and attempted to formalize them and their work laid the foundations of the field of artificial intelligence as well as cognitive science operations research and management science.']] WITH CHUNKING: [[-11.748535006883904, 'Economist herbert simon and alan newell studied problem solving human skills and attempted to formalize them and their work laid the foundations of the field of artificial intelligence as well as cognitive science operations research and management science.'], [-10.625541760088145, 'Economist herbert simon and alan newell studied human problem solving skills and attempted to formalize them and their work laid the foundations of the field of artificial intelligence as well as cognitive science operations research and management science.']] ORIGINAL: `German immigrants were instrumental in the country's three largest urban areas: [[Montreal]], [[Toronto]] and [[Vancouver]], but post-WWII immigrants managed to preserve a fluency in the German language in their respective neighborhoods and sections.' FULL: [[0.0007959999999999998, "German immigrants were instrumental in the country's three urban largest areas : Montreal Toronto and Vancouver but post-World War II immigrants managed to preserve a fluency in the German language in their respective neighborhoods and sections."], [0.0007959999999999998, "German immigrants were instrumental in the country's three urban largest areas : Montreal Toronto and Vancouver but post-WWII immigrants managed to preserve a fluency in the German language in their respective neighborhoods and sections."], [0.0007959999999999998, "German immigrants were instrumental in the country's three urban largest areas : Montreal Toronto and Vancouver but post-WW II immigrants managed to preserve a fluency in the German language in their respective neighborhoods and sections."], [0.0007190000000000001, "German immigrants were instrumental in the country's largest urban three areas : Montreal Toronto and Vancouver but post-World War II immigrants managed to preserve a fluency in the German language in their respective neighborhoods and sections."], [0.0007190000000000001, "German immigrants were instrumental in the country's largest urban three areas : Montreal Toronto and Vancouver but post-WWII immigrants managed to preserve a fluency in the German language in their respective neighborhoods and sections."]] WITH CHUNKING: [[-8.159532379633148, "German immigrants were instrumental : montreal toronto and vancouver in the country's largest urban three areas but post-ww ii immigrants managed to preserve a fluency in the german language in their respective neighborhoods and sections."], [-8.159532379633148, "German immigrants were instrumental : montreal toronto and vancouver in the country's largest urban three areas but post-wwii immigrants managed to preserve a fluency in the german language in their respective neighborhoods and sections."], [-8.054255454095017, "German immigrants were instrumental : montreal toronto and vancouver in the country's three urban largest areas but post-world war ii immigrants managed to preserve a fluency in the german language in their respective neighborhoods and sections."], [-8.054255454095017, "German immigrants were instrumental : montreal toronto and vancouver in the country's three urban largest areas but post-ww ii immigrants managed to preserve a fluency in the german language in their respective neighborhoods and sections."], [-8.054255454095017, "German immigrants were instrumental : montreal toronto and vancouver in the country's three urban largest areas but post-wwii immigrants managed to preserve a fluency in the german language in their respective neighborhoods and sections."]] ORIGINAL: `[[Biodiversity]] of an ecosystem might be defined as the total genomic complement of a particular environment, from all of the species present, whether it is a biofilm in an abandoned mine, a drop of sea water, a scoop of soil, or the entire [[biosphere]] of the planet [[Earth]].' FULL: [[0.021885, 'Whether it is a biofilm in an abandoned mine a drop of sea water a scoop of soil or the entire biosphere of the planet Earth Biodiversity of an ecosystem might be defined as the total genomic complement from all of the species present of a particular environment.'], [0.01499, 'Whether it is a biofilm in an abandoned mine a drop of sea water a scoop of soil or the entire biosphere of the planet Earth Biodiversity of an ecosystem might be defined as the total genomic complement of a particular environment from all of the species present.'], [0.010632, 'Whether it is a biofilm in a mine abandoned a drop of sea water a scoop of soil or the entire biosphere of the planet Earth Biodiversity of an ecosystem might be defined as the total genomic complement from all of the species present of a particular environment.'], [0.007281999999999999, 'Whether it is a biofilm in a mine abandoned a drop of sea water a scoop of soil or the entire biosphere of the planet Earth Biodiversity of an ecosystem might be defined as the total genomic complement of a particular environment from all of the species present.'], [0.005647, 'Whether it is a biofilm in an abandoned mine a drop of sea water a scoop of soil or the entire biosphere of the planet Earth Biodiversity of an ecosystem might be defined as the total genomic complement from all the species present of a particular environment.']] WITH CHUNKING: [[-5.202827301331882, 'Whether it is a biofilm in an abandoned mine a drop of sea water a scoop of soil or the entire biosphere of the planet earth biodiversity of an ecosystem might be defined as the total genomic complement from all the species present of a particular environment.'], [-4.948724502299398, 'Whether it is a biofilm in a mine abandoned a drop of sea water a scoop of soil or the entire biosphere of the planet earth biodiversity of an ecosystem might be defined as the total genomic complement of a particular environment from all of the species present.'], [-4.570273051678376, 'Whether it is a biofilm in a mine abandoned a drop of sea water a scoop of soil or the entire biosphere of the planet earth biodiversity of an ecosystem might be defined as the total genomic complement from all of the species present of a particular environment.'], [-4.226726790978409, 'Whether it is a biofilm in an abandoned mine a drop of sea water a scoop of soil or the entire biosphere of the planet earth biodiversity of an ecosystem might be defined as the total genomic complement of a particular environment from all of the species present.'], [-3.848275340357387, 'Whether it is a biofilm in an abandoned mine a drop of sea water a scoop of soil or the entire biosphere of the planet earth biodiversity of an ecosystem might be defined as the total genomic complement from all of the species present of a particular environment.']] ORIGINAL: `Although much less powerful than unrestricted grammars (Type 0), which can in fact express any language that can be accepted by a [[Turing machine]], these two restricted types of grammars are most often used because [[parsing|parser]]s for them can be efficiently implemented.' FULL: None WITH CHUNKING: None ORIGINAL: `As anyone who's ever used a telephone (mobile or landline) knows, however, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.' FULL: None WITH CHUNKING: None ORIGINAL: `'''Data mining''': Many [[data mining]] applications involve partitioning data items into related subsets; the marketing applications discussed above represent some examples.' FULL: [[0.033188999999999996, 'Data mining : many data mining applications involve partitioning data items into related subsets to. the marketing applications discussed above represent some examples.'], [0.023745999999999993, 'Data mining : many data mining applications involve partitioning data items into subsets related to. the marketing applications discussed above represent some examples.'], [0.022646, 'Data mining : many data mining applications involve partitioning data items into related subsets. the marketing applications discussed above represent some examples.'], [0.021974999999999998, 'Datum mining : many data mining applications involve partitioning data items into related subsets to. the marketing applications discussed above represent some examples.'], [0.015721000000000002, 'Datum mining : many data mining applications involve partitioning data items into subsets related to. the marketing applications discussed above represent some examples.']] WITH CHUNKING: None ORIGINAL: `Back-End SR or Deferred SR is where the provider dictates into a digital dictation system, and the voice is routed through a speech-recognition machine and the recognized draft document is routed along with the original voice file to the MT/editor, who edits the draft and finalizes the report.' FULL: None WITH CHUNKING: None ORIGINAL: `While mathematically it is feasible to apply multivariate regression to discrete ordered dependent variables, some of the assumptions behind the theory of multivariate linear regression no longer hold, and there are other techniques such as discrete choice models which are better suited for this type of analysis.' FULL: None WITH CHUNKING: [[-11.45883843652609, 'While mathematically it is feasible to apply multivariate regression to dependent discrete variables ordered some of the assumptions behind the theory of multivariate linear regression hold no longer and there are other techniques such as discrete choice models who are suited for this type of analysis better.'], [-11.376745040685321, 'While mathematically it is feasible to apply multivariate regression to discrete dependent variables ordered some of the assumptions behind the theory of multivariate linear regression hold no longer and there are other techniques such as discrete choice models that are suited for this type of analysis better.'], [-11.355715564379661, 'While mathematically it is feasible to apply multivariate regression to dependent discrete variables ordered some of the assumptions behind the theory of multivariate linear regression hold no longer and there are other techniques such as discrete choice models that are suited for this type of analysis better.'], [-11.31066695948214, 'While mathematically it is feasible to apply multivariate regression to discrete dependent variables ordered some of the assumptions behind the theory of multivariate linear regression hold no longer and there are other techniques such as discrete choice models which are suited for this type of analysis better.'], [-11.28963748317648, 'While mathematically it is feasible to apply multivariate regression to dependent discrete variables ordered some of the assumptions behind the theory of multivariate linear regression hold no longer and there are other techniques such as discrete choice models which are suited for this type of analysis better.']] ORIGINAL: `Linux is one of the most prominent examples of [[free software]] and [[open source]] development: typically all underlying [[source code]] can be freely modified, used, and redistributed by anyone.' FULL: [[0.025907000000000006, 'Linux is one of the most prominent examples of free software and open source development : typically all underlying source code can be modified be used and be re-distributed freely by anyone.'], [0.025907000000000006, 'Linux is one of the most prominent examples of free software and open source development : typically all underlying source code can be modified be used and be re-distributed freely by anybody.'], [0.007960000000000002, 'Linux is one of the most prominent examples of free software and open source development : typically all underlying source code can be modified used and re-distributed freely by anyone.'], [0.007960000000000002, 'Linux is one of the most prominent examples of free software and open source development : typically all underlying source code can be modified used and re-distributed freely by anybody.'], [0.007665, 'Linux is one of the most prominent examples of free software and open source development : typically all underlying source code can freely be modified be used and be re-distributed by anyone.']] WITH CHUNKING: [[-5.076144923406772, 'Linux is one of the most prominent examples of free software and open source development :. typically all underlying source code can freely be modified be used and be re-distributed by anyone.'], [-4.956616210584583, 'Linux is one of the most prominent examples of free software and open source development :. typically all underlying source code can be modified used and re-distributed freely by anybody.'], [-4.956616210584583, 'Linux is one of the most prominent examples of free software and open source development :. typically all underlying source code can be modified used and re-distributed freely by anyone.'], [-3.864666572512732, 'Linux is one of the most prominent examples of free software and open source development :. typically all underlying source code can be modified be used and be re-distributed freely by anybody.'], [-3.864666572512732, 'Linux is one of the most prominent examples of free software and open source development :. typically all underlying source code can be modified be used and be re-distributed freely by anyone.']] ORIGINAL: `However, if this is not true, it must periodically check if the world matches its predictions and it must change its plan as this becomes necessary, requiring the agent to reason under uncertainty.' FULL: None WITH CHUNKING: [[-5.328493603552603, 'However it must check that the world does match its predictions periodically and requiring the agent to reason under uncertainty it must change its plan as this becomes necessary if this is not true.'], [-5.200677070824758, 'However if this is not true it must periodically check that the world matches its predictions and requiring the agent to reason under uncertainty it must change its plan as this becomes necessary.'], [-5.0035254830540685, 'However if this is not true it must check that the world matches its predictions periodically and requiring the agent to reason under uncertainty it must change its plan as this becomes necessary.'], [-4.827563565012013, 'However if this is not true it must periodically check that the world does match its predictions and requiring the agent to reason under uncertainty it must change its plan as this becomes necessary.'], [-4.630411977241324, 'However if this is not true it must check that the world does match its predictions periodically and requiring the agent to reason under uncertainty it must change its plan as this becomes necessary.']] ORIGINAL: `This problem might be considered to be a mild form of [[linkrot]], and Google's handling of it increases [[usability]] by satisfying [[user expectations]] that the search terms will be on the returned webpage.' FULL: [[0.062145999999999965, "This problem might be considered to be a mild form of linkrot and Google's handling of it increases usability by satisfying user expectations which the search terms will be on the returned webpage."], [0.058825999999999996, "This problem might be considered to be a mild form of linkrot and Google's handling of it increases usability by satisfying user expectations that the search terms will be on the returned webpage."], [0.051629999999999995, "This problem might be considered to be a mild form of linkrot and Google's handling of it increases usability by satisfying user expectations who the search terms will be on the returned webpage."], [0.04764400000000002, "This problem might be considered to be a mild form of linkrot and Google's handling it increases usability by satisfying user expectations which the search terms will be on the returned webpage."], [0.04509899999999999, "This problem might be considered to be a mild form of linkrot and Google's handling it increases usability by satisfying user expectations that the search terms will be on the returned webpage."]] WITH CHUNKING: [[-4.429900197246935, "This problem might be considered to be a mild form of linkrot and google's handling it increases usability by satisfying user expectations who the search terms will be on the returned webpage."], [-4.299660251350003, "This problem might be considered to be a mild form of linkrot and google's handling of it increases usability by satisfying user expectations that the search terms will be on the returned webpage."], [-4.299428388329604, "This problem might be considered to be a mild form of linkrot and google's handling it increases usability by satisfying user expectations that the search terms will be on the returned webpage."], [-4.2447745867894735, "This problem might be considered to be a mild form of linkrot and google's handling of it increases usability by satisfying user expectations which the search terms will be on the returned webpage."], [-4.244545563857552, "This problem might be considered to be a mild form of linkrot and google's handling it increases usability by satisfying user expectations which the search terms will be on the returned webpage."]] ORIGINAL: `For example, applications developed for Mac OS X and [[GNOME]] are supposed to place the most important button on the right-hand side of windows and dialogs, whereas Microsoft Windows and [[KDE]] have the opposite convention.' FULL: [[0.00020999999999999987, 'For example applications developed for Mac OS x and GNOME are supposed to place the most important button on the right handed side of Windows and dialogues whereas Microsoft Windows and KDE have the opposite convention.'], [0.00020999999999999987, 'For example applications developed for Mac OS x and GNOME are supposed to place the most important button on the right handed side of Windows and dialogs whereas Microsoft Windows and KDE have the opposite convention.'], [0.00020999999999999987, 'For example applications developed for Mac OS X. and GNOME are supposed to place the most important button on the right handed side of Windows and dialogues whereas Microsoft Windows and KDE have the opposite convention.'], [0.00020999999999999987, 'For example applications developed for Mac OS X. and GNOME are supposed to place the most important button on the right handed side of Windows and dialogs whereas Microsoft Windows and KDE have the opposite convention.'], [0.00015799999999999994, 'For example applications developed for Mac OS x and GNOME are supposed to place the most important button on the right hand side of Windows and dialogues whereas Microsoft Windows and KDE have the opposite convention.']] WITH CHUNKING: [[-8.595760497226179, 'For example whereas microsoft windows and kde have the opposite convention applications developed for mac os x. and gnome are supposed to place the most important button on the right hand side of windows and dialogues.'], [-8.267050265940455, 'For example whereas microsoft windows and kde have the opposite convention applications developed for mac os x and gnome are supposed to place the most important button on the right handed side of windows and dialogs.'], [-8.267050265940455, 'For example whereas microsoft windows and kde have the opposite convention applications developed for mac os x and gnome are supposed to place the most important button on the right handed side of windows and dialogues.'], [-8.267050265940455, 'For example whereas microsoft windows and kde have the opposite convention applications developed for mac os x. and gnome are supposed to place the most important button on the right handed side of windows and dialogs.'], [-8.267050265940455, 'For example whereas microsoft windows and kde have the opposite convention applications developed for mac os x. and gnome are supposed to place the most important button on the right handed side of windows and dialogues.']] ORIGINAL: `XHTML, however, also introduces a new shortcut: an XHTML tag may be opened and closed within the same tag, by including a slash before the end of the tag like this: <br/>.' FULL: [[0.048340000000000015, 'Also XHTML introduces a new shortcut : an XHTML tag may be opened and be closed within the same tag by including a slash before the end of the tag like this : .'], [0.04478200000000004, 'Also XHTML introduces a new shortcut : an XHTML tag may be opened and closed within the same tag by including a slash before the end of the tag like this : .'], [0.04076300000000001, 'Also XHTML introduces a new shortcut : a XHTML tag may be opened and be closed within the same tag by including a slash before the end of the tag like this : .'], [0.037759999999999995, 'Also XHTML introduces a new shortcut : a XHTML tag may be opened and closed within the same tag by including a slash before the end of the tag like this : .'], [0.016066999999999998, 'Also XHTML introduces a new shortcut : an XHTML tag may be opened by and be closed within the same tag by including a slash before the end of the tag like this : .']] WITH CHUNKING: [[-4.431223504768957, 'Also xhtml introduces a new shortcut :. an xhtml tag may be opened by and be closed within the same tag by including a slash before the end of the tag like this : .'], [-3.554432641474198, 'Also xhtml introduces a new shortcut :. a xhtml tag may be opened and be closed within the same tag by including a slash before the end of the tag like this : .'], [-3.4486191903101675, 'Also xhtml introduces a new shortcut :. an xhtml tag may be opened and be closed within the same tag by including a slash before the end of the tag like this : .'], [-3.118654649177991, 'Also xhtml introduces a new shortcut :. a xhtml tag may be opened and closed within the same tag by including a slash before the end of the tag like this : .'], [-3.0128956551618913, 'Also xhtml introduces a new shortcut :. an xhtml tag may be opened and closed within the same tag by including a slash before the end of the tag like this : .']] ORIGINAL: `A distribution whose hazard function slopes upward is said to have positive duration dependence, a decreasing hazard shows negative duration dependence whereas constant hazard is a process with no memory usually characterized by the exponential distribution.' FULL: None WITH CHUNKING: [[-7.445464901629528, 'A distribution whose hazard function slopes upwards is said to have positive duration dependence. a decreasing hazard shows negative duration dependence whereas constant hazard is a process with no memory usually characterized by the exponential distribution.'], [-6.169259428134595, 'A distribution whose hazard function slopes upward is said to have positive duration dependence. whereas constant hazard is a process with no memory usually characterized by the exponential distribution a hazard decreasing shows negative duration dependence.'], [-6.169259428134595, 'A distribution whose hazard function slopes upwards is said to have positive duration dependence. whereas constant hazard is a process with no memory usually characterized by the exponential distribution a hazard decreasing shows negative duration dependence.'], [-5.945615593233369, 'A distribution whose hazard function slopes upward is said to have positive duration dependence. whereas constant hazard is a process with no memory usually characterized by the exponential distribution a decreasing hazard shows negative duration dependence.'], [-5.945615593233369, 'A distribution whose hazard function slopes upwards is said to have positive duration dependence. whereas constant hazard is a process with no memory usually characterized by the exponential distribution a decreasing hazard shows negative duration dependence.']] ORIGINAL: `[[William Jones (philologist)|Sir William Jones]] noted that [[Sanskrit]] shared many common features with classical [[Latin]] and [[Ancient Greek|Greek]], notably verb roots and grammatical structures, such as the [[case system]].' FULL: [[0.073818, 'William Sir Jones noted that Sanskrit shared many common features with classical Latin and Greek notably verb roots and grammatical structures such as the case system.'], [0.073818, 'Sir William Jones noted that Sanskrit shared many common features with classical Latin and Greek notably verb roots and grammatical structures such as the case system.'], [0.006169000000000001, 'William Sir Jones noted Sanskrit shared many common features with classical Latin and Greek notably verb roots and grammatical structures such as the case system.'], [0.006169000000000001, 'Sir William Jones noted Sanskrit shared many common features with classical Latin and Greek notably verb roots and grammatical structures such as the case system.']] WITH CHUNKING: [[-5.731227671312557, 'Sir william jones noted sanskrit shared many common features with classical latin and greek notably verb roots and grammatical structures such as the case system.'], [-5.731227671312557, 'William sir jones noted sanskrit shared many common features with classical latin and greek notably verb roots and grammatical structures such as the case system.'], [-2.6620918825252593, 'Sir william jones noted that sanskrit shared many common features with classical latin and greek notably verb roots and grammatical structures such as the case system.'], [-2.6620918825252593, 'William sir jones noted that sanskrit shared many common features with classical latin and greek notably verb roots and grammatical structures such as the case system.']] ORIGINAL: `[[Berlekamp]] in dots-and-boxes etc. and [[John Nunn]] in [[chess]] [[Chess endgame|endgames]] are notable examples of researchers doing this work, though they were not and are not involved in tablebase generation.' FULL: [[0.004372000000000003, 'Though they did not and are not involved in tablebase generation Berlekamp in dots and boxes etcetera and John Nunn in chess endgames are notable examples of researchers doing this work.'], [0.004372000000000003, 'Though they did not and are not involved in tablebase generation Berlekamp in dots and boxes etc. and John Nunn in chess endgames are notable examples of researchers doing this work.'], [0.004372000000000003, 'Though they did not and are not involved in tablebase generation Berlekamp in dots and boxes etc and John Nunn in chess endgames are notable examples of researchers doing this work.'], [0.004372000000000003, 'Though they did not and are not involved in tablebase generation Berlekamp in dots and boxes et cetera and John Nunn in chess endgames are notable examples of researchers doing this work.'], [0.00315, 'Though they were not and are not involved in tablebase generation Berlekamp in dots and boxes etcetera and John Nunn in chess endgames are notable examples of researchers doing this work.']] WITH CHUNKING: [[-5.7811926888612515, 'Though they were not and are not involved in tablebase generation berlekamp in dots and boxes etcetera and john nunn in chess endgames are notable examples of researchers doing this work.'], [-5.453037621630834, 'Though they did not and are not involved in tablebase generation berlekamp in dots and boxes et cetera and john nunn in chess endgames are notable examples of researchers doing this work.'], [-5.453037621630834, 'Though they did not and are not involved in tablebase generation berlekamp in dots and boxes etc and john nunn in chess endgames are notable examples of researchers doing this work.'], [-5.453037621630834, 'Though they did not and are not involved in tablebase generation berlekamp in dots and boxes etc. and john nunn in chess endgames are notable examples of researchers doing this work.'], [-5.453037621630834, 'Though they did not and are not involved in tablebase generation berlekamp in dots and boxes etcetera and john nunn in chess endgames are notable examples of researchers doing this work.']] ORIGINAL: `While the predominant European language in [[Egypt]] is [[English language|English]], French is considered to be a more sophisticated language by some elements of the Egyptian upper and upper-middle classes; for this reason, a typical educated Egyptian will learn French in addition to English at some point in his or her education.' FULL: None WITH CHUNKING: [[-8.646166333563425, 'While the predominant european language in egypt is english french is considered to be a more sophisticated language by some elements of the upper and upper middle egyptian classes. for this reason a typical educated egyptian will learn french in addition to english at some point in his education.'], [-8.092692375145482, 'While the european predominant language in egypt is english french is considered to be a more sophisticated language by some elements of the egyptian upper and upper middle classes. for this reason a typical egyptian educated will learn french in addition to english at some point in his education.'], [-8.092692375145482, 'While the predominant european language in egypt is english french is considered to be a more sophisticated language by some elements of the egyptian upper and upper middle classes. for this reason a typical egyptian educated will learn french in addition to english at some point in his education.'], [-8.069011340356177, 'While the european predominant language in egypt is english french is considered to be a more sophisticated language by some elements of the egyptian upper and upper middle classes. for this reason a typical educated egyptian will learn french in addition to english at some point in his education.'], [-8.069011340356177, 'While the predominant european language in egypt is english french is considered to be a more sophisticated language by some elements of the egyptian upper and upper middle classes. for this reason a typical educated egyptian will learn french in addition to english at some point in his education.']] ORIGINAL: `As a chip maker, IBM has been among the [[Worldwide Top 20 Semiconductor Sales Leaders]] in past years, and in 2007 IBM ranked second in the list of largest software companies in the world.' FULL: [[0.11517399999999998, 'As a chip maker IBM has been among the worldwide top twenty semiconductor sales leaders in past years and in 2007 IBM ranked second in the list of largest software companies in the world.'], [0.10381000000000001, 'As a chip maker IBM has been among the semiconductor worldwide top twenty sales leaders in past years and in 2007 IBM ranked second in the list of largest software companies in the world.']] WITH CHUNKING: [[-3.7973851962504357, 'As a chip maker ibm has been among the semiconductor worldwide top twenty sales leaders in past years and in 2007 ibm ranked second in the list of largest software companies in the world.'], [-3.6935072701849885, 'As a chip maker ibm has been among the worldwide top twenty semiconductor sales leaders in past years and in 2007 ibm ranked second in the list of largest software companies in the world.']] ORIGINAL: `Some search engines, such as [[Google]], store all or part of the source page (referred to as a [[web cache|cache]]) as well as information about the web pages, whereas others, such as [[AltaVista]], store every word of every page they find.' FULL: [[0.07408999999999998, 'Some search engines such as Google store all or part of the source page referred as a cache as well as information on the web pages whereas others such as AltaVista store every word of every page which they find.'], [0.06961000000000003, 'Some search engines such as Google store all or part of the source page referred as a cache as well as information on the web pages whereas others such as AltaVista store every word of every page that they find.'], [0.057361, 'Some search engines such as Google store all or part of the source page referred as a cache as well as information on the web pages whereas others such as AltaVista store every word of every page who they find.'], [0.05423000000000002, 'Some search engines such as Google store all or part of the source page referred as a cache as well as information about the web pages whereas others such as AltaVista store every word of every page which they find.'], [0.05095399999999998, 'Some search engines such as Google store all or part of the source page referred as a cache as well as information about the web pages whereas others such as AltaVista store every word of every page that they find.']] WITH CHUNKING: [[-2.8705392315786185, 'Whereas others such as altavista store every word of every page that they find some search engines such as google store all or part of the source page referred as a cache as well as information about the web pages.'], [-2.8081568361769182, 'Whereas others such as altavista store every word of every page which they find some search engines such as google store all or part of the source page referred as a cache as well as information about the web pages.'], [-2.7519442373422534, 'Whereas others such as altavista store every word of every page who they find some search engines such as google store all or part of the source page referred as a cache as well as information on the web pages.'], [-2.558485370663009, 'Whereas others such as altavista store every word of every page that they find some search engines such as google store all or part of the source page referred as a cache as well as information on the web pages.'], [-2.4961029752613086, 'Whereas others such as altavista store every word of every page which they find some search engines such as google store all or part of the source page referred as a cache as well as information on the web pages.']] ORIGINAL: `For example, if a picture has metadata that indicates the most important region — the one where there is a person — an image viewer on a small screen, such as on a mobile phone's, can narrow the picture to that region and thus show the user the most interesting details.' FULL: [[0.010225999999999985, "For example if a picture has metadata which indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user."], [0.00931399999999998, "For example if a picture has metadata who indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user."], [0.009039999999999987, "For example if a picture has metadata that indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user."], [0.0026090000000000032, "For example if a picture has metadata which indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture of to that region and thus show the most interesting details to the user."], [0.0023750000000000012, "For example if a picture has metadata who indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture of to that region and thus show the most interesting details to the user."]] WITH CHUNKING: [[-6.230520408091955, "For example an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user if a picture has metadata who indicates the most important region the one where there is a person."], [-6.138362353545007, "For example an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user if a picture has metadata which indicates the most important region the one where there is a person."], [-5.539487872606758, "For example if a picture has metadata that indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user."], [-5.507858095345387, "For example if a picture has metadata who indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user."], [-5.415700040798439, "For example if a picture has metadata which indicates the most important region the one where there is a person an image viewer on a small screen such as on a mobile phone's can narrow the picture to that region and thus show the most interesting details to the user."]] ORIGINAL: `The program downloaded the directory listings of all the files located on public anonymous FTP ([[File Transfer Protocol]]) sites, creating a searchable database of file names; however, Archie did not index the contents of these sites.' FULL: None WITH CHUNKING: None ORIGINAL: `Dynamic recompilation can achieve optimizations superior to static compilation because the dynamic compiler can base optimizations on knowledge about the runtime environment and the set of loaded classes, and can identify the ''hot spots'' (parts of the program, often inner loops, that take up the most execution time).' FULL: None WITH CHUNKING: None ORIGINAL: `The medieval affricates {{IPA|/ts/}}, {{IPA|/dz/}}, {{IPA|/tʃ/}}, {{IPA|/dʒ/}} merged with the fricatives {{IPA|/s/}}, {{IPA|/z/}}, {{IPA|/ʃ/}}, {{IPA|/ʒ/}}, respectively, but not with each other, and there were no other significant changes to the consonant phonemes since then.' FULL: [[0.0002619999999999999, 'The mediæval affricates merged with the fricatives respectively but not with one another and there were no significant other changes to the consonant phonemes since then.'], [0.0002619999999999999, 'The mediæval affricates merged with the fricatives respectively but not with each other and there were no significant other changes to the consonant phonemes since then.'], [0.0002619999999999999, 'The mediæval affricates merged with the fricatives ( respectively but not with one another and there were no significant other changes to the consonant phonemes since then.'], [0.0002619999999999999, 'The mediæval affricates merged with the fricatives ( respectively but not with each other and there were no significant other changes to the consonant phonemes since then.'], [0.0002619999999999999, 'The mediæval affricates merged with the fricatives ( respectively but not with one another and there were no significant other changes to the consonant phonemes since then.']] WITH CHUNKING: [[-9.441160385939321, 'The mediæval affricates merged with the fricatives ( respectively but not with one another and there were no significant other changes to the consonant phonemes since then.'], [-9.441160385939321, 'The mediæval affricates merged with the fricatives ( respectively but not with each other and there were no significant other changes to the consonant phonemes since then.'], [-9.441160385939321, 'The mediæval affricates merged with the fricatives ( respectively but not with one another and there were no significant other changes to the consonant phonemes since then.'], [-9.441160385939321, 'The mediæval affricates merged with the fricatives respectively but not with each other and there were no significant other changes to the consonant phonemes since then.'], [-9.441160385939321, 'The mediæval affricates merged with the fricatives respectively but not with one another and there were no significant other changes to the consonant phonemes since then.']] ORIGINAL: `Cohabitation with the Scandinavians resulted in a significant grammatical simplification and lexical supplementation of the Anglo-Frisian core of English; the later [[Normans|Norman]] occupation led to the grafting onto that Germanic core of a more elaborate layer of words from the [[Italic languages|Italic]] branch of the European languages.' FULL: [[0.000162, 'Cohabitation with the Scandinavians resulted in a significant grammatical simplification and lexical supplementation of the Anglo Frisian core of English. the later Norman occupation led to the grafting of a more elaborate layer of words from the Italic branch of the european languages onto that Germanic core.'], [0.000162, 'Cohabitation with the Scandinavians resulted in a grammatical significant simplification and lexical supplementation of the Anglo Frisian core of English. the later Norman occupation led to the grafting of a more elaborate layer of words from the Italic branch of the european languages onto that Germanic core.'], [0.000116, 'Cohabitation with the Scandinavians resulted in a significant grammatical simplification and lexical supplementation of the Anglo Frisian core of English. the Norman later occupation led to the grafting of a more elaborate layer of words from the Italic branch of the european languages onto that Germanic core.'], [0.000116, 'Cohabitation with the Scandinavians resulted in a grammatical significant simplification and lexical supplementation of the Anglo Frisian core of English. the Norman later occupation led to the grafting of a more elaborate layer of words from the Italic branch of the european languages onto that Germanic core.'], [8.6e-05, 'Cohabitation with the Scandinavians resulted in a significant grammatical simplification and lexical supplementation of the Anglo Frisian core of English. the late more Norman occupation led to the grafting of a more elaborate layer of words from the Italic branch of the european languages onto that Germanic core.']] WITH CHUNKING: [[-9.398004816609205, 'Cohabitation with the scandinavians resulted in a significant grammatical simplification and lexical supplementation of the anglo frisian core of english. the norman later occupation led to the grafting of a more elaborate layer of words from the italic branch of the european languages onto that germanic core.'], [-9.217420879713792, 'Cohabitation with the scandinavians resulted in a grammatical significant simplification and lexical supplementation of the anglo frisian core of english. the late more norman occupation led to the grafting of a more elaborate layer of words from the italic branch of the european languages onto that germanic core.'], [-9.217420879713792, 'Cohabitation with the scandinavians resulted in a significant grammatical simplification and lexical supplementation of the anglo frisian core of english. the late more norman occupation led to the grafting of a more elaborate layer of words from the italic branch of the european languages onto that germanic core.'], [-8.589856243354753, 'Cohabitation with the scandinavians resulted in a grammatical significant simplification and lexical supplementation of the anglo frisian core of english. the later norman occupation led to the grafting of a more elaborate layer of words from the italic branch of the european languages onto that germanic core.'], [-8.589856243354753, 'Cohabitation with the scandinavians resulted in a significant grammatical simplification and lexical supplementation of the anglo frisian core of english. the later norman occupation led to the grafting of a more elaborate layer of words from the italic branch of the european languages onto that germanic core.']] ORIGINAL: `An important difference between inflection and word-formation is that inflected word-forms of lexemes are organized into paradigms, which are defined by the requirements of syntactic rules, whereas the rules of word-formation are not restricted by any corresponding requirements of syntax.' FULL: None WITH CHUNKING: [[-8.696637094332562, 'An important difference between inflection and word formation is that whereas the rules of word formation are not restricted by any corresponding requirements of syntax word forms of lexemes inflected are organized into paradigms that are defined by the requirements of syntactic rules.'], [-8.669267697372979, 'An important difference between inflection and word formation is that word forms of lexemes inflected are organized into paradigms who are defined by the requirements of syntactic rules whereas the rules of word formation are not restricted by any corresponding requirements of syntax.'], [-8.626970643672857, 'An important difference between inflection and word formation is that word forms of lexemes inflected are organized into paradigms that are defined by the requirements of syntactic rules whereas the rules of word formation are not restricted by any corresponding requirements of syntax.'], [-8.604390666482859, 'An important difference between inflection and word formation is that whereas the rules of word formation are not restricted by any corresponding requirements of syntax word forms of lexemes inflected are organized into paradigms which are defined by the requirements of syntactic rules.'], [-8.534724215823154, 'An important difference between inflection and word formation is that word forms of lexemes inflected are organized into paradigms which are defined by the requirements of syntactic rules whereas the rules of word formation are not restricted by any corresponding requirements of syntax.']] ORIGINAL: `Most free software is distributed [[online]] without charge, or [[off-line]] at the [[marginal cost]] of distribution, but this pricing model is not required, and people may sell copies of free software programs for any price.' FULL: None WITH CHUNKING: [[-10.973590835222915, "Online without charge or offline at the marginal cost of distribution most free software is distributed but this pricing model isn't required and people may sell copies of free software programs for any price."], [-10.338576881009702, 'Most free software is distributed online without charge or at the marginal cost of distribution offline but this pricing model is not required and people may sell copies of free software programs for any price.'], [-9.423276951982235, 'Online without charge or at the marginal cost of distribution offline most free software is distributed but this pricing model is not required and people may sell copies of free software programs for any price.'], [-7.880746827845099, 'Most free software is distributed online without charge or offline at the marginal cost of distribution but this pricing model is not required and people may sell copies of free software programs for any price.'], [-7.538254675616027, 'Online without charge or offline at the marginal cost of distribution most free software is distributed but this pricing model is not required and people may sell copies of free software programs for any price.']] ORIGINAL: `For example, person and number are categories that can be used to define paradigms in English, because English has [[Agreement (linguistics)|grammatical agreement]] rules that require the verb in a sentence to appear in an inflectional form that matches the person and number of the subject.' FULL: None WITH CHUNKING: [[-6.827059158224806, 'For example because english has grammatical agreement rules which require the verb in a sentence to appear in a inflectional form that matches the person and number of the subject person and number are categories which can be used to define paradigms in english.'], [-6.776826869453098, 'For example because english has grammatical agreement rules which require the verb in a sentence to appear in a inflectional form who matches the person and number of the subject person and number are categories that can be used to define paradigms in english.'], [-6.765710633949311, 'For example because english has grammatical agreement rules that require the verb in a sentence to appear in a inflectional form who matches the person and number of the subject person and number are categories which can be used to define paradigms in english.'], [-6.761921674220786, 'For example because english has grammatical agreement rules which require the verb in a sentence to appear in a inflectional form which matches the person and number of the subject person and number are categories which can be used to define paradigms in english.'], [-6.6986366565525755, 'For example because english has grammatical agreement rules which require the verb in a sentence to appear in a inflectional form who matches the person and number of the subject person and number are categories which can be used to define paradigms in english.']] ORIGINAL: `This page in Wikipedia is itself an example of such usage, where the textual information is data, how it is packaged, linked, referenced, styled and displayed is markup and aspects and characteristics of that markup are metadata set globally across Wikipedia.' FULL: None WITH CHUNKING: [[-6.674114568352025, 'This page in wikipedia is an example of such usage itself. where the textual information is data how it is packaged linked referenced styled and displayed is markup and aspects and characteristics of that markup are metadata set globally across wikipedia.'], [-6.673651120067869, 'This page in wikipedia is an example of such usage itself. where the textual information is data how it is packaged linked referenced styled and displayed is markup and aspects and characteristics of that markup am metadata globally set across wikipedia.'], [-6.512017342698557, 'This page in wikipedia is an example of such usage itself. where the textual information is data how it is packaged linked referenced styled and displayed is markup and aspects and characteristics of that markup are metadata globally set across wikipedia.'], [-6.304438985149688, 'This page in wikipedia is an example of such usage itself. where the textual information is data how it is packaged linked referenced styled and displayed is markup and aspects and characteristics of that markup am metadata set across wikipedia globally.'], [-6.142777600958288, 'This page in wikipedia is an example of such usage itself. where the textual information is data how it is packaged linked referenced styled and displayed is markup and aspects and characteristics of that markup are metadata set across wikipedia globally.']] ORIGINAL: `Some claim that this method is biased by counting more vulnerabilities for the free software, since its source code is accessible and its community is more forthcoming about what problems exist.' FULL: None WITH CHUNKING: [[-6.132582121072817, 'Since its source code is accessible and its community is more forthcoming ‘bout what problems of who exist some claim that this method is biased by counting more vulnerabilities for the free software.'], [-6.132582121072817, 'Since its source code is accessible and its community is more forthcoming ’bout what problems of who exist some claim that this method is biased by counting more vulnerabilities for the free software.'], [-5.865962186460973, 'Since its source code is accessible and its community is more forthcoming about what problems of which exist some claim that this method is biased by counting more vulnerabilities for the free software.'], [-5.865962186460973, 'Since its source code is accessible and its community is more forthcoming ‘bout what problems of which exist some claim that this method is biased by counting more vulnerabilities for the free software.'], [-5.865962186460973, 'Since its source code is accessible and its community is more forthcoming ’bout what problems of which exist some claim that this method is biased by counting more vulnerabilities for the free software.']] ORIGINAL: `BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a [[data warehouse]] or a [[data mart]] and occasionally working from operational data.' FULL: None WITH CHUNKING: [[-5.185253516485362, 'Bi systems provide historical current and predictive views of business operations. to be using data that has been gathered into a data warehouse or a datum mart most often and to be working from operational data occasionally.'], [-5.087685308646401, 'Bi systems provide historical current and predictive views of business operations. to be using data which has been gathered into a data warehouse or a datum mart most often and to be working from operational data occasionally.'], [-4.647103036759915, 'Bi systems provide historical current and predictive views of business operations. to be using data who has been gathered into a data warehouse or a data mart most often and to be working from operational data occasionally.'], [-4.62138618579521, 'Bi systems provide historical current and predictive views of business operations. to be using data that has been gathered into a data warehouse or a data mart most often and to be working from operational data occasionally.'], [-4.5217856040174595, 'Bi systems provide historical current and predictive views of business operations. to be using data which has been gathered into a data warehouse or a data mart most often and to be working from operational data occasionally.']] ORIGINAL: `In a 2007 report of the United States' richest people, [[Forbes]] reported that [[Sergey Brin]] and [[Larry Page]] were tied for #5 with a net worth of US$18.5 billion each.' FULL: None WITH CHUNKING: None ORIGINAL: `These techniques were primarily developed in the medical and biological sciences, but they are also widely used in the social sciences like economics, as well as in engineering (reliability and failure time analysis).' FULL: None WITH CHUNKING: None ORIGINAL: `The domain ''google.com'' was registered on [[September 15]], [[1997]], and the company was incorporated as ''Google Inc.'' on [[September 7]], [[1998]] at a friend's garage in [[Menlo Park, California]].' FULL: None WITH CHUNKING: None ORIGINAL: `This appears to be because learning subsequent foreign languages is easier than learning one's first, while the use of a grammatically simple and culturally flexible auxiliary language like Esperanto lessens the first-language learning hurdle.' FULL: [[0.013637000000000002, "This appears to be because while the use of an auxiliary language like Esperanto simple grammatically and flexible culturally lessens the first language learning hurdle learning of subsequent foreign languages is easier than learning one's first."], [0.013637000000000002, "This appears to be because while the use of an auxiliary language like Esperanto simple grammatically and flexible culturally lessens the first language learning hurdle learning of foreign subsequent languages is easier than learning one's first."], [0.011858999999999998, "This appears to be because while the use of an auxiliary language like Esperanto simple grammatically and flexible culturally lessens the first language learning hurdle learning subsequent foreign languages is easier than learning one's first."], [0.011858999999999998, "This appears to be because while the use of an auxiliary language like Esperanto simple grammatically and flexible culturally lessens the first language learning hurdle learning foreign subsequent languages is easier than learning one's first."], [0.010915, "This appears to be because while the use of an grammatically simple and culturally flexible auxiliary language like Esperanto lessens the first language learning hurdle learning of subsequent foreign languages is easier than learning one's first."]] WITH CHUNKING: None ORIGINAL: `The multiplication operator is pronounced "per" in Italian, and so it is sometimes used to replace the word "per", which means "for"; thus, for example, "per te" ("for you") is shortened to "x te" (compare with English "4 U").' FULL: None WITH CHUNKING: None ORIGINAL: `* [[IBM Information Management Software|Information Management Software]] — database servers and tools, text analytics, content management, business process management and business intelligence.' FULL: None WITH CHUNKING: None ORIGINAL: `Many [[French language|French]] words are also intelligible to an English speaker (though pronunciations are often quite different) because English absorbed a large vocabulary from [[Norman language|Norman]] and French, via [[Anglo-Norman]] after the Norman Conquest and directly from French in subsequent centuries.' FULL: None WITH CHUNKING: [[-5.062958777964749, 'Though pronunciations are quite different from it often many french words are also intelligible to an english speaker because english absorbed a large vocabulary from norman and french via anglo-norman after the norman conquest and directly from french in subsequent centuries.'], [-4.640344435658935, 'Because english absorbed a large vocabulary from norman and french via anglo-norman after the norman conquest and in subsequent centuries directly from french though pronunciations are quite different often many french words are also intelligible to an english speaker.'], [-4.628826518543797, 'Because english absorbed a large vocabulary from norman and french via anglo-norman after the norman conquest and directly from french in subsequent centuries though pronunciations are quite different often many french words are also intelligible to an english speaker.'], [-3.574627386683728, 'Because english absorbed a large vocabulary from norman and french via anglo-norman after the norman conquest and in subsequent centuries directly from french though pronunciations are quite different from it often many french words are also intelligible to an english speaker.'], [-3.56310946956859, 'Because english absorbed a large vocabulary from norman and french via anglo-norman after the norman conquest and directly from french in subsequent centuries though pronunciations are quite different from it often many french words are also intelligible to an english speaker.']] ORIGINAL: `This also means that just because a program is written in a popular programming language such as [[C (programming language)|C]] or [[C++]], it does not mean it will run on all [[operating systems]] that support that [[programming language]].' FULL: None WITH CHUNKING: [[-10.492830861268356, 'Also this means as though just because a program is written in a popular programming language such as c or c++ it does not mean that it will run on all operating systems which support that programming language.'], [-10.492830861268356, 'Also this means as though just because a program is written in a popular programming language such as c. or c++ it does not mean that it will run on all operating systems which support that programming language.'], [-10.492830861268356, 'Also this means like just because a program is written in a popular programming language such as (c) or c++ it does not mean that it will run on all operating systems which support that programming language.'], [-10.492830861268356, 'Also this means like just because a program is written in a popular programming language such as c or c++ it does not mean that it will run on all operating systems which support that programming language.'], [-10.492830861268356, 'Also this means like just because a program is written in a popular programming language such as c. or c++ it does not mean that it will run on all operating systems which support that programming language.']] ORIGINAL: `In the early years, speakers of Esperanto kept in contact primarily through correspondence and [[magazine|periodicals]], but in 1905 the first [[World Congress of Esperanto|world congress of Esperanto speakers]] was held in [[Boulogne-sur-Mer]], [[France]].' FULL: [[0.025808, 'In the early years speakers of Esperanto kept in contact primarily through correspondence and periodicals but France in 1905 the first world congress of Esperanto speakers was held in Boulogne-sur- Mer.'], [0.000984, 'In the early years speakers of Esperanto kept in contact through correspondence and periodicals primarily but France in 1905 the first world congress of Esperanto speakers was held in Boulogne-sur- Mer.']] WITH CHUNKING: [[-6.5734218129105795, 'In the early years speakers of esperanto kept in contact through correspondence and periodicals primarily but france in 1905 the first world congress of esperanto speakers was held in boulogne-sur- mer.'], [-5.024391175230294, 'In the early years speakers of esperanto kept in contact primarily through correspondence and periodicals but france in 1905 the first world congress of esperanto speakers was held in boulogne-sur- mer.']] ORIGINAL: `As of [[December 11]], [[2007]], Google, like the [[Microsoft]] search engine, stores "personal information for 18 months" and by comparison, [[Yahoo!]] and [[AOL]] ([[Time Warner]]) "retain search requests for 13 months."' FULL: None WITH CHUNKING: None ORIGINAL: `The French words which have developed from Latin are usually less recognisable than [[Italian language|Italian]] words of Latin origin because as French evolved from [[Vulgar Latin]], the unstressed final [[syllable]] of many words was dropped or elided into the following word.' FULL: None WITH CHUNKING: [[-9.927508710337682, 'Because as french evolved from vulgar latin the un-stressed final syllable of many words was dropped or elided into the following word usually the french words which have developed from latin are less recognisable than italian words of latin origin.'], [-9.820900267554471, 'Because as french evolved from vulgar latin the un-stressed final syllable of many words was dropped or elided into the following word usually the french words who have developed from latin are less recognisable than italian words of latin origin.'], [-9.275997973672755, 'Because as french evolved from vulgar latin the final syllable of many words un-stressed was dropped or elided into the following word usually the french words that have developed from latin are less recognisable than italian words of latin origin.'], [-9.146119174429467, 'Because as french evolved from vulgar latin the final syllable of many words un-stressed was dropped or elided into the following word usually the french words which have developed from latin are less recognisable than italian words of latin origin.'], [-9.039510731646256, 'Because as french evolved from vulgar latin the final syllable of many words un-stressed was dropped or elided into the following word usually the french words who have developed from latin are less recognisable than italian words of latin origin.']] ORIGINAL: `For difficult problems, most of these algorithms can require enormous computational resources — most experience a "[[combinatorial explosion]]": the amount of memory or computer time required becomes astronomical when the problem goes beyond a certain size.' FULL: None WITH CHUNKING: None ORIGINAL: `Results are not yet available from a study in Australia to see if similar benefits would occur for learning East Asian languages, but the pupils taking Esperanto did better and enjoyed the subject more than those taking other languages.' FULL: [[0.029924000000000048, 'Results are not yet available from a study in Australia to see if benefits similar to it would occur for learning east Asian languages but the pupils taking Esperanto did better and enjoyed the subject more than those taking other languages.'], [0.02656900000000003, 'Results are not yet available from a study in Australia to see if similar benefits to it would occur for learning east Asian languages but the pupils taking Esperanto did better and enjoyed the subject more than those taking other languages.'], [0.021578000000000038, 'Results are not yet available from a study in Australia to see whether benefits similar to it would occur for learning east Asian languages but the pupils taking Esperanto did better and enjoyed the subject more than those taking other languages.'], [0.021189000000000027, 'Results are not yet available from a study in Australia to see if similar benefits would occur for learning east Asian languages but the pupils taking Esperanto did better and enjoyed the subject more than those taking other languages.'], [0.01915000000000004, 'Results are not yet available from a study in Australia to see whether similar benefits to it would occur for learning east Asian languages but the pupils taking Esperanto did better and enjoyed the subject more than those taking other languages.']] WITH CHUNKING: [[-6.32711511590105, 'Results are not yet available from a study in australia to see that would similar benefits occur for learning of east asian languages but the pupils taking esperanto did better and enjoyed the subject more than those taking other languages.'], [-6.195145514784899, 'Results are not yet available from a study in australia to see that would similar benefits to it occur for learning of east asian languages but the pupils taking esperanto did better and enjoyed the subject more than those taking other languages.'], [-5.224752442491887, 'Results are not yet available from a study in australia to see that would benefits similar to it occur for learning east asian languages but the pupils taking esperanto did better and enjoyed the subject more than those taking other languages.'], [-5.067436524024533, 'Results are not yet available from a study in australia to see that would similar benefits occur for learning east asian languages but the pupils taking esperanto did better and enjoyed the subject more than those taking other languages.'], [-4.935475417091846, 'Results are not yet available from a study in australia to see that would similar benefits to it occur for learning east asian languages but the pupils taking esperanto did better and enjoyed the subject more than those taking other languages.']] ORIGINAL: `[[Windows NT]] and its successors are designed for security (including on a network) and multi-user PCs, but are not designed with Internet security in mind as much since, when it was first developed in the early 1990s, Internet use was less prevalent.' FULL: None WITH CHUNKING: None ORIGINAL: `For any one [[logical model]] various physical implementations may be possible, and most products will offer the user some level of control in tuning the [[physical implementation]], since the choices that are made have a significant effect on performance.' FULL: [[9.000000000000002e-06, 'For any one logical model various physical implementations may be possible and since the choices which are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [9.000000000000002e-06, 'For any one logical model physical various implementations may be possible and since the choices which are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [8e-06, 'For any one logical model various physical implementations may be possible and since the choices who are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [8e-06, 'For any one logical model physical various implementations may be possible and since the choices who are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [6.999999999999999e-06, 'For any one logical model various physical implementations may be possible and since the choices that are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.']] WITH CHUNKING: [[-13.39738629908021, 'For any one logical model various physical implementations may be possible and since the choices that are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [-13.335357336392265, 'For any one logical model physical various implementations may be possible and since the choices who are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [-13.335357336392265, 'For any one logical model various physical implementations may be possible and since the choices who are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [-13.157865263884647, 'For any one logical model physical various implementations may be possible and since the choices which are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.'], [-13.157865263884647, 'For any one logical model various physical implementations may be possible and since the choices which are made have a significant effect on performance most products will offer some level of control in tuning the physical implementation to the user.']] ORIGINAL: `Because formant-based systems have complete control of all aspects of the output speech, a wide variety of prosodies and [[Intonation (linguistics)|intonation]]s can be output, conveying not just questions and statements, but a variety of emotions and tones of voice.' FULL: None WITH CHUNKING: None ORIGINAL: `Software database drivers are available for most database platforms so that [[application software]] can use a common [[Application Programming Interface]] to retrieve the information stored in a database.' FULL: [[0.006508000000000001, 'Software database drivers are available for most database platforms so that application software can use a common application programming interface to retrieve the information stored in a database.'], [0.006088, 'Software database drivers are available for most database platforms so that application software can use a common programming application interface to retrieve the information stored in a database.'], [0.001624, 'So that application software can use a common application programming interface to retrieve the information stored in a database software database drivers are available for most database platforms.'], [0.0015199999999999999, 'So that application software can use a common programming application interface to retrieve the information stored in a database software database drivers are available for most database platforms.']] WITH CHUNKING: [[-6.345201092371656, 'Software database drivers are available for most database platforms so that application software can use a common programming application interface to retrieve the information stored in a database.'], [-6.278358833944351, 'Software database drivers are available for most database platforms so that application software can use a common application programming interface to retrieve the information stored in a database.'], [-4.845351783975499, 'So that application software can use a common programming application interface to retrieve the information stored in a database software database drivers are available for most database platforms.'], [-4.778509525548193, 'So that application software can use a common application programming interface to retrieve the information stored in a database software database drivers are available for most database platforms.']] ORIGINAL: `Unlike [[Alan Newell|Newell]] and [[Herbert Simon|Simon]], [[John McCarthy (computer scientist)|John McCarthy]] felt that machines did not need to simulate human thought, but should instead try to find the essence of abstract reasoning and problem solving, regardless of whether people used the same algorithms.' FULL: [[0.0024090000000000006, 'Unlike Newell and Simon John Mccarthy felt like machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.'], [0.0024090000000000006, 'Unlike Newell and Simon John Mccarthy felt as though machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.'], [0.0024090000000000006, 'Unlike Newell and Simon John Mccarthy felt as if machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.'], [0.002356, 'Unlike Newell and Simon John Mccarthy did feel like machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.'], [0.002356, 'Unlike Newell and Simon John Mccarthy did feel as though machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.']] WITH CHUNKING: [[-5.502845312444922, 'Unlike newell and simon john mccarthy felt as though machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people used the same algorithms.'], [-5.502845312444922, 'Unlike newell and simon john mccarthy felt like machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people used the same algorithms.'], [-5.413593611631001, 'Unlike newell and simon john mccarthy felt as if machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.'], [-5.413593611631001, 'Unlike newell and simon john mccarthy felt as though machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.'], [-5.413593611631001, 'Unlike newell and simon john mccarthy felt like machines did not need to simulate human thought but should instead try to find the essence of abstract reasoning and problem solving regardless of whether people did use the same algorithms.']] ORIGINAL: `Although there is a lack of Linux ports for some [[Mac OS X]] and [[Microsoft Windows]] programs in domains such as [[desktop publishing]] and [[professional audio]], applications equivalent to those available for Mac and Windows are available for Linux.' FULL: None WITH CHUNKING: None ORIGINAL: `Portuguese and Spanish are the fastest-growing European languages, and, according to estimates by UNESCO, Portuguese is the language with the highest potential for growth as an international language in southern Africa and South America.' FULL: [[0.00015800000000000002, 'Portuguese and Spanish are the european fastest growing languages and according to estimates by UNESCO Portuguese is the language with the highest potential for growth as an international language in southern Africa and South America.']] WITH CHUNKING: [[-3.3819799104839965, 'Portuguese and spanish are the european fastest growing languages and according to estimates by unesco portuguese is the language with the highest potential for growth as an international language in southern africa and south america.']] ORIGINAL: `The acoustic noise problem is actually more severe in the helicopter environment, not only because of the high noise levels but also because the helicopter pilot generally does not wear a facemask, which would reduce acoustic noise in the microphone.' FULL: None WITH CHUNKING: None ORIGINAL: `One might argue that "definitions" of this sort really rely on speakers' prior intuitive knowledge of what nouns, verbs and adjectives are, and, so don't really add anything over and beyond this.' FULL: [[0.0035350000000000012, 'One might argue that really definitions of this sort do rely on and so do not really add anything over and beyond this to speakers ’ prior intuitive knowledge of what nouns verbs and adjectives are.'], [0.0035350000000000012, 'One might argue that really definitions of this sort do rely on and so do not really add anything over and beyond this to speakers ’ intuitive prior knowledge of what nouns verbs and adjectives are.'], [0.002995, 'One might argue that really definitions of this sort rely on and so do not really add anything over and beyond this to speakers ’ prior intuitive knowledge of what nouns verbs and adjectives are.'], [0.002995, 'One might argue that really definitions of this sort rely on and so do not really add anything over and beyond this to speakers ’ intuitive prior knowledge of what nouns verbs and adjectives are.'], [0.0028999999999999994, "One might argue that really definitions of this sort do rely on and so do not really add anything over and beyond this to speakers's prior intuitive knowledge of what nouns verbs and adjectives are."]] WITH CHUNKING: [[-5.671288773447082, 'One might argue that really definitions of this sort rely on speakers ’ prior intuitive knowledge of what nouns verbs and adjectives are and so do not really add anything over and beyond this.'], [-5.516178349477905, 'One might argue that really definitions of this sort do rely on and so do not really add anything over and beyond this to speakers ’ intuitive prior knowledge of what nouns verbs and adjectives are.'], [-5.516178349477905, 'One might argue that really definitions of this sort do rely on and so do not really add anything over and beyond this to speakers ’ prior intuitive knowledge of what nouns verbs and adjectives are.'], [-5.3783925364818, 'One might argue that really definitions of this sort do rely on speakers ’ intuitive prior knowledge of what nouns verbs and adjectives are and so do not really add anything over and beyond this.'], [-5.3783925364818, 'One might argue that really definitions of this sort do rely on speakers ’ prior intuitive knowledge of what nouns verbs and adjectives are and so do not really add anything over and beyond this.']] ORIGINAL: `It is applied as a means of coding and storage of universal knowledge — 60–70% of all world information is published in English and Russian languages.' FULL: [[0.005254000000000001, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy per cent of all world information is published in English and Russian languages.'], [0.0025039999999999997, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy percent of all world information is published in English and Russian languages.'], [0.0025039999999999997, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy % of all world information is published in English and Russian languages.'], [0.000749, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy per cent of all world information is published in languages English and Russian.'], [0.00035699999999999995, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy percent of all world information is published in languages English and Russian.']] WITH CHUNKING: [[-10.396807171152933, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy percent of all world information is published in languages english and russian.'], [-9.655933090744885, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy per cent of all world information is published in languages english and russian.'], [-8.447789664259869, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy % of all world information is published in english and russian languages.'], [-8.447789664259869, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy percent of all world information is published in english and russian languages.'], [-7.706915890524959, 'It is applied as a means of coding and storage of universal knowledge. sixty to seventy per cent of all world information is published in english and russian languages.']] ORIGINAL: `Section three of the license requires that programs distributed as pre-compiled binaries are accompanied by a copy of the source code, a written offer to distribute the source code via the same mechanism as the pre-compiled binary or the written offer to obtain the source code that you got when you received the pre-compiled binary under the GPL.' FULL: None WITH CHUNKING: None ORIGINAL: `It runs on [[Unix-like]] systems with [[X11]], Microsoft Windows and Mac OS X. It permits applications written to use it to run on all of the systems that it supports, if the application does not use any [[operating system]]-specific programming in addition to it.' FULL: None WITH CHUNKING: [[-8.659995165079188, 'It runs on unix like systems with x11 microsoft windows and mac os x. if the application does not use any operating systems specific programming in addition to it it permits applications written to use it to run on all of the systems who it supports.'], [-8.644995971762262, 'It runs on unix like systems with x11 microsoft windows and mac os x. if the application does not use any operating system specific programming in addition to it it permits applications written to use it to run on all of the systems that it supports.'], [-8.545846767271415, 'It runs on unix like systems with x11 microsoft windows and mac os x. if the application does not use any operating system specific programming in addition to it it permits applications written to use it to run on all of the systems which it supports.'], [-8.521391334813991, 'It runs on unix like systems with x11 microsoft windows and mac os x. if the application does not use any operating systems specific programming in addition to it it permits applications written to use it to run on all of the systems that it supports.'], [-8.422242130323145, 'It runs on unix like systems with x11 microsoft windows and mac os x. if the application does not use any operating systems specific programming in addition to it it permits applications written to use it to run on all of the systems which it supports.']] ORIGINAL: `Research in NLP evaluation has received considerable attention, because the definition of proper evaluation criteria is one way to specify precisely an NLP problem, going thus beyond the vagueness of tasks defined only as ''language understanding'' or ''language generation''.' FULL: [[0.0010909999999999997, 'Because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way to specify a nlp problem precisely research in NLP evaluation has received considerable attention.'], [0.00038100000000000005, 'Because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way to precisely specify a nlp problem research in NLP evaluation has received considerable attention.'], [0.0003410000000000002, 'Research in NLP evaluation has received considerable attention because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way to specify a nlp problem precisely.'], [0.000184, 'Because the definition of proper evaluation criterion is one way to precisely specify a nlp problem going beyond the vagueness of tasks only defined as language understanding or language generation research in NLP evaluation has received considerable attention.'], [0.000123, 'Because the definition of proper evaluation criterion is one way to specify a nlp problem precisely going beyond the vagueness of tasks only defined as language understanding or language generation research in NLP evaluation has received considerable attention.']] WITH CHUNKING: [[-6.748728306475685, 'Because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way precisely to specify a nlp problem research in nlp evaluation has received considerable attention.'], [-6.43157007509807, 'Research in nlp evaluation has received considerable attention because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way to precisely specify a nlp problem.'], [-4.931720766701912, 'Because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way to precisely specify a nlp problem research in nlp evaluation has received considerable attention.'], [-4.483137530371052, 'Research in nlp evaluation has received considerable attention because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way to specify a nlp problem precisely.'], [-2.983288221974894, 'Because going beyond the vagueness of tasks only defined as language understanding or language generation the definition of proper evaluation criterion is one way to specify a nlp problem precisely research in nlp evaluation has received considerable attention.']] ORIGINAL: `Historically, Linux has mainly been used as a [[Server (computing)|server]] operating system, and has risen to prominence in that area; [[Netcraft]] reported in September 2006 that eight of the ten most reliable internet hosting companies run Linux on their [[web server]]s.' FULL: None WITH CHUNKING: [[-6.735700315725024, 'Historically linux has mainly been used as a server operating system and has risen in that area to prominence. netcraft reported that eight of the most reliable ten internet hosting companies run linux on their web servers in september 2006.'], [-6.72140516469128, 'Historically linux has been used as a server operating system mainly and has risen to prominence in that area. netcraft reported that eight of the ten most reliable internet hosting companies run linux on their web servers in september 2006.'], [-6.568306095903479, 'Historically linux has been used as a server operating system mainly and has risen in that area to prominence. netcraft reported that eight of the ten most reliable internet hosting companies run linux on their web servers in september 2006.'], [-6.543215918155359, 'Historically linux has been used as a server operating system mainly and has risen to prominence in that area. netcraft reported that eight of the most reliable ten internet hosting companies run linux on their web servers in september 2006.'], [-6.390116849367557, 'Historically linux has been used as a server operating system mainly and has risen in that area to prominence. netcraft reported that eight of the most reliable ten internet hosting companies run linux on their web servers in september 2006.']] ORIGINAL: `The Item-and-Arrangement approach fits very naturally with agglutinative languages; while the Item-and-Process and Word-and-Paradigm approaches usually address fusional languages.' FULL: [[0.03872299999999999, 'The item and arrangement approach fits with agglutinative languages very naturally. while the item and process and word and paradigm approaches usually address fusional languages.'], [0.030902000000000002, 'The item and arrangement approach fits very naturally with agglutinative languages. while the item and process and word and paradigm approaches usually address fusional languages.'], [0.008865, 'The item and arrangement approach fits with agglutinative languages very naturally. while usually the item and process and word and paradigm approaches address fusional languages.'], [0.008612000000000002, 'The item and arrangement approach very naturally fits with agglutinative languages. while the item and process and word and paradigm approaches usually address fusional languages.'], [0.0070790000000000046, 'The item and arrangement approach fits very naturally with agglutinative languages. while usually the item and process and word and paradigm approaches address fusional languages.']] WITH CHUNKING: [[-6.219036823262854, 'The item and arrangement approach fits very naturally with agglutinative languages. while usually the item and process and word and paradigm approaches address fusional languages.'], [-5.9622781955768165, 'The item and arrangement approach very naturally fits with agglutinative languages. while the item and process and word and paradigm approaches usually address fusional languages.'], [-5.787794064382789, 'The item and arrangement approach fits with agglutinative languages very naturally. while usually the item and process and word and paradigm approaches address fusional languages.'], [-4.63192787163733, 'The item and arrangement approach fits very naturally with agglutinative languages. while the item and process and word and paradigm approaches usually address fusional languages.'], [-4.200685112757264, 'The item and arrangement approach fits with agglutinative languages very naturally. while the item and process and word and paradigm approaches usually address fusional languages.']] ORIGINAL: `In other words, we wish to ''infer'' the mapping implied by the data; the cost function is related to the mismatch between our mapping and the data and it implicitly contains prior knowledge about the problem domain.' FULL: [[1e-06, 'In other words we wish to infer the mapping implied by the data. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge ’bout the problem domain implicitly.'], [1e-06, 'In other words we wish to infer the mapping implied by the data. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge ‘bout the problem domain implicitly.'], [1e-06, 'In other words we wish to infer the mapping implied by the data. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge about the problem domain implicitly.'], [1e-06, 'In other words we wish to infer the mapping by the data implied. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge ’bout the problem domain implicitly.'], [1e-06, 'In other words we wish to infer the mapping by the data implied. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge ‘bout the problem domain implicitly.']] WITH CHUNKING: [[-10.594024084779466, 'In other words we wish to infer the implied mapping by the data. the cost function is related to the mismatch between our mapping and the data and it contains prior knowledge ‘bout the problem domain implicitly.'], [-10.594024084779466, 'In other words we wish to infer the implied mapping by the data. the cost function is related to the mismatch between our mapping and the data and it contains prior knowledge ’bout the problem domain implicitly.'], [-10.515112205646048, 'In other words we wish to infer the implied mapping by the data. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge about the problem domain implicitly.'], [-10.515112205646048, 'In other words we wish to infer the implied mapping by the data. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge ‘bout the problem domain implicitly.'], [-10.515112205646048, 'In other words we wish to infer the implied mapping by the data. the cost function is related to the mismatch between our mapping and the datum and it contains prior knowledge ’bout the problem domain implicitly.']] ORIGINAL: `This can have a crucial impact on meaning, specifically in relation to polarity, the positive–negative opposition; thus, falling pitch means "polarity known", while rising pitch means "polarity unknown".' FULL: [[0.017376000000000006, 'This can have a crucial impact on meaning in relation to polarity the positive negative opposition specifically. while pitch rising means un-known polarity pitch thus falling means known polarity.'], [0.015086000000000002, 'This can have a crucial impact on meaning specifically in relation to polarity the positive negative opposition. while pitch rising means un-known polarity pitch thus falling means known polarity.'], [0.009530999999999986, 'This can have a crucial impact on meaning in relation to polarity the positive negative opposition specifically. while pitch rising means un-known polarity pitch thus falling means polarity known.'], [0.008268999999999985, 'This can have a crucial impact on meaning specifically in relation to polarity the positive negative opposition. while pitch rising means un-known polarity pitch thus falling means polarity known.'], [0.005774000000000001, 'This can specifically have a crucial impact on meaning in relation to polarity the positive negative opposition. while pitch rising means un-known polarity pitch thus falling means known polarity.']] WITH CHUNKING: [[-5.031829028876245, 'This can have a crucial impact on meaning in relation to polarity the positive negative opposition specifically. while pitch rising means un-known polarity pitch thus falling means polarity known.'], [-4.91954258427263, 'This can have a crucial impact on meaning specifically in relation to polarity the positive negative opposition. while pitch rising means polarity un-known pitch thus falling means known polarity.'], [-4.909221823342171, 'This can have a crucial impact on meaning specifically in relation to polarity the positive negative opposition. while pitch rising means un-known polarity pitch thus falling means polarity known.'], [-4.374528809019871, 'This can have a crucial impact on meaning in relation to polarity the positive negative opposition specifically. while pitch rising means un-known polarity pitch thus falling means known polarity.'], [-4.251921603485798, 'This can have a crucial impact on meaning specifically in relation to polarity the positive negative opposition. while pitch rising means un-known polarity pitch thus falling means known polarity.']] ORIGINAL: `In some cases, there are regional differences: In central Germany (Hessen), the ''o'' in the [[Noun#Proper nouns and common nouns|proper name]] "Hoffmann" is pronounced long while most other Germans would pronounce it short; the same applies to the ''e'' in the geographical name "Mecklenburg" for people in that region.' FULL: None WITH CHUNKING: None ORIGINAL: `Around the world, German is spoken by approximately 100 million [[First language|native speakers]] and also about 80 million non-native speakers, and [[Standard German]] is widely taught in schools, universities, and [[Goethe Institute]]s worldwide.' FULL: None WITH CHUNKING: None ORIGINAL: `The license is also meant to cause Microsoft to extend the patent licenses it grants to Novell customers for the use of GPLv3 software to ''all'' users of that GPLv3 software; this is possible only if Microsoft is legally a "conveyor" of the GPLv3 software.' FULL: [[0.005751000000000002, 'The license is also meant to cause Microsoft to extend the patent licenses which it grants to Novell customers for the use of GPLv3 software to all users of that GPLv3 software. this is possible only if Microsoft is a conveyor of the GPLv3 software legally.'], [0.005751000000000002, 'The license is also meant to cause Microsoft to extend the patent licences which it grants to Novell customers for the use of GPLv3 software to all users of that GPLv3 software. this is possible only if Microsoft is a conveyor of the GPLv3 software legally.'], [0.005751000000000002, 'The licence is also meant to cause Microsoft to extend the patent licenses which it grants to Novell customers for the use of GPLv3 software to all users of that GPLv3 software. this is possible only if Microsoft is a conveyor of the GPLv3 software legally.'], [0.005751000000000002, 'The licence is also meant to cause Microsoft to extend the patent licences which it grants to Novell customers for the use of GPLv3 software to all users of that GPLv3 software. this is possible only if Microsoft is a conveyor of the GPLv3 software legally.'], [0.0049720000000000025, 'The license is also meant to cause Microsoft to extend the patent licenses that it grants to Novell customers for the use of GPLv3 software to all users of that GPLv3 software. this is possible only if Microsoft is a conveyor of the GPLv3 software legally.']] WITH CHUNKING: [[-6.678305730386175, 'Also the license is meant to cause microsoft to extend the patent licenses that it grants to novell customers for the use of gplv3 software to all users of that gplv3 software. only if microsoft is a conveyor of the gplv3 software legally this is possible.'], [-6.534413574139016, 'Also the licence is meant to cause microsoft to extend the patent licences which it grants to novell customers for the use of gplv3 software to all users of that gplv3 software. only if microsoft is a conveyor of the gplv3 software legally this is possible.'], [-6.534413574139016, 'Also the licence is meant to cause microsoft to extend the patent licenses which it grants to novell customers for the use of gplv3 software to all users of that gplv3 software. only if microsoft is a conveyor of the gplv3 software legally this is possible.'], [-6.534413574139016, 'Also the license is meant to cause microsoft to extend the patent licences which it grants to novell customers for the use of gplv3 software to all users of that gplv3 software. only if microsoft is a conveyor of the gplv3 software legally this is possible.'], [-6.534413574139016, 'Also the license is meant to cause microsoft to extend the patent licenses which it grants to novell customers for the use of gplv3 software to all users of that gplv3 software. only if microsoft is a conveyor of the gplv3 software legally this is possible.']] ORIGINAL: `The Normal distribution, being a symmetric distribution, takes positive as well as negative values, but duration by its very nature cannot be negative and therefore normality cannot be assumed when dealing with duration/survival data.' FULL: [[0.023907999999999992, 'The normal distribution being a symmetric distribution does take positive as well as negative values but duration by its very nature can not be negative and therefore normality can not be assumed when dealing with duration and survival data.'], [0.015854000000000004, 'The normal distribution being a symmetric distribution does take positive as well as negative values but duration by its very nature can not be negative and therefore when dealing with duration and survival data normality can not be assumed.'], [0.005308000000000002, 'The normal distribution being a symmetric distribution does take positive as well as negative values but duration by its very nature can not be negative and therefore normality when dealing with duration and survival data can not be assumed.'], [0.005304999999999999, 'The normal distribution being a symmetric distribution does take positive as well as negative values but duration by its very nature can not be negative and normality therefore can not be assumed when dealing with duration and survival data.'], [0.0038610000000000007, 'The normal distribution being a symmetric distribution takes positive as well as negative values but duration by its very nature can not be negative and therefore normality can not be assumed when dealing with duration and survival data.']] WITH CHUNKING: [[-7.31846003978247, 'The normal distribution being a symmetric distribution takes positive as well as negative values but duration by its very nature can not be negative and therefore normality when dealing with duration and survival data can not be assumed.'], [-7.2377911742818295, 'The normal distribution being a symmetric distribution takes values positive as well as negative but duration by its very nature can not be negative and therefore when dealing with duration and survival data normality can not be assumed.'], [-7.08794027219316, 'The normal distribution being a symmetric distribution takes values positive as well as negative but duration by its very nature can not be negative and therefore normality can not be assumed when dealing with duration and survival data.'], [-6.303903349667962, 'The normal distribution being a symmetric distribution takes positive as well as negative values but duration by its very nature can not be negative and therefore when dealing with duration and survival data normality can not be assumed.'], [-6.154052447579293, 'The normal distribution being a symmetric distribution takes positive as well as negative values but duration by its very nature can not be negative and therefore normality can not be assumed when dealing with duration and survival data.']] ORIGINAL: `Many linguists argue that the formal distinctions between parts of speech must be made within the framework of a specific language or language family, and should not be carried over to other languages or language families.' FULL: [[0.00028900000000000003, 'Many linguists do argue that the formal distinctions between parts of speech must be made within the framework of a specific language or language family and should not be carried over to other languages or language families.'], [0.000216, 'Many linguists argue that the formal distinctions between parts of speech must be made within the framework of a specific language or language family and should not be carried over to other languages or language families.'], [1.6e-05, 'Many linguists do argue the formal distinctions between parts of speech must be made within the framework of a specific language or language family and should not be carried over to other languages or language families.'], [9e-06, "Many linguists do argue that the formal distinctions between parts of speech must be made within the framework of a specific language or language family and shouldn't be carried over to other languages or language families."], [9e-06, 'Many linguists argue the formal distinctions between parts of speech must be made within the framework of a specific language or language family and should not be carried over to other languages or language families.']] WITH CHUNKING: [[-14.455135929121854, "Many linguists argue the formal distinctions between parts of speech must be made within the framework of a specific language or language family and shouldn't be carried over to other languages or language families."], [-14.094042619733656, 'Many linguists argue that the formal distinctions between parts of speech must be made within and should not be carried by over to other languages or language families the framework of a specific language or language family.'], [-11.579689821442365, 'Many linguists argue the formal distinctions between parts of speech must be made within the framework of a specific language or language family and should not be carried over to other languages or language families.'], [-11.385992418631446, "Many linguists argue that the formal distinctions between parts of speech must be made within the framework of a specific language or language family and shouldn't be carried over to other languages or language families."], [-8.510546310951957, 'Many linguists argue that the formal distinctions between parts of speech must be made within the framework of a specific language or language family and should not be carried over to other languages or language families.']] ORIGINAL: `The changes of pitch most commonly encountered in English are the '''rising pitch''' and the '''falling pitch''', although the '''fall-rising pitch''' and/or the '''rise-falling pitch''' are sometimes used.' FULL: [[0.006908999999999999, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered in English most commonly are the rising pitch and the falling pitch.'], [0.006272999999999998, 'Although the fall rising pitch and the rise falling pitch are sometimes used the changes of pitch encountered in English most commonly are the rising pitch and the falling pitch.'], [0.005665000000000006, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered in English most commonly are the rising pitch and the pitch falling.'], [0.005146999999999999, 'Although the fall rising pitch and the rise falling pitch are sometimes used the changes of pitch encountered in English most commonly are the rising pitch and the pitch falling.'], [0.00490900000000001, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered in English most commonly are the pitch rising and the falling pitch.']] WITH CHUNKING: [[-6.03877519716306, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered most commonly in english are the rising pitch and the falling pitch.'], [-5.916166855395851, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered in english most commonly are the pitch rising and the pitch falling.'], [-5.718370020066221, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered in english most commonly are the pitch rising and the falling pitch.'], [-5.5746440033469, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered in english most commonly are the rising pitch and the pitch falling.'], [-5.376888545446137, 'Although the fall rising pitch and the rise falling pitch are used sometimes the changes of pitch encountered in english most commonly are the rising pitch and the falling pitch.']] ORIGINAL: `In August 2007, Google announced that it would shut down its video rental and sale program and offer refunds and [[Google Checkout]] credits to consumers who had purchased videos to own.' FULL: [[0.0018229999999999995, 'In August 2007 Google announced that it would shut down its video rental and sale program and offer refunds and Google Checkout credits to consumers which had purchased videos to own.'], [0.001666, 'In August 2007 Google announced that it would shut down its video rental and sale program and offer refunds and Google Checkout credits to consumers that had purchased videos to own.'], [0.0015840000000000001, 'In August 2007 Google announced that it would shut down its video rental and sale program and offer refunds and Google Checkout credits to consumers who had purchased videos to own.'], [0.0011260000000000003, 'In August 2007 Google announced that it would shut down its video rental and sale program and offer refunds and Google Checkout credits to consumers which had videos purchased to own.'], [0.0010210000000000009, 'In August 2007 Google announced that it would shut down its video rental and sale program and offer refunds and Google Checkout credits to consumers that had videos purchased to own.']] WITH CHUNKING: [[-6.360334508925841, 'In august 2007 google announced that it would shut down its video rental and sale program and offer refunds and google checkout credits to consumers that had videos purchased to own.'], [-6.2730224898483495, 'In august 2007 google announced that it would shut down its video rental and sale program and offer refunds and google checkout credits to consumers which had videos purchased to own.'], [-5.9750019794490274, 'In august 2007 google announced that it would shut down its video rental and sale program and offer refunds and google checkout credits to consumers who had purchased videos to own.'], [-5.925744416346104, 'In august 2007 google announced that it would shut down its video rental and sale program and offer refunds and google checkout credits to consumers that had purchased videos to own.'], [-5.8373154864820345, 'In august 2007 google announced that it would shut down its video rental and sale program and offer refunds and google checkout credits to consumers which had purchased videos to own.']] ORIGINAL: `Statistical natural-language processing uses [[stochastic]], [[probabilistic]] and [[statistical]] methods to resolve some of the difficulties discussed above, especially those which arise because longer sentences are highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses.' FULL: None WITH CHUNKING: None ORIGINAL: `Operating systems may run multiple programs through [[process scheduling]] — a software mechanism to [[Context switch|switch]] the CPU among processes frequently so that users can [[Time-sharing|interact]] with each program while it is running.' FULL: [[0.10730100000000005, 'Operating systems may run multiple programs through process scheduling. a software mechanism to switch the cpu among processes frequently so that users can interact with each program while it is running.'], [0.06479, 'Operating systems may run multiple programs through process scheduling. a software mechanism to switch the cpu frequently among processes so that users can interact with each program while it is running.'], [0.040929000000000014, 'Operating systems may run multiple programs through process scheduling. a software mechanism to frequently switch the cpu among processes so that users can interact with each program while it is running.'], [0.005395, 'Operating systems may run multiple programs through process scheduling. a software mechanism frequently to switch the cpu among processes so that users can interact with each program while it is running.']] WITH CHUNKING: [[-5.318520073865556, 'Operating systems may run multiple programs through process scheduling. a software mechanism frequently to switch the cpu among processes so that users can interact with each program while it is running.'], [-3.292979950875473, 'Operating systems may run multiple programs through process scheduling. a software mechanism to frequently switch the cpu among processes so that users can interact with each program while it is running.'], [-2.831600145960247, 'Operating systems may run multiple programs through process scheduling. a software mechanism to switch the cpu frequently among processes so that users can interact with each program while it is running.'], [-2.3239568507317383, 'Operating systems may run multiple programs through process scheduling. a software mechanism to switch the cpu among processes frequently so that users can interact with each program while it is running.']] ORIGINAL: `Most researchers hope that their work will eventually be incorporated into a machine with ''general'' intelligence (known as [[strong AI]]), combining all the skills above and exceeding human abilities at most or all of them.' FULL: [[3.2e-05, 'Most researchers hope that their work will be incorporated into a machine with general intelligence known as strong AI combining all of the skills above and exceeding human abilities at most or all of them eventually.']] WITH CHUNKING: [[-15.024830968616804, 'Most researchers hope their work will eventually be incorporated into a machine with general intelligence known as strong ai combining all of the skills above and exceeding human abilities at most or all of them.'], [-13.009927948074541, 'Most researchers hope their work will be incorporated into a machine with general intelligence known as strong ai combining all of the skills above and exceeding human abilities at most or all of them eventually.'], [-12.50674539935362, 'Most researchers hope that their work will be incorporated into a machine with general intelligence known as strong ai combining all the skills above and exceeding human abilities at most or all of them eventually.'], [-12.50674539935362, 'Most researchers hope that their work will eventually be incorporated into a machine with general intelligence known as strong ai combining all of the skills above and exceeding human abilities at most or all of them.'], [-10.491842378811358, 'Most researchers hope that their work will be incorporated into a machine with general intelligence known as strong ai combining all of the skills above and exceeding human abilities at most or all of them eventually.']] ORIGINAL: `It is probably impossible to accurately enumerate the living languages because our worldwide knowledge is incomplete, and it is a "moving target", as explained in greater detail by the [[Ethnologue]]'s Introduction, p. 7 - 8.' FULL: None WITH CHUNKING: [[-11.500981516167993, "Because our worldwide knowledge is incomplete it is probably impossible to enumerate the living languages accurately and it is a moving target as explained in greater detail by the ethnologue's introduction p. seven eight."], [-11.457571487325064, "Because our worldwide knowledge is incomplete probably it is impossible to enumerate the living languages accurately and it is a moving target as explained in greater detail by the ethnologue's introduction p seven eight."], [-11.457571487325064, "Because our worldwide knowledge is incomplete probably it is impossible to enumerate the living languages accurately and it is a moving target as explained in greater detail by the ethnologue's introduction p. seven eight."], [-11.144005013166634, "Because our worldwide knowledge is incomplete probably it is impossible whether to enumerate the living languages accurately and it is a moving target as explained in greater detail by the ethnologue's introduction p seven eight."], [-11.144005013166634, "Because our worldwide knowledge is incomplete probably it is impossible whether to enumerate the living languages accurately and it is a moving target as explained in greater detail by the ethnologue's introduction p. seven eight."]] ORIGINAL: `* 1964: [[Karen Spärck Jones]] finished her thesis at Cambridge, ''Synonymy and Semantic Classification'', and continued work on [[computational linguistics]] as it applies to IR' FULL: None WITH CHUNKING: None ORIGINAL: `In 1997, Sun Microsystems approached the [[International Organization for Standardization#JTC1|ISO/IEC JTC1 standards body]] and later the [[Ecma International]] to formalize Java, but it soon withdrew from the process.' FULL: [[0.003846000000000001, 'In 1997 Sun Microsystems approached the ISO and IEC JTC1 standards body and later the Ecma International to formalize Java but it soon withdrew from the process.'], [0.0025110000000000006, 'In 1997 to formalize Java Sun Microsystems approached the ISO and IEC JTC1 standards body and later the Ecma International but it soon withdrew from the process.']] WITH CHUNKING: [[-7.69100505530562, 'In 1997 to formalize java sun microsystems approached the iso and iec jtc1 standards body and later the ecma international but it soon withdrew from the process.'], [-7.147780442637769, 'In 1997 sun microsystems approached the iso and iec jtc1 standards body and later the ecma international to formalize java but it soon withdrew from the process.']] ORIGINAL: `For example, there is talk of a "computational turn in philosophy" which includes philosophers analyzing the formal ontologies of computer science (sometimes even working directly with the software), while researchers in computer science have been making more references to those philosophers who work on ontology (sometimes with direct consequences for their methods).' FULL: None WITH CHUNKING: None ORIGINAL: `Also, in the [[Danny Phantom]] Episode, "Public Enemies", Danny, Tucker, and Sam come across a ghost wolf who speaks Esperanto, but only Tucker can understand at first.' FULL: [[0.1326320000000003, 'Also in the Danny Phantom episode Public Enemies Danny Tucker and Sam come across a ghost wolf which speaks Esperanto but only Tucker can understand at first.'], [0.11055100000000023, 'Also in the Danny Phantom episode Public Enemies Danny Tucker and Sam come across a ghost wolf who speaks Esperanto but only Tucker can understand at first.'], [0.10559100000000014, 'Also in the Danny Phantom episode Public Enemies Danny Tucker and Sam come across a ghost wolf that speaks Esperanto but only Tucker can understand at first.'], [0.018576000000000016, 'Also in the Danny Phantom episode Public Enemies Danny Tucker and Sam come across a ghost wolf which speaks Esperanto but only Tucker can at first understand.'], [0.015480999999999986, 'Also in the Danny Phantom episode Public Enemies Danny Tucker and Sam come across a ghost wolf who speaks Esperanto but only Tucker can at first understand.']] WITH CHUNKING: [[-6.557729882591048, 'Also in the danny phantom episode public enemies danny tucker and sam come across a ghost wolf who speaks esperanto but only tucker can at first understand.'], [-6.376296764512192, 'Also in the danny phantom episode public enemies danny tucker and sam come across a ghost wolf which speaks esperanto but only tucker can at first understand.'], [-4.326874074163941, 'Also in the danny phantom episode public enemies danny tucker and sam come across a ghost wolf that speaks esperanto but only tucker can understand at first.'], [-4.278873964913492, 'Also in the danny phantom episode public enemies danny tucker and sam come across a ghost wolf who speaks esperanto but only tucker can understand at first.'], [-4.097440846834635, 'Also in the danny phantom episode public enemies danny tucker and sam come across a ghost wolf which speaks esperanto but only tucker can understand at first.']] ORIGINAL: `{{transl|ja|''Bungo''}} still has some relevance for historians, literary scholars, and lawyers (many Japanese laws that survived [[World War II]] are still written in {{transl|ja|''bungo''}}, although there are ongoing efforts to modernize their language).' FULL: None WITH CHUNKING: None ORIGINAL: `In the same manner ß can be circumscribed as ss. German readers understand those circumscriptions (although they look unusual), but they are avoided if the regular umlauts are available because they are considered a makeshift, not proper spelling.' FULL: [[0.05115799999999998, 'In the same manner ß can be circumscribed as SS. although they look unusual German readers understand those circumscriptions but because they are considered a makeshift not proper spelling if the regular umlauts are available they are avoided.'], [0.047717999999999997, 'In the same manner ß can be circumscribed as SS. although they look unusual German readers understand those circumscriptions but because they are considered a makeshift not proper spelling they are avoided if the regular umlauts are available.'], [0.029054999999999984, 'In the same manner ß can be circumscribed as SS. German readers understand those circumscriptions although they look unusual but because they are considered a makeshift not proper spelling if the regular umlauts are available they are avoided.'], [0.027107000000000013, 'In the same manner ß can be circumscribed as SS. German readers understand those circumscriptions although they look unusual but because they are considered a makeshift not proper spelling they are avoided if the regular umlauts are available.'], [0.009384999999999998, 'In the same manner ß can be circumscribed as SS. although they look unusual German readers understand those circumscriptions but they are avoided if the regular umlauts are available because they are considered a makeshift not proper spelling.']] WITH CHUNKING: [[-7.1191250361750456, 'In the same manner ß can be circumscribed as ss. german readers understand those circumscriptions although they look unusual but because they are considered a makeshift not proper spelling they are avoided if the regular umlauts are available.'], [-5.968309298983986, 'In the same manner ß can be circumscribed as ss. although they look unusual german readers understand those circumscriptions but if the regular umlauts are available they are avoided because they are considered a makeshift not proper spelling.'], [-5.968309298983986, 'In the same manner ß can be circumscribed as ss. german readers understand those circumscriptions although they look unusual but because they are considered a makeshift not proper spelling if the regular umlauts are available they are avoided.'], [-5.619275727778888, 'In the same manner ß can be circumscribed as ss. although they look unusual german readers understand those circumscriptions but because they are considered a makeshift not proper spelling they are avoided if the regular umlauts are available.'], [-4.468459990587827, 'In the same manner ß can be circumscribed as ss. although they look unusual german readers understand those circumscriptions but because they are considered a makeshift not proper spelling if the regular umlauts are available they are avoided.']] ORIGINAL: `Thus, in the study of [[logic gates]], the theoretical lower bound of thermal energy released by an ''AND gate'' is higher than for the ''NOT gate'' (because information is destroyed in an ''AND gate'' and simply converted in a ''NOT gate'').' FULL: [[0.00021899999999999998, 'Thus in the study of logic gates because information is destroyed in an AND gate and simply converted in an NOT gate the lower theoretical bound of thermal energy released by a AND gate is higher than for the NOT gate.'], [0.00021099999999999995, 'Thus in the study of logic gates because information is destroyed in a AND gate and simply converted in an NOT gate the lower theoretical bound of thermal energy released by a AND gate is higher than for the NOT gate.'], [0.000195, 'Thus in the study of logic gates because information is destroyed in an AND gate and simply converted in an NOT gate the lower theoretical bound of thermal energy released by an AND gate is higher than for the NOT gate.'], [0.000195, 'Thus in the study of logic gates because information is destroyed in a AND gate and simply converted in an NOT gate the lower theoretical bound of thermal energy released by an AND gate is higher than for the NOT gate.'], [0.000173, 'Thus in the study of logic gates because information is destroyed in an AND gate and simply converted in a NOT gate the lower theoretical bound of thermal energy released by a AND gate is higher than for the NOT gate.']] WITH CHUNKING: None ORIGINAL: `The issue of OpenOffice.org’s use of Java came to the fore in May 2005, when [[Richard Stallman]] appeared to call for a [[fork (software)|fork]] of the application in a posting on the [[Free Software Foundation]] website.' FULL: [[0.0009570000000000007, "When Richard Stallman appeared to call for a fork of the application in a posting on the free software foundation website the issue of OpenOffice.org's use of Java came to the fore in May 2005."], [0.00033300000000000034, "When Richard Stallman appeared to call for a fork of the application in a posting on the free software foundation website the issue of OpenOffice.org's use of Java came in May 2005 to the fore."], [3e-06, "The issue of OpenOffice.org's use of Java came to the fore in May 2005 when Richard Stallman appeared to call for a fork of the application in a posting on the free software foundation website."], [1e-06, "The issue of OpenOffice.org's use of Java came in May 2005 to the fore when Richard Stallman appeared to call for a fork of the application in a posting on the free software foundation website."]] WITH CHUNKING: [[-4.579850988118017, "The issue of openoffice.org's use of java came in may 2005 to the fore when richard stallman appeared to call for a fork of the application in a posting on the free software foundation website."], [-3.0800016797218586, "When richard stallman appeared to call for a fork of the application in a posting on the free software foundation website the issue of openoffice.org's use of java came in may 2005 to the fore."], [-3.0335379129388897, "The issue of openoffice.org's use of java came to the fore in may 2005 when richard stallman appeared to call for a fork of the application in a posting on the free software foundation website."], [-1.5336886045427311, "When richard stallman appeared to call for a fork of the application in a posting on the free software foundation website the issue of openoffice.org's use of java came to the fore in may 2005."]] ORIGINAL: `While initially research had been concerned mostly with the electrical characteristics of neurons, a particularly important part of the investigation in recent years has been the exploration of the role of [[neuromodulators]] such as [[dopamine]], [[acetylcholine]], and [[serotonin]] on behaviour and learning.' FULL: [[0.005448000000000001, 'While mostly research had been concerned with the electrical characteristics of neurons initially a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [0.005448000000000001, 'While initially research had been concerned with the electrical characteristics of neurons mostly a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [0.002815000000000001, 'While the electrical characteristics of neurons research had been concerned with mostly initially a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [0.002815000000000001, 'While the electrical characteristics of neurons research had been concerned with initially mostly a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [0.001759, 'While mostly research had initially been concerned with the electrical characteristics of neurons a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.']] WITH CHUNKING: [[-6.004822335577367, 'While the electrical characteristics of neurons research had been concerned with mostly initially a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [-5.7166662523286815, 'While concerned with the electrical characteristics of neurons research had been initially mostly a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [-5.7166662523286815, 'While concerned with the electrical characteristics of neurons research had been mostly initially a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [-5.269330728074403, 'While initially research had been concerned with the electrical characteristics of neurons mostly a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.'], [-5.269330728074403, 'While mostly research had been concerned with the electrical characteristics of neurons initially a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine acetylcholine and serotonin on behavior and learning.']] ORIGINAL: `Results have been encouraging, and voice applications have included: control of communication radios; setting of navigation systems; and control of an automated target handover system.' FULL: [[0.146497, 'Results have been encouraging and voice applications have included : control of communication radios setting of navigation systems and control of an automated target handover system.'], [0.048836000000000004, 'Results have been encouraging and voice applications have included : control of communications radios setting of navigation systems and control of an automated target handover system.'], [0.024906999999999995, 'Results have been encouraging and voice applications have included : control of communication radios setting of navigation systems and control of a target handover system automated.'], [0.008301, 'Results have been encouraging and voice applications have included : control of communications radios setting of navigation systems and control of a target handover system automated.']] WITH CHUNKING: [[-6.430173218887113, 'Results have been encouraging and voice applications have included : control of communications radios setting of navigation systems and control of a target handover system automated.'], [-5.331560930219003, 'Results have been encouraging and voice applications have included : control of communication radios setting of navigation systems and control of a target handover system automated.'], [-4.658264500105513, 'Results have been encouraging and voice applications have included : control of communications radios setting of navigation systems and control of an automated target handover system.'], [-3.5597017952872454, 'Results have been encouraging and voice applications have included : control of communication radios setting of navigation systems and control of an automated target handover system.']] ORIGINAL: `When one tries to minimise this cost using [[gradient descent]] for the class of neural networks called [[Multilayer perceptron|Multi-Layer Perceptrons]], one obtains the common and well-known [[Backpropagation|backpropagation algorithm]] for training neural networks.' FULL: [[0.07800199999999997, 'When one tries to minimize this cost using gradient descent for the class of neural networks called Multi-Layer Perceptronsses one obtains the common and well known backpropagation algorithm for training neural networks.'], [0.045891000000000036, 'When one tries to minimize this cost using gradient descent for the class of neural networks called Multi-Layer Perceptronsses one obtains the common and well known backpropagation algorithm for training of neural networks.'], [0.03777500000000001, 'One obtains the common and well known backpropagation algorithm for training neural networks when one tries to minimize this cost using gradient descent for the class of neural networks called Multi-Layer Perceptronsses.'], [0.022226999999999993, 'One obtains the common and well known backpropagation algorithm for training of neural networks when one tries to minimize this cost using gradient descent for the class of neural networks called Multi-Layer Perceptronsses.'], [0.003451999999999999, 'When one tries to minimize this cost using gradient descent for the class of neural networks called Multi- Layer Perceptronsses one obtains the common and well known backpropagation algorithm for training neural networks.']] WITH CHUNKING: [[-5.475082797924182, 'When one tries to minimize this cost using gradient descent for the class of neural networks called multi- layer perceptronsses one obtains the common and well known backpropagation algorithm for training neural networks.'], [-4.31938830814241, 'One obtains the common and well known backpropagation algorithm for training of neural networks when one tries to minimize this cost using gradient descent for the class of neural networks called multi-layer perceptronsses.'], [-3.7890669043137835, 'One obtains the common and well known backpropagation algorithm for training neural networks when one tries to minimize this cost using gradient descent for the class of neural networks called multi-layer perceptronsses.'], [-2.819538999746252, 'When one tries to minimize this cost using gradient descent for the class of neural networks called multi-layer perceptronsses one obtains the common and well known backpropagation algorithm for training of neural networks.'], [-2.289217595917625, 'When one tries to minimize this cost using gradient descent for the class of neural networks called multi-layer perceptronsses one obtains the common and well known backpropagation algorithm for training neural networks.']] ORIGINAL: `Davis does this to his subtraction algorithm — he fixes his algorithm in a second example so that it is proper subtraction (Davis 1958:12-15).' FULL: None WITH CHUNKING: None ORIGINAL: `Tasks that fall within the paradigm of unsupervised learning are in general [[estimation]] problems; the applications include [[Data clustering|clustering]], the estimation of [[statistical distributions]], [[Data compression|compression]] and [[Bayesian spam filtering|filtering]].' FULL: [[0.011915, 'Tasks which fall within the paradigm of un-supervised learning in general are estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.'], [0.011103000000000002, 'Tasks who fall within the paradigm of un-supervised learning in general are estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.'], [0.010346, 'Tasks that fall within the paradigm of un-supervised learning in general are estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.'], [0.01, 'Tasks which fall within the paradigm of un-supervised learning are in general estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.'], [0.009319, 'Tasks who fall within the paradigm of un-supervised learning are in general estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.']] WITH CHUNKING: [[-5.238053498736679, 'Tasks which fall within the paradigm of un-supervised learning are estimation problems in general. the applications include clustering the estimation of statistical distributions compression and filtering.'], [-5.215368215905597, 'Tasks which fall within the paradigm of un-supervised learning are in general estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.'], [-4.288849593848185, 'Tasks that fall within the paradigm of un-supervised learning in general are estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.'], [-4.22256652686088, 'Tasks who fall within the paradigm of un-supervised learning in general are estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.'], [-4.149497983807741, 'Tasks which fall within the paradigm of un-supervised learning in general are estimation problems. the applications include clustering the estimation of statistical distributions compression and filtering.']] ORIGINAL: `This was a problematic stand for them, as they had distributed Linux and other GPL'ed code in their [[Caldera OpenLinux]] distribution, and there is little evidence that they had any legal right to do so except under the terms of the GPL.' FULL: [[0.015107, 'As they had distributed Linux and other gpl’ed code in their caldera OpenLinux distribution this was a problematic stand for them and there is little evidence who they had any legal right to do so except under the terms of the GPL.'], [0.014427000000000004, 'As they had distributed Linux and other gpl’ed code in their caldera OpenLinux distribution this was a problematic stand for them and there is little evidence which they had any legal right to do so except under the terms of the GPL.'], [0.013871000000000003, 'As they had distributed Linux and other gpl’ed code in their caldera OpenLinux distribution this was a problematic stand for them and there is little evidence that they had any legal right to do so except under the terms of the GPL.'], [0.012293000000000004, 'As they had distributed Linux and gpl’ed other code in their caldera OpenLinux distribution this was a problematic stand for them and there is little evidence who they had any legal right to do so except under the terms of the GPL.'], [0.011739000000000005, 'As they had distributed Linux and gpl’ed other code in their caldera OpenLinux distribution this was a problematic stand for them and there is little evidence which they had any legal right to do so except under the terms of the GPL.']] WITH CHUNKING: [[-5.501415426506786, 'As they had distributed linux and gpl’ed other code in their caldera openlinux distribution this was a problematic stand for them and there is little evidence which they had any legal right to do so except under the terms of the gpl.'], [-5.455634309372575, 'As they had distributed linux and gpl’ed other code in their caldera openlinux distribution this was a problematic stand for them and there is little evidence who they had any legal right to do so except under the terms of the gpl.'], [-5.334751715855139, 'As they had distributed linux and other gpl’ed code in their caldera openlinux distribution this was a problematic stand for them and there is little evidence that they had any legal right to do so except under the terms of the gpl.'], [-5.295069477542836, 'As they had distributed linux and other gpl’ed code in their caldera openlinux distribution this was a problematic stand for them and there is little evidence which they had any legal right to do so except under the terms of the gpl.'], [-5.249288360408626, 'As they had distributed linux and other gpl’ed code in their caldera openlinux distribution this was a problematic stand for them and there is little evidence who they had any legal right to do so except under the terms of the gpl.']] ORIGINAL: `While it is true that the only explicit dependency range is (n-1) tokens for an n-gram model, it is also true that the effective range of dependency is significantly longer than this although long range correlations drop exponentially with distance for any Markov model.' FULL: None WITH CHUNKING: [[-4.786655522514394, 'While it is true that the only explicit dependency range is n. one tokens for an n. gram model also it is true that although long range correlations drop exponentially for any markov model with distance the effective range of dependency is longer than this significantly.'], [-4.393667731345789, 'While it is true that the only explicit dependency range is n one tokens for an n gram model also it is true that although long range correlations drop with distance for any markov model exponentially the effective range of dependency is longer than this significantly.'], [-4.393667731345789, 'While it is true that the only explicit dependency range is n one tokens for an n. gram model also it is true that although long range correlations drop with distance for any markov model exponentially the effective range of dependency is longer than this significantly.'], [-4.393667731345789, 'While it is true that the only explicit dependency range is n. one tokens for an n gram model also it is true that although long range correlations drop with distance for any markov model exponentially the effective range of dependency is longer than this significantly.'], [-4.393667731345789, 'While it is true that the only explicit dependency range is n. one tokens for an n. gram model also it is true that although long range correlations drop with distance for any markov model exponentially the effective range of dependency is longer than this significantly.']] ORIGINAL: `All of this metadata can be interesting to one party or another — such as the spectators, sponsors or a counter-terrorist unit of the police — and even for a simple resource the amount of possible metadata can be gigantic.' FULL: [[0.119046, 'All of this metadata can be interesting to one party or another such as the spectators sponsors or a counter-terrorist unit of the police and even for a simple resource the amount of possible metadata can be gigantic.'], [0.025948999999999996, 'All this metadata can be interesting to one party or another such as the spectators sponsors or a counter-terrorist unit of the police and even for a simple resource the amount of possible metadata can be gigantic.']] WITH CHUNKING: [[-19.465355874845322, 'All this metadata can be interesting to one party or another such as the spectators sponsors or a counter-terrorist unit of the police and even for a simple resource the amount of possible metadata can be gigantic.'], [-17.446761733899084, 'All of this metadata can be interesting to one party or another such as the spectators sponsors or a counter-terrorist unit of the police and even for a simple resource the amount of possible metadata can be gigantic.']] ORIGINAL: `Linguists do not consider these to be "language", but describe them as [[animal communication]], because the interaction between animals in such communication is fundamentally different in its underlying principles from human language.' FULL: None WITH CHUNKING: None ORIGINAL: `In 2003, J. D. Bekenstein claimed there is a growing trend in [[physics]] to define the physical world as being made of information itself (and thus information is defined in this way).' FULL: None WITH CHUNKING: None ORIGINAL: `The [[Free Software Foundation]] views Linux distributions which use GNU software as [[GNU variants]] and they ask that such operating systems be referred to as ''GNU/Linux'' or ''a Linux-based GNU system''.' FULL: [[0.02712699999999999, 'The Free software foundation views Linux distributions which use GNU software as GNU variants and they ask that such operating systems be referred as GNU and Linux or a Linux based GNU system.'], [0.024372000000000005, 'The Free software foundation views Linux distributions that use GNU software as GNU variants and they ask that such operating systems be referred as GNU and Linux or a Linux based GNU system.'], [0.024225, 'The Free software foundation views Linux distributions who use GNU software as GNU variants and they ask that such operating systems be referred as GNU and Linux or a Linux based GNU system.'], [0.0076419999999999995, 'The Free software foundation views Linux distributions which use GNU software as GNU variants and they ask that such operating systems be referred to as GNU and Linux or a Linux based GNU system.'], [0.006866999999999998, 'The Free software foundation views Linux distributions that use GNU software as GNU variants and they ask that such operating systems be referred to as GNU and Linux or a Linux based GNU system.']] WITH CHUNKING: None ORIGINAL: `It has, however, been asserted that in certain applications, e.g. product descriptions written in a [[controlled language]], a [[dictionary-based machine translation|dictionary-based machine-translation]] system has produced satisfactory translations that require no human intervention.' FULL: [[0.0018010000000000003, 'However it has been asserted that in certain applications eg. product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [0.0018010000000000003, 'However it has been asserted that in certain applications eg product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [0.0018010000000000003, 'However it has been asserted that in certain applications e.g. product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [0.0018010000000000003, 'However it has been asserted that in certain applications e.g product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [0.0018010000000000003, 'However it has been asserted that in certain applications e. g. product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.']] WITH CHUNKING: [[-6.069029611077178, 'However it has been asserted that in certain applications e. g. product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [-6.069029611077178, 'However it has been asserted that in certain applications e.g product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [-6.069029611077178, 'However it has been asserted that in certain applications e.g. product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [-6.069029611077178, 'However it has been asserted that in certain applications eg product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.'], [-6.069029611077178, 'However it has been asserted that in certain applications eg. product descriptions written in a controlled language a dictionary based machine translation system has produced satisfactory translations who require no human intervention.']] ORIGINAL: `Several tools already produce or consume PMML documents, these include [[ADAPA]], [[IBM DB2]] Warehouse, CART, SAS Enterprise Miner, and [[SPSS]].' FULL: None WITH CHUNKING: [[-3.50760473472978, 'Several tools already produce or consume pmml documents. these include adapa ibm db2 warehouse cart sas enterprise miner and spss.'], [-1.0026888054727634, 'Several tools produce or consume pmml documents already. these include adapa ibm db2 warehouse cart sas enterprise miner and spss.']] ORIGINAL: `One example as of August, 2006 was [[OpenOffice.org]], which did not natively run on the [[AMD64]] or [[EM64T]] lines of processors implementing the [[x86-64]] [[64-bit]] standards for computers; this has since been changed, and the OpenOffice.org suite of software is “mostly” ported to these 64-bit systems[http://wiki.services.openoffice.org/wiki/Porting_to_x86-64_(AMD64,_EM64T)].' FULL: None WITH CHUNKING: [[-9.752938855003363, 'One example as of august 2006 was openoffice.org who did not natively run on the amd64 or em64t lines of processors implementing the x86- 64 64-bit standards for computers. this has been changed since and the openoffice.org suite of software is ported to these 64-bit systemswiki.services.openoffice.org/wiki/porting_to_x86-64_(amd64_em64t) mostly.'], [-9.59820997336757, 'One example as of august 2006 was openoffice.org which did not run on the amd64 or em64t lines of processors implementing the 64-bit x86- 64 standards for computers natively. this has been changed since and the openoffice.org suite of software is ported to these 64-bit systemswiki.services.openoffice.org/wiki/porting_to_x86-64_(amd64_em64t) mostly.'], [-9.526615479403194, 'One example as of august 2006 was openoffice.org who did not run on the amd64 or em64t lines of processors implementing the 64-bit x86- 64 standards for computers natively. this has been changed since and the openoffice.org suite of software is ported to these 64-bit systemswiki.services.openoffice.org/wiki/porting_to_x86-64_(amd64_em64t) mostly.'], [-7.988980175073264, 'One example as of august 2006 was openoffice.org which did not run on the amd64 or em64t lines of processors implementing the x86- 64 64-bit standards for computers natively. this has been changed since and the openoffice.org suite of software is ported to these 64-bit systemswiki.services.openoffice.org/wiki/porting_to_x86-64_(amd64_em64t) mostly.'], [-7.917278016841106, 'One example as of august 2006 was openoffice.org who did not run on the amd64 or em64t lines of processors implementing the x86- 64 64-bit standards for computers natively. this has been changed since and the openoffice.org suite of software is ported to these 64-bit systemswiki.services.openoffice.org/wiki/porting_to_x86-64_(amd64_em64t) mostly.']] ORIGINAL: `It is also possible that this means of developing a cross-platform application will result in more problems with bug tracking and fixing, because the two different ''source trees'' would have different programmers, and thus different defects in each version.' FULL: None WITH CHUNKING: [[-5.477423549585362, 'Also it is possible that this means of developing a cross platform application will result in more problems with bug tracking and fixing because the different two source trees would have different programmers and thus different defects in each version.'], [-4.9848633948659025, 'It is also possible that because the two different source trees would have different programmers and thus different defects in each version this means of developing a cross platform application will result in more problems with bug tracking and fixing.'], [-4.782408436333662, 'Also it is possible that this means of developing a cross platform application will result in more problems with bug tracking and fixing because the two different source trees would have different programmers and thus different defects in each version.'], [-3.9775742411892043, 'Also it is possible that because the different two source trees would have different programmers and thus different defects in each version this means of developing a cross platform application will result in more problems with bug tracking and fixing.'], [-3.282559127937504, 'Also it is possible that because the two different source trees would have different programmers and thus different defects in each version this means of developing a cross platform application will result in more problems with bug tracking and fixing.']] ORIGINAL: `However, there are other Latinate words that are used normally in everyday speech and do not sound formal; these are mainly words for concepts that no longer have Germanic words, and are generally assimilated better and in many cases do not appear Latinate.' FULL: None WITH CHUNKING: None ORIGINAL: `Versions of NT from 3.1 to 4.0 variously supported [[PowerPC]], [[DEC Alpha]] and [[MIPS Technologies|MIPS]] R4000, some of which were 64-bit processors, although the operating system treated them as 32-bit processors.' FULL: [[0.18552300000000002, 'Although the operating system treated them as 32-bit processors versions of NT from 3.1 to 4.0 supported PowerPC December alpha and MIPS R4000 some of which were 64-bit processors variously.'], [0.16026400000000002, 'Although the operating system treated them as 32-bit processors versions of NT from 3.1 to 4.0 supported PowerPC December alpha and MIPS R4000 some of who were 64-bit processors variously.'], [0.03596400000000001, 'Although the operating system treated them as 32-bit processors versions of NT from 3.1 to 4.0 variously supported PowerPC December alpha and MIPS R4000 some of which were 64-bit processors.'], [0.031070000000000004, 'Although the operating system treated them as 32-bit processors versions of NT from 3.1 to 4.0 variously supported PowerPC December alpha and MIPS R4000 some of who were 64-bit processors.'], [0.0024470000000000004, 'Versions of NT from 3.1 to 4.0 supported PowerPC December alpha and MIPS R4000 some of which were 64-bit processors variously although the operating system treated them as 32-bit processors.']] WITH CHUNKING: [[-3.5646990848013154, 'Versions of nt from 3.1 to 4.0 supported powerpc december alpha and mips r4000 some of which were 64-bit processors variously although the operating system treated them as 32-bit processors.'], [-3.024414725935205, 'Although the operating system treated them as 32-bit processors versions of nt from 3.1 to 4.0 variously supported powerpc december alpha and mips r4000 some of who were 64-bit processors.'], [-2.8780805168195935, 'Although the operating system treated them as 32-bit processors versions of nt from 3.1 to 4.0 variously supported powerpc december alpha and mips r4000 some of which were 64-bit processors.'], [-2.211200544563526, 'Although the operating system treated them as 32-bit processors versions of nt from 3.1 to 4.0 supported powerpc december alpha and mips r4000 some of who were 64-bit processors variously.'], [-2.064849776405157, 'Although the operating system treated them as 32-bit processors versions of nt from 3.1 to 4.0 supported powerpc december alpha and mips r4000 some of which were 64-bit processors variously.']] ORIGINAL: `In May of [[2005]], [[Wallace versus International Business Machines et al|Daniel Wallace]] filed suit against the [[Free Software Foundation]] (FSF) in the [[U.S. District Court for the Southern District of Indiana|Southern District of Indiana]], contending that the GPL is an illegal attempt to fix prices at zero.' FULL: None WITH CHUNKING: None ORIGINAL: `Many GPL proponents have strongly advocated that free/open source software developers use only GPL-compatible licenses, because doing otherwise makes it difficult to reuse software in larger wholes.' FULL: None WITH CHUNKING: [[-4.3557938786647075, 'Because doing otherwise makes it difficult to reuse software in larger wholes many gpl proponents have advocated that free and open source software developers use only gpl-compatible licenses strongly.'], [-4.345676982146445, 'Because doing otherwise makes it be difficult to reuse software in larger wholes gpl many proponents have advocated that free and open source software developers use only gpl-compatible licences strongly.'], [-4.345676982146445, 'Because doing otherwise makes it be difficult to reuse software in larger wholes gpl many proponents have advocated that free and open source software developers use only gpl-compatible licenses strongly.'], [-4.094251763638253, 'Because doing otherwise makes it difficult to reuse software in larger wholes gpl many proponents have advocated that free and open source software developers use only gpl-compatible licences strongly.'], [-4.094251763638253, 'Because doing otherwise makes it difficult to reuse software in larger wholes gpl many proponents have advocated that free and open source software developers use only gpl-compatible licenses strongly.']] ORIGINAL: `Boltzmann machine learning was at first slow to simulate, but the [[contrastive divergence algorithm]] of Geoff Hinton (circa 2000) allows models such as Boltzmann machines and ''products of experts'' to be trained much faster.' FULL: [[0.006197000000000008, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm of Geoff Hinton circa 2000 allows models such as Boltzmann machines and products of experts to be trained much faster.'], [0.006197000000000008, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm of Geoff Hinton ca. 2000 allows models such as Boltzmann machines and products of experts to be trained much faster.'], [0.006008000000000008, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm circa 2000 of Geoff Hinton allows models such as Boltzmann machines and products of experts to be trained much faster.'], [0.006008000000000008, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm ca. 2000 of Geoff Hinton allows models such as Boltzmann machines and products of experts to be trained much faster.'], [0.0015309999999999987, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm of Geoff Hinton circa 2000 allows models such as Boltzmann machines and products of experts to be trained faster much.']] WITH CHUNKING: [[-8.799669654217157, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm of geoff hinton circa 2000 allows models such as boltzmann machines and products of experts to be trained faster much.'], [-7.4083613703643785, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm ca. 2000 of geoff hinton allows models such as boltzmann machines and products of experts to be trained much faster.'], [-7.4083613703643785, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm circa 2000 of geoff hinton allows models such as boltzmann machines and products of experts to be trained much faster.'], [-7.378154461810717, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm of geoff hinton ca. 2000 allows models such as boltzmann machines and products of experts to be trained much faster.'], [-7.378154461810717, 'Boltzmann machine learning was slow to simulate at first but the contrastive divergence algorithm of geoff hinton circa 2000 allows models such as boltzmann machines and products of experts to be trained much faster.']] ORIGINAL: `The project and software are informally referred to as ''OpenOffice'', but project organizers report that this term is a [[trademark]] held by another party, requiring them to adopt ''OpenOffice.org'' as its formal name.' FULL: None WITH CHUNKING: None ORIGINAL: `The XML-based specification is usually called XHTML to distinguish it clearly from the more traditional definition; however, the root element name continues to be 'html' even in the XHTML-specified HTML.' FULL: [[0.018675000000000008, 'Usually to clearly distinguish it from the more traditional definition the XML based specification is called xhtml. however the root element name continues to be html even in the xhtml- specified HTML.'], [0.018675000000000008, 'Usually to clearly distinguish it from the more traditional definition the X.M.L. based specification is called xhtml. however the root element name continues to be html even in the xhtml- specified HTML.'], [0.01674200000000002, 'Usually the XML based specification is called xhtml to distinguish it from the more traditional definition clearly. however the root element name continues to be html even in the xhtml- specified HTML.'], [0.01674200000000002, 'Usually the X.M.L. based specification is called xhtml to distinguish it from the more traditional definition clearly. however the root element name continues to be html even in the xhtml- specified HTML.'], [0.006315000000000002, 'Usually to distinguish it from the more traditional definition clearly the XML based specification is called xhtml. however the root element name continues to be html even in the xhtml- specified HTML.']] WITH CHUNKING: [[-4.979905688968286, 'Usually the xml based specification is called xhtml to clearly distinguish it from the more traditional definition. however the root element name continues to be html even in the xhtml- specified html.'], [-3.996304493376019, 'Usually to clearly distinguish it from the more traditional definition the x.m.l. based specification is called xhtml. however the root element name continues to be html even in the xhtml- specified html.'], [-3.996304493376019, 'Usually to clearly distinguish it from the more traditional definition the xml based specification is called xhtml. however the root element name continues to be html even in the xhtml- specified html.'], [-3.819265190683483, 'Usually the x.m.l. based specification is called xhtml to distinguish it from the more traditional definition clearly. however the root element name continues to be html even in the xhtml- specified html.'], [-3.819265190683483, 'Usually the xml based specification is called xhtml to distinguish it from the more traditional definition clearly. however the root element name continues to be html even in the xhtml- specified html.']] ORIGINAL: `To decode the meaning of the [[source text]] in its entirety, the translator must interpret and analyse all the features of the text, a process that requires in-depth knowledge of the [[grammar]], [[semantics]], [[syntax]], [[idiom]]s, etc., of the [[source language]], as well as the [[culture]] of its speakers.' FULL: None WITH CHUNKING: [[-12.971774605073087, 'To decode the meaning of the source text in its entirety the translator must interpret and analyze all of the features of the text. a process which requires in-depth knowledge of the grammar semantics syntax idioms etc.? of the source language as well as the culture of its speakers.'], [-12.925797093527269, 'To decode the meaning of the source text in its entirety the translator must interpret and analyze all of the features of the text. a process which requires in-depth knowledge of the grammar semantics syntax idioms et cetera of the source language as well as the culture of its speakers.'], [-12.925797093527269, 'To decode the meaning of the source text in its entirety the translator must interpret and analyze all of the features of the text. a process which requires in-depth knowledge of the grammar semantics syntax idioms etc of the source language as well as the culture of its speakers.'], [-12.925797093527269, 'To decode the meaning of the source text in its entirety the translator must interpret and analyze all of the features of the text. a process which requires in-depth knowledge of the grammar semantics syntax idioms etc. of the source language as well as the culture of its speakers.'], [-12.925797093527269, 'To decode the meaning of the source text in its entirety the translator must interpret and analyze all of the features of the text. a process which requires in-depth knowledge of the grammar semantics syntax idioms etcetera of the source language as well as the culture of its speakers.']] ORIGINAL: `Today, since it includes a number of advanced statistical methods for regression and classification, it finds application in a wide variety of fields including [[medical diagnostics]], [[credit card fraud detection]], [[Face recognition|face]] and [[speech recognition]] and analysis of the [[stock market]].' FULL: [[0.003456, 'Today it finds application in a wide variety of fields including medical diagnostics credit card fraud detection face and speech recognition and analysis of the stock market since it includes a number of statistical advanced methods for regression and classification.'], [0.002931, 'Today it finds application in a wide variety of fields including medical diagnostics credit card fraud detection face and speech recognition and analysis of the stock market since it includes a number of advanced statistical methods for regression and classification.'], [0.0023859999999999997, 'Today it finds application in a wide variety of fields including medical diagnostics credit card fraud detection face and speech recognition and analysis of the stock market since it includes a number of statistical methods for regression and classification advanced.'], [0.000788, 'Today since it includes a number of statistical advanced methods for regression and classification it finds application in a wide variety of fields including medical diagnostics credit card fraud detection face and speech recognition and analysis of the stock market.'], [0.00067, 'Today since it includes a number of advanced statistical methods for regression and classification it finds application in a wide variety of fields including medical diagnostics credit card fraud detection face and speech recognition and analysis of the stock market.']] WITH CHUNKING: None ORIGINAL: `In April 2003, [[Windows Server 2003]] was introduced, replacing the [[Windows 2000]] line of server products with a number of new features and a strong focus on security; this was followed in December 2005 by Windows Server 2003 R2.' FULL: None WITH CHUNKING: [[-0.5150631644729586, 'In april 2003 windows server 2003 was introduced replacing the windows 2000 line of server products with a number of new features and a strong focus on security. this was followed by windows server 2003 r2 in december 2005.']] ORIGINAL: `The FSF recommends using the term "free software" rather than "open source software" because that term and the associated marketing campaign focuses on the technical issues of software development, avoiding the issue of user freedoms.' FULL: [[0.010863, 'The FSF recommends that avoiding the issue of user freedoms using the term of free software rather than open source software because that term and the associated marketing campaign focuses on the technical issues of software development.'], [0.005905, 'The FSF recommends that using the term of free software rather than open source software because that term and the associated marketing campaign focuses on the technical issues of software development avoiding the issue of user freedoms.'], [0.0019690000000000003, 'The FSF recommends avoiding the issue of user freedoms using the term of free software rather than open source software because that term and the associated marketing campaign focuses on the technical issues of software development.'], [0.000902, 'The FSF recommends that avoiding the issue of user freedoms using the term of free software rather than open source software because that term and the marketing associated campaign focuses on the technical issues of software development.'], [0.000674, 'The FSF recommends using the term of free software rather than open source software because that term and the associated marketing campaign focuses on the technical issues of software development avoiding the issue of user freedoms.']] WITH CHUNKING: None ORIGINAL: `Depending on intended application, this can be beneficial or disadvantageous: the programmer is freed from performing low-level tasks, but at the same time loses the option of writing lower level code.' FULL: [[0.0024570000000000004, 'Depending upon intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing lower level code.'], [0.0024570000000000004, 'Depending on intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing lower level code.'], [0.0014550000000000001, 'Depending upon intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing of lower level code.'], [0.0014550000000000001, 'Depending on intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing of lower level code.'], [0.000905, 'Depending upon intended application this can be beneficial or disadvantageous : the programmer is freed from performing of low level tasks but at the same time loses the option of writing lower level code.']] WITH CHUNKING: [[-8.721685735653729, 'Depending upon intended application this can be beneficial or disadvantageous : the programmer is freed from performing of low level tasks but at the same time loses the option of writing lower level code.'], [-8.245351021429272, 'Depending on intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing of lower level code.'], [-8.245351021429272, 'Depending upon intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing of lower level code.'], [-7.721359011866314, 'Depending on intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing lower level code.'], [-7.721359011866314, 'Depending upon intended application this can be beneficial or disadvantageous : the programmer is freed from performing low level tasks but at the same time loses the option of writing lower level code.']] ORIGINAL: `The suit was dismissed in March 2006, on the grounds that Wallace had failed to state a valid anti-trust claim; the court noted that "the GPL encourages, rather than discourages, free competition and the distribution of computer operating systems, the benefits of which directly pass to consumers."' FULL: None WITH CHUNKING: None ORIGINAL: `Others, including Kleene, include procedures that could run forever without stopping; such a procedure has been called a "computational method" (Knuth 1997:5) or "''calculation procedure'' or ''algorithm''" (Kleene 1952:137); however, Kleene notes that such a method must eventually exhibit "some object" (Kleene 1952:137).' FULL: None WITH CHUNKING: None ORIGINAL: `Many people believe that such lexically-ambiguous, miscommunication-prone words should be avoided altogether, since the user generally has to waste time, effort, and [[attention span]] to define what is meant when they are used.' FULL: None WITH CHUNKING: None ORIGINAL: `[[Eötvös Loránd University]] in Budapest had a department of Interlinguistics and Esperanto from 1966 to 2004, after which time instruction moved to vocational colleges; there are state examinations for Esperanto instructors.' FULL: None WITH CHUNKING: None ORIGINAL: `# In some dialects, such as [[Cockney]], the interdentals /θ/ and /ð/ are usually merged with /f/ and /v/, and in others, like [[African American Vernacular English]], /ð/ is merged with dental /d/.' FULL: [[0.05691699999999998, 'In some dialects such as Cockney usually the interdentals /θ/ and /ð/ are merged with /f/ and /v/ and in others like American African Vernacular English /ð/ is merged with dental /d/.'], [0.05691699999999998, 'In some dialects such as Cockney usually the interdentals /θ/ and /ð/ are merged with /f/ and /v/ and in others like African American Vernacular English /ð/ is merged with dental /d/.'], [0.029029000000000003, 'In some dialects such as Cockney the interdentals /θ/ and /ð/ are usually merged with /f/ and /v/ and in others like American African Vernacular English /ð/ is merged with dental /d/.'], [0.029029000000000003, 'In some dialects such as Cockney the interdentals /θ/ and /ð/ are usually merged with /f/ and /v/ and in others like African American Vernacular English /ð/ is merged with dental /d/.'], [0.010479999999999998, 'In some dialects such as Cockney the interdentals /θ/ and /ð/ usually are merged with /f/ and /v/ and in others like American African Vernacular English /ð/ is merged with dental /d/.']] WITH CHUNKING: [[-5.561055419965914, 'In some dialects such as cockney the interdentals /θ/ and /ð/ usually are merged with /f/ and /v/ and in others like american african vernacular english /ð/ is merged with dental /d/.'], [-4.62198988657007, 'In some dialects such as cockney usually the interdentals /θ/ and /ð/ are merged with /f/ and /v/ and in others like african american vernacular english /ð/ is merged with dental /d/.'], [-4.62198988657007, 'In some dialects such as cockney usually the interdentals /θ/ and /ð/ are merged with /f/ and /v/ and in others like american african vernacular english /ð/ is merged with dental /d/.'], [-4.51254132265661, 'In some dialects such as cockney the interdentals /θ/ and /ð/ are usually merged with /f/ and /v/ and in others like african american vernacular english /ð/ is merged with dental /d/.'], [-4.51254132265661, 'In some dialects such as cockney the interdentals /θ/ and /ð/ are usually merged with /f/ and /v/ and in others like american african vernacular english /ð/ is merged with dental /d/.']] ORIGINAL: `A prototypical example of an "algorithm" is Euclid's algorithm to determine the maximum common divisor of two integers greater than one: "subtract the smallest number from the biggest one, repeat until you get a zero or a one".' FULL: None WITH CHUNKING: None ORIGINAL: `Under this philosophy, the GPL is said to grant the recipients of a [[computer program]] the rights of the [[free software definition]] and uses copyleft to ensure the freedoms are preserved, even when the work is changed or added to.' FULL: [[0.00018799999999999996, 'Under this philosophy even when the work is changed or added to the GPL is said to grant the recipients of a computer program the rights of the free software definition and uses copyleft to ensure that the freedoms are preserved.'], [0.000155, 'Under this philosophy the GPL is said to grant the recipients of a computer program and uses copyleft to ensure that the freedoms are preserved by the rights of the free software definition even when the work is changed or added to.'], [6.2e-05, 'Under this philosophy the GPL is said to grant the rights of the free software definition and uses copyleft to ensure that the freedoms are preserved by the recipients of a computer program even when the work is changed or added to.'], [6e-05, 'Under this philosophy the GPL is said to grant the recipients of the rights of the free software definition and uses copyleft to ensure that the freedoms are preserved by a computer program even when the work is changed or added to.'], [3.6e-05, 'Under this philosophy even when the work is changed or added to the GPL is said to grant the recipients of a computer program and uses copyleft to ensure that the freedoms are preserved by the rights of the free software definition.']] WITH CHUNKING: [[-12.56109467057803, 'Under this philosophy even when the work is changed or added to the gpl is said to grant the recipients of a computer program the rights of the free software definition and uses copyleft to ensure the freedoms are preserved.'], [-12.073799544222233, 'Under this philosophy the gpl is said to grant the recipients of a computer program of the rights of the free software definition and uses to ensure that the freedoms are preserved copyleft even when the work is changed or added to.'], [-11.658855692159525, 'Under this philosophy the gpl is said to grant the recipients of a computer program the rights of the free software definition and uses copyleft to ensure the freedoms are preserved even when the work is changed or added to.'], [-10.379495411144891, 'Under this philosophy even when the work is changed or added to the gpl is said to grant the recipients of a computer program the rights of the free software definition and uses copyleft to ensure that the freedoms are preserved.'], [-9.476414911007542, 'Under this philosophy the gpl is said to grant the recipients of a computer program the rights of the free software definition and uses copyleft to ensure that the freedoms are preserved even when the work is changed or added to.']] ORIGINAL: `In a 1969 guest appearance on ''[[The Tonight Show]]'', [[Jay Silverheels]] of ''[[The Lone Ranger]]'' fame appeared in character as [[Tonto]] for a comedy sketch with [[Johnny Carson]], and claimed Esperanto skills as he sought new employment.' FULL: None WITH CHUNKING: None ORIGINAL: `In many places in eastern [[Ukraine]] and [[Belarus]], these languages are spoken interchangeably, and in certain areas traditional bilingualism resulted in language mixture, e.g. [[Surzhyk]] in eastern Ukraine and [[Trasianka]] in Belarus.' FULL: [[0.002993, 'In many places in eastern Ukraine and Belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture eg. surzhyk in eastern Ukraine and Trasianka in Belarus.'], [0.002993, 'In many places in eastern Ukraine and Belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture eg surzhyk in eastern Ukraine and Trasianka in Belarus.'], [0.002993, 'In many places in eastern Ukraine and Belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture e.g. surzhyk in eastern Ukraine and Trasianka in Belarus.'], [0.002993, 'In many places in eastern Ukraine and Belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture e.g surzhyk in eastern Ukraine and Trasianka in Belarus.'], [0.002993, 'In many places in eastern Ukraine and Belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture e. g. surzhyk in eastern Ukraine and Trasianka in Belarus.']] WITH CHUNKING: [[-7.403638595145226, 'In many places in eastern ukraine and belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture e. g. surzhyk in eastern ukraine and trasianka in belarus.'], [-7.403638595145226, 'In many places in eastern ukraine and belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture e.g surzhyk in eastern ukraine and trasianka in belarus.'], [-7.403638595145226, 'In many places in eastern ukraine and belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture e.g. surzhyk in eastern ukraine and trasianka in belarus.'], [-7.403638595145226, 'In many places in eastern ukraine and belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture eg surzhyk in eastern ukraine and trasianka in belarus.'], [-7.403638595145226, 'In many places in eastern ukraine and belarus these languages are spoken interchangeably and in certain areas traditional bilingualism resulted in language mixture eg. surzhyk in eastern ukraine and trasianka in belarus.']] ORIGINAL: `* $25,000 for the first chatterbot that judges cannot distinguish from a real human in a text-only Turing test, and that can convince judges that the other (human) entity they are talking to simultaneously is a computer.' FULL: [[5.6999999999999935e-05, 'USD 25000 for the first chatterbot that judges can not distinguish from a real human in a text only Turing test and that can convince judges that the other human entity who they are talking to simultaneously is a computer.'], [5.6999999999999935e-05, 'USD 25000 for the first chatterbot that judges can not distinguish from a real human in a text only Turing test and that can convince judges that the human other entity who they are talking to simultaneously is a computer.'], [5.6999999999999935e-05, '$ 25000 for the first chatterbot that judges can not distinguish from a real human in a text only Turing test and that can convince judges that the other human entity who they are talking to simultaneously is a computer.'], [5.6999999999999935e-05, '$ 25000 for the first chatterbot that judges can not distinguish from a real human in a text only Turing test and that can convince judges that the human other entity who they are talking to simultaneously is a computer.'], [5.3999999999999944e-05, 'USD 25000 for the first chatterbot which judges can not distinguish from a real human in a text only Turing test and that can convince judges that the human other entity who they are talking to simultaneously is a computer.']] WITH CHUNKING: [[-4.063660180091106, "Usds 25000 for the first chatterbot judges can't distinguish from a real human in a text only turing test and who can convince judges the other human entity who they are talking to simultaneously is a computer."], [-4.063660180091106, "Usds 25000 for the first chatterbot judges can't distinguish from a real human in a turing text only test and which can convince judges the other human entity who they are talking to simultaneously is a computer."], [-4.063660180091106, 'Usds 25000 for the first chatterbot judges cannot distinguish from a real human in a text only turing test and which can convince judges the other human entity who they are talking to simultaneously is a computer.'], [-4.063660180091106, 'Usds 25000 for the first chatterbot judges cannot distinguish from a real human in a text only turing test and who can convince judges the other human entity who they are talking to simultaneously is a computer.'], [-4.063660180091106, 'Usds 25000 for the first chatterbot judges cannot distinguish from a real human in a turing text only test and which can convince judges the other human entity who they are talking to simultaneously is a computer.']] ORIGINAL: `If the MIME type is not recognized as HTML, the Web browser should not attempt to render the document as HTML, even if the document is prefaced with a correct Document Type Declaration.' FULL: None WITH CHUNKING: [[-7.176622995149277, "If the mime type is not recognized as html even if the document is prefaced with a correct document type declaration the web browser shouldn't attempt to render the document as html."], [-6.308105278111546, 'The web browser should not attempt to render the document as html even if the document is prefaced with a correct document type declaration if the mime type is not recognized as html.'], [-5.484988517856218, 'Even if the document is prefaced with a correct document type declaration the web browser should not attempt to render the document as html if the mime type is not recognized as html.'], [-5.157289540920484, 'If the mime type is not recognized as html the web browser should not attempt to render the document as html even if the document is prefaced with a correct document type declaration.'], [-4.334172780665157, 'If the mime type is not recognized as html even if the document is prefaced with a correct document type declaration the web browser should not attempt to render the document as html.']] ORIGINAL: `Later, Esperanto speakers began to see the language and the culture that had grown up around it as ends in themselves, even if Esperanto is never adopted by the United Nations or other international organizations.' FULL: [[0.014096, 'Later even if Esperanto is never adopted by the United Nations or other international organizations Esperanto speakers began to see the language and the culture which had grown up around it as ends in themselves.'], [0.013973, 'Later even if Esperanto is never adopted by the United Nations or other international organizations Esperanto speakers began to see the language and the culture that had grown up around it as ends in themselves.'], [0.010614, 'Later even if Esperanto is never adopted by the United Nations or other international organizations Esperanto speakers began to see the language and the culture who had grown up around it as ends in themselves.'], [0.010229000000000004, 'Later even if Esperanto is never adopted by the United Nations or international other organizations Esperanto speakers began to see the language and the culture which had grown up around it as ends in themselves.'], [0.010140999999999999, 'Later even if Esperanto is never adopted by the United Nations or international other organizations Esperanto speakers began to see the language and the culture that had grown up around it as ends in themselves.']] WITH CHUNKING: None ORIGINAL: `* [[Simple DirectMedia Layer]]—An [[open source]] cross-platform multimedia library written in C that creates an abstraction over various platforms’ graphics, sound, and input [[Application programming interface|API]]s.' FULL: [[0.0010049999999999992, 'Simple DirectMedia Layer. a cross open source platform multimedia library written in c which creates a abstraction over various platforms ’ graphics sound and input APIs.'], [0.0010049999999999992, 'Simple DirectMedia Layer. a cross open source platform multimedia library written in C. which creates a abstraction over various platforms ’ graphics sound and input APIs.'], [0.0010049999999999992, 'Simple DirectMedia Layer. a cross open source platform multimedia library written in (C) which creates a abstraction over various platforms ’ graphics sound and input APIs.'], [0.0010049999999999992, 'DirectMedia Simple Layer. a cross open source platform multimedia library written in c which creates a abstraction over various platforms ’ graphics sound and input APIs.'], [0.0010049999999999992, 'DirectMedia Simple Layer. a cross open source platform multimedia library written in C. which creates a abstraction over various platforms ’ graphics sound and input APIs.']] WITH CHUNKING: None ORIGINAL: `In 1983, [[Richard Stallman]], longtime member of the [[hacker (free and open source software)|hacker]] community at the [[MIT Artificial Intelligence Laboratory]], announced the [[GNU project]], saying that he had become frustrated with the effects of the change in culture of the computer industry and its users.' FULL: [[4.9999999999999996e-05, 'In 1983 Richard Stallman longtime member of the hacker community at the artificial intelligence MIT laboratory announced the GNU project saying like he had become frustrated with the effects of the change in culture of the computer industry and its users.'], [4.9999999999999996e-05, 'In 1983 Richard Stallman longtime member of the hacker community at the artificial intelligence MIT laboratory announced the GNU project saying like he had become frustrated by the effects of the change in culture of the computer industry and its users.'], [4.9999999999999996e-05, 'In 1983 Richard Stallman longtime member of the hacker community at the artificial intelligence MIT laboratory announced the GNU project saying like he had become frustrated at the effects of the change in culture of the computer industry and its users.'], [4.9999999999999996e-05, 'In 1983 Richard Stallman longtime member of the hacker community at the artificial intelligence MIT laboratory announced the GNU project saying as though he had become frustrated with the effects of the change in culture of the computer industry and its users.'], [4.9999999999999996e-05, 'In 1983 Richard Stallman longtime member of the hacker community at the artificial intelligence MIT laboratory announced the GNU project saying as though he had become frustrated by the effects of the change in culture of the computer industry and its users.']] WITH CHUNKING: [[-5.016714783599565, 'In 1983 richard stallman longtime member of the hacker community at the artificial intelligence mit laboratory announced the gnu project saying like he had become frustrated with the effects of the change in culture of the computer industry and its users.'], [-4.881043153208357, 'In 1983 richard stallman longtime member of the hacker community at the artificial intelligence mit laboratory announced the gnu project saying that he had become frustrated at the effects of the change in culture of the computer industry and its users.'], [-4.76429744547147, 'In 1983 richard stallman longtime member of the hacker community at the artificial intelligence mit laboratory announced the gnu project saying as if he had become frustrated at the effects of the change in culture of the computer industry and its users.'], [-4.76429744547147, 'In 1983 richard stallman longtime member of the hacker community at the artificial intelligence mit laboratory announced the gnu project saying as though he had become frustrated at the effects of the change in culture of the computer industry and its users.'], [-4.76429744547147, 'In 1983 richard stallman longtime member of the hacker community at the artificial intelligence mit laboratory announced the gnu project saying like he had become frustrated at the effects of the change in culture of the computer industry and its users.']] ORIGINAL: `Besides working with the community on the free productivity suite's software, IBM will also leverage OpenOffice.org technology in its products" as has been seen with [[Lotus Symphony]].' FULL: None WITH CHUNKING: None ORIGINAL: `Outside of [[Quebec]], the highest number of Francophones in Canada, 485,000, excluding those who claim multiple mother tongues, reside in [[Ontario]], whereas [[New Brunswick]], home to the vast majority of [[Acadians]], has the highest ''percentage'' of Francophones after [[Quebec]], 33%, or 237,000.' FULL: None WITH CHUNKING: None ORIGINAL: `A common rebuttal often used within the AI community against criticism of such approaches asks, "How do we know that humans don't also just follow some cleverly devised rules?" (in the way that Chatterbots do).' FULL: None WITH CHUNKING: [[-15.56379683021719, 'A common rebuttal often used within the ai community against criticism of such approaches does ask how do we know that humans do not also follow some rules devised cleverly justly in the way as if chatterbots do.'], [-15.56379683021719, 'A common rebuttal often used within the ai community against criticism of such approaches does ask how do we know that humans do not also follow some rules devised cleverly justly in the way as though chatterbots do.'], [-15.56379683021719, 'A common rebuttal often used within the ai community against criticism of such approaches does ask how do we know that humans do not also follow some rules devised cleverly justly in the way like chatterbots do.'], [-15.489614999728644, 'A common rebuttal often used within the ai community against criticism of such approaches does ask how do we know that humans do not also follow some cleverly devised rules justly in the way that chatterbots do.'], [-15.435726343570252, 'A common rebuttal often used within the ai community against criticism of such approaches does ask how do we know that humans do not also follow some rules devised cleverly justly in the way that chatterbots do.']] ORIGINAL: `These problems plague firms all across the spectrum and some examples of likely victims are [[Credit card fraud|credit card issuers]], insurance companies, retail merchants, manufacturers, business to business suppliers and even services providers.' FULL: None WITH CHUNKING: None ORIGINAL: `[[Windows XP]] and [[Windows Server 2003]] users who have [[Windows Genuine Advantage|genuine]] copies of Microsoft Windows can freely download the program from Microsoft's web site, and Windows Defender ships as part of [[Windows Vista]].' FULL: [[0.006099000000000025, "Windows XP and Windows Server 2003 users which have genuine copies of Microsoft windows can download the program from Microsoft's web site freely and Windows Defender ships as part of Windows Vista."], [0.005630000000000019, "Windows XP and Windows Server 2003 users who have genuine copies of Microsoft windows can download the program from Microsoft's web site freely and Windows Defender ships as part of Windows Vista."], [0.005319000000000019, "Windows XP and Windows Server 2003 users that have genuine copies of Microsoft windows can download the program from Microsoft's web site freely and Windows Defender ships as part of Windows Vista."], [0.004798000000000021, "Windows XP and Windows Server 2003 users which have genuine copies of Microsoft windows can down load the program from Microsoft's web site freely and Windows Defender ships as part of Windows Vista."], [0.004460000000000025, "Windows XP and Windows Server 2003 users who have genuine copies of Microsoft windows can down load the program from Microsoft's web site freely and Windows Defender ships as part of Windows Vista."]] WITH CHUNKING: [[-20.140003992335465, "Windows xp and windows server 2003 users who have genuine copies of microsoft windows can down load the program from microsoft's web site freely and windows defender ships as part of windows vista."], [-20.0632918946056, "Windows xp and windows server 2003 users which have genuine copies of microsoft windows can down load the program from microsoft's web site freely and windows defender ships as part of windows vista."], [-19.966105879119148, "Windows xp and windows server 2003 users that have genuine copies of microsoft windows can download the program from microsoft's web site freely and windows defender ships as part of windows vista."], [-19.906436874078707, "Windows xp and windows server 2003 users who have genuine copies of microsoft windows can download the program from microsoft's web site freely and windows defender ships as part of windows vista."], [-19.829426567887616, "Windows xp and windows server 2003 users which have genuine copies of microsoft windows can download the program from microsoft's web site freely and windows defender ships as part of windows vista."]] ORIGINAL: `The two were at one time a single language, known today as [[Galician-Portuguese]], but since the political separation of Portugal from Galicia they have diverged somewhat, especially in pronunciation and vocabulary.' FULL: [[0.0011809999999999993, 'The two were a single language known as galician- Portuguese today at one time but since the political separation from Galicia of Portugal they have diverged especially in pronunciation and vocabulary somewhat.'], [0.0010929999999999991, 'The two were a single language known as galician- Portuguese today at one time but since the political separation from Galicia of Portugal they have diverged somewhat especially in pronunciation and vocabulary.'], [0.00039500000000000044, 'The two were a single language known as galician- Portuguese today at one time but since the political separation of Portugal from Galicia they have diverged especially in pronunciation and vocabulary somewhat.'], [0.00036900000000000024, 'The two were a single language known as galician- Portuguese today at one time but since the political separation of Portugal from Galicia they have diverged somewhat especially in pronunciation and vocabulary.']] WITH CHUNKING: [[-9.512446491754272, 'The two were a single language known as galician- portuguese today at one time but since the political separation of portugal from galicia they have diverged especially in pronunciation and vocabulary somewhat.'], [-8.921329801807964, 'The two were a single language known as galician- portuguese today at one time but since the political separation of portugal from galicia they have diverged somewhat especially in pronunciation and vocabulary.'], [-8.431582433524227, 'The two were a single language known as galician- portuguese today at one time but since the political separation from galicia of portugal they have diverged especially in pronunciation and vocabulary somewhat.'], [-7.840446025265656, 'The two were a single language known as galician- portuguese today at one time but since the political separation from galicia of portugal they have diverged somewhat especially in pronunciation and vocabulary.']] ORIGINAL: `Corpus linguistics does away with Chomsky's ''competence/performance'' split; adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference.' FULL: [[0.04983900000000001, "Corpus linguistics does away with Chomsky's competence and performance split. adherents believe that reliable language analysis occurs the best on field collected samples in natural contexts and with minimal experimental interference."], [0.04983900000000001, "Corpus linguistics does away with Chomsky's competence and performance split. adherents believe that reliable language analysis occurs the best on field collected samples in natural contexts and with experimental minimal interference."], [0.036898, "Corpus linguistics does away with Chomsky's competence and performance split. adherents believe that reliable language analysis occurs on field collected samples the best in natural contexts and with minimal experimental interference."], [0.036898, "Corpus linguistics does away with Chomsky's competence and performance split. adherents believe that reliable language analysis occurs on field collected samples the best in natural contexts and with experimental minimal interference."], [0.03493100000000002, "Corpus linguistics does away with Chomsky's competence and performance split. adherents believe that reliable language analysis occurs best on field collected samples in natural contexts and with minimal experimental interference."]] WITH CHUNKING: [[-5.264498979438006, "Corpus linguistics does away with chomsky's competence and performance split. adherents believe that reliable language analysis occurs on field collected samples the best in natural contexts and with minimal experimental interference."], [-5.206072781377547, "Corpus linguistics does away with chomsky's competence and performance split. adherents believe that reliable language analysis occurs best on field collected samples in natural contexts and with experimental minimal interference."], [-5.206072781377547, "Corpus linguistics does away with chomsky's competence and performance split. adherents believe that reliable language analysis occurs best on field collected samples in natural contexts and with minimal experimental interference."], [-4.941134258327391, "Corpus linguistics does away with chomsky's competence and performance split. adherents believe that reliable language analysis occurs the best on field collected samples in natural contexts and with experimental minimal interference."], [-4.941134258327391, "Corpus linguistics does away with chomsky's competence and performance split. adherents believe that reliable language analysis occurs the best on field collected samples in natural contexts and with minimal experimental interference."]] ORIGINAL: `If the sending device is equally likely to send any one of a set of N messages, then the preferred measure of "the information produced when one message is chosen from the set" is the base two [[logarithm]] of N (This measure is called ''[[self-information]]'').' FULL: None WITH CHUNKING: None ORIGINAL: `Early computer science was strongly influenced by the work of mathematicians such as [[Kurt Gödel]] and [[Alan Turing]], and there continues to be a useful interchange of ideas between the two fields in areas such as [[mathematical logic]], [[category theory]], [[domain theory]], and [[algebra]].' FULL: [[0.019010000000000006, 'Early computer science was influenced by the work of mathematicians such as Kurt Gödel and Alan Turing strongly and there continues to be a useful interchange of ideas between the two fields in areas such as mathematical logic category theory domain theory and algebra.'], [0.009948999999999998, 'Early computer science was strongly influenced by the work of mathematicians such as Kurt Gödel and Alan Turing and there continues to be a useful interchange of ideas between the two fields in areas such as mathematical logic category theory domain theory and algebra.'], [0.002746000000000001, 'Early computer science was influenced by the work of mathematicians such as Kurt Gödel and Alan Turing strongly and there continues to be a useful interchange between the two fields of ideas in areas such as mathematical logic category theory domain theory and algebra.'], [0.0023550000000000003, 'Early computer science strongly was influenced by the work of mathematicians such as Kurt Gödel and Alan Turing and there continues to be a useful interchange of ideas between the two fields in areas such as mathematical logic category theory domain theory and algebra.'], [0.0014419999999999993, 'Early computer science was influenced by the work of mathematicians such as Kurt Gödel and Alan Turing strongly and there continues being a useful interchange of ideas between the two fields in areas such as mathematical logic category theory domain theory and algebra.']] WITH CHUNKING: [[-8.12230412590708, 'Early computer science was influenced by the work of mathematicians such as kurt gödel and alan turing strongly and there continues being a useful interchange between the two fields of ideas in areas such as mathematical logic category theory domain theory and algebra.'], [-7.465306672700626, 'Early computer science was strongly influenced by the work of mathematicians such as kurt gödel and alan turing and there continues to be a useful interchange of ideas between the two fields in areas such as mathematical logic category theory domain theory and algebra.'], [-7.190703227455611, 'Early computer science was influenced by the work of mathematicians such as kurt gödel and alan turing strongly and there continues to be a useful interchange between the two fields of ideas in areas such as mathematical logic category theory domain theory and algebra.'], [-6.18859511542936, 'Early computer science was influenced by the work of mathematicians such as kurt gödel and alan turing strongly and there continues being a useful interchange of ideas between the two fields in areas such as mathematical logic category theory domain theory and algebra.'], [-5.2570484881380874, 'Early computer science was influenced by the work of mathematicians such as kurt gödel and alan turing strongly and there continues to be a useful interchange of ideas between the two fields in areas such as mathematical logic category theory domain theory and algebra.']] ORIGINAL: `Some ISO standards are already published but most of them are under construction, mainly on lexicon representation (see [[lexical markup framework|LMF]]), annotation and data category registry.' FULL: None WITH CHUNKING: None ORIGINAL: `In [[regular grammar]]s, the left hand side is again only a single nonterminal symbol, but now the right-hand side is also restricted: It may be the empty string, or a single terminal symbol, or a single terminal symbol followed by a nonterminal symbol, but nothing else.' FULL: [[0.004659999999999999, 'In regular grammars the left handed side is only a nonterminal single symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a terminal single symbol followed by a nonterminal symbol but nothing else.'], [0.004659999999999999, 'In regular grammars the left handed side is only a nonterminal single symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a single terminal symbol followed by a nonterminal symbol but nothing else.'], [0.0038509999999999994, 'In regular grammars the left handed side is only a single nonterminal symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a terminal single symbol followed by a nonterminal symbol but nothing else.'], [0.0038509999999999994, 'In regular grammars the left handed side is only a single nonterminal symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a single terminal symbol followed by a nonterminal symbol but nothing else.'], [0.003213, 'In regular grammars the left handed side again is only a nonterminal single symbol but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a terminal single symbol followed by a nonterminal symbol but nothing else.']] WITH CHUNKING: [[-6.500506651922611, 'In regular grammars the left handed side is only a nonterminal single symbol again but now the right handed side is also restricted : it may be the empty string or a single terminal symbol or a terminal single symbol followed by a nonterminal symbol but nothing else.'], [-6.149842992787814, 'In regular grammars the left handed side is only a single nonterminal symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a single terminal symbol followed by a nonterminal symbol but nothing else.'], [-6.149842992787814, 'In regular grammars the left handed side is only a single nonterminal symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a terminal single symbol followed by a nonterminal symbol but nothing else.'], [-5.959061036564078, 'In regular grammars the left handed side is only a nonterminal single symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a single terminal symbol followed by a nonterminal symbol but nothing else.'], [-5.959061036564078, 'In regular grammars the left handed side is only a nonterminal single symbol again but now the right hand side is also restricted : it may be the empty string or a single terminal symbol or a terminal single symbol followed by a nonterminal symbol but nothing else.']] ORIGINAL: `The [[Bayes Theorem]] is applied to p(e|f), the probability that the foreign string produces the native string to get p(e|f) \propto p(f|e) p(e), where the [[translation model]] p(f|e) is the probability that the native string is the translation of the foreign string, and the [[language model]] p(e) is the probability of seeing that native string.' FULL: None WITH CHUNKING: [[-8.030616339704503, 'The bayes theorem is applied to the probability that the foreign string produces the native string to get where the translation model is the probability that the native string is the translation of the foreign string and that the language model is the probability of seeing of that native string.'], [-7.90248403254008, 'The bayes theorem is applied to the probability that the foreign string produces the native string to get where the translation model is the probability as if the native string is the translation of the foreign string and that the language model is the probability of seeing that native string.'], [-7.90248403254008, 'The bayes theorem is applied to the probability that the foreign string produces the native string to get where the translation model is the probability as though the native string is the translation of the foreign string and that the language model is the probability of seeing that native string.'], [-7.90248403254008, 'The bayes theorem is applied to the probability that the foreign string produces the native string to get where the translation model is the probability like the native string is the translation of the foreign string and that the language model is the probability of seeing that native string.'], [-7.77457868904105, 'The bayes theorem is applied to the probability that the foreign string produces the native string to get where the translation model is the probability that the native string is the translation of the foreign string and that the language model is the probability of seeing that native string.']] ORIGINAL: `While non-humans acquire their own communication systems, they do not acquire human language in this way (although many non-human animals can learn to respond to language, or can even be trained to use it to a degree).' FULL: None WITH CHUNKING: None ORIGINAL: `The [[source code]] of the suite was released in July 2000 with the aim of reducing the dominant [[market share]] of [[Microsoft Office]] by providing a free, open and high-quality alternative; later versions of StarOffice are based upon OpenOffice.org with additional proprietary components.' FULL: [[0.04017299999999998, 'The source code of the suite was released with the aim of reducing the dominant market share of Microsoft Office by providing a free open and high quality alternative in July 2000. later versions of StarOffice are based upon OpenOffice.org with proprietary additional components.'], [0.04017299999999998, 'The source code of the suite was released with the aim of reducing the dominant market share of Microsoft Office by providing a free open and high quality alternative in July 2000. later versions of StarOffice are based upon OpenOffice.org with additional proprietary components.'], [0.013588999999999999, 'The source code of the suite was released with the aim of reducing the dominant market share of Microsoft Office by providing a free open and high quality alternative in July 2000. late more versions of StarOffice are based upon OpenOffice.org with proprietary additional components.'], [0.013588999999999999, 'The source code of the suite was released with the aim of reducing the dominant market share of Microsoft Office by providing a free open and high quality alternative in July 2000. late more versions of StarOffice are based upon OpenOffice.org with additional proprietary components.'], [0.004454000000000002, 'The source code of the suite was released in July 2000 with the aim of reducing the dominant market share of Microsoft Office by providing a free open and high quality alternative. later versions of StarOffice are based upon OpenOffice.org with proprietary additional components.']] WITH CHUNKING: [[-6.448432671804387, 'The source code of the suite was released in july 2000 with the aim of reducing the dominant market share of microsoft office by providing a free open and high quality alternative. later versions of staroffice are based upon openoffice.org with proprietary additional components.'], [-5.333434195684667, 'The source code of the suite was released with the aim of reducing the dominant market share of microsoft office by providing a free open and high quality alternative in july 2000. late more versions of staroffice are based upon openoffice.org with additional proprietary components.'], [-5.333434195684667, 'The source code of the suite was released with the aim of reducing the dominant market share of microsoft office by providing a free open and high quality alternative in july 2000. late more versions of staroffice are based upon openoffice.org with proprietary additional components.'], [-4.249456083189399, 'The source code of the suite was released with the aim of reducing the dominant market share of microsoft office by providing a free open and high quality alternative in july 2000. later versions of staroffice are based upon openoffice.org with additional proprietary components.'], [-4.249456083189399, 'The source code of the suite was released with the aim of reducing the dominant market share of microsoft office by providing a free open and high quality alternative in july 2000. later versions of staroffice are based upon openoffice.org with proprietary additional components.']] ORIGINAL: `Uppercase ''[[È]]'' is particularly rare, as it is absent from the [[Keyboard layout#Italian|Italian keyboard layout]], and is very often written as ''E''' (even though there are [[:it:Aiuto:Manuale di stile#Scrivere .C3.88|several ways]] of producing the uppercase È on a computer).' FULL: None WITH CHUNKING: None ORIGINAL: `Russian-language schooling is also available in Latvia, Estonia and Lithuania, but due to education reforms, a number of subjects taught in Russian are reduced at the high school level.' FULL: [[0.0026339999999999983, 'Russian language schooling is also available in Latvia Estonia and Lithuania but due to education reforms a number of subjects taught in Russian is reduced at the high school level.'], [0.0023069999999999974, 'Also Russian language schooling is available in Latvia Estonia and Lithuania but due to education reforms a number of subjects taught in Russian is reduced at the high school level.'], [0.0018689999999999998, 'Russian language schooling also is available in Latvia Estonia and Lithuania but due to education reforms a number of subjects taught in Russian is reduced at the high school level.'], [0.0005160000000000006, 'Russian language schooling is also available in Latvia Estonia and Lithuania but due to education reforms a number of subjects taught in Russian is reduced at the high schools level.'], [0.00045300000000000017, 'Also Russian language schooling is available in Latvia Estonia and Lithuania but due to education reforms a number of subjects taught in Russian is reduced at the high schools level.']] WITH CHUNKING: [[-8.670775235022754, 'Also russian language schooling is available in latvia estonia and lithuania but due to education reforms a number of subjects taught in russian is reduced at the high schools level.'], [-8.57716467895956, 'Russian language schooling is also available in latvia estonia and lithuania but due to education reforms a number of subjects taught in russian is reduced at the high schools level.'], [-7.956702105318387, 'Russian language schooling also is available in latvia estonia and lithuania but due to education reforms a number of subjects taught in russian is reduced at the high school level.'], [-7.044889324491128, 'Also russian language schooling is available in latvia estonia and lithuania but due to education reforms a number of subjects taught in russian is reduced at the high school level.'], [-6.951278768427934, 'Russian language schooling is also available in latvia estonia and lithuania but due to education reforms a number of subjects taught in russian is reduced at the high school level.']] ORIGINAL: `Invented by [[Geoff Hinton]] and [[Terry Sejnowski]] in 1985, the Boltzmann machine is important because it is one of the first neural networks to demonstrate learning of latent variables (hidden units).' FULL: None WITH CHUNKING: None ORIGINAL: `This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content.' FULL: [[0.034351, 'Also this typically requires the addition of some kind of query language since conventional programming languages do not have the ability to find objects based on their information content.'], [0.021052000000000005, 'Also this typically does require the addition of some kind of query language since conventional programming languages do not have the ability to find objects based on their information content.'], [0.02072, 'Also this does typically require the addition of some kind of query language since conventional programming languages do not have the ability to find objects based on their information content.'], [0.0046050000000000015, 'This does also typically require the addition of some kind of query language since conventional programming languages do not have the ability to find objects based on their information content.'], [0.0038720000000000004, 'This also does typically require the addition of some kind of query language since conventional programming languages do not have the ability to find objects based on their information content.']] WITH CHUNKING: None ORIGINAL: `For example, if a search engine understands that "Van Gogh" was a "Dutch painter", it can answer a search query on "Dutch painters" with a link to a web page about Vincent Van Gogh, although the exact words "Dutch painters" never occur on that page.' FULL: None WITH CHUNKING: [[-9.730961321125191, 'For example although the exact words dutch painters never occur on that page it can answer a search query on dutch painters with a link to a web page ‘bout vincent van gogh if a search engine understands that van gogh was a dutch painter.'], [-9.730961321125191, 'For example although the exact words dutch painters never occur on that page it can answer a search query on dutch painters with a link to a web page ’bout vincent van gogh if a search engine understands that van gogh was a dutch painter.'], [-9.008299008378623, 'For example if a search engine understands that van gogh was a dutch painter although the exact words dutch painters never occur on that page it can answer a search query on dutch painters with a link to a web page about vincent van gogh.'], [-9.008299008378623, 'For example if a search engine understands that van gogh was a dutch painter although the exact words dutch painters never occur on that page it can answer a search query on dutch painters with a link to a web page ‘bout vincent van gogh.'], [-9.008299008378623, 'For example if a search engine understands that van gogh was a dutch painter although the exact words dutch painters never occur on that page it can answer a search query on dutch painters with a link to a web page ’bout vincent van gogh.']] ORIGINAL: `HTML documents can be delivered by the same means as any other computer file; however, they are most often delivered in one of two forms: over [[HTTP]] servers and through e-mail.' FULL: None WITH CHUNKING: [[-11.386123136802903, 'Html documents can be delivered by the same means as any other computer file. however they are delivered in one of two forms over http servers and through email : most often.'], [-11.343563522384107, 'Html documents can be delivered by the same means as any other computer file. however they are delivered over http servers and through email in one of two forms : most often.'], [-11.26352081471057, 'Html documents can be delivered by the same means as any other computer file. however they are delivered : in one of two forms over http servers and through email most often.'], [-10.910699440087829, 'Html documents can be delivered by the same means as any other computer file. however they are delivered in one of two forms : over http servers and through email most often.'], [-10.737427718813791, 'Html documents can be delivered by the same means as any other computer file. however they are most often delivered in one of two forms : over http servers and through email.']] ORIGINAL: `DeRose used a table of pairs, while Church used a table of triples and an ingenious method of estimating the values for triples that were rare or nonexistent in the Brown Corpus (actual measurement of triple probabilities would require a much larger corpus).' FULL: None WITH CHUNKING: None ORIGINAL: `The noun {{transl|ja|''hon''}} ({{lang|ja|本}}) may refer to a single book or several books; {{transl|ja|''hito''}} ({{lang|ja|人}}) can mean "person" or "people"; and {{transl|ja|''ki''}} ({{lang|ja|木}}) can be "tree" or "trees".' FULL: None WITH CHUNKING: None ORIGINAL: `The spelling system, or [[orthography]], is multilayered, with elements of French, Latin and Greek spelling on top of the native Germanic system; it has grown to vary significantly from the [[phonology]] of the language.' FULL: [[0.015016000000000002, 'The spelling system or orthography is multilayered with elements of French Latin and Greek spelling on top of the native Germanic system. it has grown to vary significantly from the phonology of the language.'], [0.015016000000000002, 'The spelling system or orthography is multilayered with elements of French Latin and Greek spelling on top of the Germanic native system. it has grown to vary significantly from the phonology of the language.'], [0.014514999999999998, 'The spelling system or orthography is multilayered with elements of French Latin and Greek spelling on top of the native Germanic system. it has grown to vary from the phonology of the language significantly.'], [0.014514999999999998, 'The spelling system or orthography is multilayered with elements of French Latin and Greek spelling on top of the Germanic native system. it has grown to vary from the phonology of the language significantly.'], [0.010145999999999999, 'The spelling system or orthography is multilayered on top of the native Germanic system with elements of French Latin and Greek spelling. it has grown to vary significantly from the phonology of the language.']] WITH CHUNKING: [[-4.582314822350353, 'The spelling system or orthography is multilayered on top of the native germanic system with elements of french latin and greek spelling. it has grown to vary significantly from the phonology of the language.'], [-4.224011773962321, 'The spelling system or orthography is multilayered with elements of french latin and greek spelling on top of the germanic native system. it has grown to vary from the phonology of the language significantly.'], [-4.224011773962321, 'The spelling system or orthography is multilayered with elements of french latin and greek spelling on top of the native germanic system. it has grown to vary from the phonology of the language significantly.'], [-4.190101930588328, 'The spelling system or orthography is multilayered with elements of french latin and greek spelling on top of the germanic native system. it has grown to vary significantly from the phonology of the language.'], [-4.190101930588328, 'The spelling system or orthography is multilayered with elements of french latin and greek spelling on top of the native germanic system. it has grown to vary significantly from the phonology of the language.']] ORIGINAL: `Significant advances in the state-of-the-art in CSR have been achieved, and current efforts are focused on integrating speech recognition and natural language processing to allow spoken language interaction with a naval resource management system.' FULL: [[0.0068850000000000005, 'Significant advances in the state of the art in CSR have been achieved and current efforts are focused on integrating speech recognition and natural language processing to allow spoken language interaction with a naval resource management system.'], [0.004475, 'Significant advances in the state of the art in CSR have been achieved and current efforts are focused on integrating speech recognition and natural language processing to allow spoken language interaction with a naval resources management system.']] WITH CHUNKING: [[-9.039419865053137, 'Significant advances in the state of the art in csr have been achieved and current efforts are focused on integrating speech recognition and natural language processing to allow spoken language interaction with a naval resources management system.'], [-8.608646657745677, 'Significant advances in the state of the art in csr have been achieved and current efforts are focused on integrating speech recognition and natural language processing to allow spoken language interaction with a naval resource management system.']] ORIGINAL: `While [[grammarians]], writers of dictionaries, and language policy-makers all have a certain influence on the evolution of language, their ability to influence what people think they 'ought' to say is distinct from what people actually say.' FULL: None WITH CHUNKING: None ORIGINAL: `By [[law]], the federal government must operate and provide services in both English and French, proceedings of the [[Parliament of Canada]] must be translated into both these languages, and most products sold in Canada must have labeling in both languages.' FULL: [[0.0008190000000000001, 'By law the federal government must operate and provide services in English and French proceedings of the Parliament of Canada must be translated into both these languages and most products sold in Canada must have labeling in both languages.']] WITH CHUNKING: None ORIGINAL: `The GPL has been described as being [[Copyleft#Is copyleft .22viral.22.3F|"viral"]] by many of its critics because the GPL only allows conveyance of whole programs, which means that programmers are not allowed to convey programs that [[GPL linking exception|link]] to libraries having GPL-incompatible licenses.' FULL: None WITH CHUNKING: None