background preloader

The Sketch Engine: Text Corpus Query System for All

The Sketch Engine: Text Corpus Query System for All

About the Collins Corpus and the Bank of Englishâ„¢ The Collins corpus is a 2.5-billion word analytical database of English. It contains written material from websites, newspapers, magazines and books published around the world, and spoken material from radio, TV and everyday conversations. New data is fed into the corpus every month, to help the Collins dictionary editors identify new words and meanings from the moment they are first used. The Bank of Englishâ„¢ forms part of the Collins Corpus. It contains 650 million words from a carefully chosen selection of sources, to give a balanced and accurate reflection of English as it is used everyday. All cobuild dictionaries are based on the information we find on the Bank of Englishâ„¢ and the Collins corpus. When a dictionary editor wants to add a new word to cobuild, they search the corpus for every example of the word. The corpus lies at the heart of cobuild and you can be confident that cobuild will show you what you need to know to be able to communicate easily and accurately in English.

WebCorp: The Web as Corpus WebCorp Live lets you access the Web as a corpus - a large collection of texts from which examples of real language use can be extracted. More... Have you tried WebCorp LSE? Our large-scale search engine with more search options, part-of-speech tags and quantitative analyses. More details... Enter the word or phrase you wish to search for in this box. A case insensitive search will match both upper and lower case variants of the search terms. Span will choose the number of words or characters to display as the left and right contexts of the search term. WebCorp works 'on top of' existing web search engines. You can also specify a language or market for the pages to search, as classified by the web search engine. Show URLs will display a link to and other meta-information for each matching web page. Pages will tell you the maximum number of web pages WebCorp will search. One concordance line per web page will retrieve only one match from each page searched.

Related: