WebCorp: The Web as Corpus

WebCorp Live lets you access the Web as a corpus - a large collection of texts from which examples of real language use can be extracted. More... Have you tried WebCorp LSE? Our large-scale search engine with more search options, part-of-speech tags and quantitative analyses. More details... Enter the word or phrase you wish to search for in this box. A case insensitive search will match both upper and lower case variants of the search terms. Span will choose the number of words or characters to display as the left and right contexts of the search term. WebCorp works 'on top of' existing web search engines. You can also specify a language or market for the pages to search, as classified by the web search engine. Show URLs will display a link to and other meta-information for each matching web page. Pages will tell you the maximum number of web pages WebCorp will search. One concordance line per web page will retrieve only one match from each page searched.

Words and phrases: frequency, genres, collocates, concordances, synonyms, and WordNet Leeds collection of Internet corpora The Internet corpora used here were developed using the same methodology as outlined in Sharoff, S. (2006) Creating general-purpose corpora using automated search engine queries. In Marco Baroni and Silvia Bernardini, (eds), WaCky! Working papers on the Web as Corpus. Gedit, Bologna, Steps 2 and 3 above use customised versions of tools from Marco Baroni's BootCat, which also has a very extensive description of installation requirements and tool functions. Have a look at them. The English CC corpus has been compiled from webpages with the Creative Commons permissive licences. The Perl scripts are free software. The interface and corpora were developed by Serge Sharoff; contact me at s.sharoff leeds.ac.uk, if you have further queries.

Oxford Text Checker at Oxford Learner's Dictionaries The Oxford Text Checker will check the vocabulary in any text against one of three word lists. You can find out which words in a text are part of: the Oxford 3000 - our list of the most useful and important words to learn in English (which we call "keywords") the top 2,000 keywords taken from the Oxford 3000 (the keyword list from our Oxford Essential Dictionary and Oxford Basic American Dictionary) the Academic Word List - a list of words that you are likely to meet if you study at an English-speaking university To use the Text Checker, first choose which wordlist you want to check against. In a typical low intermediate text, close to 100% of the words will be Oxford 3000 keywords. The Text Checker will automatically ignore any numbers and symbols.