background preloader

WebCorp: The Web as Corpus

WebCorp: The Web as Corpus
WebCorp Live lets you access the Web as a corpus - a large collection of texts from which examples of real language use can be extracted. More... Have you tried WebCorp LSE? Our large-scale search engine with more search options, part-of-speech tags and quantitative analyses. More details... Enter the word or phrase you wish to search for in this box. A case insensitive search will match both upper and lower case variants of the search terms. Span will choose the number of words or characters to display as the left and right contexts of the search term. WebCorp works 'on top of' existing web search engines. You can also specify the language of the pages to search, as classified by the web search engine. Show URLs will display a link to and other meta-information for each matching web page. Pages will tell you the maximum number of web pages WebCorp will search. One concordance line per web page will retrieve only one match from each page searched.

Related:  Corpus Siteswords

Using CLAN Warning: After installing a new version of CLAN for use with old data, you will need to get a new version of the MOR grammar and run MOR, POST, and CHECK again on your old data to make sure they work with the newer format. Alternatively, you may wish continue using old versions of CLAN with old versions of corpora. However, CHILDES data on the web are always updated to run with new versions of CLAN.

21 Digital Tools to Build Vocabulary l Dr. Kimberly's Literacy Blog If you follow this blog, you know that I believe effective vocabulary instruction is just about the most important instructional activity for teachers to get right. For lots of reasons. Vocabulary influences fluency, comprehension, and student achievement. CORPORA: 1.9 billion - 45 million words each: free online access Note: click RETURN in the upper right-hand corner to return to this page, after clicking on any of the links below. The BYU Wikipedia corpus, which was released in early 2015, was created by Mark Davies (professor of linguistics at Brigham Young University). It contains 1.9 billion words in 4.4 million web pages, and you can search the entire corpus with the same type of queries as the other BYU corpora. More importantly, though, you can also quickly and easily create "virtual" corpora "on the fly" for any topic that you want, such as: biology, investments, Buddhism, psychology, cars, basketball.

Laurence Anthony's AntConc Older Versions All previous releases of AntConc can be found at the following link. <.exe> files are for Windows. <.zip> files are for Macintosh OS X. <.tar.gz> files are for Linux. All previous releases 10 Ways to Use Technology to Build Vocabulary Click the "References" link above to hide these references. Adesope, O.O., Lavin, T., Thompson, T., & Ungerleider, C. (2010). A systematic review and meta-analysis of the cognitive correlates of bilingualism. Review of Educational Research, 80(2), 207-245. doi:10.3102/0034654310368803 Cambridge University Press Language Research at Cambridge Cambridge University Press is committed to language research - the investigation of written and spoken English in order to understand more about how we use language. Our research helps to inform and improve our English Language Teaching resources. All of our authors and editors have access to the language research facilities at Cambridge. Our language research features in most of our materials. In particular, we use it to:

Santa Barbara Corpus of Spoken American English Parts 1-4 of the Santa Barbara Corpus of Spoken American English (SBCSAE) are now available, for a total of approximately 249,000 words. The Santa Barbara Corpus includes transcriptions, audio, and timestamps which correlate transcription and audio at the level of individual intonation units. AccessDescriptionContents and Summaries CitationRecordingsAcknolwedgementsContact 9 Word Cloud Generators That Aren't Wordle The use of word clouds in the classroom is a powerful way to really get through to visual learners. The details about the following nine word cloud generators will give you a fair idea how, as an educator, you can get the best out of them. A quick note: Wordle is quite easily the most popular word cloud generator out there. It’s free and easy to use.

ELISA - English Language Interview Corpus as a Second-Language Learning Application The ELISA corpus is being developed at the University of Tuebingen (Dept of Applied English Linguistics, AEL) and the University of Surrey (Dept of Languages and Translation Studies, LTS) as a resource for language learning and teaching, and interpreter training. It contains interviews with native speakers of English. They talk about their professional career (e.g. in tourism, politics, the media or environmental education). We are very grateful to all speakers for their kind contributions.

Self-Study English Grammar Quizzes HTML-Only Quizzes Grammar | Places | Vocabulary | Idioms | Homonyms | Scrambled Words | Misc. Activities for ESL Students has over 1,000 activities to help you study English as a Second Language. This project of The Internet TESL Journal has contributions by many teachers. Page Contents Articles | Cloze | Conjunctions | Dialogs | Plurals | Prepositions | Pronouns | Sentence Structure | Tag Questions | Verbs | What's the Correct Sequence | Word Choice | Other Quizzes

ELFA Project – University of Helsinki On this page you can find: See also: Description of the ELFA corpus project The ELFA corpus was completed in 2008 and its development work is ongoing. VOICE - Project - 'Lingua Franca corpus' In the early 21st century, English in the world finds itself in an “unstable equilibrium”: On the one hand, the majority of the world's English users are not native speakers of the language, but use it as an additional language, as a convenient means for communicative interactions that cannot be conducted in their mother tongues. On the other hand, linguistic descriptions have as yet predominantly been focusing on English as it is spoken and written by its native speakers. VOICE seeks to redress the balance by providing a sizeable, computer-readable corpus of English as it is spoken by this non-native speaking majority of users in different contexts. These speakers use English successfully on a daily basis all over the world, in their personal, professional or academic lives. We therefore see them primarily not as language learners but as language users in their own right. It is therefore clearly worth finding out just how they use the language.

Geoffrey Sampson: SUSANNE Scheme - Parsed Corpus Geoffrey Sampson The Need for Grammatical Taxonomy Since the 1990s, the exciting growth-area in linguistics has been corpus linguistics: studying how English and other languages are used in real life, through analysis of large electronic samples – “corpora” – of spoken or written usage. In 2004, together with my colleague Diana McCarthy I edited an anthology of papers illustrating the diverse strengths of modern corpus linguistics. Many findings of corpus linguistics shed new light on the nature of language as a human ability.