Morphology (linguistics)

The discipline that deals specifically with the sound changes occurring within morphemes is morphophonology. The history of morphological analysis dates back to the ancient Indian linguist Pāṇini, who formulated the 3,959 rules of Sanskrit morphology in the text Aṣṭādhyāyī by using a constituency grammar. The Greco-Roman grammatical tradition also engaged in morphological analysis. Studies in Arabic morphology, conducted by Marāḥ al-arwāḥ and Aḥmad b. The term "morphology" was coined by August Schleicher in 1859.[2] Here are examples from other languages of the failure of a single phonological word to coincide with a single morphological word form. kwixʔid-i-da bəgwanəmai-χ-a q'asa-s-isi t'alwagwayu Morpheme by morpheme translation: kwixʔid-i-da = clubbed-PIVOT-DETERMINER bəgwanəma-χ-a = man-ACCUSATIVE-DETERMINER q'asa-s-is = otter-INSTRUMENTAL-3SG-POSSESSIVE t'alwagwayu = club. "the man clubbed the otter with his club." (Notation notes: kwixʔid i-da-bəgwanəma χ-a-q'asa s-isi-t'alwagwayu Related: Natural Language Processing

Natural language generation Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form. Psycholinguists prefer the term language production when such formal representations are interpreted as models for mental representations. It could be said an NLG system is like a translator that converts a computer based representation into a natural language representation. However, the methods to produce the final language are different from those of a compiler due to the inherent expressivity of natural languages. NLG may be viewed as the opposite of natural language understanding: whereas in natural language understanding the system needs to disambiguate the input sentence to produce the machine representation language, in NLG the system needs to make decisions about how to put a concept into words. Simple examples are systems that generate form letters. Example[edit] Stages[edit] Applications[edit]

Coreference resolution Optical character recognition Optical Character Recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned or photographed images of typewritten or printed text into machine-encoded/computer-readable text. It is widely used as a form of data entry from some sort of original paper data source, whether passport documents, invoices, bank statement, receipts, business card, mail, or any number of printed records. It is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data extraction and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision. Early versions needed to be programmed with images of each character, and worked on one font at a time. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now common. History[edit] Types[edit]

Proofreading Professional proofreading[edit] Traditional method[edit] Alternative methods[edit] Copy holding or copy reading employs two readers per proof. Experienced copy holders employ various codes and verbal short-cuts that accompany their reading. Double reading. Style guides and checklists[edit] Before it is typeset, copy is often marked up by an editor or customer with various instructions as to typefaces, art, and layout. Checklists are commonly employed in proof-rooms where there is sufficient uniformity of product to distill some or all of its components to a list format. Qualifications[edit] By contrast, printers, publishers, advertising agencies and law firms tend not to specifically require a degree. Proofreader testing[edit] Applicants. Proofreader applicants are tested primarily on their spelling, speed, and skill in finding errors in sample text. A contrasting approach to testing is to identify and reward persistence more than an arbitrarily high level of expertise. In fiction[edit]

Query expansion Query expansion (QE) is the process of reformulating a seed query to improve retrieval performance in information retrieval operations.[1] In the context of web search engines, query expansion involves evaluating a user's input (what words were typed into the search query area, and sometimes other types of data) and expanding the search query to match additional documents. Query expansion involves techniques such as: Query expansion is a methodology studied in the field of computer science, particularly within the realm of natural language processing and information retrieval. Precision and recall tradeoffs[edit] Search engines invoke query expansion to increase the quality of user search results. It is assumed that users do not always formulate search queries using the best terms. This tradeoff is one of the defining problems in query expansion, regarding whether it is worthwhile to perform given the questionable effects on precision and recall. See also[edit] Software libraries[edit] D.

Truecasing Truecasing is the problem in natural language processing (NLP) of determining the proper capitalization of words where such information is unavailable. This commonly comes up due to the standard practice (in English and many other languages) of automatically capitalizing the first word of a sentence. It can also arise in badly cased or noncased text (for example, all-lowercase or all-uppercase text messages). Truecasing aids in many other NLP tasks, such as named entity recognition, machine translation and Automatic Content Extraction.[1] Truecasing is unnecessary in languages whose scripts do not have a distinction between uppercase and lowercase letters. Jump up ^ Lita, L.

Natural language processing Field of linguistics and computer science Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation. History[edit] Natural language processing has its roots in the 1950s. Symbolic NLP (1950s – early 1990s)[edit] The premise of symbolic NLP is well-summarized by John Searle's Chinese room experiment: Given a collection of rules (e.g., a Chinese phrasebook, with questions and matching answers), the computer emulates natural language understanding (or other NLP tasks) by applying those rules to the data it confronts. Statistical NLP (1990s–2010s)[edit] Up to the 1980s, most natural language processing systems were based on complex sets of hand-written rules. 1990s: Many of the notable early successes on statistical methods in NLP occurred in the field of machine translation, due especially to work at IBM Research, such as IBM alignment models. Neural NLP (present)[edit] Before that they were commonly used:

Category:Natural language and computing From Wikipedia, the free encyclopedia This category is for categories and articles that relate to natural language and computing. Please look for an appropriate subcategory before adding articles to this one. language portal Subcategories This category has the following 12 subcategories, out of 12 total. Pages in category "Natural language and computing" The following 28 pages are in this category, out of 28 total.