background preloader

Natural language processing

Natural language processing
Natural language processing (NLP) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve natural language understanding, that is, enabling computers to derive meaning from human or natural language input, and others involve natural language generation. History[edit] The history of NLP generally starts in the 1950s, although work can be found from earlier periods. In 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence. The Georgetown experiment in 1954 involved fully automatic translation of more than sixty Russian sentences into English. Up to the 1980s, most NLP systems were based on complex sets of hand-written rules. NLP using machine learning[edit] Major tasks in NLP[edit] Parsing

Related:  Natural Language ProcessingWiki

Machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation (MAHT) or interactive translation) is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one natural language to another. On a basic level, MT performs simple substitution of words in one natural language for words in another, but that alone usually cannot produce a good translation of a text because recognition of whole phrases and their closest counterparts in the target language is needed. Solving this problem with corpus and statistical techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies.[1] The progress and potential of machine translation have been debated much through its history.

List of Java virtual machines This article provides non-exhaustive lists of Java SE Java virtual machines (JVMs). It does not include a large number of Java ME vendors. Note that Java EE runs on the standard Java SE JVM but that some vendors specialize in providing a modified JVM optimized for Java EE applications. A large amount of Java development work takes place on Windows, Solaris, Linux and FreeBSD, primarily with the Sun JVMs. Note the further complication of different 32-bit/64-bit varieties. The primary reference Java VM implementation is HotSpot, produced by Oracle Corporation. Morphology (linguistics) The discipline that deals specifically with the sound changes occurring within morphemes is morphophonology. The history of morphological analysis dates back to the ancient Indian linguist Pāṇini, who formulated the 3,959 rules of Sanskrit morphology in the text Aṣṭādhyāyī by using a constituency grammar. The Greco-Roman grammatical tradition also engaged in morphological analysis. Studies in Arabic morphology, conducted by Marāḥ al-arwāḥ and Aḥmad b. ‘alī Mas‘ūd, date back to at least 1200 CE.[1] The term "morphology" was coined by August Schleicher in 1859.[2]

Hybris (software) Hybris has also been picked up by the Open webOS community for WebOS Ports,[6][7] and by Canonical for Ubuntu Touch.[5][8] libhybris enables Android device drivers to be used on glibc-based Linux systems The main feature of Hybris is overriding of Bionic calls and their translation into glibc calls, thus allowing Bionic-based software to be used on glibc-based Linux distributions.[3] Official website

Natural language generation Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form. Psycholinguists prefer the term language production when such formal representations are interpreted as models for mental representations. It could be said an NLG system is like a translator that converts a computer based representation into a natural language representation. However, the methods to produce the final language are different from those of a compiler due to the inherent expressivity of natural languages. NLG may be viewed as the opposite of natural language understanding: whereas in natural language understanding the system needs to disambiguate the input sentence to produce the machine representation language, in NLG the system needs to make decisions about how to put a concept into words. Simple examples are systems that generate form letters.

klik (packaging method) klik was a system for software download and usage on GNU/Linux. klik integrated with web browsers on the user's computer. Users downloaded and installed software by typing a URL beginning with This downloaded a klik "recipe" file, which was used to generate the .cmg file. In this way, one recipe could be used to supply packages to a wide variety of platforms. Optical character recognition Optical Character Recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned or photographed images of typewritten or printed text into machine-encoded/computer-readable text. It is widely used as a form of data entry from some sort of original paper data source, whether passport documents, invoices, bank statement, receipts, business card, mail, or any number of printed records. It is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data extraction and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

GLX The Linux DRI Graphic Stack in 2013 History[edit] Silicon Graphics developed GLX as part of their effort to support OpenGL in the X Window System. Stemming Stemming programs are commonly referred to as stemming algorithms or stemmers. Examples[edit] A stemmer for English, for example, should identify the string "cats" (and possibly "catlike", "catty" etc.) as based on the root "cat", and "stemmer", "stemming", "stemmed" as based on "stem".

FrontPage - JythonWiki The purpose of a programming language is to let software developers express their intentions as simply and directly as possible. - JimHugunin Jython is a Java implementation of Python that combines expressive power with clarity. Jython is freely available for both commercial and non-commercial use and is distributed with source code. Jython is complementary to Java and is especially suited for the following tasks: Embedded scripting - Java programmers can add the Jython libraries to their system to allow end users to write simple or complicated scripts that add functionality to the application.

Proofreading Professional proofreading[edit] Traditional method[edit] Alternative methods[edit] Copy holding or copy reading employs two readers per proof. The first reads the text aloud literally as it appears, usually at a comparatively fast but uniform rate of speed. The second reader follows along and marks any pertinent differences between what is read and what was typeset.