background preloader


Computer vision is a field that includes methods for acquiring, processing, analyzing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions.[1][2][3][4] A theme in the development of this field has been to duplicate the abilities of human vision by electronically perceiving and understanding an image.[5] This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.[6] Computer vision has also been described as the enterprise of automating and integrating a wide range of processes and representations for vision perception.[7] As a scientific discipline, computer vision is concerned with the theory behind artificial systems that extract information from images. Related fields[edit] Applications for computer vision[edit] Recognition[edit] Related:  Computer Vision

Vision Lab; Prof. Fei-Fei Li (Please cite all of Fei-Fei’s papers with the name L. Fei-Fei.) Large-Scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Socially-aware Large-scale Crowd Forecasting Alexandre Alahi, Vignesh Ramanathan, and Li Fei-Fei Co-localization in Real-World Images Kevin Tang, Armand Joulin, Li-Jia Li, Li Fei-Fei Scalable Multi-Label Annotation Jia Deng, Olga Russakovsky, Jonathan Krause, Michael Bernstein, Alexander C. Visual Categorization is Automatic and Obligatory: Evidence from a Stroop-like Paradigm Michelle Greene, Li Fei-Fei Journal of Vision, 2014 3D Object Representations for Fine-Grained Categorization Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei ICCV 2013, Workshop on 3D Representation and Recognition Combining the Right Features for Complex Event Recognition Kevin Tang, Bangpeng Yao, Li Fei-Fei, Daphne Koller Video Event Understanding using Natural Language Descriptions O. B. L. J. V. K. L.

Vision par ordinateur Un article de Wikipédia, l'encyclopédie libre. La vision par ordinateur (aussi appelée vision artificielle ou vision numérique) est une branche de l'intelligence artificielle dont le but est de permettre à une machine de comprendre ce qu'elle « voit » lorsqu'on la connecte à une ou plusieurs caméras. Vue d'artiste d'un Rover automatique explorant la surface de Mars. Il est équipé sur son sommet de deux caméras vidéo lui conférant une vision stéréoscopique. Une approche consiste à tenter d'imiter la vision humaine ou animale (ex : vision à larges champs de certains oiseaux, de certains insectes par exemple dotés d'yeux à facettes par exemple, ou vision nocturne...) par le truchement de composants électroniques. Applications[modifier | modifier le code] En tant que discipline technologique, la vision par ordinateur cherche à appliquer ses théories et ses modèles à différents systèmes. Les problèmes posés par la modélisation de la vision sont loin d'être résolus.

A smart-object recognition algorithm that doesn’t need humans (Credit: BYU Photo) BYU engineer Dah-Jye Lee has created an algorithm that can accurately identify objects in images or video sequences — without human calibration. “In most cases, people are in charge of deciding what features to focus on and they then write the algorithm based off that,” said Lee, a professor of electrical and computer engineering. “With our algorithm, we give it a set of images and let the computer decide which features are important.” Humans need not apply Not only is Lee’s genetic algorithm able to set its own parameters, but it also doesn’t need to be reset each time a new object is to be recognized — it learns them on its own. Lee likens the idea to teaching a child the difference between dogs and cats. Comparison with other object-recognition algorithms In a study published in the December issue of academic journal Pattern Recognition, Lee and his students demonstrate both the independent ability and accuracy of their “ECO features” genetic algorithm.

Autodesk Labs Project Photofly ~ Create 3D Model Scenes from Photos - Between the Lines Capture Reality with Project Photofly! The Autodesk Labs was started as a way to show emerging technologies and get feedback on them to shape their future features and direction. Project Photofly is definitely one of those shining examples of cutting edge photogrammetry technology combined with cloud computing with many potential uses but we need your feedback on to help decide where to focus our investment in. To hijack a phrase from a song “The Future is so bright with this technology, you need shades”. Currently it is available in English only currently and relies on a web connection to process the photos using the cloud. Here is a simplified workflow without complex terms or deep diving into mathematical algorithms used to take photos and convert them to 3D points. Here are two screen captures of early prototype Photofly and the 3D point cloud with splats from some photos I took in Tucson and Philadelphia. Download Photofly Blog post by Kean Walmsley Photo Scene Editor on Autodesk Labs

One-shot learning One-shot learning is an object categorization problem of current research interest in computer vision. Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of images and very large datasets, one-shot learning aims to learn information about object categories from one, or only a few, training images. The primary focus of this article will be on the solution to this problem presented by L. Fei-Fei, R. Fergus and P. Perona in IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol28(4), 2006, which uses a generative object category model and variational Bayesian framework for representation and learning of visual object categories from a handful of training examples. Motivation[edit] Background[edit] As with most classification schemes, one-shot learning involves three main challenges: " Representation: How should we model objects and categories? Theory[edit] Bayesian framework[edit] To formalize these ideas, let . and . yields:

GPAI Project - GPAI From GPAI The GPAI Project ,General Public Artificial Intelligence Project,is an open project for everyone come together to develop AI under GPL . Aim The aim of this project is to develop functionally human-equivalent Artificial Intelligence that exists in OpenSim or SecondLife. Content The AI system built in this project contents the following part: ultimate goal and principle management system self-planning system of mid-and-long term a short-term schedule a decision-making system a fuzzy-logic engine an emotion-core a semantic web in lojban a personal knowledge-base in lojban a personal database of individual history,preference,and personality a linking between part of personal kb and raw experience and hard-coded content a lojban sentence generator, which generate sentence NOT by random, but according to the express need of the other part a lojban grammar parser a binding to OpenSim Routine and Steps Step 0 Basic Component Step 1 Common Chatbot Step 3 Conjective Robot Step 4 Learning Robot realXtend SNePS

With Emotion Recognition Algorithms, Computers Know What You’re Thinking Back when Google was first getting started, there were plenty of skeptics who didn’t think a list of links could ever turn a profit. That was before advertising came along and gave Google a way to pay its bills — and then some, as it turned out. Thanks in part to that fortuitous accident, in today’s Internet market, advertising isn’t just an also-ran with new technologies: Marketers are bending innovation to their needs as startups chase prospective revenue streams. A handful of companies are developing algorithms that can read the human emotions behind nuanced and fleeting facial expressions to maximize advertising and market research campaigns. Companies building the emotion-detecting algorithms include California-based Emotient, which released its product Facet this summer, Massachusetts-based Affectiva which will debut Affdex mobile software development kit in early 2014, and U.K. Here’s how the systems work. Photos: Realeyes, Emotient, Affectiva

* Wearable sensor system automatic maps building while wearer is moving MIT researchers have built a wearable sensor system that automatically creates a digital map of the environment through which the wearer is moving. The prototype system, described in a paper slated for the Intelligent Robots and Systems conference in Portugal next month, is envisioned as a tool to help emergency responders coordinate disaster response. In experiments conducted on the MIT campus, a graduate student wearing the sensor system wandered the halls, and the sensors wirelessly relayed data to a laptop in a distant conference room. Observers in the conference room were able to track the student's progress on a map that sprang into being as he moved. Connected to the array of sensors is a handheld pushbutton device that the wearer can use to annotate the map. In the prototype system, depressing the button simply designates a particular location as a point of interest. Shaky aim The new work builds on previous research on systems that enable robots to map their environments.

narswang What makes NARS different from conventional reasoning systems is its ability to learn from its experience and to work with insufficient knowledge and resources. NARS attempts to uniformly explain and reproduce many cognitive facilities, including reasoning, learning, planning, reacting, perceiving, categorizing, prioritizing, remembering, decision making, and so on. The research results include a theory of intelligence, a formal model of the theory, and a computer implementation of the model. The ultimate goal of this research is to fully understand the mind, as well as to build thinking machines. Currently this research field is often called "Artificial General Intelligence" (AGI). What is new:

L'application qui scanne les aliments et vous dit tout ce que vous mangez Les étals du marché vous font saliver ? Mais les fruits, les légumes, les oeufs et les poissons présentés sont-ils aussi frais et sains que le marchand vous le dit ? Tellspec, votre nouveau compagnon shopping, répondra à la question avant même que le vendeur n’ait pu vous vanter les mérites de ses produits. Mis au point par Stephen Watson et Isabel Hoffman, l’objet fonctionne comme un Shazam alimentaire : il suffit de le pointer vers un aliment ou un plat, d’appuyer sur un bouton et d’attendre que sa sonnerie indique les résultats de l’analyse. Présence de gluten, ou de résidus d’arachides, présence de produits chimiques, composition vitaminique et balance calorique : tout ce qui rentre dans la composition de ce que vous vous apprêtez à avaler est détaillé sur l’écran de votre téléphone. [Initiative détectée par Julie Rivoire, éclaireuse de Soon Soon Soon->

Finally A TV Ad That Encourages Hand Gestures: Brainient Taps Kinect For Interactive TV Ads European online video startup, Brainient, whose BrainRolls system enables advertisers to incorporate interactive elements into online video adverts to boost brand engagement and recognition — such as clickable Facebook Like buttons and photo galleries — is tapping into Microsoft’s Xbox Kinect gesture-based controller to push into the connected TV space. Brainient already sells its BrainRolls product for viewing video ads on computers, smartphones and tablets — its system automatically tailors the ad to the type of screen it’s being viewed on, and can therefore offer advertisers the ability to run what is effectively the same campaign across a variety of devices. Today it’s opening a new front with the launch of an interactive video ad that taps up Kinect gestures to extend interactive video ads to connected TVs. Brainient’s first Kinect-friendly ad is for the forthcoming film The Hobbit.

Computer learns language by playing games Computers are great at treating words as data: Word-processing programs let you rearrange and format text however you like, and search engines can quickly find a word anywhere on the Web. But what would it mean for a computer to actually understand the meaning of a sentence written in ordinary English — or French, or Urdu, or Mandarin? One test might be whether the computer could analyze and follow a set of instructions for an unfamiliar task. And indeed, in the last few years, researchers at MIT’s Computer Science and Artificial Intelligence Lab have begun designing machine-learning systems that do exactly that, with surprisingly good results. Starting from scratch “Games are used as a test bed for artificial-intelligence techniques simply because of their complexity,” says Branavan, who was first author on both ACL papers. Moreover, Barzilay says, game manuals have “very open text. So initially, its behavior is almost totally random. Proof of concept

Machine olfaction Machine olfaction is the automated simulation of the sense of smell. It is an emerging application of modern engineering where robots or other automated systems are needed to measure the existence of a particular chemical concentration in air. Such an apparatus is often called an electronic nose or e-nose. Machine olfaction is complicated by the fact that e-nose devices to date have had a limited number of elements, whereas each odor is produced by own unique set of (potentially numerous) odorant compounds. because[1] This technology is still in the early stages of development, but it promises many applications, such as:[2] Pattern analysis constitutes a critical building block in the development of gas sensor array instruments capable of detecting, identifying, and measuring volatile compounds, a technology that has been proposed as an artificial substitute for the human olfactory system. Detection[edit] There are three basic detection techniques using: See also[edit] References[edit]