background preloader


Computer vision is a field that includes methods for acquiring, processing, analyzing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions.[1][2][3][4] A theme in the development of this field has been to duplicate the abilities of human vision by electronically perceiving and understanding an image.[5] This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.[6] Computer vision has also been described as the enterprise of automating and integrating a wide range of processes and representations for vision perception.[7] As a scientific discipline, computer vision is concerned with the theory behind artificial systems that extract information from images. Related fields[edit] Applications for computer vision[edit] Recognition[edit]

Vision Lab; Prof. Fei-Fei Li (Please cite all of Fei-Fei’s papers with the name L. Fei-Fei.) Large-Scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Socially-aware Large-scale Crowd Forecasting Alexandre Alahi, Vignesh Ramanathan, and Li Fei-Fei Co-localization in Real-World Images Kevin Tang, Armand Joulin, Li-Jia Li, Li Fei-Fei Scalable Multi-Label Annotation Jia Deng, Olga Russakovsky, Jonathan Krause, Michael Bernstein, Alexander C. Visual Categorization is Automatic and Obligatory: Evidence from a Stroop-like Paradigm Michelle Greene, Li Fei-Fei Journal of Vision, 2014 3D Object Representations for Fine-Grained Categorization Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei ICCV 2013, Workshop on 3D Representation and Recognition Combining the Right Features for Complex Event Recognition Kevin Tang, Bangpeng Yao, Li Fei-Fei, Daphne Koller Video Event Understanding using Natural Language Descriptions O. B. L. J. V. K. L.

Autodesk Labs Project Photofly ~ Create 3D Model Scenes from Photos - Between the Lines Capture Reality with Project Photofly! The Autodesk Labs was started as a way to show emerging technologies and get feedback on them to shape their future features and direction. Project Photofly is definitely one of those shining examples of cutting edge photogrammetry technology combined with cloud computing with many potential uses but we need your feedback on to help decide where to focus our investment in. To hijack a phrase from a song “The Future is so bright with this technology, you need shades”. Currently it is available in English only currently and relies on a web connection to process the photos using the cloud. Here is a simplified workflow without complex terms or deep diving into mathematical algorithms used to take photos and convert them to 3D points. Here are two screen captures of early prototype Photofly and the 3D point cloud with splats from some photos I took in Tucson and Philadelphia. Download Photofly Blog post by Kean Walmsley Photo Scene Editor on Autodesk Labs

One-shot learning One-shot learning is an object categorization problem of current research interest in computer vision. Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of images and very large datasets, one-shot learning aims to learn information about object categories from one, or only a few, training images. The primary focus of this article will be on the solution to this problem presented by L. Fei-Fei, R. Fergus and P. Perona in IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol28(4), 2006, which uses a generative object category model and variational Bayesian framework for representation and learning of visual object categories from a handful of training examples. Motivation[edit] Background[edit] As with most classification schemes, one-shot learning involves three main challenges: " Representation: How should we model objects and categories? Theory[edit] Bayesian framework[edit] To formalize these ideas, let . and . yields:

* Wearable sensor system automatic maps building while wearer is moving MIT researchers have built a wearable sensor system that automatically creates a digital map of the environment through which the wearer is moving. The prototype system, described in a paper slated for the Intelligent Robots and Systems conference in Portugal next month, is envisioned as a tool to help emergency responders coordinate disaster response. In experiments conducted on the MIT campus, a graduate student wearing the sensor system wandered the halls, and the sensors wirelessly relayed data to a laptop in a distant conference room. Observers in the conference room were able to track the student's progress on a map that sprang into being as he moved. Connected to the array of sensors is a handheld pushbutton device that the wearer can use to annotate the map. In the prototype system, depressing the button simply designates a particular location as a point of interest. Shaky aim The new work builds on previous research on systems that enable robots to map their environments.

Finally A TV Ad That Encourages Hand Gestures: Brainient Taps Kinect For Interactive TV Ads European online video startup, Brainient, whose BrainRolls system enables advertisers to incorporate interactive elements into online video adverts to boost brand engagement and recognition — such as clickable Facebook Like buttons and photo galleries — is tapping into Microsoft’s Xbox Kinect gesture-based controller to push into the connected TV space. Brainient already sells its BrainRolls product for viewing video ads on computers, smartphones and tablets — its system automatically tailors the ad to the type of screen it’s being viewed on, and can therefore offer advertisers the ability to run what is effectively the same campaign across a variety of devices. Today it’s opening a new front with the launch of an interactive video ad that taps up Kinect gestures to extend interactive video ads to connected TVs. Brainient’s first Kinect-friendly ad is for the forthcoming film The Hobbit.

Face detection using HTML5, javascript, webrtc, websockets, Jetty and OpenCV Through HTML5 and the corresponding standards, modern browsers get more standarized features with every release. Most people have heard of websockets that allows you to easily setup a two way communication channel with a server, but one of the specifications that hasn't been getting much coverage is the webrtc specificiation. With the webrtc specification it will become easier to create pure HTML/Javascript real-time video/audio related applications where you can access a user's microphone or webcam and share this data with other peers on the internet. For instance you can create video conferencing software that doesn't require a plugin, create a baby monitor using your mobile phone or more easily facilitate webcasts. All using cross-browser features without the use of a single plugin. Update: in the newest versions of the webrtc spec we can also access the microphone! For this we need to take the following steps: Which tools and technologies do we use What do we use at the backend:

Accenture Innovation Awards Concept van de Week – ThirdSight: persoonlijke reclame Het klinkt onwerkelijk, maar het kan: reclameboodschappen automatisch afstemmen op interesses en emoties. En wel met de software van AIA-deelnemer ThirdSight. ThirdSight is een samenwerkingsverband van medewerkers van de Universiteit van Amsterdam (UvA) en mediagroep BlueBubbleLab onder leiding van Ben van Dongen. Ben is ook CEO van ThirdSight. ThirdSight is ervan overtuigd dat de producten de potentie bezitten om marketing als vakgebied te transformeren, en elke boodschap voor een consument relevant te maken. Camera’s herkennen mensen Ben: “Onze technologie verschaft toegang tot een bron van informatie over persoonskenmerken, emotie en gedrag. Tot nu toe ongekende informatie Het betekent dat ThirdSight elke boodschap op elk digitaal medium voor elke consument interessant kan maken. Effect van reclame meten Zo is het softwarepakket EmoVision gericht op individuele analyse. Wat wil ThirdSight bereiken? Accenture organiseert in 2012 de Innovation Awards in vijf verschillende industrieën.

Quividi - Automated Audience Measurement of Billboards and Out Of Home Digital Media - OOH - Home VENDORS APPLICATIONS RESEARCH Rhonda Software: Computer Vision Computer Vision is relatively new but most rapidly growing domain of Rhonda's expertise. Since 2007 Rhonda has been doing research and development in this area. As the mainstream of CV R&D, Rhonda is planning to release two Audience Measurement products. Rhonda also offers CV custom-oriented solutions in other domains like barcodes, tools and pattern recognition. Rhonda leverages modern CV-methods and mathematical approaches: KDE, Mean shift or Running Gaussian Average methods to extract objects from background (depending on background scene).Color based histograms and Mean Shift for object detection and trackingViola and Jones method for faces detectionHidden Markov models and Neural Networks for faces recognitionRTP (MPEG4) or MJPEG over HTTP for streaming meta-data and video. Note: Rhonda does not distribute its CV solutions in form of library or SDK.

121VIEW Digital Signage Media Who We Are 121View is a digital media software development company. We provide two way digital signage media networks that interact with customers in the marketplace. view details What We Do We are a software development and digital media managment company. Our Vision – A digital media network for every business. Our Mission – To engineer and manage digital media software solutions that enable businesses and customers to communicate with each other. Value Proposition – Providing dedicated web based media networks that are user friendly, cost effective and provide measured results. Join Us – In a world where media delivery and communications are migrating from print to digital delivery we believe every company needs its own digital media network.