Augmented Reality with Google Tablets and Glasses

Apply

Project Description

Wearable devices have attracted a lot of attention from the research community. These devices enable access to a user’s day-to-day life. Many of these devices support multi-modal sensors that can record and/or transfer sensory data including video and audio. Coupling this source of data with network connectivity can enable a wide range of augmented reality (AR) applications, which serve to enrich the user’s life and provide more insight in making decisions. For example, a wearable device supported by intelligent computer vision and machine learning methods can automatically infer and relay information about the place and situation to the user directly. This provides the user with more information to make a particular decision, e.g. whether or not to buy a product in a store based on reviews and competitor pricing found online. Moreover, these augmented capabilities could be very beneficial for people with sensory impairments, e.g. a visually impaired person wearing an AR device can be warned (through audio) of immediate obstacles in his/her way or a hearing impaired person can be notified (through words on a display) of someone calling out to him/her. In this project, we aim to build an AR system based on the Google Glass and a Google Tablet, which will automatically acquire visual and audio data and transfer it to a central processing station for analysis. Information inferred from this data will be transferred back to the Glass, so that it is conveyed to the user in visual or audio form. This is possible because the Glass supports both visual and audio sensors. One possible output of this project is the ability to project on the Glass display automatically-generated results of recognizing (i.e. labeling) and detecting (i.e. localizing) objects in front of the user. ​​​
Program - Electrical Engineering
Division - Computer, Electrical and Mathematical Sciences and Engineering
Center Affiliation - Visual Computing Center
Field of Study - Computer, Electrical , Mathematical Sciences , Engineering ​

About the
Researcher

Bernard Ghanem

Associate Professor, Electrical and Computer Engineering

Bernard Ghanem
Professor Ghanem's research interests focus on topics in computer vision, machine learning, and image processing. They include:
  • Modeling dynamic objects in video sequences to improve motion segmentation, video compression, video registration, motion estimation, and activity recognition.
  • Developing efficient optimization and randomization techniques for large-scale computer vision and machine learning problems.
  • Exploring novel means of involving human judgment to develop more effective and perceptually-relevant recognition and compression techniques.
  • Developing frameworks for joint representation and classification by exploiting data sparsity and low-rankness.

Desired Project Deliverables

A software module based on the Google Glass SDK to acquire and transfer still images and videos from the Glass to a central processing station and transfer meta-data in the opposite direction. An API for the central processing station to invoke automatic computer vision and machine learning algorithms on the received images and videos. A software module based on the Google Glass SDK that conveys to the user the meta-data acquired from the central processing station on the Glass display. A large-scale dataset of videos and still images captured by a Google Glass during day-to-day activities. The important objects and activities in these videos will be manually labeled and used for training as well as testing the overall system. This dataset will be made publicly available to the research community for future algorithm evaluation and comparison.​