|
|
 |
|
|
| Visual Perception with Deep Learning |
 |
Google Tech Talks
April, 9 2008
ABSTRACT
A long-term goal of Machine Learning research
is to solve highy
complex "intelligent" tasks, such as visual
perception auditory
perception, and language understanding. To
reach that goal, the ML
community must solve two problems: the Deep
Learning Problem, and the
Partition Function Problem.
There is considerable theoretical and
empirical evidence that complex
tasks, such as invariant object recognition
in vision, require "deep"
architectures, composed of multiple layers of
trainable non-linear
modules. The Deep Learning Problem is related
to the difficulty of
training such deep architectures.
Several methods have recently been proposed
to train (or pre-train)
deep architectures in an unsupervised
fashion. Each layer of the deep
architecture is composed of an encoder which
computes a feature vector
from the input, and a decoder which
reconstructs the input from the
features. A large number of such layers can
be stacked and trained
sequentially, thereby learning a deep
hierarchy of features with
increasing levels of abstraction. The
training of each layer can be
seen as shaping an energy landscape with low
valleys around the
training samples and high plateaus everywhere
else. Forming these
high plateaus constitute the so-called
Partition Function problem.
A particular class of methods for deep
energy-based unsupervised
learning will be described that solves the
Partition Function problem
by imposing sparsity constraints on the
features. The method can learn
multiple levels of sparse and overcomplete
representations of
data. When applied to natural image patches,
the method produces
hierarchies of filters similar to those found
in the mammalian visual
cortex.
An application to category-level object
recognition with invariance to
pose and illumination will be described (with
a live demo). Another
application to vision-based navigation for
off-road mobile robots will
be described (with videos). The system
autonomously learns to
discriminate obstacles from traversable areas
at long range.
This is joint work with Y-Lan Boureau, Sumit
Chopra, Raia Hadsell,
Fu-Jie Huang, Koray Kavakcuoglu, and
Marc'Aurelio Ranzato.
Speaker: Yann Le Cun
Computational and Biological Learning Lab,
Courant Institute of Mathematical Sciences,
New York University. Tags : google techtalks techtalk engedu talk talks googletechtalks education |
|
Affichage : 6332
Durée : 3445 s |
| Mobile Visual Search Engine on the Apple iPhone |
 |
Imagine you could search the Internet just by
taking a picture of something, such snapping
a picture of a music CD to look up reviews
and listen to the tracks, instantly getting
information on a product featured in a
magazine, or looking up recommend places to
visit by taking a picture of a famous
landmark.
Millions of mobile phone users can "visually
search" the Internet today with the help of
this revolutionary search engine powered by
Evolution Robotics' Visual Pattern
Recognition technology. (ViPR®)
Evolution is developing a number of
applications for visual search, and we came
up with this quick demo for the iPhone to
showcase some of the the possibilities .
For more information, please visit
www.evolution.com. Tags : Mobile Visual Search Evolution Robotics Camera Phone Robotic Vision Patten Recognition ViPR |
|
Affichage : 120317
Durée : 246 s |
| The Visual Wiki: a new metaphor for knowledge access and management |
 |
Google Tech Talks
June 4, 2008
ABSTRACT
Successful knowledge management results in a
competitive advantage in today's information-
and knowledge-rich industries. The
elaboration and integration of emerging
web-based tools and services has proven
suitable for collecting and organizing
intellectual property. Due to an increasing
information overload, information and
knowledge visualization have become an
effective method for representing complex
bodies of knowledge in an alternative fashion
by using visual languages. The focus of this
research is the development of a "Visual
Wiki", which combines the notion of a textual
and a visual representation of knowledge. A
Visual Wiki model has been proposed which
provides a unified framework to design and
discuss different approaches. Three
prototypes of Visual Wikis have been
implemented and evaluated according to the
improvements to knowledge management
applications that they facilitate. This is
joint work with Christian Hirsch and John
Grundy Tags : google techtalks techtalk engedu talk talks googletechtalks education |
|
Affichage : 6002
Durée : 1981 s |
|
|
|
|
|
|
|
|
|
|
 |
| |
|