Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Protocols from perceptual observations
Needham C., Ferreira L., Magee D., Devin V., Hogg D., Cohn A. Artificial Intelligence167 (1-2):103-136,2005.Type:Article
Date Reviewed: May 8 2006

This is a very interesting paper on the integration of sub-symbolic and symbolic systems. One of the main features of the described system is its ability to learn, both under unsupervised and supervised training. The authors have achieved an important step in the quest for artificial intelligence: from visual and acoustic inputs, give the system the ability to learn how to correlate what is important in the shown sequence, and then select an appropriate action for the perceived input signals. This was achieved in real-time, with real data, through inexpensive hardware--two personal computers, Web cameras, and a microphone.

Simple real-world scenarios were used to demonstrate the principles through card games using a pack of cards with pictures of objects with different attributes (for instance, color and shape). The system uses Prolog as a high-level formalism to represent objects and relationships, while PROGOL is used for inductive learning, working directly from raw visual and acoustic data (color, shape, and single word utterances). An attention mechanism based on motion analysis is used to select key frames, and objects’ attributes are clustered using unsupervised learning where different classes are denoted by the attribute labels. Finally, a supervised learning method is applied over the object’s attributes (a vector quantization-based nearest neighbor classifier is used). Audio signals are processed in a similar fashion using K-means clustering over the set of utterances. For each utterance, a symbolic data stream is created. In order to relate an object’s attributes to the uttered word, it is necessary to keep track of time. Once an utterance is classified, it is backtracked to the particular video segment so that audio and visual symbols can be correlated. On the issue of linking perception to action, actions are defined as utterances; the system will choose to play back a sequence of video showing a person speaking the selected word according to current perceived visual signals.

The authors should be commended for tackling the difficult issue of symbol grounding. The burning question is how sensory projections can give rise to iconic representations, such that symbols can be attached to these providing a semantic interpretation of the world. A clear answer is provided, and its limitations are highlighted in this paper.

Reviewer:  Marcos Rodrigues Review #: CR132749 (0703-0299)
Bookmark and Share
 
Perceptual Reasoning (I.2.10 ... )
 
 
Modeling And Recovery Of Physical Attributes (I.2.10 ... )
 
 
Clustering (I.5.3 )
 
 
Vision And Scene Understanding (I.2.10 )
 
Would you recommend this review?
yes
no
Other reviews under "Perceptual Reasoning": Date
Vision, instruction, and action
Chapman D., MIT Press, Cambridge, MA, 1991. Type: Book (9780262031813)
Jun 1 1992
Perception of transparency in man and machine
Beck J. Computer Vision, Graphics, and Image Processing 31(2): 127-138, 1985. Type: Article
Mar 1 1987
Perceptual organization and the representation of natural form
Pentland A. (ed) Artificial Intelligence 28(3): 293-331, 1986. Type: Article
Mar 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy