Computing Reviews, the leading online review service for computing literature.

Search

Protocols from perceptual observations
Needham C., Ferreira L., Magee D., Devin V., Hogg D., Cohn A. Artificial Intelligence167 (1-2):103-136,2005.Type:Article

Date Reviewed: May 8 2006

This is a very interesting paper on the integration of sub-symbolic and symbolic systems. One of the main features of the described system is its ability to learn, both under unsupervised and supervised training. The authors have achieved an important step in the quest for artificial intelligence: from visual and acoustic inputs, give the system the ability to learn how to correlate what is important in the shown sequence, and then select an appropriate action for the perceived input signals. This was achieved in real-time, with real data, through inexpensive hardware--two personal computers, Web cameras, and a microphone. Simple real-world scenarios were used to demonstrate the principles through card games using a pack of cards with pictures of objects with different attributes (for instance, color and shape). The system uses Prolog as a high-level formalism to represent objects and relationships, while PROGOL is used for inductive learning, working directly from raw visual and acoustic data (color, shape, and single word utterances). An attention mechanism based on motion analysis is used to select key frames, and objects’ attributes are clustered using unsupervised learning where different classes are denoted by the attribute labels. Finally, a supervised learning method is applied over the object’s attributes (a vector quantization-based nearest neighbor classifier is used). Audio signals are processed in a similar fashion using K-means clustering over the set of utterances. For each utterance, a symbolic data stream is created. In order to relate an object’s attributes to the uttered word, it is necessary to keep track of time. Once an utterance is classified, it is backtracked to the particular video segment so that audio and visual symbols can be correlated. On the issue of linking perception to action, actions are defined as utterances; the system will choose to play back a sequence of video showing a person speaking the selected word according to current perceived visual signals. The authors should be commended for tackling the difficult issue of symbol grounding. The burning question is how sensory projections can give rise to iconic representations, such that symbols can be attached to these providing a semantic interpretation of the world. A clear answer is provided, and its limitations are highlighted in this paper.

Reviewer: Marcos Rodrigues	Review #: CR132749 (0703-0299)

Perceptual Reasoning (I.2.10 ... )

Modeling And Recovery Of Physical Attributes (I.2.10 ... )

Clustering (I.5.3 )

Vision And Scene Understanding (I.2.10 )

Would you recommend this review?

yes

Other reviews under "Perceptual Reasoning":	Date

Vision, instruction, and action Chapman D., MIT Press, Cambridge, MA, 1991. Type: Book (9780262031813)	Jun 1 1992

Perception of transparency in man and machine Beck J. Computer Vision, Graphics, and Image Processing 31(2): 127-138, 1985. Type: Article	Mar 1 1987

Perceptual organization and the representation of natural form Pentland A. (ed) Artificial Intelligence 28(3): 293-331, 1986. Type: Article	Mar 1 1987

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy