This Title All WIREs
How to cite this WIREs title:
WIREs Cogn Sci
Impact Factor: 3.175

ART, cognitive science, and technology transfer

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Three computational examples illustrate how cognitive science can introduce new approaches to the analysis of large datasets. The first example addresses the question: how can a neural system learning from one example at a time absorb information that is inconsistent but correct, as when a family pet is called Spot and dog and animal, while rejecting similar incorrect information, as when the same pet is called wolf? How does this system transform such scattered information into the knowledge that dogs are animals, but not conversely? The second example asks: how can a real‐time system, initially trained with a few labeled examples and a limited feature set, continue to learn from experience when confronted with oceans of additional information, without eroding reliable early memories? How can such individual systems adapt to their unique application contexts? The third example asks: how can a neural system that has made an error refocus attention on environmental features that it had initially ignored? Three models that address these questions, each based on the distributed adaptive resonance theory (dART) neural network, are applied to a spatial testbed created from multimodal remotely sensed data. The article summarizes key design elements of ART models, and provides links to open‐source code for each system and the testbed dataset. WIREs Cogn Sci 2013, 4:707–719. doi: 10.1002/wcs.1260 This article is categorized under: Computer Science > Machine Learning Computer Science > Neural Networks Psychology > Memory
Distributed adaptive resonance theory (dART). (a) At the field F0, complement coding transforms the feature pattern a to the system input A, which represents both scaled feature values ai ∈ [0,1] and their complements (1 − ai) (i = 1 … M). (b) F2 is a competitive field that transforms its input pattern into the working memory code y. The F2 nodes that remain active following competition send the pattern σ of learned top‐down expectations to the match field F1. The pattern active at F1 becomes x = A ∧ σ, where ∧ denotes the component‐wise minimum, or fuzzy intersection. (c) A parameter ρ ∈ [0,1], called vigilance, sets the matching criterion. The system registers a mismatch if the size of x is less than ρ times the size of A. A top‐down/bottom‐up mismatch triggers a signal that resets the active F2 code. (d) Medium‐term memories in the F0‐to‐F2 dynamic weights allow the system to activate a new code y. When only one F2 node remains active following competition, the code is maximally compressed, or winner‐take‐all. When |x| ≥ ρ|A|, y remains approximately constant until the next reset, even if input A changes or F0‐to‐F2 signals habituate. During learning, thresholds τij in paths from F0 to F2 increase according to the dInstar law, and thresholds τji in paths from F2 to F1 increase according to the dOutstar law.
[ Normal View | Magnified View ]
Boston Testbed simulations comparing test strip accuracy (recall) of a winner‐take‐all network without featural bias (λ = 0) and biased adaptive resonance theory (ART) (λ > 0). The bias index λ is set equal to 10 by default, and performance is generally insensitive to precise parameter values once λ is large enough to bias search.
[ Normal View | Magnified View ]
In biased adaptive resonance theory (ART), the medium‐term memory e cumulatively tracks features as they are attended at the match field for a given input. Following a reset, e reduces activation of previously attended features in both the bottom‐up input and the new matched pattern. Compare the search step in Figure (d), where the network is unbiased (λ = 0) but otherwise the same.
[ Normal View | Magnified View ]
Confusion matrices for the self‐supervised adaptive resonance theory (ART) simulation of Figure (b). (a) Test performance after Stage 1 learning on labeled inputs with the five blue‐related feature values. (b) Performance on the same test set after further Stage 2 learning.
[ Normal View | Magnified View ]
Self‐supervised adaptive resonance theory (ART) performance on the Boston Testbed. Stage 1 training specifies class labels and five blue‐related features in three strips. Further, Stage 2 training specifies all input features but no class labels for pixels in the remaining strip. Testing measures the performance of the system on fully featured inputs from the fourth strip. Each simulation result is the average of the independent accuracies across the four test strips. (a) Performance histograms for 500 randomized trials. (b) For one trial, the percent of pixels labeled as belonging to a class that were predicted as in that class (recall) after Stage 1 (dark bars) and Stage 2 (light bars).
[ Normal View | Magnified View ]
For the Boston Testbed, the adaptive resonance theory (ART) knowledge discovery system correctly and robustly produces all class rules and levels, and no equivalence relations.
[ Normal View | Magnified View ]
The Boston Testbed was derived from data acquired on the morning of January 1, 2001, by the Earth Resources Observation System (EROS) Data Center, U.S. Geological Survey, Sioux Falls, SD. The 5.4 km × 9 km area includes portions of northeast Boston and suburbs, and encompasses mixed urban, suburban, industrial, water, and park spaces. Ground‐truth pixels are labeled ocean, ice, river, beach, park, residential, and industrial. Of the 216,000 pixels, 10% are labeled ocean, and only 3% represent other classes. Cross‐validation divides the image into four vertical strips, with class distributions varying substantially across strips.
[ Normal View | Magnified View ]

Browse by Topic

Psychology > Memory
Computer Science > Machine Learning
Computer Science > Neural Networks

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts