|
RESULTS OF RESEARCH
The description of
results will be organised around the major research issues addressed during
the research. For more details on the individual models and results, please
download the publications.
§
Grounding
transfer in neural networks
§
Evolution
of syntax
§
Co-evolution
of language and brain
§
Language
universals
§
Language
in evolutionary robots
1 Symbols
and Words: Grounding transfer in neural networks
During the research two new models were developed
to address the symbol-grounding problem (Harnad, 1990) and
grounding-transfer capabilities of neural networks. The first model was
based on an extension of Cangelosi et al. (2000) work to deal with larger
category sets and to look at different strategies for the transfer of
grounding in new categories. The architecture of the network was
significantly changed. The connections were modularly organised, by
dividing hidden units into two groups, one for processing shapes, and one
for features. The networks were first trained to classify retinal images
using the backpropagation algorithm. Training stimuli consisted of four
animal shapes (e.g., horses) and four texture features (e.g., stripes).
Subsequently, nets were trained to name these stimuli (entry-level stage).
Learning occurs during these two phases trough direct trial and error
experience supervised by corrective feedback (‘sensorimotor toil’).
Therefore, names acquired this way can be considered as symbols grounded in
retinal input. During the third training phase (higher-level stage) the
nets acquired new names (e.g. "zebra") defined on the sole basis
of symbolic strings containing combinations of previously grounded names
(‘symbolic theft’). In the following test phase (symbol transfer test), new
retina images exhibiting combinations of previously learned shapes and
features were presented to the networks (e.g. images of zebras obtained by
combining a horse shape and the stripe feature). Although these nets had
never seen these images before, they were able to correctly categorise and
name images with entry- and higher-level symbols. This result clearly
showed that grounding is "transferred" from directly grounded
names to higher-order ones. Moreover, the networks were able to give the
correct sensorimotor response when they received the name of a higher-level
category in input (inverse grounding transfer).
In a different model, the architecture of the
network was changed (Figure 1 Left). Modular connections were kept only
between the hidden and output layer, while the input and hidden layer were
fully connected. In fact, the previous separation of connections from
regions of the input retina to groups of hidden units was implausible and
only due to the peculiar organisation of stimuli. In this new simulation, a
new and expanded stimulus set was used (Figure 1 Right). Again, the
networks were able to transfer the grounding to new categories acquired via
language.

Figure 1: (Left) Neural network architecture for second
model with hidden-output modularity. (Right) Sample of images for
categorisation training and test
All these results support the approach to symbol
grounding based on fully connectionist models. The same network processes
both the sensorimotor grounding and the acquisition of new categories
through symbolic learning. The modular organisation of the hidden units
suggests that it is important that sensorimotor grounding is separated for
different classification features. In fact, when a fully distributed
network was used, the grounding transfer was difficult to achieve in
scaled-up models. We are currently
working on various extensions of this model. For example, in order to
improve the psychological plausibility and scalability of connectionist
approaches to the symbol grounding, new algorithms are being used, such as
Kohonen’s self organizing maps and Hebbian learning the basic
categorization stage. These extensions will be part of Thomas Riga’s PhD
research programme in Plymouth.
2
Evolution of syntax
Two main simulations were conducted to study the
evolution of syntax in grounded multi-agent systems. Both are based on the
original model of the PI (Cangelosi 1999; 2001) that showed the emergence
of early syntactic categories resembling the word categories of verbs and
nouns. A first study directly expanded the 1999 model (mushroom foraging
scenario) and looked at the interaction between language and learning in
the form of the Baldwin effect. In the second study, a completely new model
was developed to analyse the evolutionary acquisition of verbs and nouns in
object manipulation tasks.
The Baldwin Effect has been explicitly used as an
argument for the explanation of the origins of language and the evolution
of a Language Acquisition Device. In new simulations on the evolution of
compositional languages in foraging agents (Cangelosi 2001), the role of
cultural variation and of learning costs in the Baldwin Effect was
specifically addresses. Results showed that when there is a high cost
associated with language learning, agents gradually assimilate in their
genome some explicit features (e.g. lexical properties) of the specific
language they are exposed to. When the structure of the language is allowed
to vary using a process of cultural transmission, Baldwinian processes
cause, instead, the assimilation of a predisposition to learn, rather than
any structural properties associated with a specific language (Figure2).
The analysis of the mechanisms underlying such a predisposition (using
categorical perception techniques) supports Deacon's hypothesis regarding
the Baldwinian inheritance of general underlying cognitive capabilities
that serve language acquisition. This is in opposition to the thesis that
argues for assimilation of structural properties needed for the
specification of a fully blown Language Acquisition Device (Pinker &
Bloom, 1990.

Figure 2 - The difference in learning error between the
initial and final generations indicates the presence of the Baldwin effect
for a predisposition to learn. Rather than possessing the full lexicon in
the first epoch, the network starts with a high error level, which
decreases quickly in few generations.
The second model used a different behavioural
task for the evolution of verbs and nouns. It simulated a simple
two-segment arm that had to manipulate objects in a 2-D environment. The
lexicon of verbs (names of actions) and nouns (names of objects) was not
evolved autonomously by the agents, but was provided externally. The
analyses of results shed some light on the reciprocal influences between
language and non-linguistic cognition, on the differences between nouns and
verbs, and on the internal organization of neural networks that use
language in an ecological context. In particular it was shown that language
has a beneficial effect on non-linguistic cognition if it emerges on the
already existing basis of non-linguistic skills, but not if it evolves
together with them. The basis for this beneficial influence of language on
behaviour appears to be that language produces better internal categorical
representations of reality. That is, more similar representations of
different situations that must be responded to with the same action, and
more different internal representations of similar situations that must be
responded to with different behaviours. This effect is accentuated in verbs
(Figure 3). Verbs have a more beneficial effect on behaviour than nouns
because verbs, by their nature, tend to covary with the organism's actions
while nouns tend to covary with the objects of reality that may be
responded to with different actions in different occasions. Finally, the
model also permits some comparisons between the computational model of
language evolution and the literature on children’s language acquisition.
It shows that the evolution of nouns precedes that of verbs, as observed in
children’s language development (Tomasello & Brooks, 1999)

Figure 3 - Categorical perception measurements in the
model of the evolution of verbs and nouns. Note the increase in
between-category distances between no_language and all language conditions,
and between noun_only and verb_only conditions.
3 Co-evolution of language
and brain
Categorisation and language are some of the
fundamental abilities of cognitive organisms. Computational modelling
through neural networks permits the investigation of the functional role of
categorical perception in category learning and language acquisition. In
particular, the use of neural network models permits the investigation of
the neural mechanisms underlying both learning phenomena. Categorical
perception have been hypothesized to constitute the groundwork of
cognition, as in the case of the acquisition and evolution of language
(Harnad, 1987; Cangelosi & Harnad, 2000). In addition, it has been
hypothesised (Deacon, 1997) that the co-evolution of language and other
cognitive and symbolic abilities have played a major role in the evolution
of human language.
As seen in Figure 3, evolutionary neural networks
produce enhanced categorical perception effects have been found in
syntactic categories such as nouns and verbs. When the network must respond
to the same object in different contexts with different actions (verbs),
the similarity space of verbs is optimised with respect to that of nouns
(Cangelosi & Parisi, 2001). It was also hypothesized that verbs have a
more beneficial effect on behaviour than nouns because the latter tend to
covary with the network's sensorimotor tasks (actions/verbs), while nouns
tend to covary with the objects of reality that may be responded to with
different actions in different occasions. To understand better the neural
mechanisms behind category learning, language processing and sensorimotor
knowledge, the method of synthetic brain imaging (Arbib et al., 2000) has
been applied to these artificial neural networks. The PI adapted Arbib’s et
al. computational neuroscience method to the evolutionary connectionist
models used in this research. Analyses on data obtained from different
experimental conditions (e.g. manipulations of the network architecture)
showed that the representations of perceptual categories and syntactic
classes are sensitive to the internal organization of the network and to
the level of integration of linguistic information with sensorimotor
knowledge (Cangelosi & Parisi, 2003). Moreover, these models show
functional organizations that reflect those observed in human experiments
(Cappa & Perani 2003). For example, the synthetic brain imaging on the
evolutionary model showed that verbs are active in the network module
specialised for integrating sensorimotor knowledge (corresponding to the
Prefrontal motor cortex), while nouns are active in the sensory/associative
processing module (corresponding to associative, temporal areas of the
brain) (Figure 4).

Figure 4 – Data for synthetic brain imaging (fMRI) in
the neural network. Note the high activation for nouns in the first hidden
layer (sensory processing module) and the high activation for verbs in the
second hidden layer (sensorimotor integration module).
These neural network and synthetic brain imaging
studies support a series of general hypothesis on the interaction between
category learning, language emergence and the evolution of the brain.
First, categorical perception induced by language can be seen as an
instance of the Whorfian Hypothesis (Whorf, 1964). Our language influences
the way the world looks to us. Second, the enhancement of dissimilarities
in the category similarity space due to language acquisition (symbolic
theft) and its beneficial effects in the emergence of language highlight
some of the evolutionary and adaptive advantages of language. This can also
be used to support Deacon’s hypothesis on the co-evolution of language and
brain. Finally, the use of evolutionary models of category and language
learning produces some functional and architectural equivalence between
cognitive computational models and real organisms.
4 Language
universals under cognitive constraints
It is not always straightforward to draw
conclusions about the evolution of characteristically human language from
the results of abstract computational models (Turner, 2002). For instance,
in Kirby’s (2001) simulations, a combinatorial syntax emerges in the
languages of the agents, but the ability to combine symbols to create
composite meanings is as equally characteristic of computer programming
‘languages’ and other language-like systems as it is of natural human
languages. Some examples of properties that are unique to the syntax of
human language are case constraints (e.g., in English these constraints
determine where a speaker can use he versus him), agreement constraints
(e.g., in English these constraints make utterances like “I are happy”
ungrammatical), binding constraints (e.g., in English these constraints
require him to refer to someone other than George in a sentence like
“George attacked him”, but allow co-reference in a sentence like “George’s
father attacked him”), constraints on displacement (e.g., in English these
constraints allow the man to appear in different structural positions in
paraphrases like “The dog bit the man” and
“The man was bitten”), constraints on word-order (e.g., subjects
precede verbs in the basic word order of all languages), and other
constraints on structure (e.g., sentences in all languages are
hierarchically structured in a way that can be represented using binary
trees). All of these constraints are found in some form in all human
languages, but there is almost no research in the literature addressing
their evolutionary origins. One of the research activities made possible by
the grant was the development of a new computational methodology to begin
to address this question.
The method, developed by Huck Turner, comes from
the realisation that computers now make it possible to perform optimisations
very rapidly so that hypotheses of the form “language constraint X is
optimal for Y” can be tested realistically. For instance, if it could be
shown computationally that the word orders observed in human language texts
are optimal for minimising demands on working memory, then given that there
are many more ways that words could be ordered that are not optimal, it
would be surprising if working memory didn’t have something to do with the
evolution of the specific word orders that we observe. Indeed this specific
hypothesis has received preliminary support under the research undertaken
so far.
In exploring the link between word order and
working memory, it was necessary to develop a theory of what is represented
when an utterance is held in working memory. The theory that Huck Turner
developed is essentially a representational variant of the derivational
theory of Chomsky (1995, chapter 4), but with some modifications. Under the
theory, a representation of a sentence’s structure can be completely
specified by the unordered set of ‘dependencies’ that hold between its
tokens. We therefore have a hypothesis about what is held in syntactic
working memory when a sentence has been parsed. The descriptive adequacy of
the theory was tested by applying it to the description of the first
chapter of The Wizard of Oz. It was unclear how the formalism could be used
to describe certain phenomena appearing in this sample including idioms,
aspects of non-declarative quoted speech, and co-ordinated structures arising
from the use of conjunctions, but most of the rest of the text can be
described with only trivial modifications.
The theory was implemented in a computational
parser to demonstrate how this representation could be constructed from an
input text. The parser identifies dependencies by matching features of
lexical tokens. The system has a number of interesting properties that may
have useful applications: 1) It can analyse the structure of a sentence
incrementally so does not need to wait until the whole sentence is input to
start processing, 2) it can determine the category of unknown words based
on the context in which they occur thus providing the potential to
automatically acquire syntactic information, and 3) text from different
languages can be mixed together in the same file and parsed without having
to specify which part of the text is from which language.
Word-order optimisation studies are on-going.
Huck Turner is in the process of producing a version of the Wizard of Oz
sample text with optimised word order to see if the resulting word order is
the same as that of the original English text. The variable that is being
optimised is the length of the dependencies in the structural descriptions
of the sentences – a property that is a function of word-order
specifications. If the optimal word order of the resulting text is the same
as English word order, then it will suggest an explanation for why English
has the word order that it has.
5
Emergence of communication in evolutionary robots
Evolutionary robotics approach has been
successfully applied to the synthesis of robots able to exploit
sensorimotor coordination (Nolfi, 2002), body and brain co-evolution
(Lipson and Pollack, 2000) and competing and cooperative collective
behaviours (Baldassarre et al., 2002). In this grant, a new model was
developed by Davide Marocco to do new experiments on the emergence of
communication. They are based on an extended version of Nolfi and Marocco’s
(2002) model for the emergence of sensorimotor categorization. In this new
model, the robotic agents share the explicit categorization of objects
(spheres and cubes) with which it interacts (Figure 5). The activation of
the output linguistic nodes in the robot’s neural controller is the signal
(“name”) sent to another agent to instruct it on what to do with the
object. Agents will be selected on their ability to manipulate objects
correctly, not on their (linguistic) ability to name them correctly. A
variety of experiments were executed to test the role of different
sensorimotor, social and cognitive factors in the emergence of
communication.

Figure 5 - View
of the evolutionary robotic arm while it interacts with the sphere
The simulation of this evolutionary robotics
model of the evolution of communication showed that: (a) the emergence of
language brings direct benefits to the agents and the population, in terms
of increased fitness and comprehension ability; (b) there is a benefit in
communicating with your kin-related agents (e.g. between parents and
children), since this improves the possibilities of successfully evolving
shared lexicons also by maintaining stable and reliable signals; (c) good
sensorimotor and cognitive abilities permit the establishment of a link
between production and comprehension/behavioural abilities; (d) the kinship
relation between speaking parents and listening offspring does not fully
explain the emergence of communication – instead, this is important in the
early stages of communication because it exploits the cognitive benefits of
positive production/fitness correlations.
Most of these results also have important
implications for the theories and hypotheses on the origins of language.
For example, this simulation highlights and explains the role of cognitive
factors in the emergence of communication (Burling, 1993). In particular,
the model supports the hypothesis that the ability to form categories
constitutes the grounding for the subsequent evolution of words and
language (Harnad, 1996; Cangelosi & Harnad, 2000 cf. also §2.1). In
addition, future developments of this model could also have an impact on
computational investigations of the mirror neuron hypothesis for the
origins of language (Arbib, 2002).
|