Neural
Networks is an advanced technique in building artificially intelligent
machines. I will more specifically define artificially intelligent as machines
that think at least as well as humans on a specific task. There are many things we do not know the
definitive answers to with our current technology or methodology. The knowledge of the brain is what has led
us to the design of neural networks.
The increasing knowledge of brain physiology is what is helping us come
closer to building effective neural networks in theory. Hardware limitations aside theoretically we
can use neural networks to help us learn how to mimic the brain intelligence
artificially. The application of Chaos
Theory to AI and neural is another interesting and very important consideration
I will make.
In contrast
to the notion of the processing of a normal computer neural networks cause 4
major differences that vastly change how the computer processes
information. The first difference is in
adaptive learning. It can react to different
sets of input without having being told how to react
to every imaginable possible set of input as with conventional computing. Thus it can use general properties of the
input data to make a decision. As it receives new input at various times it will
improvise what it “knows” so in essence it becomes more intelligent in
computations. It adapts to new
information as it receives it about its input.
The second major feature is self-organization. The neural network will organize its
structure in response of its input data sets.
In most neural network types this is simply the strength of certain
links between neurons. This leads to the
network reflecting and organizing itself based on the type of information it
receives. Third difference is error tolerance.
The neural network can generalize the input data sets it has received so
if there is slight errors in the data due to accidental or intentional reasons
the network will be rather forgiving in its decisions. Finally parallel
information processing. Neural
networks have the inherent ability to process input and output data in
parallel.
Conventional
computers follow a preprogrammed algorithmic approach. A computer follows step by step of exact
instructions and it must know everything at that time to solve the problem at
hand. On the other hand neural networks
solve by learning from different examples and adjusting their network in
response. The disadvantage of this over
conventional computers is their learning is unpredictable. We may think they are learning something one
way when in fact they have learned something else unexpectedly from the data
input. We also have little control over
what they learn.
A simple
example is if I gave a neural network a white background with nothing on it as
input one. Input number two as white
background blue square. Third input as white background purple circle. Fourth input as green triangle. My goal is for the neural network to detect
if there was a circle. From this example we can see that it could be trained
either to detect any object that has the color purple or an object that is just
a circle regardless of color or it could look for only circles that are purple
colored. We don’t know what it has
really learned at this point. The data
set is limited. No matter which of the
three ways it has learned, the output (solution) will be the same for this
example. We have no control over what
it has learned in this situation. Giving
it a wide range of random different kinds of examples will help gear it to
learning what we want it to learn. It is
not an easy to know when it is learning exactly what we want it to learn. There can be more then one pattern seen in a
group of data sets.
An
artificial network uses a node to represent a neuron in a brain. Structurally it looks like a graph. The links to nodes have weights just like an
edge in a graph. It has weighted inputs
that only cause the weighted output to fire (in computer science terms return a
Boolean value of 1) when the sum total has reached some determined
threshold. Weights theoretically can be
positive (excitatory) or negative (inhibitory).
There is usually “layers” of nodes.
Each neuron in a layer works in parallel. The inputs and outputs of the network are
also done in parallel. These ideas have
come from the neuron in the brain where a certain threshold of firings from the
dendrites (inputs) can cause the axon (output of neuron) to fire. These nodes are connected to other neurons
various ways depending on the type of net.
There are 4 possible types of connections between nodes. Feed-forward brings weighted output from the
node in a layer to the next layer’s nodes.
Feedback connections bring weighted output back to a lower level
layer. Lateral connections are weighted
outputs between the nodes in only their own layer. Finally time-delayed
connections, which add a time element.
A node does not have to do any of these connections between nodes
globally. In other words if we have a
feed-forward connection to one node in the next layer it is possible not all of
the nodes of the next layer have to be connected too.
Some
common, but rather relatively simple networks are the perceptron network, the
Hopfield net, the Adaptive Resonance Theory (ART) network and the
Self-Organizing Map of Kohonen. The
perceptron is a multi-layered feed-forward network. It has no lateral or feedback connections.
The Hopfield is a one layered network with all lateral connections between
nodes of that one layer. ART has
bi-directional connections between layers.
It “resonates” a certain amount of times before
it propagates to the next layer. The SOM
of Kohonen is where each node has a feature vector and is compared to the input
of the neural network until the closest match is found. This vector is then updated and the
neighboring nodes are updated closer or farther away from it depending on
implementation.
There are
various algorithms for the network to learn from its data sets. Some are better for certain types of
networks. Every Neural network
mentioned except the SOM of Kohonen uses a supervised learning algorithm. This means the data of the output is compared
to the output that is expected and then adjusted (usually weight changes on
connections) until it has learned correctly the appropriate expected output
from the input. This is in
correspondence to the theory that neurons strengthen their synaptic gaps in the
brain to influence the interaction between neurons. The simplest example is the
perceptron. It uses the backpropogation
algorithm to learn which is the standard algorithm and easiest to understand.
When the network gets output it is compared to what is expected for the input
and then the weights of the links between the nodes of the outermost layer are
decreased or increased to reduce error output. Then the layer before that is
fixed and so on until it is back to the input layer. You must cycle through the whole range of the
sample data set types before going back again for another cycle. If you have it learn one input type and then
move on and never go back again for that particular input type, then it will
forget previous memory of the previous input. For example if the input is the
alphabet you would cycle through a to z over and over
instead of cycling through letter a many times before moving onto letter
b. We want it to learn an average of
weight values for the given set of data inputs.
This keeps things general and not too specific to one particular input
and causing it to only remember the most recent input type. The SOM of Kohonen
is an example of unsupervised learning, which is not driven by comparing
expected output from a given input, but instead just by its input data
sets. It learns the input data and
clusters the data in a self-organization process to reflect the properties of
the input data. An example would be if
we had a large data set of numbers and wanted to know more information about
its properties. There is no expected
output to calculate so the only way to learn about the properties is for the
network to internally organize itself based on its input. A pattern of organization will emerge as the
network learns from the data set. The
pattern can be displayed as a tree structure showing how the data relates to
one another. http://odur.let.rug.nl/~kleiweg/kohonen/kohonen.html
There are
many ways to build a neural network.
Not all-neural networks are created equal. Some of the practical application areas for
neural networks is in pattern recognition (simple walk
through example of image pattern recognition http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html#
Firing rules), filtering, data segmentation and compression,
optimization. Some have been engineered
towards doing some of these tasks, but there is no hard and
set rules on what a network can and cannot do.
While some of the generic aforementioned networks may seem like enough on the surface to solve a variety of problems, they may not be enough or even adequate for many kinds of nontrivial problems that AI usually encompasses. If the above models were too simplistic it would be like having a monkey doing linear algebra. While it may do other tasks fine a monkey cannot do certain complex things such as linear algebra. The previously mentioned basic models do not model a complex system like the brain accurately. The general model of these networks can do some tasks intelligently to a very limited extent. If they did them as well as humans that would be the end of the story and all new research would be halted. That is not the case however and there is research today into making them more like the brain. We must look further for answers in advanced neurobiology to see if and how we can improve on the above kinds of models. The study of chaos in conjunction with neurobiology is only roughly 10 years old. This is extremely young field relatively speaking. It appears chaos in the brain has been discovered on both the microscopic (neural) and macroscopic level. Most new research on neural networks is learning how to simulate brain activity on the macroscopic and microscopic level. To understand what we mean by chaos and its possible relevance to neural network construction we will have to delve into more detail on it.
Chaos theory had slow growth for most of the 20th century and is now starting to get a lot of attention by some scientist in the last couple decades. It is a new paradigm of science and it seems that it may hold some answers or in some areas at least a new way to view the world around us.
Henri Poincare was the first one
to suggest anything of the likes of modern chaos theory. At the end of the 19th century
King Oscar II of
He was the first one to come up with this revolutionary idea of unpredictability over the long-term coming out of determinism mathematical formulas. This means that if I give a mathematical formula a slight change in its input it will lead to vastly different output. It is unpredictable on what will happen with each slightly different input. This is only possible due to the nonlinearity inherent in the mathematical formula. Even though it is deterministic (not random) it is unpredictable for a given set of inputs.
Unlike classical Newtonian mechanics where predictability was relatively easy based on formulas and where there were no “surprises” just based on what the initial conditions were. The non-linearity is necessity for chaos, but not all-unpredictable nor deterministic systems are chaotic. Deterministic chaos is not the only explanation of random, noisy or unpredictable behavior. Since it is deterministic there is no randomness built into deterministic chaos, but through its repeated iterations over the long term it becomes totally unpredictable. A graph of the movement of a chaotic expression is cyclical in fashion but it is not predictable in what the graph will look like for a given input. An example to demonstrate this conceptually is a function with a modulo in it. You can start with input that is only different in very small amount and after a while on the graph you get a totally different looking pattern. There was no way to predict the other based on the other one even though they started with very similar inputs.
In the 1960’s when Edward Lorenz accidentally put in a different value for weather simulation program he got unpredictable results. His input only varied by a very tiny amount since he accidentally put in the wrong digit but he still got widely varying results. The Lorenz attractors, which are a solution to the Lorenz differential equations for atmosphere, are two lobes where they jump back and forth in a unpredictable fashion.
We do not fully understand chaos theory and the universe as a whole right now. This makes things difficult when putting it into practical applications such as neural networks. Knowing if something is exactly chaotic or not is also hard to determine. There is “tests” to see if something is chaotic, but they don’t really diagnose for absolute certainty. For example we once thought we had tests proving that the heart was chaotic. When research was done on dogs they found that it was not chaotic in the mathematical sense. The tests gave a false conclusion. It is hard to say what chaos is exactly and when we exactly see it in nature if the tests failed. It is hard to come up with tests to determine what it is because we obviously don’t completely understand what it is if the tests do not really work. We can not simply at face value tell if it’s just random noise or if it has some underlying deterministic behavior.
Random activity had been seen by brain physiologist for a while, but they had originally thought it was just irrelevant random noise. Now some believe this is not noise at all, but this is true chaotic activity essential to brain functioning. However some believe that chaotic behavior is a side effect not an integral part to have a well functioning brain. Gail Carpenter and Stephen Grossberg, founders of adaptive resonance theory believe that it can be achieved in artificial neural networks in other ways without doing chaos. There has been research into more advanced ART (adaptive resonance theory) networks since then. Dr. Walter Freeman a UC Berkeley Neurophysiologist who has written many articles on chaos in the brain disagrees strongly. He believes chaos in integral part to neural network and brain functioning. There seems to be quite a bit of evidence he may be right. In comparing artificial neural networks to the brain Freeman and his co-researchers said:
“…pattern recognition
systems based on the perceptron... operate by relaxation to one of a
collectionof equlibrium states, constituting the minimization of an energy
function" on the other hand "…biological pattern recognition systems
do not go to equilibrium and do not minimize an energy function. Instead, they
maintain continuing oscillatory activity, sometimes nearly periodic but most
commonly chaotic." 1
With
artificial networks they are trying to capture the brains capabilities and
hence they try to model what exactly happens down to the neuron level. An interesting theory is that maybe chaos
theory is a necessity for neural networks for advanced intelligence since it
may be a requirement for true consciousness in the human brain. Freeman believes that chaos is the difference
between the brain and an artificially intelligent machine that can only act in
a controlled environment. Thus chaos
may be a necessity if Freeman is right if we want to make a
artificially intelligent machine through neural networks that has human like
intelligence.
It is
unknown whether microscopic (neuron) level chaos is critical to the macroscopic
chaotic behavior. That’s because chaotic
behavior can be modeled on the macroscopic level with more traditional models. According to research macroscopic chaos was
found to be an unintentional by product of neural networks with special
properties that were based on feed-forward and feedback nodes, such as a
Hopfield net. If these types of nets
have inhibitory and excitatory links can display chaotic behavior
macroscopically. It also cropped up
surprisingly when the nodes themselves were made the task of excitation or
inhibition instead of being neutral as normal. This was to simulate the
"Dale hypothesis" that in the brain each neuron has only an
excitatory or an inhibitory nature. This is profound since the brain also has
inhibitory and excitatory connections and explains the up and down motion of
EEG’s. When chaos was added to a
Hopfield type net it was able to engage in selective learning. Thus it could recognize specific classes of stimulus
and the rest were ignored. A neural
network that was to identify 4 different types of industrial parts did not do
as good in identifying defective and non-defective parts as the one with chaos
added to it.
Some
examples of structures that could be added to a neural network to display
important chaotic behavior at the macroscopic layer is; interlayer and
intralayer connections, excitatory and inhibitory links, ability of weights on
links to switch from negative to positive, neurodes that can display individual
chaotic behavior, nodes that are either excitatory or inhibitory instead of
neutral. This chaotic behavior could
lead to selective memorization, faster recognition of learned patterns,
recognition of new patterns and creation of new categories for these newly
found patterns, and better pattern recognition.
Widespread
use of chaos in artificial neural networks has not happened. Most of its been
displayed in limited specific applications.
The evidence seems to point to the notion that Freeman may be
right. It might be that chaos is a
necessity in building advanced neural networks and in brain processing and not
just a side effect or one way to get there.
There
is also research into many other dynamics to help mimic closer to how the brain
may think intelligently. One example is
fractal neural networks. They are
organized in a hierarchical fashion to process information in a modular
hierarchical fashion. http://www.comp.nus.edu.sg/~inns/IvoWidjaja/report.htm
.It is hard to say which approach is most effective in making a network be more
intelligent. In time though their use in
practical applications, maybe we will find out.
References:
Freeman,
Walter J. (1991) The Physiology of
Perception, Scientific American,
Vol. 264, (2) Pg. 78-85
http://sulcus.berkeley.edu/FreemanWWW/manuscripts/IE1/91.html
Stergiou,
Christos and Siganos, Dimitrios Neural Networks,
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html#Why%20use%20neural%20networks
http://www.gc.ssr.upm.es/inves/neural/ann1/concepts/structnn.htm
http://koti.mbnet.fi/~phodju/nenet/NeuralNetworks/NeuralNetworks.html
Gross,
David The Importance of Chaos Theory in the Development of Artificial Neural
Systems http://www.geocities.com/CapeCanaveral/Lab/3765/chaos/neuro1.html
Bradley, Stewart Chaos Theory http://students.bath.ac.uk/ma2bs/chaostheoryhomepage.html
Chaos in the Solar System, ICIC center for mathematical sciences
http://www.icmsstephens.com/chaos1.htm
Extended
Kohonen Maps
http://odur.let.rug.nl/~kleiweg/kohonen/kohonen.html#alg
Bowles, Richard, Richard
Bowles Idiot Guide to Neural Networks
http://richardbowles.tripod.com/neural/hopfield/hopfield.htm
Cambel, A.B (1993) Applied Chaos Theory, Chapter 11
Widjaja, Ivo, Fractal
Neural Network
http://www.comp.nus.edu.sg/~inns/IvoWidjaja/report.htm