Monday, April 11, 2005

Artificial Intelligence in a Nutshell

Contents

1. Why study Artificial Intelligence?………………………………. Page 3

2. What is Artificial Intelligence?………………………………….. Page 3

3. The History of Artificial Intelligence?………………………….. Page 3

4. Stumbling blocks and Successes………………………………… Page 4

5. Situated or Embodied Cognition………………………………… Page 5

6. Evolutionary Computation……………………………………….Page 6

7. Neural networks and Neuro-computation……………………….. Page 6

8. Swarm Intelligence……………………………………………… Page 7

9. Current Status and Future Possibilities for Artificial Intelligence. Page 8

10. References……………………………………………………… Page 9





























1. Why study Artificial Intelligence?

Humans study artificial intelligence for primarily two reasons. Firstly, we wish to understand how we, as humans think. How can an organ, the brain, something so small, ‘perceive, understand, predict, and manipulate a world far larger and more complicated than itself?’ (Russel & Norvig, 2003).
Secondly, we study the field of artificial intelligence in order to attempt to create intelligent phenomena.

2. What is Artificial Intelligence?

Some definitions:
· “The study of mental faculties through the use of computational models” (Charniak & McDermott, 1985).
· “The study of the computations that make it possible to perceive, reason and act” (Winston, 1992).
· “Computational Intelligence is the study of the design of intelligent agents” (Poole et al., 1998).
· “AI is concerned with intelligent behavior in artifacts” (Nilsson, 1998).

If you were to summarise the following definitions then you would be able to say that AI is the study of ‘systems that think and act rationally.’

3. The History of Artificial Intelligence

The beginnings of artificial intelligence can be traced back as far as 500 BC with Greek philosophy proposing a precise logical set of laws by which rational thought functioned. Mathematics dating back 1600 years to the present has contributed primarily through the fields of logic, computation and probability theory. (Russel & Norvig, 2003).
Other contributions were made by the fields of economics, neuroscience, psychology (especially cognitive and behavioural psychology), engineering, cybernetics and linguistics. However, 1956 is classified as the official ‘Birth of Artificial Intelligence.’
A workshop held at Dartmouth College in 1956 and attended by the primary thinkers and researchers in the field of ‘AI’ clearly exemplified the need for a clearly defined and separate area of research, which was promptly named artificial intelligence.

Alan Turing (1950) proposed the Turing Test by which ‘intelligence’ could be defined in the following way: He proposed a test in which a human respondent poses a number of questions to both a human and a machine behind a one-way mirror. A machine is said to be intelligent if it gives responses in such a way that the human questioner cannot ascertain which is the human and which is the machine.
Turing also proposed a list of X’s which were a list of things that a machine could never do such as: Be kind, resourceful, beautiful, friendly, have initiative, have a sense of humour, tell right from wrong, make mistakes, fall in love, enjoy strawberries and cream, make someone fall in love with it, learn from experience, use words properly, be the subject of its own thought, have as much diversity of behaviour as man, do something really new.
AI researchers began however to prove Turing wrong on some of those premises by actually succeeding to do some of the above mentioned X’s. John McCarthy, an important figure in AI called it the “Look, Ma, no hands!” era.

John McCarthy, who had organised the AI workshop, which led to the ‘Birth of Artificial Intelligence’ in 1956, made 3 major contributions in 1958. He defined the machine language called LISP, which became the major programming language in the field of AI. Having the tool now, he did not have the necessary access to computer processing power. Thus he developed time sharing in order to get access to processing power. This would be akin to Microsoft’s software policy which allows different applications to access the CPU/s of a machine at pre-allocated times and proportions so as to give each of the applications at least some access to the processor and the illusion that multi-tasking is taking place. McCarthy also published a paper called ‘Programs with Common Sense’ in which he described a then hypothetical computer program called the ADVICE TAKER. It would use knowledge to find solutions to problems after being programmed with general knowledge about the world. It would also be intelligent in the sense that it would be able to ‘learn’ and acquire new knowledge without being reprogrammed (Russel & Norvig, 2003).

A collection of works published by Rumelhart & McClelland (1986) called Parallel Distributed Processing created a re-emergence of interest in neural networks. The idea behind neural networks was the premise that human cognition and memory to a large extent relied upon the connection between knowledge structures. It was proposed that this model of knowledge manipulation be incorporated into the research of AI as an alternative to the logical approach of McCarthy and others (Smolensky, 1988). It was also in contrast to the symbolic models of Newell & Simon (1976). However it seems that the neural network and symbolic models are complementary rather than exclusive (Russel & Norvig, 2003).

4. Stumbling Blocks and Successes

“Early experiments in machine evolution (now called genetic algorithms) (Friedberg, 1958; Friedberg et al., 1959) were based on the undoubtedly correct belief that by making an appropriate series of small mutations to a machine code program, one can generate a program with good performance for any particular simple task. The idea, then, was to try random mutations with a selection process to preserve mutations that seemed useful. Despite thousands of hours of CPU time, almost no progress was demonstrated. Modern genetic algorithms use better representations and have shown more success” (Rusell & Norvig, 2003).

In the period of 1969-1979 there was an attempt to develop knowledge-based systems that solved problems via “cookbook recipes.” Feigenbaum et al. (1971) proposed that, “All the relevant theoretical knowledge to solve these problems has been mapped over from its general form in the (“first principles”) to efficient special forms (“cookbook recipes”).” “Such approaches have been called weak methods, because, although general, they do not scale up to large or difficult problem instances. The alternative to weak methods is to use more powerful, domain-specific knowledge that allows larger reasoning steps and can more easily handle typically occurring cases in narrow areas of expertise. One might say that to solve a hard problem, you have to almost know the answer already. The DENDRAL program (Buchanan et al,. 1969) was an early example of this approach.

The significance of DENDRAL was that it was the first successful knowledge-intensive system: its expertise derived from large numbers of special-purpose rules. Later systems also incorporated the main theme of McCarthy’s Advice Taker approach – the clean separation of the knowledge (in the form of rules) from the reasoning component. With this lesson in mind, Feigenbaum and others at Sanford began the Heuristic Programming Project (HPP), to investigate the extent to which the new methodology of expert systems could be applied to other areas of human expertise. The next major effort was in the area of medical diagnosis. Feigenbaum, Buchanan, and Dr. Edward Shortliffe developed MYCIN to diagnose blood infections. With about 450 rules, MYCIN was able to perform as well as some experts, and considerably better than junior doctors. It also contained two major differences from DENDRAL. Firs, unlike the DENDRAL rules, no general theoretical model existed from which the MYCIN rules could be deduced. They had to be acquired from extensive interviewing of experts, who in turn acquired them from textbooks, other experts, and direct experience of cases. Second, the rules had to reflect the uncertainty called certainty factors which seemed (at the time) to fit well with how doctors assessed the impact of evidence on the diagnosis ” (Russel & Norvig, 2003).

In terms of understanding natural language the following is interesting. “At Yale, the linguist-turned-AI-researcher Roger Shank emphasised this point, claiming, “There is no such thing as syntax,” which upset a lot of linguists, but did serve to start a useful discussion. Schank and his students built a series of programs (Schank & Abelson, 1977; Wilensky, 1978; Schank & Riesbeck, 1981; Dyer, 1983) that all had the task of understanding natural language. The emphasis, however, was less on language per se and more on the problems of representing and reasoning with the knowledge required for language understanding. The problems included representing stereotypical situations (Cullingford, 1981), describing human memory organisation (Rieger, 1976; Kolodner, 1983), and understanding plans and goals (Wilensky, 1983).

5. Situated or Embodied Cognition

“Many researchers have emphasized the importance of studying cognition in the context of agent-environment interaction and sensorimotor activity. As a consequence, traditional cognitive scientific notions of, for example, internal representation and computation have come under attack, and there is a growing interest in both the bodily/biological mechanisms underlying cognition and the role of the environment (including other agents, artifacts, etc.)” (Ziemke, 2002).

Thus the premise that the physical world has an influence on the process of computation/cognitive functioning of an entity. It is assumed that cognition without a body (physical aspects and interaction) is different and incomplete from cognition with a body.



6. Evolutionary Computation

Modern evolutionary computation essentially involves genetic algorithms, which are predetermined mathematical agents that create, modify and select other algorithms via a type of evolutionary process. These algorithms are then employed in various ways to calculate and process various types of data and/or information.

Heitkotter & Beasley (2000) elaborate: “Evolutionary algorithm is an umbrella term used to describe computer-based problem solving systems which use computational models of some of the known mechanisms of evolution as key elements in their design and implementation. A variety of evolutionary algorithms have been proposed. The major ones are: GENETIC ALGORITHMS, EVOLUTIONARY PROGRAMMING, EVOLUTION STRATEGIES, CLASSIFIER SYSTEMS, and GENETIC PROGRAMMING. They all share a common conceptual base of simulating the evolution of individual structures via processes of SELECTION, MUTATION, and REPRODUCTION. The processes depend on the perceived PERFORMANCE of the individual structures as defined by an ENVIRONMENT.
More precisely, EAs maintain a POPULATION of structures, that evolve according to rules of selection and other operators, that are referred to as "search operators", (or GENETIC OPERATORS), such as RECOMBINATION and mutation. Each individual in the population receives a measure of its FITNESS in the environment. Reproduction focuses attention on high fitness individuals, thus exploiting (cf. EXPLOITATION) the available fitness information. Recombination and mutation perturb those individuals, providing general heuristics for EXPLORATION. Although simplistic from a biologist's viewpoint, these algorithms are sufficiently complex to provide robust and powerful adaptive search mechanisms (p.1).

7. Neural Networks and Neuro-computation

An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurones) working in unison to solve specific problems. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological systems involves adjustments to the synaptic connections that exist between the neurones. This is true of ANNs as well.
Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network can be thought of as an "expert" in the category of information it has been given to analyze. This expert can then be used to provide projections given new situations of interest and answer "what if" questions.Other advantages include:
Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience.
Self-Organisation: An ANN can create its own organisation or representation of the information it receives during learning time.
Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability.
Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage (Stergio & Siganos, Unpublished)

8. Swarm Intelligence

Swarm Intelligence (SI) is the property of a system whereby the collective behaviours of (unsophisticated) agents interacting locally with their environment cause coherent functional global patterns to emerge. SI provides a basis with which it is possible to explore collective (or distributed) problem solving without centralized control or the provision of a global model.
“Researchers are interested in a new way of achieving a form of artificial intelligence called swarm intelligence, or the collective emergent intelligence of groups of simple agents. The term has been used to refer to “any attempt to design algorithms and distributed problem-solving devices inspired by the behavior of social insect colonies and other animal societies” (Theraulaz, 1999).

The Swarm Intelligence approach argues that there may exist an alternative approach to problem solving that operates at a level above our traditional problem-solving processes. The individual agents do not know they are solving the problem, but their collective interaction actually solves the problem (Perez-Uribe, unpublished).
9. Current Status and Future Possibilities of Artificial Intelligence

From the brief summary of the history and current developments of artificial intelligence provided in this article, it can be concluded that the field of AI has come a very long way in the development of artificially intelligent systems. It is especially evident that AI has made great strides in the development and mapping of the materialistic or reductionistic aspects of intelligence. Thus intelligence per se has been viewed as something that can be calculated and mathematically extrapolated. Intelligence is something that is ordered, logical and mathematically decipherable. AI also seems to assume to a large extent that a physical interactive body is necessary for intelligent computation and action. Virtually all the models proposed for AI are inherently ‘unconscious’ and materialistic and/or reductionistic, ignoring the body of evidence that seems to indicate that intelligence does not just consist of logical and unconscious structural rules maintained by some sort of system extended in time and space.
In my view, intelligence in its most basic form can to some extent be reduced to logical laws and theoretical anomalies. However, it is quite apparent that a ‘higher’ level of intelligence exists that seems to defy logic and materialistic determinism. AI may be making great strides in the mapping of intelligent laws and algorithms onto computerised systems, but it seems a long way off from creating an intelligent system that closely resembles human intelligence. Perhaps AI needs to focus more on the problem of consciousness as this seems to be the quintessential difference between artificial and actual intelligence. It may be argued that animals are intelligent and show little evidence for self-consciousness. However, is self-consciousness necessary for a ‘near human’ level of intelligence? These are I believe, some of the problems that AI still needs to focus major attention on if it wishes to rival anything ‘human’ in terms of artificial intelligence in the near future.





10. References

Buchanan, B. G., Sutherland, G. L., & Feigenbaum, E. A. (1969). Heuristic DENDRAL: A program for generating explanatory hypotheses in organic chemistry. In Meltzer, B., Michie, D., & Swann, M. (Eds.), Machine Intelligence 4, pp. 209-254. Scotland: Edinburgh University Press.

Charniak, E. & McDermott, D. (1985). Introduction to Artificial Intelligence. Massachusetts: Addison-Wesley.

Cullingford, R. E. (1981). Integrating knowledge sources for computer “understanding” tasks. IEEE Transactions on Systems, Man and Cybernetics (SMC), 11.

Dyer, M. (1983). In-Depth Understanding. Massachusetts: Cambridge MIT Press.

Feigenbaum, E. A., Buchanan, B. G., & Lederberg, J. (1971). On generality and problem solving: A case study using the DENDRAL program. In Meltzer, B. & Mitchie, D. (Eds.), Machine Intelligence 6, pp. 165-190. Scotland: Edinburgh University Press.

Friedburg, R. M. (1958). A learning machine: Part I. IBM Journal, 2, 2-13.

Friedburg, R. M., Dunham, B., & North, T. (1959). A learning machine: Part II. IBM Journal of Research and Development, 3(3), 282-287.

Heitkötter, J. & Beasley, D. (2000). An Overview of Evolutionary Computation. Hitch Hiker's Guide to Evolutionary Computation, 1(8), 442-459.

Kolodner, J. (1983). Reconstructive memory: A computer model. Cognitive Science, 7, 281-328.

Nilsson, N. J. (1998). Artificial Intelligence: A New Synthesis. California: Morgan Kaufmann.

Perez-Uribe, A. (Unpublished) Swarm Intelligence.
http://dsp.jpl.nasa.gov/members/payman/swarm/perez-uribe.pdf, 23 March, 2005.

Poole, D., Mackworth, A. K., & Goebel, R. (1988). Computational intelligence: A logical approach. Oxford: Oxford University Press.

Rieger, C. (1976). An organisation of knowledge for problem solving and language comprehension. Artificial Intelligence, 7, 89-127.

Rumelhart, D. E. & McClelland, J. L. (Eds.). (1986). Parallel Distributed Processing. Massachusetts: Cambridge MIT Press.
Russel, S. & Norvig, P. (2003). Artificial Intelligence – A Modern Approach (2nd ed.). New Jersey: Prentice Hall.

Schank, R. C. & Abelson, R. P. (1977). Scripts, Plans, Goals, and Understanding. Maryland: Lawrence Erlbaum Associates.

Schank, R. C. & Riesbeck, C. (1981). Inside Computer Understanding: Five Programs Plus Miniatures. Maryland: Lawrence Erlbaum Associates.

Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 2, 1-74.

Stergio, C. & Siganos, D. (Unpublished) Neural Networks.
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html, 23 March, 2005.

Turing, A. (1936). Computing machinery and intelligence. Mind, 59, 433-460.

Wilensky, R. (1978). Understanding goal-based stories. Ph.D. thesis, Yale University, New Haven, Connecticut.

Wilensky, R. (1983). Planning and Understanding. Massachusetts: Addison-Wesley.

Winston, P. H. (1992). Artificial Intelligence. (3rd Edition). Massachusetts: Addison-Wesley.

Ziemke, T. (2002) Situated and Embodied Cognition.
http://www.ida.his.se/ida/~tom/SEC.html, 23 March, 2005.