Acessibilidade / Reportar erro

Trajectory Planning and Learning of A Redundant Manipulator with Structured Intelligence

Abstract

Abstract This paper deals with trajectory planning and motion learning for a redundant manipulator. Recently, behavior engineering for robotic systems has been discussed as a new technological discipline. A robotic system requires the whole structure of intelligence, and acquires skill and knowledge through interaction with a dynamic environment. Consequently, the whole structure determines the potentiality of intelligence. This paper proposes a robotic system with structured intelligence. Based on perceptual information, a robot with structured intelligence makes decision and action from four levels in parallel. In addition, the robot generates its motion through interaction with the environment, and at the same time, gradually acquires its skill based on the generated motion. To acquire skill and motion, the robot requires internal and external evaluations at least. This paper applies a virus-evolutionary genetic algorithm for trajectory planning and applies neural network for motion learning. Furthermore, we discuss its effectiveness through computer simulation results.

Motion Planning; Motion Learning; Computational Intelligence; Structured Intelligence


Trajectory Planning and Learning of A Redundant Manipulator with Structured Intelligence

Naoyuki Kubota

Dept. of Mechanical Engineering, Osaka Institute of Technology

5-16-1 Omiya, Asahi-ku, Osaka 535, Japan

kubota@med.oit.ac.jp

Takemasa Arakawa, Toshio Fukuda

Dept. of Micro System Engineering, Nagoya University

1 Furo-cho, Chikusa-ku, Nagoya 464-01, Japan

arakawa@robo.mein.nagoya-u.ac.jp

fukuda@mein.nagoya-u.ac.jp

Abstract This paper deals with trajectory planning and motion learning for a redundant manipulator. Recently, behavior engineering for robotic systems has been discussed as a new technological discipline. A robotic system requires the whole structure of intelligence, and acquires skill and knowledge through interaction with a dynamic environment. Consequently, the whole structure determines the potentiality of intelligence. This paper proposes a robotic system with structured intelligence. Based on perceptual information, a robot with structured intelligence makes decision and action from four levels in parallel. In addition, the robot generates its motion through interaction with the environment, and at the same time, gradually acquires its skill based on the generated motion. To acquire skill and motion, the robot requires internal and external evaluations at least. This paper applies a virus-evolutionary genetic algorithm for trajectory planning and applies neural network for motion learning. Furthermore, we discuss its effectiveness through computer simulation results.

Keywords: Motion Planning, Motion Learning, Computational Intelligence, Structured Intelligence.

1 Introduction

Intelligent systems have been recently discussed in various fields such as knowledge engineering, computer science, mechatronics and robotics. In general, the intelligent systems require the capabilities of perception, decision making and action simulating living things. The intelligence of a system emerges from the close linkage of these capabilities. Furthermore, the intelligence of systems evolves through the learning and adaptation to dynamic environments. With the progress of computation capability, various methodologies concerning intelligence and life have been successfully developed such as brain science, soft computing, artificial life and computational intelligence [1-8]. We describe these researches briefly. The aim of artificial intelligence (AI) is to describe and build the intelligent agent that perceives its environment by sensors, and makes decision and action [1]. The aim of brain science is to understand the biochemical and physical mechanism of human brain and to construct a highly interconnected neural network like human brain [1-4]. Soft computing is a new concept for information processing, and its objective is to realize a new approach for analyzing and creating flexible information processing of human being such as sensing, understanding, learning, recognizing and thinking [5,6]. Artificial life means 'life made by humans rather than nature', and has three types of approaches: wetware system from the molecular level, software system from the cellular level, and hardware system from the organism level [7]. In the strictest sense, computational Intelligence (CI) depends on numerical data and does not rely on knowledge. Bezdek discussed intelligence from the three level: artificial, biological, and computational [3,4,8]. Furthermore, Eberhart defined CI as a methodology involving computing [4]. We also summarize this. CI aims to construct intelligence from the viewpoints of biology, evolution and self-organization. CI tries to construct intelligence by internal description, while classical AI tries to construct intelligence by external (explicit) description. Therefore, skill and knowledge for the intelligent system in CI should be learned or acquired by itself.

The human brain processes information super-quickly and super-accurately like a network. Neural network (NN) and fuzzy system (FS) are based on the mechanism of human brain. NN can be trained to recognize patterns and to identify incomplete patterns. The neurodynamics plays the role of non-linear mapping from input to output. NN has been applied for system identification, control and pattern matching (mapping from environmental information to conceptual information). While NN simulates physiological features of human brain, FS simulates psychological features of human brain. Fuzzy theory expresses a degree of truth, which is represented as a grade of a membership function. Fuzzy controllers have been applied for various systems that are complicated, ill-defined and time-varying. In addition, genetic algorithm (GA) provides the evolution mechanism for population dynamics, robot society and A-life. From the view point of simulated evolution, GAs can maintain the genetic diversity in a population to adapt to dynamic environment. Therefore, GAs are often called an adaptive system. To realize higher intelligence on the robotic systems, the emerging synthesis of various techniques is required as a whole system. Neuro-fuzzy computing has been developed for overcoming their disadvantages. In general, the neural network part is used for its learning and classifying, while the fuzzy logic part is used for inference and crisp output. GA is applied for parameter tuning and structure optimization of NN and FS [3,4,6]. To summarize, an intelligent system can quickly adapt to dynamic environment by NN and FS, and furthermore, the structure of the intelligent system can globally evolve by GA.

On the other hand, current advanced robotics and mechatronics often require intelligent systems in critical and dangerous areas for human beings. To perform given tasks, a robot collects or receives information concerning its external environment, and makes actions to the environment. Brooks proposed a subsumption architecture, and, afterwards, behavior based artificial intelligence (BBAI) for robotics has been successfully proposed [1,15]. In addition, bucket brigade algorithm, temporal difference method, and Q learning have been discussed as learning methods through interaction with the environment [6]. This kind of BBAI stresses the importance of the interaction between robot and environment, while classic AI is based on the representation and manipulation of explicit knowledge. Further, behavior analysis and training as methodology for behavior engineering, MonaLysa architecture, model-based learning, hierarchical intelligent control, and others, have been recently proposed [16-21]. However, the behavior-based approach needs to design a new controller for each task [1]. The intelligence of a robot depends on the structure of hardware and software for processing information, that is, the structure determines the potentiality of machine intelligence. Therefore, this paper proposes a robotic system with structured intelligence [22]. Based on perceptual information, a structured intelligent robot makes decision and action from four levels in parallel. In addition, the robot generates its motion through interaction with the environment, and at the same time, gradually acquires its skill based on the generated motion. In this paper, we focus on a redundant manipulator and discuss motion planning and learning. First, we apply a virus-evolutionary GA (VE-GA) to the trajectory planning of the redundant manipulator, and next, we apply an NN cor motion learning. Based on the outputs of the learned NN, the initial candidates of trajectories for the redundant manipulator are generated. The VE-GA optimizes the trajectory by binding the already acquired behaviors.

This paper is organized as follows. Section 2 proposes structured intelligence for a robotic system. Section 3 proposes trajectory planning and learning methods for a redundant manipulator by applying the VE-GA and NN. Section 4 discusses the effectiveness of the proposed method through several computer simulation results.

2 Structured Intelligence for Robotic System

2.1 Structured Intelligence

Human beings make decision and action based on the sensing information and internal state. In addition, human beings can learn by acquiring or perceiving information concerning reward or penalty from the external environments. Key technologies for intelligence are knowledge representation, inference, search, planning, learning, prediction and so on [1]. The intelligence emerges from the synthesis of various intelligent capabilities of the systems. Therefore, we should consider a whole structure of intelligence for processing information flowing over the hardware and software. We have proposed structured intelligence with three features: 1) neurodynamics, 2) skill and 3) recursive consciousness [22] (Figure 1). This paper does not discuss brain science, but the structured intelligence only simulates the intelligent aspects of human being. Human being uses the logical and intuitive information processing on the brain at the same time. The skill is defined as generalized knowledge and motion. To represent skill, we apply fuzzy inference system as the logical information processing [6,9]. Consequently, the skill is explicit representation of human intelligence. However, we can't adapt to dynamic environment only by skill. Therefore, we require neurodynamics as the intuitive information processing. Nonlinear property of NN has been applied for representing various relationships to be learned [2,6]. NN makes intuitive-like output based on the learned relationship, although NN is often used as an approximator. However, this kind of intuitive inference sometimes makes ineffective outputs. To control ineffective outputs, we need a meta level controller, that is,

Figure 1
Architecture of structured intelligence
Figure 2
Architecture of a robot with structured intelligence

recursive consciousness. The recursive consciousness plays the roles of binding several outputs and generating a new output. We need two types of evaluations for binding several outputs at least. The first one is external evaluation through interaction with the environment. The other is internal evaluation for binding outputs recursively from the meta-level. The Evolutionary concept is useful for generating new outputs. Therefore, the structured intelligence has candidate solutions of outputs, and each candidate solution is evaluated by the internal evaluation and external evaluation. GAs are applied for generating outputs.

2.2. Structured Intelligence for Robotic System

An intelligent robot requires the close linkage of perception, decision making and action. The structure of intelligence as a whole system is very important to realize this linkage, since the flexibility and learning capability depend on the structure of the whole system. Figure 2 shows a robotic system with structured intelligence. Based on the perceptual information from environment, the robot makes decision and action from four levels in parallel. The action comprises reactive motion (reflex), skilled motion, primitive motion planning and motion planning. When the robot recognizes dangerous quantitative information from the environment, the robot makes reactive motion without decision making. In the skilled motion level, the robot performs fundamental motions such as locomotion based on the simple decision making by selection. In the different environmental conditions, the robot must generate its suitable motion. If the robot can simply combine the already acquired skills or motions, the robot can generate its new motion by the combination. However, if the robot cannot apply the acquired skills and motions, the robot generates new motions by motion planning based on inverse kinematics and inverse

Figure 3
External evaluation and internal evaluation for motion planning

dynamics. Thus, a robot has no skill at the beginning, but gradually acquires skill and motion through interaction with the environment. To acquire skills and motions, the robot requires external evaluation and internal evaluation (Figure 3). The external evaluation is basically used for generating primitive motion through interaction with the environment. The internal evaluation binds the robot's primitive motions recursively. Consequently, the internal evaluation is used for high-level motion planning and decision making from meta-level. In fact, human can be conscious of being conscious recursively. This allows humans to predict and infer in unknown environments. In this way, the robot can acquire skill and knowledge based on the internal evaluation and external evaluation through interaction with the environment.

3 Trajectory Planning and Learning for Redundant Manipulator

Trajectory planning is one of the most important and difficult tasks required to robot manipulators. To achieve a given task, the robot manipulator generally performs the following steps: 1) find obstacles (perception), 2) generate a collision-free trajectory (decision making), and 3) trace the trajectory actually (action). In this paper, we focus on motion planning (decision making) based on the work space's map built from sensory information. The task is that the robot manipulator performs its motion planning where its initial and final configurations are given. The motion of the robot is basically constrained by its dynamics, kinematics and work space including obstacles. Various trajectory planning methods have been proposed to solve motion planning problems [23~25]. Basically, two main approaches have been proposed to generate collision-free trajectories. The first one is artificial potential field methods. The robot manipulator moves based on the attractive force from the goal point and the repulsive force from the obstacles in the work space. Some deadlocks occur in the iterative search based on the artificial potential field sometime. The other approach is a configuration space (C-space) method. The C-space is transformed from the work space, and therefore, the dimensions of the C-space is equal to the degrees of freedom (DOF) of the robot manipulator. As a result, the search space becomes very large. Redundant degrees of freedom enable a manipulator to perform various works where a work space includes a number of obstacles, but it is difficult to solve collision avoidance problems in such cases.

We have proposed a hierarchical trajectory planning method on the pseudo-potential space for work space [25]. This method has basically two layers of trajectory generator and configuration generator. The configuration generator (low level planning) generates collision-free configurations and the trajectory generator (high level planning) generates a collision-free trajectory by binding intermediate configurations (see Figures 3 and 4). Furthermore, we apply a virus-evolutionary genetic algorithm to the hierarchical trajectory planning.

Figure 4
Hierarchical trajectory planning for robot manipulator
Figure 5
Virus evolutionary genetic algorithm; VE-GA

3.1 Virus-Evolutionary Genetic Algorithm

Evolutionary computation is a field of simulating evolution on a computer [11]. The stochastic optimization method simulating the process of evolution is historically divided into three main categories: GA, evolutionary programming (EP) and evolution strategy (ES). GAs have been often applied for combinatorial optimization problems, while EP and ES have been mainly applied for numerical optimization problems [11~14].

We have proposed virus-evolutionary GA (VE-GA) simulating the coevolution with both horizontal propagation and vertical inheritance of genetic information [20-22]. The VE-GA is composed of two populations: a host population and a virus population (Figure 5). The host population and virus population are defined as a set of candidate solutions and a substring set of the host population, respectively. We assume that the virus population enables the horizontal propagation of genetic information in the host population, while the crossover and selection enables the vertical inheritance from parents to children. Crossover recombines substrings between two host individuals. In the selection, host individuals with low fitness are eliminated, and host individuals with high fitness can reproduce children. The VE-GA has two virus infection operators, as follows:

(a) Reverse transcription operator (infection): A virus overwrites its substring on the string of a host individual for generating new host individuals.

(b) Incorporation operator: A virus takes out a substring from the string of a host individual for generating new virus individuals.

Figure 6
Procedure of virus infection in VE-GA

First, all host individuals are reproduced as children before virus infections. Figure 6 shows the procedure of virus infection. A virus has a measure of the strength of the virus infection (fitvirusi). We assume that fithostj and fithostj' are fitness values of a host individual j before and after the infection, respectively. The fitvirusi,j denotes difference between fithostj and fithostj', which is equal to the improvement value obtained by infecting to the host individual:

(1)

(2)

where i is the virus number and S is a set of the host individuals which are infected by the virus i. In addition, the infection rate (infratei) is updated according to the above improvement value.

(3)

where a is a coefficient greater than zero and t is generation number. Furthermore, if infratei,t+1> Max_Infec-tion_Rate, set infratei,t+1 to Max_Infection_Rate. The increase of infratei accelerates the increase of effective schemata by virus infection. Furthermore, each virus has a life force through generations as follows:

(4)

where r is the life reduction rate, respectively. This life force plays the role of evaluating a schema of virus to the host population. After all infections, the virus evolves by incorporation. If fitvirusi is larger than zero, the virus performs incorporation by taking out partially new substring from one of the infected host individuals (extending incorporation). Otherwise, the virus shortens the genotype by removing several genes (shortening incorporation). Actually, the virus length shortens or extends according to the incorporation probability (Pinc). Pinc denotes a probability to take out one genes from a host individual. After calculating the life force, if lifei,t+1 takes a negative value, the virus individual takes out a new substring with the incorporation from the randomly selected host individual (generating incorporation). Each virus performs this infection process.

The basic architecture of the VE-GA is based on a steady-state model. The steady-state GA (SSGA) basically exchanges the worst individuals with a pair of individuals generated by crossover and mutation [27]. The SSGA is known as a continuous alternation model of generation. We show the procedure of the VE-GA is as follows:

begin

Initialization

repeat

Selection

Crossover

Mutation

Evaluation

Reproduction

Virus_infection

Replacement

until Termination _condition = True

end.

Initialization randomly generates an initial host population, and then a virus individual is generated as a substring of a host individual. The initial length of each virus individual is predefined as Min_V_Length. 'Delete least fitness' (DLF) is used as selection scheme, which eliminates the worst and second worst host individuals selected from two host individuals. Next, crossover is performed between the selected host individuals with crossover probability. Mutation is performed with mutation probability per gene. After genetic operators, the modified host individuals are evaluated by fitness function. Next, all host individuals are reproduced as children

Figure 7
Hierarchical trajectory planning with trajectory generator and configuration generator

for virus infections. Each virus infects host individuals of its neighbors according to infection rate. After the infections by all viruses, a parent is replaced with the infected host individual (child), if the fitness value is higher than that of the parent. Thus, the host individuals (children) improved by virus infections can survive into the next generation. This indicates that the virus infection operator can be regarded as local hill-climbing operator. These processes are repeated until the termination condition is satisfied. Here we use the maximal number of generations as the termination condition.

3.2 Hierarchical Trajectory Planning

In this subsection, we consider trajectory planning for redundant manipulators made up of only revolving and bending joints in the work space including some static obstacles. Figure 7 shows a hierarchical trajectory planning method with trajectory generator and configuration generator. The configuration generator generates some intermediate configurations of the redundant manipulator between the given initial and final configurations. Here a configuration qis expressed by a set of joint angles.

(5)

where n denotes the DOF of redundant manipulator. In addition, the position of end-effector, P, is defined as follows.

(6)

where T denotes a rotation matrix. In this paper, we only use the forward kinematics. The configuration generator has a set of intermediate configurations. An appropriate intermediate configuration is generated based on the distance between the manipulator and obstacles. To measure the distance between the manipulator and obstacles, we apply the concept of pseudo-potential space [25]. Here the work space is divided into NNN cells. Furthermore, let us assign a pseudo-potential value p to each cell to evaluate its closeness to the obstacles in the work space. A pseudo-potential value of a cell can only take positive integer or zero. The closer the distance from the obstacles is, the larger the pseudo-potential value is. A cell with the pseudo-potential value zero is comparatively far away from obstacles.

We apply the VE-GA to the generation of intermediate configurations of a redundant manipulator. Let us assume two configurations, pA and pB be constraints for intermediate configurations generator. We consider the generation of an intermediate configuration pC between configuration pA and pB. The variables to be optimized are joint variables of the redundant manipulator.

Figure 8
The representation of candidate solution

The objective is to generate a trajectory realizing the minimum distance from the initial configuration to the final configuration and farther away from the obstacles. To achieve the objective, we use the fitness function as the external evaluation in the following,

(7)

where w1, theough w5 are weight coefficients. The first term, fp, denotes the sum of squares of the distance between the the configuration of end-effector of pA and that of pC, and the distance between those of pB and of pC. The second term, fd, denotes the sum of squares of the difference between the each joint angle of pA and that of pC, and the difference between that of pB and that of pC,. The third term, ffr, denotes the sum of the evaluation function using a normal distribution to make each joint be within an available range. To evaluate the distance between the manipulator and the obstacles, we introduced the concept of pseudo-potential space. Let us take several sampling points on the manipulator. The fourth term, maxpot, denotes the maximal value in the pseudo-potential values on the sampling points. If the maxpot is the predefined maximal pseudo-potential value, p, the manipulator collides with obstacles. The last term, sumpot, denotes the sum of the pseudo-potential values on all sampling points.

A candidate solution includes all joint variables of a configuration for the redundant manipulator. Each joint variable, qi, is represented as binary code (Figure 8). A virus' substring consists of three characters {0,1,*} and is the same length as a host individual. The character "*" denotes a" don't care" mark. A virus individual does not carry out the reverse transcription in the same configuration where there is "*". Figure 9 shows an example of the reverse transcription. The reverse transcription overwrites the virus' substring on a randomly selected host individual. The incorporation has two types of operations (Figure 10). The first one is to copy genes from a host individual with a copy probability per gene. The other is to replace some genes with character "*" with a cut probability per gene. An initial virus population is generated from a host population using the incorporation operator.

Figure 9
Reverse transcription of a virus to a host individual
Figure 10
Incorporation of a virus from a host individual

The trajectory generator as a high level planning, generates a collision-free trajectory combining several intermediate configurations generated in the configuration generator. These intermediate configurations of the configuration generator are evaluated by the internal evaluation. The internal evaluation has a criterion concerning the safety and distance against the best trajectory of the trajectory generator. If the criterion is satisfied, the trajectory generator receives the intermediate configuration, that is, good intermediate configurations are selected for the trajectory planning. As the best candidate (trajectory) gives high performance, the criterion for selecting a configuration becomes high. Thus, this criterion dynamically changes according to the total evaluation. We also apply the VE-GA to the trajectory generator. The VE-GA in the trajectory generator has a set of candidate solutions including joint angles of real number. Figure 8 shows validity parameters for intermediate configurations and a set of joint angles of several intermediate configurations as a trajectory candidate in the trajectory generator. The VE-GA initially sets zero on all validity parameters. When an individual receives an intermediate configuration from the configuration generator, the VE-GA sets one on the corresponding bit on the validity. Therefore, each individual gradually evolves by receiving a set of joint angles of the intermediate configuration from the configuration generator. Here two mutation operators are used [22]. The first one is a bit mutation operator changing a randomly selected gene on the deta-set string to 0. The other is a self-adaptive mutation operator using normal random variables with mean zero is introduced:

(8)

where xj,i denotes the joint angle i of the individual j. A virus individual has a subset of intermediate positions and transmits among the host individuals to generate a collision-free trajectory quickly.

The VE-GA for the trajectory generator also uses equation (7) as the fitness function, but each term in equation (7) is extended to the sum on the trajectory (total evaluation). Here the best individual is constraint for generating intermediate configurations, that is, the configuration generator generates intermediate configurations based on the best individual in the trajectory generator. Therefore, the hierarchical trajectory planning results in a co-optimization of a trajectory and intermediate configurations.

3.3 Motion Learning and Motion Binding

The generated trajectory is used for the learning of the primitive motions by using NN [26]. Figure 11 shows the architecture of motion planning and learning by VE-GA and NN. Consequently, NN plays the role of the storage of motion and knowledge, and the VE-GA plays the role of decision maker for motion planning. In addition, the internal simulator means the forward kinematics model which can evaluate a trajectory. The procedure of the motion planning is shown in Figure 13. First, a task is given to the redundant manipulator. Next, NN makes outputs of a series of configurations as a trajectory (qinit) for performing the given task. If the trajectory generated by NN is feasible, the trajectory is selected as neural motion without motion binding (see Figures 2 and 12 ). At the same time, the VE-GA generates initial candidate solutions of trajectories according to the outputs from NN. Small normal bias is added to each input to NN in order to generate various candidate solutions. Next, as the primitive motion planning, the VE-GA changes the combination of the intermediate configurations during several iterations by simple mutation, to optimize the trajectory. This optimization process requires little computational time. However, if the binding can not find a feasible solution within several iterations, the hierarchical trajectory planning is performed until satisfying the aspiration level. Finally, the VE-GA outputs the best trajectory (q *), and simultaneously, the NN is trained by the backpropagation algorithm according to q *. In this paper, we apply a multi-layer NN composed of input, hidden and output layers. The total error E between the teach signal Ti and the output Oi of the NN is defined as follows:

Figure 11
Architecture of motion planning and learning by VE-GA and NN

Figure 12 Procedure of primitive motion planning and motion planning

Figure 13
Teaching data for NN (STM: short term memory, LTM: long term memory)

(9)

(10)

where Ik is the k-th input value to the NN, aj is the output of the j-th neuron of the hidden layer, Wi,j denotes the

Figure 14
Update of STM and LTM

weights from the j-th neuron of the hidden layer to the i-th neuron of the output layer, Wj,k denotes the weights from the k-th neuron of the input layer to the j-th neuron of the hidden layer, and S is a sigmoid function. The weights of NN are trained by the following equation:

(11)

(12)

where t is the learning rate. In this paper, since the sigmoid function is applied, the derivative is S'=S(1-S).

We use the concept of short-term memory (STM) and long-term memory (LTM) in order to make teaching data (Figure 13). The STM simply keeps the trajectories used in the previous tasks. The LTM keeps the trajectories which has the worst error in the learning of NN, and the LTM is used for the learning of tasks difficult to train the NN. The STM and LTM have the finite length of memory. If the memory buffer is full, the oldest trajectory is removed (Figure 14). The teaching data are composed of the trajectories generated by VE-GA, STM and LTM. The learning of NN is performed after the motion planning by the VE-GA. Consequently, when the neural motion and primitive motion planning are performed, the learning is not performed but the STM is only updated. Thus, the NN is trained by the recent teaching trajectories. The input variables to NN are the target direction of the end-effector and current configuration of q i,t and the output variables are Dq i,t. The initial trajectory is generated according to Dq i,t. Each joint angle (q i,t+1) of the next intermediate configuration is calculated by adding the Dq i,t. Next, the position of end-effector of the next configuration is calculated according to the q i,t+1.

Figure 15
Trajectory planning problems of 7 DOF manipulator

4 Numerical Simulation

We show the simulation results of trajectory planning for a redundant manipulator with structured intelligence. First, we consider the trajectory planning problem of a redundant manipulator with 7 degrees of freedom in the work space where there are one or two obstacles (Figure 15). The objective is to generate a collision-free trajectory from an initial configuration to a final configuration. The work space is divided into 30~30~30 cells. The pseudo-potential space has a maximum 5. Tables 1 and 2 show the simulation parameters concerning the VE-GA for the hierarchical trajectory planning. The number of intermediate configuration is 8. Therefore the string length of candidate solution of trajectory generator is 56 (7 DOF ~8 intermediate configurations).

Figure 16 shows simulation results of task 1 by the hierarchical trajectory planning. In these figures, the side view shows that the redundant manipulator avoids colliding with the obstacles. The redundant manipulator

Table 1
Simulation parameters of hierarchical trajectory planning
Table 2
Simulation parameters of virus infection

achieves the final position without colliding with obstacles and the obtained trajectories are further away from obstacles. In addition, Figure 17 shows the evolutionary transition of intermediate configurations (side view). At first (No.1), the configuration generator generates various types of intermediate configurations. Though the trajectory generator obtains a collision-free trajectory in No.1, the combination of the intermediate configurations is not optimal. Gradually, the hierarchical trajectory planning obtains good trajectories and configurations through the co-optimization of the configuration generator and trajectory generator. These simulation results show that the binding of primitive motions by internal evaluations is very important for global motion planning. Next, we conduct two simulation cases of the motion learning of NN.

  • Case 1 : hierarchical trajectory planning (same task) + motion learning by NN

  • Case 2 : hierarchical trajectory planning + primitive motion binding + motion learning by NN

We describe the simulation parameters of NN as follows. The number of input is 10 including the direction of the end-effector and each joint angle (q i,t) on time step t. The number of output is 7 and the number of the neuron of hidden layer is 20. The iteration times for the learning is 1000 (relatively short).

The motion planning of the same task is conducted 5 times (trials)in case 1. Figure 18 shows the fitness values of the best trajectory against the evaluation times in the trajectory generator (case 1). In the first trail, since NN has no primitive motions, the initial candidate solutions of trajectories are no good. Therefore, the VE-GA must generate a trajectory without primitive motions. Consequently, the VE-GA requires many evaluation times to generate feasible trajectories. Next, NN is trained by the teaching data of five feasible trajectories generated by the VE-GA. In the case 1, STM and LTM are not used. From the second trial, the initial candidate solutions are generated using the outputs of the learned NN, and so, the evaluation times of the VE-GA gradually becomes short. The fourth trial requires about 40 evaluation times. These simulation results show that NN can learn the trajectory generated by the VE-GA.

In general, there are two adaptive approaches to dynamic environment. One is incremental learning. The other is to make teaching data suitable to the environment. We apply STM and LTM to make suitable teaching data. The sizes of STM and LTM are 3 and 2, respectively. NN is trained by the teaching data of STM, LTM and five feasible trajectories generated by the VE-GA. The iteration times for the learning is 3000 in the case 2. Table 3 shows the simulation results of case 2 including task 1~3. The given task schedule is written in the second row. The third row denotes the type of decision making: "M" is motion planning, "B" binding, and "N" neural motion. The binding is to change the combination of intermediate configurations by simple mutation. The maximal iteration times for the binding is 50. After 50 iteration times, the hierarchical trajectory planning is performed as motion planning. The first trial requires many iteration times for motion planning (515 iteration times in search). NN is trained by the teaching data generated by motion planning, since the STM and LTM has no previous trajectories. After learning, the STM and LTM are updated. In the second trial, the given task is the same as the first trial. Therefore, the NN can generate suitable trajectories. In the simulation results, a feasible trajectory is generated by binding the outputs of NN, and the iteration times is very small. After binding, the STM is updated, but the LTM is not updated since the LTM is updated only after the learning of NN. The third task is No.2. This task is achieved by binding outputs of NN. Figure 19 shows the generated trajectory for the task 2. This simulation result shows the trajectory for the task 2 partially inherits the motion for the task 1. Next, the task 3 is given as the fourth trial. Since NN has no trajectories for the task 3, NN can not generate feasible trajectories. Figure 20 shows the simulation result of the hierarchical trajectory planning. The NN is trained by the teaching

Figure 16
Simulation results of trajectory generator (task 1)
Figure 17
Side views of simulation results of configuration generator (task 1)

data of STM, LTM, and the trajectories for task 3. After learning, the STM is updated like first-in first-out me-thod, since the buffer is full (see Figure.14 and Table 3). In the 7th trial, NN cannot generate feasible trajectories. Figure 21 shows an infeasible trajectory colliding with obstacles. However, this trajectory is useful for search in the VE-GA, and the iteration times for search is comparatively small (80 evaluation times). In this way, the redundant manipulator can achieve the given task with motion learning.

Figure 18
Simulation results of GA optimization
Table 3
Motion planning results of the given task schedule (M: motion planning, B: binding, N: Neural motion)

5 Concluding Remarks

This paper proposes a robotic system with structured intelligence. Furthermore, this paper proposes motion planning and learning by a virus-evolutionary genetic algorithm and NN with short-term memory and long-term memory. Simulation results show that the

Figure 19
Simulation result of trajectory planning (task 2) by binding the outputs of NN
Figure 20
Simulation result of hierarchical trajectory planning (task 3)

Figure 21 Simulation result of trajectory planning (the 7-th trial, task 1) by

binding the outputs of NN

motion learning is very important for the robot to adapt to the dynamic environment. In addition, the binding of the outputs of NN is effective for quick primitive motion planning. Furthermore, even if NN cannot completely generate feasible trajectories, motion planning based on the outputs of NN can quickly generate feasible trajectories suitable to given tasks. However, the proposed method is a little complicated comparing with other methods [16-19,22-25]. An intelligent aspect can be easily realized, but it is very difficult to develop a structured intelligence including many intelligent aspects. The future robotics and mechatronics require this kind of structured intelligence to make robots perform various tasks autonomously.

The motion learning by NN depends on the size of short-term/long-term memories. We will discuss adaptive memory size, and will extend to a module structure of NN. Furthermore, this paper only applied the concept of recursive consciousness to robotics, but we must carefully deal with the consciousness to discuss the close linkage between human science and robotics.

  • [1] S.J.Russell, P.Norvig. Artificial Intelligence, Prentice-Hall, Inc., 1995.
  • [2] J. A. Anderson, E. Rosenfeld, Neurocomputing - Foundations of Research, The MIT Press (1988).
  • [3] J. M. Zurada, R. J. Marks II, C. J. Robinson. Computational Intelligence - Imitating Life, IEEE Press, 1994.
  • [4] M.Palaniswami, Y.Attikiouzel, R.J.Marks II, D.Fogel, T.Fukuda. Computational Intelligence - A Dynamic System Perspective, IEEE Press, 1995.
  • [5] L. A. Zadeh. Fuzzy Sets, Information and Control, Vol.8, 338-353, 1965.
  • [6] J.-S.R.Jang, C.-T.Sun, and E.Mizutani. Neuro-Fuzzy and Soft Computing, Prentice-Hall, Inc., 1997.
  • [7] C.G.Langton. Artificial Life -An Overview, The MIT Press, 1995.
  • [8] R.J.Marks II. Intelligence: Computational Versus Artificial, IEEE Trans. on Neural Networks, 4 (5): 737-739, 1993.
  • [9] S.V.Kartalopoulos. Understanding Neural Networks and Fuzzy Logic, IEEE Press, 1996.
  • [10] D.E.Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison Welsey, 1989.
  • [11] D.B.Fogel. Evolutionary Computation, IEEE Press, 1995.
  • [12] J.Koza. Genetic Programming, Massachusetts: The Mit Press, 1992.
  • [13] J.Koza. Genetic Programming II, Massachusetts: The Mit Press, 1994.
  • [14] J.Holland. Adaptation in Natural and Artificial Systems, Ann Arbor:University of Michigan Press, 1975.
  • [15] R.Brooks. A Robust Layered Control System for a Mobile Robot, IEEE Journal of Robotics and Automation, RA-2-1, 14-23, 1986.
  • [16] Marco Colombetti, Marco Dorigo, Giuseppe Borghi. Behavior Analysis and Training - A methodology for Behavior Engineering, IEEE Trans. on Systems, Man, And Cybernetics, Part B: Cybernetics,26 (3): 365-380, 1996.
  • [17] J.Y.Donnart, J.A.Meyer, Learning Reactive and Planning Rules in a Motivationally Autonomous Animat, IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics, 26 (3): 381-395, 1996.
  • [18] Jun Tani, Model-Based Learning for Mobile Robot Navigation from the Dynamical Systems Perspective, IEEE Trans. on Systems, Man, And Cybernetics, Part B: Cybernetics, 26 (3): 421-436, 1996.
  • [19] T.Fukuda and T.Shibata. Theory and Applications for Neural Networks for Industrial Control Systems, IEEE Trans. on Industrial Electronics, 39 (6): 472-489, 1992.
  • [20] N.Kubota, T.Fukuda, K.Shimojima. Trajectory Planning of Cellular Manipulator System Using Virus-Evolutionary Genetic Algorithm, Robotics and Autonomous Systems 19, 85-94, 1996.
  • [21] N. Kubota, T. Fukuda, K. Shimojima, Trajectory Planning of Reconfigurable Redundant Manipulator Using Virus-Evolutionary Genetic Algorithm. In Proc. of The 22nd International Conference on Industrial Electronics, Control, and Instrumentation, pp 836-841, 1996.
  • [22] N.Kubota, T.Arakawa, T.Fukuda, K.Shimojima. Motion Planning for A Robotic System with Structured Intelligence. In Proc. of the 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation, 60-65, 1997.
  • [23] T.Lozano-Perez. Automatic Planning of Manipulator Transfer Movements, IEEE Trans. on Systems, Man, Vol.11, 681-698, 1981.
  • [24] R.A.Brooks, Planning Collision Free Motions for Pick and Place Operation, Robotics Research, MIT Press, 5/38, 1983.
  • [25] O.Khatib. Real-Time Obstacle Avoidance for Manipulators and Mobile Robots, Robotics and Research, 5 (1): 90-98, 1986.
  • [26] T. Fukuda and N. Kubota. Intelligent Robotic System Adaptation, Learning and Evolution, In Proc. of the Third International Symposium on Artificial Life and Robotics, Oita, Japan, I-40-45, 1998.
  • [27] G.Syswerda. A Study Reproduction in Generational and Steady-State Genetic Algorithms, In Foundations of Genetic Algorithms, San Mateo: Morgan Kaufmann Publishers, Inc., 1991.
  • Publication Dates

    • Publication in this collection
      08 Oct 1998
    • Date of issue
      Apr 1998
    Sociedade Brasileira de Computação Sociedade Brasileira de Computação - UFRGS, Av. Bento Gonçalves 9500, B. Agronomia, Caixa Postal 15064, 91501-970 Porto Alegre, RS - Brazil, Tel. / Fax: (55 51) 316.6835 - Campinas - SP - Brazil
    E-mail: jbcs@icmc.sc.usp.br