Biopsychically Inspired Cognitive Control for Intelligent Agents Based On Motivated Learning
- Jim Zhu and Xudan Xu, School of Electrical Engineering and Computer Science, Ohio University, USA (zhuj @ ohio.edu)
In this talk we present a biopsychically (biologically and psychologically) inspired control system paradigm and architecture for autonomous (aerial, terrestrial, aquatic, space and extraterrestrial) mobile robotic platforms, hereafter called autonomous agents, or simply agents. This talk will be a forward-looking, out-of-the-box inquiry that poses more questions than answers.
Autonomous and semi-autonomous agents play increasingly significant roles today in scientific exploration, industrial applications, search and rescue missions, security and defense, and people's daily lives. These agents integrates the advancement in mechanization, automation, artificial intelligence and machine cognition achieved since the first industrial revolution, yet the current autonomous agents are far from mature for the missions that they are designed for.
Most of the current autonomous agents are remotely commanded or human supervised, which require human intervention in situation awareness and assessment, goal setting, decision making and mission planning. This necessitates significant communication bandwidth and power, or else the agents will not be able to respond to dynamic environment timely and effectively. Moreover, at the present most of the practical machine intelligence are pre-programmed; machine knowledge are static; and machine learning are supervised. As such the agents are not able to set goals to pursue when facing complex operating scenarios, nor innovate or improvise in dealing with unknown environments or unforeseen situations autonomously. While supervised learning increases the onboard knowledge base, it does not increase the wisdom of the agent. The pre-programmed intelligence also lacks distinct cognitive traits, or personalities, to facilitate complementary or symbiotic autonomous collaborations. Comparing to the biological counter parts, the current agents are still too machine-like. Furthermore, current machine cognitive system design and implementation are quite ad hoc, and lack systematic yet flexible integration with the motion control subsystems.
In this talk we look to the Nature for inspiration, and propose an autonomous control system architecture whose hardware is modeled after the biological Central Nervous System (CNS) of human's, and software/algorithms are modeled after human psychological processes. In particular, the agent dynamics (the "plant") model consist of not only the typical kinematics and dynamics equations of motion for the mechanical platform, but also a hierarchy of "pain" models driven by the agents innate needs for Surviving and Thriving (S&V) in the environment. At the bottom of the pain hierarchy are the primitive pains that are driven by the needs for energy, shelter, rest and healing, exploring, and finding mates, which will grow with time left unattended and motivate the agent to act. Abstract pains can be derived as the agent learns from its experience, thereby forming higher level of goals. This "motivated learning" paradigm forms the foundation of unsupervised autonomy and complex psychological behaviors.
Around the motion and pain dynamics models are several feedback loops with distinct time-scale separations. From the inner-most and the fastest time-scale, there are effector-actuator loops (milliseconds time-scale), coordinated motion control (sub-seconds), motion trajectory planning and tracking (seconds), tactic task management (minutes-hours) and strategic mission planning (hours-to-years), which correspond roughly to the autonomous flight control systems terminology as (closed-loop) control allocation, (attitude) control, guidance, (unsupervised) task manager and mission manager. Comparing to the human CNS, this multi-loop architecture correspond roughly to the extremity reflex-arcs and spinal cord, cerebellum, cerebral motor cortex, and the cerebrum which includes the primitive motivation center (brain stem to limbic system) and advanced cognitive activities in the frontal and prefrontal lobes, as well as high-level sensory and motor processing in parietal, occipital and temporal lobes.
This paradigm allows us to think of machine cognition as an adaptive dynamic feedback controller which regulates the primitive pains (state variables) to follow a certain "nominal trajectories" that are optimized by the environment. Critical controller dynamic states that lead to successful relief of primitive pains can be formulated into derived abstract pains, which in turn set tactic and strategic goals for the cognitive controller to attain and maintain. In this paradigm, onboard knowledge base, learned with or without supervision, are treated as equilibrium states in a neural network whose synaptic weights are treated as state variables rather than parameters as in the conventional artificial neural network paradigm. This new paradigm forces we control engineers and scientists to rethink of the complexity of dynamic systems and behaviors, and control law design.
As examples of novel control system paradigm, it is noted that in addition to the hierarchical CNS control pathways, the biological system can exert control directly on the muscular actuators in the face of emergency via hormonal signals through blood circulation, which are energy distribution pathways. This suggests that emergency control signals could be transmitted through, e.g. electric power distribution network. The biological CNS also has a direct inhibition pathway from the cerebrum to the reflex-arcs to override the reflexive local protective reaction in order to save the agent as a whole, which can be borrowed in machine agent control architecture design.
As examples of cognitive control law design, we will present some initial explorative results of employing human and animal behaviors identified in psychological studies such as the so-called Goal-Gradient Behavior (GGB), i.e. the agent motivation and control effort increases as the goal is approaching, which is in contrast to the Conventional Control Behavior (CCB) as designed by control engineering principals, where control effort diminishes as the goal is approaching. Another related psychological behavior is the Stuck-in-the-Middle Behavior (SMB), where the agent applies a greater effort when first embark on a goal pursuit, and then maintains a lower level but steady effort over the course, and finally increases its effort as the goal is in sight. The GGB is typical in animal predator stalking, and human deadline-beating behavior, and the SMB is typical in enduring goal pursuit such as Marathon running. We will use a Linear Quadratic Regulator optimal control model with time-dependent and state-dependent cost functions to demonstrate the GGB and SMG psychological behaviors and their advantages in reducing energy and stress as compared to the CCB control strategies for particular goal striving processes.
The talk will conclude with a summary of questions and challenges posed by the proposed biopsychic control paradigm and architecture.
Lifelong education in robotics and mechatronics
- Andreja Rojko, University of Maribor, Slovenia (andreja.rojko @ uni-mb.si)
- Krzysztof Kozłowski, Poznań University of Technology, Poland (krzysztof.kozlowski @ put.poznan.pl)
European companies face in on-going economic and financial crisis a high pressure from the market. A lot of companies are outsourcing their resources to other parts of the world with higher and stable grow rates and lower work-force costs. The only way to hold companies and jobs in Europe are investments in high tech products, which require high qualified human resources. On the other side, the European countries are confronted with aging societies, share of older people is increasing and they should stay in the work process longer. To keep these people up-to date with the technological progress, a lifelong learning process started inside and outside the industries. European Union (EU) has in the framework of Lifelong learning program executed a series of expert studies to define the needs of continuing education and qualification in different technology fields. They have identified robotics and mechatronics as one of the structural drivers of change in the electro-mechanical industry of EU and as such, especially important target for lifelong education.
As one of the solutions for lifelong education, many EU companies (38 %, EUROSTAT) rely on the in-company (in-house) training. However such training is usually organized by the producers of equipment and hardly meets the needs of the practicing technicians and engineers, who also need some general knowledge and experience in order to cope with ever increasing demands in their workplace and technological progress. Statistical data also show, that 32 % of the total EU population consult internet for the purpose of self-directed learning. Learning on demand, using technology enhanced learning methods, where the complete training cycle is delivered via internet, is therefore a promising method for the training in industry. EU has supported few projects for training in mechatronics or alternative technologies such as AIRE, MARVEL and MITS-Mechatronica. One project offers mechatronics training to the deaf people and in another one, a project INNOVET, a concept for mechatronics training for the weaker learners. However although this area seem to be well addressed, there are only few attempts made until now to join the education and industry in order to develop a distance in-company training with up-to date contents and modern educational approach.
The presentation will provide overview of attempts done in this direction and their results. MERLAB initiative (Innovative Remote Laboratory in the E-training of Mechatronics), in which distance training from basics of mechatronics was delivered to the professionals from electro-mechanical field from Austria and Slovenia, will be presented. Further, the approach was especially in details developed and tested within E-PRAGMATIC network (E-Learning and Practical Training of Mechatronics and Alternative Technologies in Industrial Community), which is an association of 13 regular and 4 associated partners from seven European countries. The network’s partners are the educational institutions, chambers, enterprises and associations. Extensive needs analysis, which was executed with the management of relevant companies and practicing technicians and engineers, reveals the real knowledge and education needs in that field, as well as some other relevant information concerning state of lifelong education. Those data are valuable, as they don’t reveal only needs of professionals, but also give important information for educators in regular programs, as topics of high interest and missing knowledge in industry was identified.
Distance courses from robotics, mechatronics and alternative technologies and their implementation for lifelong learning in industry will be presented. Courses are prepared and executed by University of Maribor - Slovenia, Poznan University of Technology – Poland, Delft University of Technology - Netherlands, University of Deusto - Spain, Carinthia University of Applied Sciences - Austria and University of Applied Sciences Bern – Switzerland. Participants, practicing technicians and engineers, are from more than 40 companies from electro-mechanical field. Concrete solutions will be presented including contents of the courses, multimedia e-learning materials, used learning management systems and application of remote working stations/experiment for practical part of the training. Applied education methodology will be revised by considering feedback obtained from the training participants. Based on gained experience, a long-term perspective of such education will be evaluated and some conclusion relevant also for regular education will be drawn.
Variable-, Fractional-Order Discrete PID Controllers
- Piotr Ostalczyk, Technical University of Łódź, Poland (piotr.ostalczyk @ kis.p.lodz.pl)
The fractional calculus theory is the area of mathematics that handles derivatives and integrals of
any arbitrary order (fractional or integer, real or complex order). Nowadays it is applied in almost
all areas of science and engineering. Here one can mention its numerous and successful
applications in dynamical systems modelling and control with increasing number of studies
related to the theory and application of fractional-order controllers, specially PIuDv
ones. In such controllers u and v denote the integration and differentiation order, respectively. Research
activities are focused on developing new analysis and closed-loop system synthesis methods for
fractional-order controllers being an extension of classical control theory. In the fractional-order PIuDv
controller tuning there are two additional parameters u and v. This impedes the controller
tuning procedure but leads to new (unattainable in classical PID control) closed-loop system
transient responses. The closed-loop system with fractional controller must satisfy typical
requirements among which one can mention the system robustness due to the plant model
uncertainties. The well-known example of practically implemented fractional-order robust
controller is the CRONE controller (fr. Commande Robuste d?Ordre non-Entier).
The discrete-time counterparts of the fractional-order derivatives and integrals are
fractional-order PΣuΔv backward differences and sums. They can be applied in a discrete-time control
controller. In practical, microprocessor implementation of such
controllers appears yet a problem of linearly growing "calculation tail" caused by the fractional-
order difference and sum evaluation. Admitting any differentiation and summation order one may
define discrete order functions v(k),u(k)
and relate its to the difference and sum evaluated at k-
th time instant. Such variable-, fractional-order backward difference and sum, as a generalisation
of the fractional (constant) - order backward difference and sum, may be used in the variable-,
fractional-order PID (VFOPID) controller.
The VFOPID controllers show new unlimited possibilities of shaping its transient
characteristics preserving the proportional, differential and integration action. It is illustrated by
numerous computer simulations and DSP realisations. Several structures of the VFOPID
controllers are presented. An appropriate choice of the order functions v(k),u(k)
avoid problems related to "the growing calculation tail" mentioned above. The stability of the
closed-loop discrete linear system with the VFOPID conditions are discussed. The investigations
are illustrated by a practical application of the VFOPID controller.