The University of Edinburgh SENSOPAC Site

Optimal Feedback Control
Relates to SENSOPAC Work Package 3: 3A.1, 3A.2
People involved: Djordje Mitrovic, Stefan Klanke, Sethu Vijayakumar

Role within SENSOPAC

Optimal feedback control is an essential ingredient for:

Planning optimal actions in the presence of redundancy across various levels: actuation dynamics, kinematics and variable stiffness
Incorporating dynamic and kinematic models seamlessly with task constraints
Explain trial-to-trial variability experienced in natural human movements

This is essential bridge between the learning work of WP2 and the implementation on the DLR robot since conventional planning cannot be employed in the presence of redundancies.

With 26 degrees of freedom and variable stiffness of each joint, the integrated hand-arm system developed by the SENSOPAC partner DLR puts high demands on possible control approaches. In parallel to the implementation of classic techniques for inverse kinematics and control (carried out at DLR), we investigate the incorporation of learning the kinematics and dynamics online (cf. our corresponding subproject) into the framework of optimal feedback control.

We believe that our approach will result in an improvement over existing adaptive control methods and at the same time will be useful for explaining observed biological motion patterns. We will integrate and test our novel control paradigms in simulation as well as on anthropomorphic robotic platforms (DLR) that embody the challenges of complex, nonlinear dynamics and real-time requirements.

Adaptive optimal control for high dimensional movement systems

Humans and other biological systems are very adept at performing fast, complicated control tasks in spite of large sensorimotor delays while being fairly robust to noise and perturbations. For example, one is able to react accurately and fast to catch a speeding ball while at the same time being flexible to "give in" when obstructed during the execution of a task. This unique combination of speed, accuracy and safety is extremely appealing in any autonomous robotic system.

Classic PID feedback control steers the system along a planned trajectory by generating corrective commands that depend on fine-tuned PID gains, balancing between accuracy and stiffness. High gains make the robot precise but also very stiff and dangerous, i.e., not compliant. Small gains, on the other hand, will reduce stiffness and increase the robot's flexibility with respect to external stimuli, but may result in sluggish and imprecise movements. The ideal movements would be accurate but still compliant; but these are conflicting aims within the PID feedback control paradigm. In order to deal with this problem, a feedforward component with an accurate inverse dynamics model of the robot arm can be used. Then, desired motions can be predicted and only small correction forces are required, which increases the compliance.

Most motor systems exhibit high degrees of redundancy and a controller has to make a choice between many different possible trajectories (kinematics) and from a multitude of applicable motor commands (dynamics) for achieving a particular task. So far, this redundancy has been resolved mostly by models focused on open loop optimization. Here, the sequence of motor commands or the trajectory is directly optimized with respect to some cost function. Whereas these models assume deterministic dynamics, real systems typically suffer from perturbations, noise and an incorrect approximation of true dynamics. The latter are compensated for by using a feedback component (e.g. PID controller) but the correction is not taken into account in the optimization process.

Recently, a closed loop optimization model, namely optimal feedback control (OFC), has been suggested as an alternative. Here, the gains of a feedback controller are optimized to produce an optimal mapping from state to control signals (control law), incorporating real world uncertainties and noise. OFC needs an optimal state estimation (state variables) which is created by combining the feedback signals from sensors and an efferent copy of the motor commands. A forward internal model is used to convert motor commands to state variables. A key property of OFC is that errors are only corrected by the controller if they adversely affect the motor performance, otherwise they are neglected (minimum intervention principle). In OFC, no trajectories need to be explicitly planned or represented internally. OFC also seems to be an interesting paradigm from the perspective of biological plausibility. Experimental results show that optimal feedback controllers reproduce observed biological movement patterns, but unlike the open loop approaches, they additionally can model the typical trial-to-trial variability. Furthermore, OFC can cope with large sensor delays and other uncertainties and therefore, appears to be a favorable computational approach for us to investigate further.

Subproject overview