Motion Planning: Task and Motion Planning
Published:

The Motion Planning Hierarchy
Robot autonomy is naturally hierarchical. A mobile manipulator tidying a room must decide which object to pick next (task), plan a collision-free path to the object (path planning), compute smooth joint trajectories that respect torque limits (trajectory optimisation), and track those trajectories despite disturbances (control). Motion planning sits at the intersection of all these levels.
Task and Motion Planning (TAMP)
Pure geometric planners assume the task sequence is fixed. But for long-horizon manipulation — “set the table”, “assemble a widget” — the robot must reason over sequences of symbolic actions whose feasibility depends on continuous geometry.
Task and Motion Planning (TAMP) interleaves a symbolic task planner (which selects action sequences) with a geometric motion planner (which checks and computes feasible trajectories for each action). If the geometric planner fails for a candidate action sequence, it provides feedback to backtrack and explore other symbolic options.
Toussaint (2018) formalised TAMP as a logic-geometric program: a joint optimisation over discrete action sequences and continuous motion parameters. The key insight is that symbolic and geometric subproblems are not independent — the feasibility of picking an object depends on the robot’s current pose, which is determined by prior motion plans.
Trajectory Optimisation: iLQR and DDP
Given a fixed task sequence and waypoints, trajectory optimisation computes smooth, dynamically consistent joint trajectories by minimising a cost functional over the entire trajectory.
Differential Dynamic Programming (DDP) optimises trajectories by iterating two steps: a backward pass that computes a second-order approximation of the value function, and a forward pass that updates the trajectory using the computed control gains.
Iterative Linear Quadratic Regulator (iLQR) is a simplified form of DDP that retains only first-order dynamics approximations, reducing computation while preserving convergence properties. The cost to minimise over a horizon \(T\) is:
where \(l(x_t, u_t)\) is a running cost (e.g., penalising joint velocities and torques) and \(l_f\) is a terminal cost (e.g., distance to goal pose). iLQR alternates between linearising the dynamics around the current trajectory and solving the resulting LQR problem, making it highly efficient for robot arms with known dynamics models.
Model Predictive Control (MPC)
Trajectory optimisation plans an entire trajectory offline. Model Predictive Control (MPC) closes the loop by re-solving the optimisation problem at every control step, using only a finite receding horizon \(H\):
MPC applies the first action from the optimal sequence, observes the new state, and repeats. This makes MPC inherently robust to model mismatch and disturbances — real advantages in physical robotics. The tradeoff is computational cost: for legged robots and manipulation, MPC must solve a nonlinear program in milliseconds.
Recent work combines MPC with learned dynamics models (neural MPC) or uses warm-starting from the previous solution to reduce solve time, enabling real-time operation at 100 Hz or higher.
Whole-Body Motion Planning
Humanoids and mobile manipulators require whole-body motion planning: simultaneously optimising base locomotion, arm motion, and end-effector goals. Whole-body control (WBC) formulates this as a hierarchical quadratic program (QP) that satisfies multiple tasks (Cartesian goals, balance, joint limits) in priority order. Tasks with higher priority strictly constrain lower-priority ones.
References
- Toussaint, M. (2018). Logic-geometric programming: An optimisation-based approach to combined task and motion planning. IJCAI 2018, 1930–1936.
- Mayne, D. Q., et al. (2000). Constrained model predictive control: Stability and optimality. Automatica, 36(6), 789–814.
- Tassa, Y., Erez, T., & Todorov, E. (2012). Synthesis and stabilisation of complex behaviours through online trajectory optimisation. IROS 2012.
- Jacobson, D. H., & Mayne, D. Q. (1970). Differential Dynamic Programming. Elsevier.
- Sentis, L., & Khatib, O. (2005). Synthesis of whole-body behaviors through hierarchical control of behavioral primitives. IJHR, 2(4), 505–518.
- Posa, M., Cantu, C., & Tedrake, R. (2014). A direct method for trajectory optimization of rigid body dynamical systems with contact. IJRR, 33(1), 69–81.
