Motion Planning: Task and Motion Planning

4 minute read

Published: September 07, 2025

TL;DR: Motion planning bridges the gap between what a robot must do (task level) and how it moves to do it (trajectory level). Task and Motion Planning (TAMP) interleaves symbolic reasoning with geometric feasibility checks; trajectory optimisation methods like iLQR and DDP compute locally optimal joint trajectories; and Model Predictive Control closes the loop by replanning at every timestep.

Motion planning for manipulation — Dexterous manipulation via learned motion planning (Andrychowicz et al., 2019)

The Motion Planning Hierarchy

Robot autonomy is naturally hierarchical. A mobile manipulator tidying a room must decide which object to pick next (task), plan a collision-free path to the object (path planning), compute smooth joint trajectories that respect torque limits (trajectory optimisation), and track those trajectories despite disturbances (control). Motion planning sits at the intersection of all these levels.

Task and Motion Planning (TAMP)

Pure geometric planners assume the task sequence is fixed. But for long-horizon manipulation — “set the table”, “assemble a widget” — the robot must reason over sequences of symbolic actions whose feasibility depends on continuous geometry.

Task and Motion Planning (TAMP) interleaves a symbolic task planner (which selects action sequences) with a geometric motion planner (which checks and computes feasible trajectories for each action). If the geometric planner fails for a candidate action sequence, it provides feedback to backtrack and explore other symbolic options.

Toussaint (2018) formalised TAMP as a logic-geometric program: a joint optimisation over discrete action sequences and continuous motion parameters. The key insight is that symbolic and geometric subproblems are not independent — the feasibility of picking an object depends on the robot’s current pose, which is determined by prior motion plans.

Key Insight: TAMP is powerful precisely because it refuses to separate "what to do" from "how to do it." Geometric infeasibility is a signal to revise the high-level plan, not just to try harder at the motion level.

Trajectory Optimisation: iLQR and DDP

Given a fixed task sequence and waypoints, trajectory optimisation computes smooth, dynamically consistent joint trajectories by minimising a cost functional over the entire trajectory.

Differential Dynamic Programming (DDP) optimises trajectories by iterating two steps: a backward pass that computes a second-order approximation of the value function, and a forward pass that updates the trajectory using the computed control gains.

Iterative Linear Quadratic Regulator (iLQR) is a simplified form of DDP that retains only first-order dynamics approximations, reducing computation while preserving convergence properties. The cost to minimise over a horizon \(T\) is:

J = Σ_{t=0}^{T} [ l(x_t, u_t) ] + l_f(x_T)

where \(l(x_t, u_t)\) is a running cost (e.g., penalising joint velocities and torques) and \(l_f\) is a terminal cost (e.g., distance to goal pose). iLQR alternates between linearising the dynamics around the current trajectory and solving the resulting LQR problem, making it highly efficient for robot arms with known dynamics models.

Model Predictive Control (MPC)

Trajectory optimisation plans an entire trajectory offline. Model Predictive Control (MPC) closes the loop by re-solving the optimisation problem at every control step, using only a finite receding horizon \(H\):

min_{u_{t:t+H}} Σ_{k=t}^{t+H} l(x_k, u_k) s.t. x_{k+1} = f(x_k, u_k), constraints

MPC applies the first action from the optimal sequence, observes the new state, and repeats. This makes MPC inherently robust to model mismatch and disturbances — real advantages in physical robotics. The tradeoff is computational cost: for legged robots and manipulation, MPC must solve a nonlinear program in milliseconds.

Recent work combines MPC with learned dynamics models (neural MPC) or uses warm-starting from the previous solution to reduce solve time, enabling real-time operation at 100 Hz or higher.

Whole-Body Motion Planning

Humanoids and mobile manipulators require whole-body motion planning: simultaneously optimising base locomotion, arm motion, and end-effector goals. Whole-body control (WBC) formulates this as a hierarchical quadratic program (QP) that satisfies multiple tasks (Cartesian goals, balance, joint limits) in priority order. Tasks with higher priority strictly constrain lower-priority ones.

References

Toussaint, M. (2018). Logic-geometric programming: An optimisation-based approach to combined task and motion planning. IJCAI 2018, 1930–1936.
Mayne, D. Q., et al. (2000). Constrained model predictive control: Stability and optimality. Automatica, 36(6), 789–814.
Tassa, Y., Erez, T., & Todorov, E. (2012). Synthesis and stabilisation of complex behaviours through online trajectory optimisation. IROS 2012.
Jacobson, D. H., & Mayne, D. Q. (1970). Differential Dynamic Programming. Elsevier.
Sentis, L., & Khatib, O. (2005). Synthesis of whole-body behaviors through hierarchical control of behavioral primitives. IJHR, 2(4), 505–518.
Posa, M., Cantu, C., & Tedrake, R. (2014). A direct method for trajectory optimization of rigid body dynamical systems with contact. IJRR, 33(1), 69–81.

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Alessio Borgi

Motion Planning: Task and Motion Planning

The Motion Planning Hierarchy

Task and Motion Planning (TAMP)

Trajectory Optimisation: iLQR and DDP

Model Predictive Control (MPC)

Whole-Body Motion Planning

References

Share on

You May Also Enjoy

GAPE: Remember to Forget — Gated Adaptive Positional Encoding

PolyNSD: Polynomial Neural Sheaf Diffusion

TDA in Materials Science: Topology of Structure and Phase

TDA in Drug Discovery: Molecular Topology