Preprint

DexCompose: Reusing Dexterous Policies for Multi-Task Manipulation with a Single Hand

Dihong Huang1 Zhenyu Wei1 Zhuxiu Xu1 Yunchao Yao1 Sikai Li1 Mingyu Ding1
1 University of North Carolina at Chapel Hill
DexCompose composes dexterous skills through role-aware finger ownership

DexCompose separates grasp preservation from downstream interaction through explicit finger ownership and dual residual stabilizers, reducing destructive interference when two pretrained full-hand policies share one dexterous hand.

Abstract

Dexterous manipulation policies can solve individual skills, but composing them to perform multiple tasks with a single hand remains difficult. Different tasks often compete over the same action dimensions, causing destructive interference between preserving an existing manipulation outcome and executing a new one. We propose DexCompose, a role-aware residual composition framework that enables the reuse of pretrained dexterous policies through explicit finger-level action ownership.

Given two pretrained full-hand policies, DexCompose first collects successful post-task states from the first skill and performs release tests over candidate finger masks to estimate which fingers are necessary for maintaining the established held state. It then trains a bounded residual stabilizer for task preservation and a context-aware residual that adapts the frozen downstream policy only within the action subspace assigned to the new task.

We evaluate the framework on 16 composite dexterous manipulation tasks spanning four object-retention skills and four downstream interactions. Our method achieves 77.4% average composite success, and ablation studies confirm that structural action ownership combined with dual asymmetric residuals is more effective than conventional policy chaining or unmasked residual correction.

Method

DexCompose treats policy composition as an action allocation problem at the embodiment level. The framework discovers which fingers must preserve Task A, releases redundant fingers for Task B, and composes the two frozen policies using residual modules that are restricted to their assigned action subspaces.

DexCompose pipeline: finger attribution, action masks, and dual residual stabilizers
Given two frozen single-task policies, DexCompose attributes fingers through release tests, expands the selected finger mask into action-space ownership masks, and executes the composed policy with a Task-A preservation residual and a Task-B adaptation residual.

Finger Attribution

Successful Task-A hold states are replayed while candidate finger subsets are released. Retention and clean-release diagnostics identify which fingers can be reassigned without losing the held object.

Action Ownership

The selected finger mask defines disjoint action subspaces: Task A controls preserved fingers, while Task B controls the wrist and released fingers.

Dual Residual Stabilizer

A bounded Task-A residual preserves the held state, and a Task-B residual adapts the downstream policy under the constraints imposed by the maintained grasp.

Video Demos

Demos are grouped by the behavior they illustrate: competent single-task base policies, release-test diagnostics for finger ownership, successful composite rollouts across all task pairs, and representative failure or ablation cases.

Primitive Skills

Base Policy Primitives

The primitive policies solve their own tasks before composition, so the challenge is controlled reuse under one shared hand action space.

Retention

GraspBall

Retention

PickStick

Retention

PickCan

Retention

PourMug

Downstream

OpenDoor

Downstream

PushButton

Downstream

OpenMicrowave

Downstream

TurnOnSwitch

Results

16 Composite Tasks
77.4% Mean Success
+15.8 pp Over Strongest Baseline
32 x 8 Rollouts per Pair

Experiments are conducted in Isaac Lab with a Shadow Hand. Task A contains four object-retention skills: GraspBall, PourMug, PickCan, and PickStick. Task B contains four downstream interactions: OpenDoor, PushButton, OpenMicrowave, and TurnOnSwitch. A rollout succeeds only when the Task-A object remains retained and Task B is completed.

Method Mean Composite Success Role in Comparison
Frozen Grasp 3.09 +/- 0.72 Sequential execution with frozen grasp-maintaining fingers
Decomposed Action Space 45.61 +/- 1.94 Fixed subspace split without release-test validation
Residual Learning 61.60 +/- 2.31 Unmasked residual correction over the combined policy
Ours-ZS 69.23 +/- 1.60 Finger allocation and Task-A stabilizer without Task-B residual
DexCompose 77.43 +/- 2.64 Finger attribution, action ownership, and dual residual stabilization
Base-policy preservation analysis for DexCompose
DexCompose preserves both sides of the composed behavior, achieving A-side and B-side preservation ratios of 0.811 and 0.893 while maintaining the highest composite success.

Ablations and Diagnostics

Component ablations show that the Task-A residual is essential for preserving the first manipulation outcome. Finger allocation and action masking reduce cross-task interference, while the Task-B residual and transition stage further improve execution robustness.

Variant Mean Success Interpretation
DexCompose 77.43 +/- 2.64 Full method
Ours-ZS 69.23 +/- 1.60 Removes Task-B residual
-Finger 54.52 +/- 2.09 Removes finger allocation
-Task-A 9.13 +/- 1.29 Removes Task-A residual stabilizer
-Mask 59.13 +/- 1.75 Removes action masking
-Trans. 69.02 +/- 2.07 Removes transition stage
Failure-mode breakdown for DexCompose across task combinations
Failure modes vary by downstream interaction. Transition failures dominate difficult combinations, TurnOnSwitch causes more preservation failures, and OpenMicrowave more often fails from limited downstream progress under constrained grasp configurations.

BibTeX

@misc{huang2026dexcompose,
  title={DexCompose: Reusing Dexterous Policies for Multi-Task Manipulation with a Single Hand},
  author={Huang, Dihong and Wei, Zhenyu and Xu, Zhuxiu and Yao, Yunchao and Li, Sikai and Ding, Mingyu},
  year={2026},
  note={Preprint},
  url={https://dexcompose.github.io/DexCompose-Page/}
}