Fixing the corner case when the optimizer has no trainable parameters

Summary: We made the following changes: 1. We has fixed the corner case when the optimizer has no trainable parameters. This might happen when there are more than one optimizers while some of them are frozen during the fine-tuning. 2. We have changed the "closure" logic in the "step" function in "ddpoptimizer.py", to make it consistent with "optimizer.py". Differential Revision: D53055273
pytorch · Jan 24, 2024 · a0fee30 · a0fee30
1 parent d0290d7
commit a0fee30
Show file tree

Hide file tree

Showing 2 changed files with 10 additions and 1 deletion.
diff --git a/opacus/optimizers/ddpoptimizer.py b/opacus/optimizers/ddpoptimizer.py
@@ -70,8 +70,12 @@ def reduce_gradients(self):
     def step(
         self, closure: Optional[Callable[[], float]] = None
     ) -> Optional[torch.Tensor]:
+        if closure is not None:
+            with torch.enable_grad():
+                closure()
+
         if self.pre_step():
             self.reduce_gradients()
-            return self.original_optimizer.step(closure)
+            return self.original_optimizer.step()
         else:
             return None
diff --git a/opacus/optimizers/optimizer.py b/opacus/optimizers/optimizer.py
@@ -491,6 +491,11 @@ def pre_step(
             closure: A closure that reevaluates the model and
                 returns the loss. Optional for most optimizers.
         """
+        # The corner case when the optimizer has no trainable parameters.
+        # Essentially the DPOptimizer act as a normal optimizer
+        if self.grad_samples is None or len(self.grad_samples) == 0:
+            return True
+
         self.clip_and_accumulate()
         if self._check_skip_next_step():
             self._is_last_step_skipped = True