Consider adjusting the clock frequency (while keeping VCC constant for now). What does this achieve? For a fixed task, it will take longer to complete. If the processor is to halt at the end of the task, it will spend less time halted. If the main clock tree keeps going while halted, yet most of the chip uses local clock gating, then we do save some power in that fewer useless clock cycles are executed by the main clock tree.
This sort of frequency scaling can be software controlled: update PLL division ratio. The PLL has inertia: e.g. 1 millisecond, but this is similar to the rate at which an operating system services interrupts, and hence the clock frequency to a system can be ramped up as load arrives. This is how most laptops now work.
Let's compare with dynamic clock gating: the table shows the main differences, but the most important difference is still to come: we can reduce the supply voltage if we have reduced the clock frequency.