Artem Oppermann Updated on April 29, 2025
Double deep Q-learning reduces overestimated action values in deep Q-learning by splitting the max operation in the target into separate action selection and action evaluation steps, resulting in more stable and accurate learning.