Having perceptual differences between scene colors is key in many computer vision applications such as image segmentation or visual salient region detection. Nevertheless, most of the times, we only have access to the rendered image colors, without any means to go back to the true scene colors. The main existing approaches propose either to compute a perceptual distance between the rendered image colors, or to estimate the scene colors from the rendered image colors and then to evaluate perceptual distances. However the first approach provides distances that can be far from the scene color differences while the second requires the knowledge of the acquisition conditions that are unavailable for most of the applications. In this paper, we design a new local Mahalanobis-like metric learning algorithm that aims at approximating a perceptual scene color difference that is invariant to the acquisition conditions and computed only from rendered image colors. Using the theoretical framework of uniform stability, we provide consistency guarantees on the learned model. Moreover, our experimental evaluation shows its great ability (i) to generalize to new colors and devices and (ii) to deal with segmentation tasks.