Images are typically encoded in the RGB space. However, other color spaces have been developped for different purposes.
We want to predict a colorized picture from a black & white version. Some color spaces have one of their channels corresponding to black & white, which is convenient as we then have to predict only 2 channels.
In order to define our neural network and loss, we would like to use a color space where the difference between two colors is more closely related to visual perception. The CIELAB color space was developped for this purpose and has one of his channel corresponding directly to the black & white version.
Note: some more recent color spaces have been developed such as CIECAM02 to improve the correlation with visual perception. However we currently rely on Pillow which does not handle these color space transformations.
A & B channels are in the [0, 255] range, 0 being the neutral value.
However, a few tests show that 127 & 128 are the two extrema so we use AdjustType
to recast to [-128, 127] range with those two values as extrema.
# test Transforms
path = untar_data(URLs.IMAGENETTE_160)
items = get_image_files(path)
tls = TfmdLists(items, tfms=[PILImage.create, Resize(128), RGBToLAB(), ToTensor(), Split_L_AB()])
tensor_l, tensor_ab = tuple_l_ab = tls[0]
tensor_l.show(title='Input'), tensor_ab.show(title='Targets')
tls.decode(tuple_l_ab).show(title='Fully decoded');
set_seed(13)
dls = tls.dataloaders(after_batch=[AdjustType(), IntToFloatTensor(), Normalize.from_stats(mean=[0.5],std=[1])])
b = dls.one_batch()
print(f'L channel (input) going from {b[0].min():.2f} to {b[0].max():.2f}')
print(f'A channel (output) going from {b[1][:,0].min():.2f} to {b[1][:,0].max():.2f}')
print(f'B channel (output) going from {b[1][:,1].min():.2f} to {b[1][:,1].max():.2f}')
dls.show_batch()