Transforms and utilities related to color space

Images are typically encoded in the RGB space. However, other color spaces have been developped for different purposes.

We want to predict a colorized picture from a black & white version. Some color spaces have one of their channels corresponding to black & white, which is convenient as we then have to predict only 2 channels.

In order to define our neural network and loss, we would like to use a color space where the difference between two colors is more closely related to visual perception. The CIELAB color space was developped for this purpose and has one of his channel corresponding directly to the black & white version.

Note: some more recent color spaces have been developed such as CIECAM02 to improve the correlation with visual perception. However we currently rely on Pillow which does not handle these color space transformations.

class PILImageLAB[source]

PILImageLAB() :: PILImage

PILImage in the LAB space

class RGBToLAB[source]

RGBToLAB(enc=None, dec=None, split_idx=None, order=None) :: Transform

Convert a PILImage from RGB to LAB space

class TensorLAB[source]

TensorLAB(x, **kwargs) :: TensorImage

Tensor for images in the LAB space

decodes[source]

decodes(x:TensorAB)

class TensorL[source]

TensorL(x, **kwargs) :: TensorLAB

Tensor containing the L channel of an image

class TensorAB[source]

TensorAB(x, **kwargs) :: TensorLAB

Tensor containing A & B channels of an image

class Tuple_L_AB[source]

Tuple_L_AB(x=None, *rest) :: Tuple

Tuple with L channel and A & B channels

class Split_L_AB[source]

Split_L_AB(enc=None, dec=None, split_idx=None, order=None) :: ItemTransform

Split TensorLAB into TensorL (input) & TensorAB (output)

class AdjustType[source]

AdjustType(enc=None, dec=None, split_idx=None, order=None) :: Transform

Cast A & B channels to correct type to have continuous values

A & B channels are in the [0, 255] range, 0 being the neutral value.

However, a few tests show that 127 & 128 are the two extrema so we use AdjustType to recast to [-128, 127] range with those two values as extrema.

encodes[source]

encodes(x:TensorAB)

decodes[source]

decodes(x:TensorAB)

decodes[source]

decodes(x:TensorAB)

encodes[source]

encodes(x:TensorAB)

decodes[source]

decodes(x:TensorAB)

We normalize only the input channel TensorL since the output TensorAB is already in [-0.5,0.5] range.

# test Transforms
path = untar_data(URLs.IMAGENETTE_160)
items = get_image_files(path)
tls = TfmdLists(items, tfms=[PILImage.create, Resize(128), RGBToLAB(), ToTensor(), Split_L_AB()])
tensor_l, tensor_ab = tuple_l_ab = tls[0]
tensor_l.show(title='Input'), tensor_ab.show(title='Targets')
tls.decode(tuple_l_ab).show(title='Fully decoded');
set_seed(13)
dls = tls.dataloaders(after_batch=[AdjustType(), IntToFloatTensor(), Normalize.from_stats(mean=[0.5],std=[1])])
b = dls.one_batch()
print(f'L channel (input) going from {b[0].min():.2f} to {b[0].max():.2f}')
print(f'A channel (output) going from {b[1][:,0].min():.2f} to {b[1][:,0].max():.2f}')
print(f'B channel (output) going from {b[1][:,1].min():.2f} to {b[1][:,1].max():.2f}')
dls.show_batch()
L channel (input) going from -0.50 to 0.50
A channel (output) going from -0.24 to 0.31
B channel (output) going from -0.32 to 0.36