first code update

mv-lab · Jul 7, 2023 · 2253d71 · 2253d71
1 parent 5a631d7
commit 2253d71
Show file tree

Hide file tree

Showing 11 changed files with 599 additions and 10 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,10 @@
+# Custom
+
+*.ipynb
+!nilut-multiblend.ipynb
+
+dataset/*.png
+
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]

diff --git a/README.md b/README.md
@@ -1,29 +1,67 @@
-# NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement
+# [NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement](https://arxiv.org/abs/2306.11920)
+
+[![arXiv](https://img.shields.io/badge/arXiv-Paper-<COLOR>.svg)](https://arxiv.org/abs/2306.11920)
+[<a href=""><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="colab demo"></a>]()
+[<a href="https://www.kaggle.com/code/jesucristo/super-resolution-demo-swin2sr-official/"><img src="https://upload.wikimedia.org/wikipedia/commons/7/7c/Kaggle_logo.png?20140912155123" alt="kaggle demo" width=50></a>]()
 
-[![arXiv](https://img.shields.io/badge/arXiv-Paper-<COLOR>.svg)]()
 
 [Marcos V. Conde](https://scholar.google.com/citations?user=NtB1kjYAAAAJ&hl=en), [Javier Vazquez-Corral](https://scholar.google.com/citations?user=gjnuPMoAAAAJ&hl=en), [Michael S. Brown](https://scholar.google.com/citations?hl=en&user=Gv1QGSMAAAAJ), [Radu Timofte](https://scholar.google.com/citations?user=u3MwH5kAAAAJ&hl=en)
 
 
-**TL;DR** NILUT uses neural representations for controllable photorealistic image enhancement.
+**TL;DR** NILUT uses neural representations for controllable photorealistic image enhancement. 🚀 Demo Tutorial and pretrained models available.
+
+
+<img src="media/nilut-intro.gif" alt="NILUT" width="800"> 
+
+----
+
+3D lookup tables (3D LUTs) are a key component for image enhancement. Modern image signal processors (ISPs) have dedicated support for these as part of the camera rendering pipeline. Cameras typically provide multiple options for picture styles, where each style is usually obtained by applying a unique handcrafted 3D LUT. Current approaches for learning and applying 3D LUTs are notably fast, yet not so memory-efficient, as storing multiple 3D LUTs is required. For this reason and other implementation limitations, their use on mobile devices is less popular. 
+
+In this work, we propose a Neural Implicit LUT (NILUT), an implicitly defined continuous 3D color transformation parameterized by a neural network. We show that NILUTs are capable of accurately emulating real 3D LUTs. Moreover, a NILUT can be extended to incorporate multiple styles into a single network with the ability to blend styles implicitly. Our novel approach is memory-efficient, controllable and can complement previous methods, including learned ISPs.
+
 
 **Topics** Image Enhancement, Image Editing, Color Manipulation, Tone Mapping, Presets
 
 ***Website and repo in progress.*** **See also [AISP](https://github.com/mv-lab/AISP)** for image signal processing code and papers.
 
+----
 
-<br>
+**Pre-trained** sample models are available at `models/`. We provide `nilutx3style.pt` a NILUT that encodes three 3D LUT styles (1,3,4) with high accuracy.
 
- <img src="nilut-intro.gif" alt="NILUT" width="800"> 
+**Demo Tutorial** in [nilut-multiblend.ipynb](nilut-multiblend.ipynb) we provide a simple tutorial on how to use NILUT for multi-style image enhancement and blending. The corresponding training code will be released soon.
 
- <br>
+**Dataset** The folder `dataset/` includes 100 images from the Adobe MIT 5K Dataset. The images were processed using professional 3D LUTs on Adobe Lightroom. The structure of the dataset is:
 
-----
+```
+dataset/
+├── 001_blend.png
+├── 001_LUT01.png
+├── 001_LUT02.png
+├── 001_LUT03.png
+├── 001_LUT04.png
+├── 001_LUT05.png
+├── 001_LUT08.png
+├── 001_LUT10.png
+└── 001.png
+...
+```
 
-3D lookup tables (3D LUTs) are a key component for image enhancement. Modern image signal processors (ISPs) have dedicated support for these as part of the camera rendering pipeline. Cameras typically provide multiple options for picture styles, where each style is usually obtained by applying a unique handcrafted 3D LUT. Current approaches for learning and applying 3D LUTs are notably fast, yet not so memory-efficient, as storing multiple 3D LUTs is required. For this reason and other implementation limitations, their use on mobile devices is less popular. 
+where `001.png` is the input unprocessed image, `001_LUTXX.png` is the result of applying each corresponding LUT and `001_blend.png` is the example target for evaluating sytle-blending (in the example the blending is between styles 1,3, and 4 with equal weights 0.33). 
+The complete dataset includes 100 images `aaa.png` and their enhanced variants for each 3D LUT.
 
-In this work, we propose a Neural Implicit LUT (NILUT), an implicitly defined continuous 3D color transformation parameterized by a neural network. We show that NILUTs are capable of accurately emulating real 3D LUTs. Moreover, a NILUT can be extended to incorporate multiple styles into a single network with the ability to blend styles implicitly. Our novel approach is memory-efficient, controllable and can complement previous methods, including learned ISPs.
 
 ----
 
- **Contact** marcos.conde[at]uni-wuerzburg.de
+Hope you like it 🤗 If you find this interesting/insightful/inspirational or you use it, do not forget to acknowledge our work:
+
+```
+@article{conde2023nilut,
+  title={NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement},
+  author={Conde, Marcos V and Vazquez-Corral, Javier and Brown, Michael S and Timofte, Radu},
+  journal={arXiv preprint arXiv:2306.11920},
+  year={2023}
+}
+```
+
+**Contact** marcos.conde[at]uni-wuerzburg.de
+
diff --git a/dataloader.py b/dataloader.py
@@ -0,0 +1,70 @@
+import torch
+from torch import nn
+from torch.utils.data import DataLoader, Dataset
+import numpy as np
+
+from utils import load_img, np_psnr
+
+
+class EvalMultiLUTBlending (Dataset):
+    """
+    Dataloader to load the input image <inp_img> and the reference target images <list_out_imgs>.
+    The order of the target images must be: ground-truth 3D LUT outputs (the first <nluts> elements in the list), following by gt blending results.
+
+    We will load each reference, and include the corresponding style vector a sinput to the network
+    Example:
+
+    test_images = EvalMultiLUTFitting('./DatasetLUTs_100images/001.png', 
+                                 ['./DatasetLUTs_100images/001_LUT01.png', 
+                                  './DatasetLUTs_100images/001_LUT03.png', 
+                                  './DatasetLUTs_100images/001_LUT04.png',
+                                  './DatasetLUTs_100images/001_blend.png'], nluts=3)
+            
+    test_dataloader = DataLoader(test_images, batch_size=1, pin_memory=True, num_workers=0)
+    """
+
+    def __init__(self, inp_img, list_out_img, nluts):
+        super().__init__()
+
+        self.inp_imgs = load_img(inp_img)
+        self.out_imgs = []
+        self.error = []
+        self.shape = self.inp_imgs.shape
+        self.nluts = nluts
+
+        for fout in list_out_img:
+            lut = load_img(fout)
+            assert self.inp_imgs.shape == lut.shape
+            assert (self.inp_imgs.max() <= 1) and (lut.max() <= 1)
+            self.out_imgs.append(lut)
+            self.error.append(np_psnr(self.inp_imgs,lut))
+            del lut
+
+        self.references = len(list_out_img)
+
+    def __len__(self):
+        return self.references
+
+    def __getitem__(self, idx):
+        if idx > self.references: raise IndexError
+
+        style_vector = np.zeros(self.nluts).astype(np.float32)
+
+        if idx < self.nluts:
+            style_vector[idx] = 1.
+        else:
+            style_vector = np.array([0.33, 0.33, 0.33]).astype(np.float32)
+
+        # Convert images to pytorch tensors
+        img = torch.from_numpy(self.inp_imgs)
+        lut = torch.from_numpy(self.out_imgs[idx])
+
+        img = img.reshape((img.shape[0]*img.shape[1],3)) # [hw, 3]
+        lut = lut.reshape((lut.shape[0]*lut.shape[1],3)) # [hw, 3]
+
+        style_vector    = torch.from_numpy(style_vector)
+        style_vector_re = style_vector.repeat(img.shape[0]).view(img.shape[0],self.nluts)
+
+        img = torch.cat([img,style_vector_re], dim=-1)
+
+        return img, lut, style_vector
diff --git a/dataset/.gitkeep b/dataset/.gitkeep
diff --git a/media/cnilut.png b/media/cnilut.png
diff --git a/media/header.png b/media/header.png
diff --git a/nilut-intro.gif → media/nilut-intro.gif b/nilut-intro.gif → media/nilut-intro.gif
diff --git a/models/nilutx3style.pt b/models/nilutx3style.pt
diff --git a/nilut-multiblend.ipynb b/nilut-multiblend.ipynb
diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,10 @@
+imageio==2.30.0
+matplotlib==3.7.1
+numpy==1.24.3
+opencv-python==4.7.0.72
+Pillow==9.4.0
+scikit-image==0.20.0
+scipy==1.10.1
+torch==2.0.1
+torchaudio==2.0.2
+torchvision==0.15.2
diff --git a/utils.py b/utils.py
@@ -0,0 +1,95 @@
+"""
+NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement
+
+Utils for training and ploting
+"""
+
+import torch
+import cv2
+from PIL import Image
+import matplotlib.pyplot as plt
+import numpy as np
+import gc
+import time
+from skimage import io, color
+
+
+# Timing utilities
+
+def start_timer():
+    global start_time
+    gc.collect()
+    torch.cuda.empty_cache()
+    torch.cuda.reset_max_memory_allocated()
+    torch.cuda.synchronize()
+    start_time = time.time()
+
+def end_timer_and_print(local_msg):
+    torch.cuda.synchronize()
+    end_time = time.time()
+    print("\n" + local_msg)
+    print("Total execution time = {:.3f} sec".format(end_time - start_time))
+    print("Max memory used by tensors = {} bytes".format(torch.cuda.max_memory_allocated()))
+
+def clean_mem():
+    gc.collect()
+    torch.cuda.empty_cache()
+    torch.cuda.reset_max_memory_allocated()
+
+# Model
+
+def count_parameters(model):
+    return sum(p.numel() for p in model.parameters() if p.requires_grad)
+
+
+# Load/save and plot images
+
+def load_img (filename, norm=True,):
+
+    img = np.array(Image.open(filename))
+    if norm:   
+        img = img / 255.
+        img = img.astype(np.float32)
+    return img
+
+def save_rgb (img, filename):
+    if np.max(img) <= 1:
+        img = img * 255
+
+    img = img.astype(np.float32)
+    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
+
+    cv2.imwrite(filename, img)
+
+def plot_all (images, figsize=(20,10), axis='off'):
+    fig = plt.figure(figsize=figsize, dpi=80)
+    nplots = len(images)
+    for i in range(nplots):
+        plt.subplot(1,nplots,i+1)
+        plt.axis(axis)
+        plt.imshow(images[i])
+
+    plt.show()
+
+# Metrics
+
+def np_psnr(y_true, y_pred):
+    mse = np.mean((y_true - y_pred) ** 2)
+    if(mse == 0):  return np.inf
+    return 20 * np.log10(1 / np.sqrt(mse))
+
+def pt_psnr (y_true, y_pred):
+    mse = torch.mean((y_true - y_pred) ** 2)
+    return 20 * torch.log10(1 / torch.sqrt(mse))
+
+def deltae_dist (y_true, y_pred):
+    """
+    Calcultae DeltaE discance in the LAB color space.
+    Images must numpy arrays.
+    """
+
+    gt_lab  = color.rgb2lab((y_true*255).astype('uint8'))
+    out_lab = color.rgb2lab((y_pred*255).astype('uint8'))
+    l2_lab  = ((gt_lab - out_lab)**2).mean()
+    l2_lab  = np.sqrt(((gt_lab - out_lab)**2).sum(axis=-1)).mean()
+    return l2_lab