Images of the Russian Empire:
    Colorizing the Prokudin-Gorskii Photo Collection
    
        Yueheng Zeng
        @
        Project 1
    
    Overview
    
        The goal of this project is to colorize the Prokudin-Gorskii photo collection. The collection is a series of
        photographs taken by Russian photographer Sergei Mikhailovich Prokudin-Gorskii in the early 20th century. The
        photographs are in black and white, but they were taken using a special camera that captured three separate
        black and white images, each with a different color filter (red, green, or blue). By combining these three
        images, we can reconstruct the original color image.
    
    
    
        The image on the left is the original black and white image, and the image on the right is the colorized image
        produced by this project.
    
    Table of Contents
    
        - 
            
                Aligning the Low-resolution Images
            
        
- 
            
                Aligning the High-resolution Images
            
        
- 
            Automatic Cropping
        
- 
            Automatic Contrasting
        
- 
            Better Color Mapping
        
- 
            All Results
        
Aligning the Low-resolution Images
    
        The alignment process for low-resolution images involves loading the images, splitting them into separate
        channels, detecting edges, and aligning the channels to create colorized images. The process is as follows:
    
    
        - 
            Loading and Splitting Channels:
            Load the image and split it into three separate channels—blue (B), green (G), and red (R)—from a single
            image where
            these channels are stacked vertically. This is achieved by dividing the height of the image by three and
            slicing it into
            equal parts for each channel.
        
- 
            Edge Detection:
            Use the Canny edge detector to detect edges in the images. This technique
            helps focus on key features of the image (e.g., edges and transitions) rather than pixel intensity
            values, which can be less effective for alignment due to variations in brightness.
        
- 
            Alignment Process:
            
                - 
                    The alignment of the green (G) and red (R) channels to the blue (B) channel is performed using a
                    brute-force
                    search over a specified range of pixel displacements
                    (I used a range of -15 to 15 pixels) along both the horizontal and vertical axes.
                
- 
                    A cropping step is included in the alignment to avoid border artifacts. I cut off 10% of the image
                    from the edges
                    before calculating the distance.
                
- 
                    For each possible combination of vertical and horizontal displacements, the
                    function shifts the green or red channel using np.roll and calculates the squared Euclidean distance
                    between the shifted
                    channel and the reference channel (the blue channel).
                
 
         
         
     
    
        The image above is the cathedral image before and after alignment.
    
    Aligning the High-resolution Images
    
        The alignment process for high-resolution images is similar to the process for low-resolution images, but it
        will take significantly longer to run due to the increased image size. To speed up the process, it is common to
        use a pyramid approach, where the image is progressively downsampled to create a series of smaller images that
        are easier to align. The process is as follows:
    
    
        - 
            Create Pyramid:
            Generate progressively smaller versions of the image at each level by downscaling the original channel.
            I down scaled the image by a factor of 2 for each level and I want the smallest image have a height around
            400 pixels.
        
- 
            Recursive Alignment:
            
                - 
                    Start aligning at the lowest resolution (coarsest level).
                
- 
                    Recursively align each higher level by first aligning the next coarser level.
                
- 
                    Scale up the offset from the lower level and apply it to the current level before refining the
                    alignment.
                
 
- 
            Refine Alignment:
            At each level, combine the new alignment offset with the scaled offset from the previous level to
            progressively improve
            accuracy as the resolution increases.
        
        The image above is the lady image before and after alignment.
    
    Automatic Cropping
    
        The automatic cropping process is used to remove the weird border artifacts that appear in the aligned images.
        The process is as follows:
    
    
        - 
            Rescale Image:
            If the image height is larger than 500 pixels, it is downscaled; otherwise, it remains at the same
            resolution.
        
- 
            Crop Central Region:
            A central 60% region of the image is extracted both horizontally and vertically for line detection.
        
- 
            Line Detection:
            Vertical (left, right) and horizontal (top, bottom) lines are detected in each color channel using Canny
            edge detection
            and Hough transforms.
        
- 
            Determine Innermost Lines:
            The coordinates of the innermost lines (left, right, top, bottom) are determined.
        
- 
            Adjust Crop:
            A small margin (1% of the image) is removed from each side to avoid edge artifacts.
        
- 
            Crop Image:
            The image is cropped using the coordinates of the innermost lines.
        
        The image above is the lady image before and after cropping.
    
    Automatic Contrasting
    
        The automatic contrasting process is used to enhance the contrast of the images. The process is as follows:
    
    
        - 
            Crop the Image:
            The image is cropped to avoid border artifacts.
        
- 
            Determine Min and Max Values:
            The minimum and maximum pixel values within the cropped region are calculated. These values represent the
            darkest and
            brightest parts of the image within the selected region.
        
- 
            Scale Pixel Values:
            The pixel values in the entire image are linearly scaled so that the minimum value becomes 0 and the maximum
            value
            becomes 1. This improves contrast by stretching the image's intensity values across the full range. For
            those pixels
            that are outside the range, they are clipped to the minimum or maximum value.
        
        The upper-left image is the lady image before automatic contrasting, and the upper-right image is the lady image
        after automatic contrasting.
    
    Better Color Mapping
    
        The better color mapping process is used to map the images to a more realistic color space. The process is as
        follows:
    
    
        - 
            Convert to LAB Color Space:
            Convert the image from the RGB color space to the LAB color space. The LAB color space is designed to mimic
            human vision
            and is more perceptually uniform than the RGB color space.
        
- 
            Adjust the AB Channels:
            I wrote a program that creates a simple GUI using tkinter that allows
            real-time adjustment of the L (lightness), A (green-red), and B (blue-yellow)
            channels in the LAB color space of an image.
            Then I performed some experiments to find the best scaling factors for the A and B channels to achieve a
            more realistic color mapping.
            
                  
                    Some results:
                    
                        - emir: L: 1, A: 0.4, B: 0.95
- church: L: 1, A: 0.31, B: 0.91
- lady: L: 1, A: 0.36, B: 0.87
- three_generations: L: 1, A: 0.36, B: 0.95
- tobolsk: L: 1, A: 0.31, B: 0.95
 Conclusion:
                    According to the experiments, it is reasonable to leave the L channel unchanged and scale the
                    A and B channels by 0.35 and 0.93, respectively.
                 
 
 
- 
            Convert Back to RGB:
            Convert the image back to the RGB color space for display.
        
         
         
     
    
        The upper-left image is the lady image before better color mapping, and the upper-right image is the lady image
        after better color mapping.
    
    All Results
    
        All results are shown below:
        
            aligned ⇒ cropped ⇒ auto-contrasted ⇒ better color mapped
        
    
    
        Low-resolution Images
    
    
    
    
    
        High-resolution Images