I wanted a place where I could direct people to see and download pictures I have taken, and I also wanted to have a connected platform and storage solution that I had control over. I was in the middle of rebuilding my personal site, and I thought hey, this sounds like a fun project!
And it was. There are so many angles to take with this gallery that I think I will be continuing to build on top of it as my needs change and grow.
The first task was actually designing the pipeline. It would be best if I could upload an image and everything "just worked". Lambda was a perfect fit, because we could trigger a lambda function whenever an image is uploaded to S3, then we can use that lambda function to tie everything else together. The lambda function handles more basic stuff, like resizing the images, extracting EXIF data, compiling metadata, and invoking external endpoints for capabilities that are outside of the scope of a lambda function.
At first, I just thought it would be fun to run object/label detection on my photos to see what would be picked up. AWS makes this easy with Rekognition. While using recognition, I noticed that it also provides this interesting "dominant colors" feature. This made me think that it would be cool to store these colors and show them to the user when viewing a specific image. But what would be even cooler would be to implement a search by color feature!
Search by color turned out to be an unexpected rabbit hole. Looking back on it, I should have known better than to think it would be easy. The way that we perceive color is different than how we represent it, and bridging that gap is tricky. After starting with a very simple RBG euclidean distance calculation, I quickly found that the color search had a preference for blacks that dominate the image. After reading some theory and doing some experimenting, I eventually settled on sticking with euclidean distance, but instead weighting colors and luminance differently to better align with how humans perceive color. However, there are more complex algorithms like CIEDE2000. Settled is a key word here, it still doesn't work perfectly. But I thought it was necessary to strike a balance between simplicity and accuracy.
Overall, it was a lot of fun to make this, and I am continuing to come up with ideas for it faster than I can implement them. I will certainly come back to this gallery and keep working on it.