Abstract

This thesis presents original research on deep learning applied to photo editing. Our work focuses on several goals we use as guiding lights, and that distinguish it from rich, contemporaneous research in the same field. We focus on the democratization of photo editing, not only because the methods don’t require steep learning curves to be adopted by end users, but also because they can run on consumer hardware. We strive to make this requirement compatible with arbitrary resolution editing, so users don’t have to sacrifice quality in exchange for speed or accessibility. Furthermore, we are respectful of the work of photographers as artists and professionals, and develop lossless parametric methods whenever possible: they act like traditional filters, rather than replacing captured pixels with generated ones. Our first contribution is FilterNet, an automatic enhancement method that predicts the values of filters that should be applied to a photo in order to improve it, rather than predicting the edited result. Predicted filters can be applied to arbitrary resolution photos, are easily explainable because they map to photography concepts such as exposure or color temperature, and can be tweaked by the user at will. However, they act globally because the system is trained with whole images. As an intermediate step to extend FilterNet to local edits, we explore how segmentation methods can be conditioned using text instructions, so users can intuitively indicate the subject they are interested in. Linking visual and textual representations is being extensively researched these days; in particular, generative methods such as diffusion models achieve impressive creative results in this area. Unlike these methods, we don’t attempt to produce final versions of edited photos; instead, we demonstrate that they can be leveraged to predict segmentation masks. Our prototype, CocoGold, is our second contribution. Combining FilterNet with CocoGold, we create COCONET, a pipeline that predicts filter parameters like FilterNet, but applies them locally to the segmentation masks created with CocoGold. No research is ever complete. Our progress in the areas we set out to explore opens new directions for the future, which we are excited to pursue. We end this thesis recognizing the limitations of our prototypes and looking forward to the next steps that can get us closer to our vision.
Loading...

Quotes

plumx
0 citations in WOS
0 citations in

Journal Title

Journal ISSN

Volume Title

Publisher

Universidad Rey Juan Carlos

URL external

DOI

Date

Description

Tesis Doctoral leída en la Universidad Rey Juan Carlos de Madrid en 2025. Supervisors José María Cañas Plaza Jesús Fernández Conde

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By

Statistics

Views
8
Downloads
15

Bibliographic managers

Document viewer

Select a file to preview:
Reload