Art is an interesting but very complex discipline. Indeed, the creation of artistic images is often not only a time-consuming problem but also requires a significant amount of expertise. If this problem holds for 2D works of art, consider extending it to dimensions beyond the picture plane, such as time (in animated content) or 3D space (with sculptures or physical space). This presents new issues and challenges, which this paper addresses.
Previous results involving 2D rendering focused on video content segmenting it frame by frame. The result is that the produced frames get a high quality style but often lead to flickering artifacts in the produced video. This is due to the lack of temporal coherence of the generated frames. In addition, they do not investigate the 3D environment, which can increase the complexity of the task. Some works that focus on 3D styling suffer from poor geometric reconstruction of point cloud or triangle meshes and lack of styling details. The reason lies in the different geometrical features of the original spaces and the generated spaces, as the style is applied after the linear transformation.
The proposed technique, called Artistic Radiance Fields (ARF), can transfer artistic features from a single 2D image to a real-world 3D scene, resulting in the rendering of a novel artistic view that is faithful to the input style image (Fig. 1).
For this purpose, the researchers used a real-image luminance field reconstructed from multiple images of real-world scenes into a new, stylized luminance field that supports high-level rendering from a novel perspective. The results are shown in Fig. 1.
As an example, given as input a set of real-world pictures of an excavator and a picture of the famous Van Gogh “A Starry Night” to draw such as the “style” to be applied, the result is a colorful excavator with a smooth texture similar to a painting.
The ARF pipeline is shown in the figure below (Fig. 2).
The main point of this structure is to combine the proposed Nearest Nearest Matching (NNFM) with color transfer.
NNFM involves the comparison between feature maps of both rendered and stylized images, extracted using the infamous VGG-16 Convolutional Neural Network (CNN). In this way, features can be used to guide the transmission of complex high-frequency visual information consistently across multiple perspectives.
Transferring color instead is a technique used to avoid significant color mismatches between the composite view and the style image. It involves linear transformation of the pixels that make up the input images to match the definition and variation of the pixels in the style image.
In addition, the structure uses a reverse backpropagation method, which allows the calculation of loss in full-resolution images with a reduced load on the GPU. The first step is to render the image at full resolution and calculate the image loss and gradient with respect to the pixel colors, which produces a stored gradient image. Then, these cache gradients are propagated back through the accumulation process.
The method, ARF, presented in this paper brings several advantages. First, it leads to the amazing creation of stylized images almost without artifacts. Second, stylized images can be generated from novel views with a few input images, allowing for artistic 3D reconstruction. Finally, using the inverse back-propagation method, the architecture significantly reduces the GPU memory.
This Article is written as a research summary article by Marktechpost Staff based on the research paper 'ARF: Artistic Radiance Fields'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, github link and project.
Please Don't Forget To Join Our ML Subreddit
Daniel Lorenzi received his M.Sc. in Internet ICT and Multimedia Engineering in 2021 from the University of Padua, Italy. He is a Ph.D. candidate at the Institute of Information Technology (ITEC) at Alpen-Adria-Universität (AAU) Klagenfurt. He currently works at the Christian Doppler Laboratory ATHENA and his research interests include dynamic video streaming, embedded media, machine learning, and QoS/QoE evaluation.