基于深度学习的视频修复方法综述
Video Inpainting Based on Deep Learning: An Overview
-
摘要: 视频作为常见的媒体信息之一,目前已在各个领域得到广泛应用。尤其是以抖音等为代表的短视频软件的兴起,使得与视频相关的技术不断迭代更新。其中,视频修复技术是视频处理研究中的一个热点。视频修复技术是利用视频帧内的像素信息和帧间的时域参考信息对视频帧受损的区域进行内容推理并修复,在补全缺损视频、物体移除及视频伪造检测等场景中具有广泛应用前景。该技术可追溯到二十世纪末的老电影修复技术,该任务通常由专业的技术团队逐帧修复完成。而随着数字技术的发展,近年来已有一些人工智能技术用于视频修复,让老电影重获新生。目前,视频修复技术可分为传统方法和基于深度学习的两类方法。其中,传统方法由于缺少对高层语义信息的理解,在场景复杂、缺失区域较大的情况下修复效果不佳;而基于深度学习的方法随着算法框架的优化和图形处理器性能的提升展现了出色的效果,对修复结果的语义结构准确性和时间一致性都有明显的提升。本文在简要回顾传统视频修复方法的基础上,重点分析四类基于深度学习视频修复方法的网络结构、参数模型、性能表现与优缺点;介绍视频修复领域中常用的数据集和评价指标;最后,对视频修复领域现存的问题进行总结并展望未来可能的研究方向。Abstract: Video is a ubiquitous medium that has found widespread use in various fields. The advent of short-video software such as TikTok has fueled iterative updates in video-related technologies. Video inpainting is a current hot topic in video-processing research. It focuses on repairing the damaged areas of video frames using pixel information within frames and temporal reference information between frames. This technique has broad application prospects, including inpainting videos with corruption, removing objects, and detecting video forgeries. Video inpainting can be traced back to old movie restoration techniques utilized at the end of the 20th century. Typically, the movies were repaired frame-by-frame by professional technical teams. In recent years, some artificial intelligence techniques have been used for video restoration, making it easier to revitalize old movies. Video inpainting techniques can be divided into two categories: traditional and deep-learning-based methods. Their lack of understanding of high-level semantic information makes traditional methods less effective for complex scenes and large missing areas. In contrast, their optimized algorithm framework and improved graphics processor performance allow deep-learning-based methods to achieve excellent results in video inpainting by significantly improving the semantic structure accuracy and time consistency. This paper first briefly reviews traditional video inpainting methods. Then, four types of deep-learning-based video inpainting methods are discussed in detail by analyzing their network structures, parameter models, performances, advantages, and limitations. In addition, the commonly used datasets and evaluation metrics for video inpainting are introduced. Finally, the challenges and prospects of the video-inpainting technique are discussed.