Home
Uses of videos
Loading Inventory...
Barnes and Noble
Uses of videos
Current price: $26.99
Barnes and Noble
Uses of videos
Current price: $26.99
Loading Inventory...
Size: OS
*Product Information may vary - to confirm product availability, pricing, and additional information please contact Barnes and Noble
Video captioning, the task of describing the content of a video in natural language, is a populartask both in computer vision and natural languageprocessing. In the beginning, researchers try to generate sentence-level captions for short video clips(Venugopalan et al., 2015). Krishna et al. (2017)propose the task of dense video captioning. Thesystem needs to detect event segments first andthen generate captions. Park et al. (2019) proposethe task of video paragraph captioning: they useground-truth event segments and focus on generating coherent paragraphs. Lei et al. (2020) follow the task setting and propose a recurrent transformer model that can generate more coherent andless repetitive paragraphs. Considering the groundtruth event segments are often unavailable in practice, our goal is to generate paragraph captionswithout ground-truth segments.