posted on 2023-05-20, 05:40authored byHussain, T, Muhammad, K, Ullah, A, Cao, Z, Baik, SW, de Albuquerque, VHC
The massive amount of video data produced by surveillance networks in industries instigate various challenges in exploring these videos for many applications such as video summarization, analysis, indexing, and retrieval. The task of multi-view video summarization (MVS) is very challenging due to the gigantic size of data, redundancy, overlapping in views, light variations, and inter-view correlations. To address these challenges, various low-level features and clustering based soft computing techniques are proposed that cannot fully exploit MVS. In this article, we achieve MVS by integrating deep neural network based soft computing techniques in a two tier framework. The first online tier performs target appearance based shots segmentation and stores them in a lookup table that is transmitted to cloud for further processing. The second tier extracts deep features from each frame of a sequence in the lookup table and pass them to deep bi-directional long short-term memory (DB-LSTM) to acquire probabilities of informativeness to generate summary. Experimental evaluation on MVS benchmark dataset and industrial surveillance data from YouTube confirms the higher accuracy of our system compared to state-of-the-art MVS methods.
History
Publication title
IEEE Transactions on Industrial Informatics
Volume
16
Pagination
77-86
ISSN
1551-3203
Department/School
School of Information and Communication Technology