NUST Institutional Repository

Towards Large-scale Unsupervised Face Recognition in Videos

Show simple item record

dc.contributor.author Khurshid, Atif
dc.date.accessioned 2023-05-02T10:53:51Z
dc.date.available 2023-05-02T10:53:51Z
dc.date.issued 2023
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/32816
dc.description.abstract Video content is ubiquitous in the modern world and there is a growing need for au tomated methods to extract information from videos. Face-based person retrieval is a particularly interesting task in this domain which involves the use of face recognition to track the appearances of people in video data. It is useful in a variety of applications, from video analytics and indexing to video surveillance and crowd analysis. Feature representation and identity classification are two key components of a face recognition system and largely determine its accuracy. However, existing representation and iden tification methods are ill-equipped for large-scale, unsupervised video face recognition. Despite open challenges, development of novel methods has been slow while test scores on most benchmark datasets have saturated. The availability of good quality datasets is a prerequisite for any deep learning-based face recognition system. Most face datasets are based on web images of celebrities and do not represent the challenges of video face recognition. The few video face datasets that do exist are curated from short-form video content such as movies and television shows, and generally contain a small number of identities while also being limited to the demographics of international celebrities. This research is focused on the development of a large-scale dataset of face images extracted from videos, in order to renew interest in and promote the development of face representation and identification models capable of large-scale, unsupervised face recognition in videos. We present TVFace, a large-scale dataset of face images extracted from public live streams of international news channels. It consists of 22 subsets, one for each chan nel, containing a total of 2.6 million face images and 33 thousand identities. Identity labeling is performed using a clustering-based, semi-automatic annotation framework designed to facilitate manual annotation of large collections of face images. Each im age is also annotated for 6 facial attributes (mask, age, gender, ethnicity, expression and pose) using state-of-the-art face analysis models. The dataset can be used for the evaluation of face representation and identity classification components in both image and video domains, as well as for multiple tasks including face verification, identifica tion, and clustering. It effectively represents the challenges of the video domain, such as variations in photometric properties and non-discriminatory facial attributes like pose and expression, while maintaining a diverse demographic distribution. We also design a hierarchical retrieval index for online clustering in order to demonstrate the effectiveness of the proposed dataset in evaluating real-time person retrieval systems. en_US
dc.description.sponsorship Dr. Muhammad Moazam Fraz en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and Computer Sciences (SEECS) NUST en_US
dc.title Towards Large-scale Unsupervised Face Recognition in Videos en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account