Deep metric learning with improved triplet loss for face clustering in videos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

20 Scopus citations

Abstract

Face clustering in videos is to partition a large amount of faces into a given number of clusters, such that some measure of distance is minimized within clusters and maximized between clusters. In real-world videos, head pose, facial expression, scale, illumination, occlusion and some uncontrolled factors may dramatically change the appearance variations of faces. In this paper, we tackle this problem by learning non-linear metric function with a deep convolutional neural network from the input image to a low-dimensional feature embedding with the visual constraints among face tracks. Our network directly optimizes the embedding space so that the Euclidean distances correspond to a measure of semantic face similarity. This is technically realized by minimizing an improved triplet loss function, which pushes the negative face away from the positive pairs, and requires the distance of the positive pair to be less than a margin. We extensively evaluate the proposed algorithm on a set of challenging videos and demonstrate significant performance improvement over existing techniques.

Original languageEnglish
Title of host publicationAdvances in Multimedia Information Processing – 17th Pacific-Rim Conference on Multimedia, PCM 2016, Proceedings
EditorsEnqing Chen, Yun Tie, Yihong Gong
PublisherSpringer Verlag
Pages497-508
Number of pages12
ISBN (Print)9783319488899
DOIs
StatePublished - 2016
Event17th Pacific-Rim Conference on Multimedia, PCM 2016 - Xi’an, China
Duration: 15 Sep 201616 Sep 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9916 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Pacific-Rim Conference on Multimedia, PCM 2016
Country/TerritoryChina
CityXi’an
Period15/09/1616/09/16

Keywords

  • Deep metric learning
  • Face clustering in videos
  • Improved triple loss

Fingerprint

Dive into the research topics of 'Deep metric learning with improved triplet loss for face clustering in videos'. Together they form a unique fingerprint.

Cite this