Exploring Transformer for Face Mask Detection

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

The COVID-19 pandemic has underscored the importance of face masks in curbing viral transmission, prompting governments worldwide to enforce stringent public health mandates requiring mask usage in public areas. Consequently, there is a growing focus on developing automated mask detection technologies to augment these measures and minimize viral spread. In this study, we explore the potential of the Swin Transformer architecture for accurately identifying face mask usage, aiming to surpass the current performance limitations of existing face mask detection models. We evaluate the performance of our proposed model and comparison models using comprehensive evaluation metrics, including accuracy, precision, recall, specificity, F1-score, Kappa coefficient, and MCC. Our experiments yield several notable findings. Firstly, MobileNetV2 demonstrates superior performance compared to the baseline CNN model across all seven evaluation metrics within the face mask datasets. Secondly, within the category of convolutional neural networks (CNNs), EfficientNetV2 outperforms MobileNetV2, a classic lightweight network, across all metrics. DenseNet exhibits better performance than ResNet-50 across all seven evaluation metrics. Most significantly, the Swin Transformer architecture emerges as the most effective model, surpassing not only MobileNetV2 but also EfficientNetV2. The empirical results confirm that our Swin Transformer achieves statistically significant improvements in accuracy, precision, recall, specificity, F1-score, Kappa coefficient, and MCC compared to the other models.

Original languageEnglish
Pages (from-to)118377-118388
Number of pages12
JournalIEEE Access
Volume12
DOIs
StatePublished - 2024

Keywords

  • EfficientNet
  • Face mask detection
  • MobileNet
  • swin transformer

Fingerprint

Dive into the research topics of 'Exploring Transformer for Face Mask Detection'. Together they form a unique fingerprint.

Cite this