AN OVERVIEW OF FACIAL ATTRIBUTE LEARNING
Main Article Content
Abstract
Facial attributes are useful for developing applications such as face recognition, search, and surveillance. They are therefore important for various facial analysis. Many facial attribute learning algorithms have been developed to automatically detect those key attributes over the years. In this paper, we have surveyed some typical facial attribute learning methods. Five major categories of the state-of-the-art methods are identified: (1) Traditional learning, (2) Deep Single Task Learning, (3) Deep Multitask Learning, (4) Imbalanced Data Solver, and (5) Facial Attribute Ontology. They included from traditional learning algorithm to deep learning, along with methods that assist in solving semantic gaps based on ontology and solving data imbalances. For each algorithm of category, basic theories as well as their strengths, weaknesses, and differences are discussed. We also compared their performance on the standard datasets. Finally, based on characteristics and contribution of methods, we present conclusion and future works to solve facial attributes learning. The survey can help researchers gain a quick overview to build future human face applications as well as further studies.
Keywords
deep learning, facial attribute learning, facial attribute ontology, imbalanced data solver, multi-task learning
Article Details
References
Akbir, K., & Mahmoud, M. (2019). Considering race a problem of transfer learning. Proceedings - 2019 IEEE Winter conf WACVW 2019, 100-106.
Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M. S., Asari, V. K. (2019a). A state-of-the-art survey on deep learning theory and architectures. Electronics.
Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M. S.,… Asari, V. K. (2019b). A state-of-the-art survey on deep learning theory and architectures. Electronics.
Alorf, A., & Abbott, A. L. (2018). In defense of low-level structural features and SVMs for facial attribute classification: Application to detection of eye state, Mouth State, and eyeglasses in the wild. IEEE International Joint Conf on Biometrics, IJCB 2017, 2018-Janua, 599-607.
An, L., Zou, C., Zhang, L., & Denney, B. (2015). Scalable attribute-driven face image retrieval. Neurocomputing, 172, 215–224. https://doi.org/10.1016/j.neucom.2014.09.098
B, Y. L., Tai, Y., & Tang, C. (2018). Attribute-Guided Face Generation Using Conditional CycleGAN (Vol. 3951). Springer International Publishing. https://doi.org/10.1007/11744023
Bashar, R., Kang, S. K., Dawadi, P. R., & Rhee, P. K. (2007). A Context-Aware Statistical Ontology Approach for Adaptive Face Recognition. Convergence of Bioscience and Information Technologies, Jeju, Korea (South), 2007, 698-703. doi: 10.1109/FBIT.2007.112
Bozorgtabar, B., Rad, M. S., Ekenel, H. K., & Thiran, J.-P. (2019). Learn to synthesize and synthesize to learn. Computer Vision and Image Understanding, 185(June 2018), 1-11.
Cao, J., Li, Y., & Zhang, Z. (2018). Partially Shared Multi-task Convolutional Neural Network with Local Constraint for Face Attribute Learning. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 4290-4299.
Celona, L., Bianco, S., & Schettini, R. (2018). Fine-grained face annotation using deep Multi-Task CNN. Sensors (Switzerland), 18(8). https://doi.org/10.3390/s18082666
Chan, J.-S., Hsu, G.-S. (Jison), Shie, H.-C., & Chen, Y.-X. (2017). Face recognition by facial attribute assisted network. ICIP, 3825-3829.
Chang, W.-Y., Hsu, S.-H., & Chien, J.-H. (2017). FATAUVA-Net: An Integrated Deep Learning Framework for Facial Attribute Recognition, Action Unit Detection, and Valence-Arousal Estimation. 2017 IEEE Conference on CVPRW, 1963-1971.
Chen, B., Chen, Y., Kuo, Y., Hsu, W. H., & Member, S. (2013). Scalable Face Image Retrieval Using Attribute-Enhanced Sparse Codewords. IEEE Transactions on Multimedia, 15(5), 1163-1173.
Chen, D., Cao, X., Wang, L., Wen, F., & Sun, J. (2012). Bayesian face revisited: A joint formulation. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7574 LNCS(PART 3), 566-579.
Chen, D., Cao, X., Wipf, D., Wen, F., & Sun, J. (2016). An Efficient Joint Formulation for Bayesian Face Verification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(1), 32-46, https://doi.org/10.1109/TPAMI.2016.2533383
Chen, H., Gallagher, A. C., & Girod, B. (2014). The Hidden Sides of Names—Face Modeling with First Name Attributes. Pattern Analysis and Machine Intelligence, IEEE Transactions On, 36(9), 1860-1873. https://doi.org/10.1109/TPAMI.2014.2302443
Chen, Y. Y., Hsu, W. H., & Liao, H. Y. M. (2013). Automatic training image acquisition and effective feature selection from community-contributed photos for facial attribute detection. IEEE Trans. Multimed., 15(6), 1388-1399. https://doi.org/10.1109/TMM.2013.2250492
Contreras, R., Starostenko, O., Alarcon-Aquino, V., & Flores-Pulido, L. (2010). Facial feature model for emotion recognition using fuzzy reasoning. Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 6256 LNCS, 11-21.
Demirkus, M., Precup, D., Clark, J., & Arbel, T. (2015). Hierarchical Spatio-Temporal Probabilistic Graphical Model with Multiple Feature Fusion for Estimating Binary Facial Attribute Classes in Real-World Face Videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8828(FEBRUARY 2014), 1-22.
Ding, H., Zhou, S. K., & Chellappa, R. (2017). FaceNet2ExpNet : Regularizing a Deep Face Recognition Net for Expression Recognition, 118-126. https://doi.org/10.1109/FG.2017.23
Do, T. T., & Le, T. H. (2009). Facial feature extraction using geometric feature and independent component analysis. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5465 LNAI, 231-241.
Dong, Q., Gong, S., & Zhu, X. (2017). Class Rectification Hard Mining for Imbalanced Deep Learning. Proceedings of the IEEE International Conference on Computer Vision, 2017-Octob, 1869-1878. https://doi.org/10.1109/ICCV.2017.205
Dornaika, F., Bekhouche, S. E., & Arganda-Carreras, I. (2020). Robust regression with deep CNNs for facial age estimation: An empirical study. Expert Syst. Appl., 141.
Duong, C. N., Quach, K. G., Luu, K., Le, T. H. N., & Savvides, M. (2017). Temporal Non-volume Preserving Approach to Facial Age-Progression and Age-Invariant Face Recognition. Proceedings of the IEEE International Conference on Computer Vision, 2017-Octob,
3755-3763. https://doi.org/10.1109/ICCV.2017.403
Ehrlich, M., Shields, T. J., Almaev, T., & Amer, M. R. (2016). Facial Attributes Classification Using Multi-task Representation Learning. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 752-760.
Everingham, M., & Zisserman, A. (2006). Regression and classification approaches to eye localization in face images. 7th International Conference on Automatic Face and Gesture Recognition FGR06, pages, 441-448. https://doi.org/10.1109/FGR.2006.90
Fan, D., Kim, H., Kim, J., Liu, Y., & Huang, Q. (2019). Multi-task learning using task dependencies for face attributes prediction. Appl. Sci., 9(12).
Fanhe, X., Guo, J., Huang, Z., Qiu, W., & Zhang, Y. (2019). Multi-task learning with knowledge transfer for facial attribute classification. Proc. IEEE Int. Conf. Ind. Technol., 2019-Febru, 877-882. https://doi.org/10.1109/ICIT.2019.8755180
Gao, Z., & Wang, S. (2015). Multiple Aesthetic Attribute Assessment by Exploiting Relations Among Aesthetic Attributes, 575-578.
Gauthier, J. (2014). Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter Semester 2014.
Gruber, T. R. (1993). Toward principles for the design of ontologies used for knowledge sharing. International Journal of Human - Computer Studies, 43(5-6), 907-928.
Günther, M., Rozsa, A., & Boult, T. E. (2017). AFFACT - Alignment Free Facial Attribute Classification Technique. Fg, 90-99.
Gupta, N., Gupta, A., Joshi, V., Subramaniam, L. V., & Mehta, S. (2017). Deep Attribute Driven Image Similarity Learning Using Limited Data. Proceedings - 2017 IEEE International Symposium on Multimedia, ISM 2017, 2017-Janua, 146-153.
Han, H., Jain, A. K., Shan, S., & Chen, X. (2017). Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach. Proc. 12th IEEE Int. Conf. Autom. Face Gesture Recognit., 8828(c), 1-14. https://doi.org/10.1109/TPAMI.2017.2738004
Hand, E. M., Castillo, C., & Chellappa, R. (2018). Doing the best we can with what we have: Multi-label balancing with selective learning for attribute prediction. 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, 6878-6885.
Hand, E. M., & Chellappa, R. (2016). Attributes for Improved Attributes: A Multi-Task Network for Attribute Classification, 8057–8058. Retrieved from http://arxiv.org/abs/1604.07360
Haque, M. A., Bautista, R. B., Noroozi, F., Kulkarni, K., Laursen, C. B., Irani, R.,… Moeslund, T. B. (2018). Deep Multimodal Pain Recognition : A Database and Comparison of Spatio-Temporal Visual Modalities. IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 250-257. https://doi.org/10.1109/FG.2018.00044
He, H., & Garcia, E. A. (2009). Learning from imbalanced data. Ieee Transactions On Knowledge And Data Engineering, 21(9), 1263-1284.
He, K., Fu, Y., & Xue, X. (2017). A Jointly Learned Deep Architecture for Facial Attribute Analysis and Face Detection in the Wild. Retrieved from http://arxiv.org/abs/1707.08705
He, K., Wang, Z., Fu, Y., Feng, R., Jiang, Y. G., & Xue, X. (2017). Adaptively weighted multi-task deep network for person atribute classification. 2017 ACM Multimed. Conf., 1636-1644.
He, Z., Zuo, W., Member, S., Kan, M., Shan, S., Member, S., & Chen, X. (2018). AttGAN : Facial Attribute Editing by Only Changing What You Want, 1-16.
Hsieh, H.-L., Hsu, W., & Chen, Y.-Y. (2017). Multi-task learning for face identification and attribute estimation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2981-2985.
Hsieh, H.-L., Hsu, W., & Chen, Y.-Y. (2017). Multi-task learning for face identification and attribute estimation, 1, 2981-2985.
Huang, C., Li, Y., Loy, C. C., & Tang, X. (2016). Learning Deep Representation for Imbalanced Classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6.
Huang, C., Li, Y., Loy, C. C., & Tang, X. (2018). Deep Imbalanced Learning for Face Recognition and Attribute Prediction, 1-14. Retrieved from http://arxiv.org/abs/1806.00194
Hudelot, C. (2008). Towards a Cognitive Vision Platform for Semantic Image Interpretation; Application to the Recognition of Biological Organisms, 280.
Hupont, I., & Fernández, C. (2019). DemogPairs: Quantifying the impact of demographic imbalance in deep face recognition. Proc. - 14th IEEE Int. Conf. FG 2019.
Illendula, A., & Sheth, A. (2019). Multimodal emotion classification. The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019, 2, 439-449.
Jadhav, A., Namboodiri, V. P., & Venkatesh, K. S. (2016). Deep Attributes for One-Shot Face Recognition. ECCV Workshops, (3), 516-523. https://doi.org/10.1007/978-3-319-49409-8_44
Jiang, J., Wang, C., Liu, X., & Ma, J. (2021). Deep Learning-based Face Super-resolution: A Survey. Retrieved from http://arxiv.org/abs/2101.03749
Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1). https://doi.org/10.1186/s40537-019-0192-5
Kahou, S. E., Michalski, V., Konda, K., Memisevic, R., & Pal, C. (2015). Recurrent neural networks for emotion recognition in video. ICMI 2015 - Proceedings of the 2015 ACM International Conference on Multimodal Interaction, 467-474.
Kalayeh, M. M., Gong, B., & Shah, M. (2017). Improving Facial Attribute Prediction using Semantic Segmentation, 6942-6950. https://doi.org/10.1109/CVPR.2017.450
Kumar, N., Member, S., Berg, A. C., Belhumeur, P. N., & Nayar, S. K. (2011). Describable Visual Attributes for Face Verification and Image Search, 1-17.
Lee, M. K., Choi, D. Y., & Song, B. C. (2019). Facial expression recognition via relation-based conditional generative adversarial network. ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction, 35-39.
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324.
Li, D., Zhang, M., Zhang, L., Chen, W., & Feng, G. (2021). A novel attribute-based generation architecture for facial image editing. Multimedia Tools and Applications, 80(4), 4881-4902.
Li, H., Sun, J., & Xu, Z. (2017). Multimodal 2D + 3D Facial Expression Recognition with Deep Fusion Convolutional Neural Network, 9210(c), 1-16.
Li, J., Zhao, F., Feng, J., Roy, S., Yan, S., & Sim, T. (2018). Landmark free face attribute prediction. IEEE Transactions on Image Processing, 27(9), 4651-4662.
Li, Y., Wang, Q., Nie, L., & Cheng, H. (2017). Face Attributes Recognition via Deep Multi-Task Cascade. Proc. 2017 Int. Conf. Data Mining, Commun. Inf. Technol. - DMCIT ’17, 5-9.
Liang, X., Xu, L., Liu, J., Liu, Z., Cheng, G., Xu, J., & Liu, L. (2021). Patch attention layer of embedding handcrafted features in CNN for facial expression recognition. Sensors
Liao, S., Shen, D., & Chung, A. C. S. (2014). A Markov Random Field Groupwise Registration Framework for Face Recognition, 36(4).
Lin, C.-H., Chen, Y.-Y., Chen, B.-C., Hou, Y.-L., & Hsu, W. (2014). Facial Attribute Space Compression by Latent Human Topic Discovery. Proc. ACM Int. Conf. Multimed. - MM ’14,
Lin, H. H., Chiang, W. C., Yang, C. T., Cheng, C. T., Zhang, T., & Lo, L. J. (2021). On construction of transfer learning for facial symmetry assessment before and after orthognathic surgery. Computer Methods and Programs in Biomedicine, 200.
Liu, Y., Wei, F., Shao, J., Sheng, L., Yan, J., & Wang, X. (2018). Exploring Disentangled Feature Representation Beyond Face Identification, 2080-2089.
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. Proceedings of the IEEE International Conf on Computer Vision, 2015 Inter, 3730-3738.
Loy, C. C., Luo, P., & Huang, C. (2017). Deep Learning Face Attributes for Detection and Alignment. https://doi.org/10.1007/978-3-319-50077-5
Ly, N. Q., Do, T. K., & Nguyen, B. X. (2019). Large-scale coarse-to-fine object retrieval ontology and deep local multitask learning. Computational Intelligence and Neuroscience, 2019.
Ly, N. Q., Cao, H. N.M., Nguyen, T. T (2020). Person Re-Identification System at Semantic Level based on Pedestrian Attributes Ontology. International Journal of Advanced Computer Science and Applications (IJACSA), 11(2), 2020.
Mahbub, U., Sarkar, S., & Chellappa, R. (2018). Segment-based Methods for Facial Attribute Detection from Partial Faces, 1-13. Retrieved from http://arxiv.org/abs/1801.03546
Maillot, N. (2005). Ontology Based Object Learning and Recognition.
Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. BBA - Protein Structure, 405(2), 442-451.
Mezaris, V., Kompatsiaris, I., & Strintzis, M. G. (2004). An ontology approach to object-based image retrieval, II-511-514. https://doi.org/10.1109/icip.2003.1246729
Mirjalili, V., Raschka, S., & Ross, A. (2020). PrivacyNet: Semi-Adversarial Networks for Multi-attribute Face Privacy, 1-3. Retrieved from http://arxiv.org/abs/2001.00561
Nguyen, H. M., Ly, N. Q., & Phung, T. T. T. (2018). Large-Scale Face Image Retrieval System at attribute level based on Facial Attribute Ontology and Deep Neuron Network.
Penghui, S., Hao, L., Xin, W., Zhenhua, Y., & Wu, S. (2019). Similarity-aware deep adversarial learning for facial age estimation. Proc. - IEEE Int. Conf. Multimed. Expo, 2019-July.
Pini, S., Ahmed, O. Ben, Cornia, M., Baraldi, L., Cucchiara, R., & Huet, B. (2017). Modeling Multimodal Cues in a Deep Learning-based Framework for Emotion Recognition in the Wild. Proceedings of the 19th ACM International Conference on Multimodal Interaction.
Rudd, E. M., Günther, M., & Boult, T. E. (2016). MOON: A mixed objective optimization network for the recognition of facial attributes. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9909
Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks, (May). Retrieved from http://arxiv.org/abs/1706.05098
Sun, Y., & Yu, J. (2018). General-to-specific learning for facial attribute classification in the wild. J. Vis. Commun. Image Represent., 56, 83-91. https://doi.org/10.1016/j.jvcir.2018.09.003
Sundararajan, K., & Woodard, D. L. (2018). Deep learning for biometrics: A survey. ACM Computing Surveys, 51(3). https://doi.org/10.1145/3190618
Taherkhani, F., Nasrabadi, N. M., & Dawson, J. (2018). A Deep Face Identification Network Enhanced by Facial Attributes Prediction, 666-673.
Tian, Q., Arbel, T., & Clark, J. J. (2017). Deep LDA-Pruned Nets for Efficient Facial Gender Classification. https://doi.org/10.1109/CVPRW.2017.78
Tzirakis, P., Trigeorgis, G., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2016). End-to-End Multimodal Emotion Recognition using Deep Neural Networks, 14(8), 1-9.
Wan, L., Wan, J., Jin, Y., Tan, Z., & Li, S. Z. (2018). Fine-grained multi-attribute adversarial learning for face generation of age, gender and ethnicity. Proceedings - 2018 International Conference on Biometrics, ICB 2018, 98-103. https://doi.org/10.1109/ICB2018.2018.00025
Wang, J., Cheng, Y., & Feris, R. S. (2016). Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data.
Wang, P., Su, F., & Zhao, Z. (2017). Joint Multi-Feature Fusion and Attribute Relationships for Facial Attribute Prediction, 3-6.
Wang, P., Su, F., Zhao, Z., Guo, Y., Zhao, Y., & Zhuang, B. (2019). Deep class-skewed learning for face recognition. Neurocomputing, 363, 35-45.
Wang, S., Yin, S., Hao, L., & Liang, G. (2021). Multi-task face analyses through adversarial learning. Pattern Recognition, 114, 107837. https://doi.org/10.1016/j.patcog.2021.107837
Wang, Y., Gan, W., Yang, J., Wu, W., & Yan, J. (2019). Dynamic Curriculum Learning for Imbalanced Data Classification, (2), 5017-5026. http://arxiv.org/abs/1901.06783
Wang, Z., He, K., & Fu, Y. (2017). Multi-task Deep Neural Network for Joint Face Recognition and Facial Attribute Prediction. ICMR’17, 365-374.
Wiles, O., Sophia Koepke, A., & Zisserman, A. (2019). Self-supervised learning of a facial attribute embedding from video. British Machine Vision Conference 2018, BMVC 2018.
Xiao, T., Tsai, Y.-H., Sohn, K., Chandraker, M., & Yang, M.-H. (2019). Adversarial Learning of Privacy-Preserving and Task-Oriented Representations. http://arxiv.org/abs/1911.10143
Xiaohua, W., Muzi, P., Lijuan, P., Min, H., Chunhua, J., & Fuji, R. (2019). Two-level attention with two-stage multi-task learning for facial emotion recognition. J. Vis. Commun. Image Represent., 62, 217-225. https://doi.org/10.1016/j.jvcir.2019.05.009
Xu, M., Chen, F., Li, L., Shen, C., Lv, P., Zhou, B., & Ji, R. (2018). Bio-Inspired Deep Attribute Learning Towards Facial Aesthetic Prediction. IEEE Transactions on Affective Computing.
Yang, H., Huang, D., Wang, Y., & Jain, A. K. (2018). Learning Face Age Progression : A Pyramid Architecture of GANs. CVPR, 31-39.
Zhang, N., Paluri, M., Ranzato, M. A., Darrell, T., Bourdev, L., & Berkeley, U. C. (2014). PANDA : Pose Aligned Networks for Deep Attribute Modeling.
Zhang, Y., & Yang, Q. (2018). A Survey on Multi-Task Learning, 1-20.
Zhang, Z., Song, Y., & Qi, H. (2017). Age Progression / Regression by Conditional Adversarial Autoencoder, 5810-5818.
Zheng, X., Guo, Y., Huang, H., Li, Y., & He, R. (2018). A Survey to Deep Facial Attribute Analysis. Retrieved from http://arxiv.org/abs/1812.10265
Zhong, Y., Sullivan, J., & Li, H. (2016). Leveraging mid-level deep representations for predicting face attributes in the wild. Proceedings - ICIP, 2016-Augus, 3239-3243.