Conference article

Multi-Task Representation Learning

Mohamed-Rafik Bouguelia
Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Sweden

Sepideh Pashami
Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Sweden

Slawomir Nowaczyk
Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Sweden

Download article

Published in: 30th Annual Workshop of the Swedish Artificial Intelligence Society SAIS 2017, May 15–16, 2017, Karlskrona, Sweden

Linköping Electronic Conference Proceedings 137:6, p. 53-59

Show more +

Published: 2017-05-12

ISBN: 978-91-7685-496-9

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

The majority of existing machine learning algorithms assume that training examples are already represented with sufficiently good features, in practice ones that are designed manually. This traditional way of preprocess- ing the data is not only tedious and time consuming, but also not sufficient to capture all the different as- pects of the available information. With big data phenomenon, this issue is only going to grow, as the data is rarely collected and analyzed with a specific purpose in mind, and more often re-used for solving different problems. Moreover, the expert knowledge about the problem which allows them to come up with good representations does not necessarily generalize to other tasks. Therefore, much focus has been put on de- signing methods that can automatically learn features or representations of the data instead of learning from handcrafted features. However, a lot of this work used ad hoc methods and the theoretical understanding in this area is lacking.

Keywords

Representation Learning, Machine Learning

References

[1] Coates, A., Lee, H., & Ng, A. Y. (2010). An analysis of single-layer networks in unsupervised feature learning. Ann Arbor, 1001(48109), 2.

[2] Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.

[3] Bengio, Y. (2012). Deep learning of representations for unsupervised and transfer learning. ICML Unsupervised and Transfer Learning, 27, 17-36.

[4] Bengio, Y., Yao, L., Alain, G., & Vincent, P. (2013). Generalized denoising auto-encoders as generative models. In Advances in Neural Information Processing Systems.

[5] Dosovitskiy, A., Springenberg, J. T., Riedmiller, M., & Brox, T. (2014). Discriminative unsupervised feature learning with convolutional neural networks. In Advances in Neural Information Processing Systems.

[6] Zhuang, F., Cheng, X., Luo, P., Pan, S. J., & He, Q. (2015, July). Supervised Representation Learning: Transfer Learning with Deep Autoencoders. In IJCAI.

[7] Bengio, Y. (2013, July). Deep learning of representations: Looking forward. In International Conference on Statistical Language and Speech Processing (pp. 1-37). Springer Berlin Heidelberg.

[8] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).

[9] Zou, F., Wang, Y., Yang, Y., Zhou, K., Chen, Y., & Song, J. (2015). Supervised feature learning via l 2-norm regularized logistic regression for 3d object recognition. Neurocomputing, 151, 603-611.

[10] Fan, H., Cao, Z., Jiang, Y., Yin, Q., & Doudou, C. (2014). Learning deep face representation. arXiv preprint arXiv:1403.2802.

[11] Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.

[12] Banijamali, E., & Ghodsi, A. (2016). Semi-Supervised Representation Learning based on Probabilistic Labeling. arXiv preprint arXiv:1605.03072.

[13] Evgeniou, T., Micchelli, C. A., & Pontil, M. (2005). Learning multiple tasks with kernel methods. Journal of Machine Learning Research, 6(Apr), 615-637.

[14] Evgeniou, T., & Pontil, M. (2004, August). Regularized multi–task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 109-117). ACM.

[15] Maurer, A., Pontil, M., & Romera-Paredes, B. (2016). The benefit of multitask representation learning. Journal of Machine Learning Research, 17(81), 1-32.

[16] Gong, P., Zhou, J., Fan, W., & Ye, J. (2014, August). Efficient multi-task feature learning with calibration. In Proceedings of the 20th ACM SIGKDD (pp. 761-770).

[17] Zhao, H., Stretcu, O., Negrinho, R., Smola, A., & Gordon, G. (2017). Efficient Multi-task Feature and Relationship Learning. arXiv preprint arXiv:1702.04423.

[18] Argyriou, A., Evgeniou, T., & Pontil, M. (2008). Convex multi-task feature learning. Machine Learning, 73(3).

[19] Argyriou, A., Evgeniou, T., & Pontil, M. (2007). Multi-task feature learning. Advances in neural information processing systems, 19, 41.

[20] Vapnik, V. N., & Chervonenkis, A. Y. (2015). On the uniform convergence of relative frequencies of events to their probabilities. In Measures of Complexity (pp. 11-30). Springer International Publishing.

[21] Renshaw, D., Kamper, H., Jansen, A., & Goldwater, S. (2015). A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge. In INTERSPEECH (pp. 3199-3203).

[22] Cruz-Roa, A., Arevalo, J., Basavanhally, A., Madabhushi, A., & Gonzlez, F. (2015, January). A comparative evaluation of supervised and unsupervised representation learning approaches for anaplastic medulloblastoma differentiation. In Tenth International Symposium on Medical Information Processing and Analysis (pp. 92870G-92870G). International Society for Optics and Photonics.

[23] Tokarczyk, P., Montoya, J., & Schindler, K. (2012, July). An evaluation of feature learning methods for high resolution image classification. In ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, 22nd ISPRS Congress, Melbourne, Australia.

[24] Shao, L., Cai, Z., Liu, L., & Lu, K. (2017). Performance evaluation of deep feature learning for RGB-D image/video classification. Information Sciences, 385, 266-283.

[25] Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.

[26] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

[27] Reeve, H. W., & Brown, G. (2015). Modular Autoencoders for Ensemble Feature Extraction. NIPS 2015 Workshop on Feature Extraction: Modern Questions and Challenges. JMLR W&CP, volume 44, 2015.

[28] Khan, S. S., & Taati, B. (2016). Detecting Unseen Falls from Wearable Devices using Channel-wise Ensemble of Autoencoders. arXiv preprint arXiv:1610.03761.

[29] Xu, Z. E., Kusner, M. J., Weinberger, K. Q., & Chen, M. (2013, June). Cost-Sensitive Tree of Classifiers. In ICML (1) (pp. 133-141).

[30] Khan, S. H., Bennamoun, M., Sohel, F., & Togneri, R. (2015). Cost sensitive learning of deep feature representations from imbalanced data. arXiv preprint arXiv:1508.03422.

[31] Xu, Z. E., Kusner, M. J., Huang, G., & Weinberger, K. Q. (2013). Anytime Representation Learning. In ICML (3) (pp. 1076-1084).

[32] Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P. A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11(Dec), 3371-3408.

[33] Masci, J., Meier, U., Cirean, D., & Schmidhuber, J. (2011, June). Stacked convolutional auto-encoders for hierarchical feature extraction. In International Conference on Artificial Neural Networks (pp. 52-59).

[34] Rifai, S., Vincent, P., Muller, X., Glorot, X., & Bengio, Y. (2011). Contractive auto-encoders: Explicit invariance during feature extraction. In Proceedings of the 28th International Conference on Machine Learning (ICML-11).

[35] Gregor, K., & LeCun, Y. (2011). Learning representations by maximizing compression. arXiv preprint arXiv:1108.1169.

[36] Gregor, K., Besse, F., Rezende, D. J., Danihelka, I., & Wierstra, D. (2016). Towards conceptual compression. In Advances In Neural Information Processing Systems.

[37] Kang, Z., Grauman, K., & Sha, F. (2011). Learning with whom to share in multi-task feature learning. In Proceedings of the 28th International Conference on Machine Learning
(ICML-11) (pp. 521-528).

[38] Kumar, A., & Daume III, H. (2012). Learning task grouping and overlap in multi-task learning. arXiv preprint arXiv:1206.6417.

[39] Long, P. M. (1995). On the sample complexity of PAC learning half-spaces against the uniform distribution. IEEE Transactions on Neural Networks, 6(6), 1556-1559.

Citations in Crossref