When the state of the art is ahead of the state of understanding: Unintuitive properties of deep neural networks

Joan Serrà

doi:10.7203/metode.9.11035

Autors/ores

Joan Serrà Telefónica R&D, Barcelona (Espanya).

DOI:

https://doi.org/10.7203/metode.9.11035

Paraules clau:

aprenentatge profund, aprenentatge automàtic, xarxes neuronals, propietats poc intuïtives

Resum

L’aprenentatge profund és un tema indiscutiblement candent, no sols entre acadèmics i indústria, sinó també en la societat i en els mitjans de comunicació. Les raons d’aquesta popularitat creixent són múltiples: una disponibilitat sense precedents de dades i potència de càlcul, l’aparició d’algunes metodologies innovadores, trucs tècnics menors però significatius, etc. No obstant això, és curiós que l’èxit actual i la pràctica de l’aprenentatge profund pareixen no estar correlacionats amb la comprensió més teòrica i formal d’aquest camp. A causa d’això, l’avantguarda tecnològica de l’aprenentatge profund presenta una sèrie de propietats o situacions poc intuïtives. En aquest text es ressalten algunes d’aquestes propietats poc intuïtives, i es tracta de mostrar treballs recents rellevants i de posar de relleu la necessitat de saber més sobre la matèria, ja siga mitjançant mètodes empírics o formals.

Descàrregues

Les dades de descàrrega encara no estan disponibles.

Biografia de l'autor/a

Joan Serrà, Telefónica R&D, Barcelona (Espanya).

Investigador de Telefónica R&D a Barcelona (Espanya), on treballa en qu?estions relacionades amb l’aprenentatge automàtic i l’aprenentatge profund. Va obtenir el seu doctorat en Informàtica en la Universitat Pompeu Fabra de Barcelona en 2011 i va treballar com a investigador postdoctoral en intel·ligència artificial en l’Institut d’Investigació en Intel·ligència Artificial (IIIA-CSIC, 2015). Ha estat involucrat en més de deu projectes d’investigació amb fons d’institucions espanyoles i europees i és coautor de més de cent publicacions de diferents disciplines, moltes de les quals àmpliament citades i publicades en revistes i conferències de primer nivell.

Referències

Cybenko, G. (1989). Approximation by superposition of sigmoidal functions. Mathematics of Control, Signals and Systems, 2(4), 303–314. doi: 10.1007/BF02551274

Dauphin, Y. N., Pascanu, R., Gulcehere, C., Cho, K., Ganguli, S., & Bengio, Y. (2014). Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 27(pp. 2933–2941). New York, NY: Curran Associates Inc.

Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., & Bengio, S. (2010). Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 11, 625–660.

Gilmer, J., Metz, L., Faghri, F., Schoenholz, S. S., Raghu, M., Wattenberg, M., & Goodfellow, I. (2018). Adversarial spheres. Retrieved from https://arxiv.org/abs/1801.02774

Goodfellow, I., Vinyals, O., & Saxe, A. M. (2015). Qualitatively characterizing neural network optimization problems. In Proceedings of the International Conference on Learning Representations (ICLR 2016). San Diego, CA, USA: ICLR. Retrieved from https://arxiv.org/abs/1412.6544

Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In Proceedings of the International Conference on Learning Representations (ICLR 2016). San Juan, Puerto Rico: ICLR. Retrieved from https://arxiv.org/abs/1510.00149

Hinton, G., Vinyals, O., & Dean, J. (2014). Distilling the knowledge in a neural network. In NIPS 2014 Deep Learning and Representation Learning Workshop. Montreal, Canada: NIPS. Retrieved from https://arxiv.org/abs/1503.02531

Kawaguchi, K. (2016). Deep learning without poor local minima. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in neural information processing systems, 29(pp. 586–594). New York, NY: Curran Associates Inc.

Larochelle, H. (2017, 28 june). Neural networks II. Deep Learning and Reinforcement Learning Summer School. Montreal Institute for Learning Algorithms, University of Montreal. Retrieved on 12 January 2018 from https://mila.quebec/en/cours/deep-learning-summer-school-2017/slides/

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–444. doi: 10.1038/nature14539

LeCun, Y., Bottou, L., Orr, G. B., & Müller, K.-R. (2002). Efficient backprop. In G. B. Orr & K.-R. Müller (Eds.), Neural networks: Tricks of the trade. Lecture notes in computer science. Volume 1524 (pp. 9–50). Berlin: Springer. doi: 10.1007/3-540-49430-8

Li, H., Xu, Z., Taylor, G., & Goldstein, T. (2017). Visualizing the loss landscape of neural nets. Retrieved from https://arxiv.org/abs/1712.09913

McCloskey, M., & Cohen, N. (1989). Catastrophic interference in connectionist networks: The sequential learning problem. Psychology of Learning and Motivation, 24, 109–165. doi: 10.1016/S0079-7421(08)60536-8

Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 427–436). Boston, MA: IEEE. doi: 10.1109/CVPR.2015.7298640

Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z. B., & Swami, A. (2017). Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM Asia Conference on Computer and Communications Society (Asia-CCCS) (pp. 506–619). New York, NY: Association for Computing Machinery. doi: 10.1145/3052973.3053009

Serrà, J., Surís, D., Miron, M., & Karatzoglou, A. (2018). Overcoming catastrophic forgetting with hard attention to the task. In Proceedings of the 35th International Conference on Machine Learning (ICML) (pp. 4555–4564). Stockholm: ICML.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations (ICLR). Banff, Canada: ICLR. Retrieved from https://arxiv.org/abs/1312.6199

Wolfram, S. (2002). A new kind of science. Champaign, IL: Wolfram Media.

Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks? In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 27 (pp. 3320–3328). New York, NY: Curran Associates Inc.

Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. In Proceedings of the International Conference on Learning Representations (ICLR). Toulon, France: ICLR. Retrieved from https://arxiv.org/abs/1611.03530

Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. Proceedings of the International Conference on Learning Representations (ICLR). Toulon, France: ICLR. Retrieved from https://arxiv.org/abs/1611.01578

Quan la tecnologia va més ràpid que la comprensió: Propietats poc intuïtives de les xarxes neuronals profundes

Autors/ores

DOI:

Paraules clau:

Resum

Descàrregues

Biografia de l'autor/a

Joan Serrà, Telefónica R&D, Barcelona (Espanya).

Referències

Descàrregues

Publicades

Com citar

Número

Secció

Llicència

Metrics

Articles similars

Fer una tramesa

Llengua

Informació

Paraules clau

scimago

scopus

jcr

redib

fecyt

urkund_antiplagio