When the state of the art is ahead of the state of understanding: Unintuitive properties of deep neural networks

Joan Serrà

doi:10.7203/metode.9.11035

Autores/as

Joan Serrà Telefónica R&D en Barcelona (España).

DOI:

https://doi.org/10.7203/metode.9.11035

Palabras clave:

aprendizaje profundo, aprendizaje automático, redes neuronales, propiedades poco intuitivas

Resumen

El aprendizaje profundo es un tema indiscutiblemente candente, no solo entre académicos e industria, sino también en la sociedad y en los medios de comunicación. Las razones de este crecimiento de popularidad son múltiples: una disponibilidad sin precedentes de datos y potencia de cálculo, la aparición de algunas metodologías innovadoras, trucos técnicos menores pero significativos, etc. Sin embargo, resulta curioso que el éxito actual y la práctica del aprendizaje profundo parecen no estar correlacionados con su comprensión más teórica y formal. Debido a esto, la vanguardia tecnológica del aprendizaje profundo presenta una serie de propiedades o situaciones poco intuitivas. En este texto se resaltan algunas de estas propiedades poco intuitivas, tratando de mostrar trabajos recientes relevantes y de evidenciar la necesidad de saber más sobre ellos, ya sea mediante métodos empíricos o formales.

Descargas

Los datos de descargas todavía no están disponibles.

Biografía del autor/a

Joan Serrà, Telefónica R&D en Barcelona (España).

Investigador de Telefónica R&D en Barcelona (España), donde trabaja en cuestiones relacionadas con el aprendizaje automático y el aprendizaje profundo. Obtuvo su doctorado en Informática en la Universidad Pompeu Fabra de Barcelona en 2011 y trabajó como investigador postdoctoral en inteligencia artificial en el Instituto de Investigación en Inteligencia Artificial (IIIA-CSIC, 2015). Ha estado involucrado en más de diez proyectos de investigación con fondos de instituciones españolas y europeas y es coautor de más de cien publicaciones de diferentes disciplinas, muchas de ellas ampliamente citadas y publicadas en revistas y conferencias de primer nivel.

Citas

Cybenko, G. (1989). Approximation by superposition of sigmoidal functions. Mathematics of Control, Signals and Systems, 2(4), 303–314. doi: 10.1007/BF02551274

Dauphin, Y. N., Pascanu, R., Gulcehere, C., Cho, K., Ganguli, S., & Bengio, Y. (2014). Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 27(pp. 2933–2941). New York, NY: Curran Associates Inc.

Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., & Bengio, S. (2010). Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 11, 625–660.

Gilmer, J., Metz, L., Faghri, F., Schoenholz, S. S., Raghu, M., Wattenberg, M., & Goodfellow, I. (2018). Adversarial spheres. Retrieved from https://arxiv.org/abs/1801.02774

Goodfellow, I., Vinyals, O., & Saxe, A. M. (2015). Qualitatively characterizing neural network optimization problems. In Proceedings of the International Conference on Learning Representations (ICLR 2016). San Diego, CA, USA: ICLR. Retrieved from https://arxiv.org/abs/1412.6544

Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In Proceedings of the International Conference on Learning Representations (ICLR 2016). San Juan, Puerto Rico: ICLR. Retrieved from https://arxiv.org/abs/1510.00149

Hinton, G., Vinyals, O., & Dean, J. (2014). Distilling the knowledge in a neural network. In NIPS 2014 Deep Learning and Representation Learning Workshop. Montreal, Canada: NIPS. Retrieved from https://arxiv.org/abs/1503.02531

Kawaguchi, K. (2016). Deep learning without poor local minima. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in neural information processing systems, 29(pp. 586–594). New York, NY: Curran Associates Inc.

Larochelle, H. (2017, 28 june). Neural networks II. Deep Learning and Reinforcement Learning Summer School. Montreal Institute for Learning Algorithms, University of Montreal. Retrieved on 12 January 2018 from https://mila.quebec/en/cours/deep-learning-summer-school-2017/slides/

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–444. doi: 10.1038/nature14539

LeCun, Y., Bottou, L., Orr, G. B., & Müller, K.-R. (2002). Efficient backprop. In G. B. Orr & K.-R. Müller (Eds.), Neural networks: Tricks of the trade. Lecture notes in computer science. Volume 1524 (pp. 9–50). Berlin: Springer. doi: 10.1007/3-540-49430-8

Li, H., Xu, Z., Taylor, G., & Goldstein, T. (2017). Visualizing the loss landscape of neural nets. Retrieved from https://arxiv.org/abs/1712.09913

McCloskey, M., & Cohen, N. (1989). Catastrophic interference in connectionist networks: The sequential learning problem. Psychology of Learning and Motivation, 24, 109–165. doi: 10.1016/S0079-7421(08)60536-8

Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 427–436). Boston, MA: IEEE. doi: 10.1109/CVPR.2015.7298640

Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z. B., & Swami, A. (2017). Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM Asia Conference on Computer and Communications Society (Asia-CCCS) (pp. 506–619). New York, NY: Association for Computing Machinery. doi: 10.1145/3052973.3053009

Serrà, J., Surís, D., Miron, M., & Karatzoglou, A. (2018). Overcoming catastrophic forgetting with hard attention to the task. In Proceedings of the 35th International Conference on Machine Learning (ICML) (pp. 4555–4564). Stockholm: ICML.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations (ICLR). Banff, Canada: ICLR. Retrieved from https://arxiv.org/abs/1312.6199

Wolfram, S. (2002). A new kind of science. Champaign, IL: Wolfram Media.

Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks? In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 27 (pp. 3320–3328). New York, NY: Curran Associates Inc.

Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. In Proceedings of the International Conference on Learning Representations (ICLR). Toulon, France: ICLR. Retrieved from https://arxiv.org/abs/1611.03530

Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. Proceedings of the International Conference on Learning Representations (ICLR). Toulon, France: ICLR. Retrieved from https://arxiv.org/abs/1611.01578

Cuando la tecnología va más rápido que la comprensión: Propiedades poco intuitivas de las redes neuronales profundas

Autores/as

DOI:

Palabras clave:

Resumen

Descargas

Biografía del autor/a

Joan Serrà, Telefónica R&D en Barcelona (España).

Citas

Descargas

Publicado

Cómo citar

Número

Sección

Licencia

Métrica

Artículos similares

Enviar un artículo

Idioma

Información

Palabras clave

scimago

scopus

jcr

redib

fecyt

Urkund