Deep Learning: Theoretical and Practical Approach
Zenodo, 2022
Online
unknown
Zugriff:
Deep Learning Course Book In Persian This book includes three parts; The first part provides necessary prerequisites for deep learning topics such as linear algebra, statistics and probability, information theory, data mining, signal processing, machine learning, etc. The main issues in deep learning, including artificial neural networks, evaluation criteria, optimization methods, represent learning, recurrent neural networks, convolutional neural networks, and generative networks, fall within the scope of the second part. Also, the third part of this book is dedicated to advanced topics in this field. Natural language models, attention mechanism, transfer learning, domain adaption, and neural architecture search are examples of the titles of this part.
{"references": ["Bishop, Christopher. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, 2018.", "Goodfellow, Ian, et al. Deep Learning (Adaptive Computation and Machine Learning Series). Illustrated, The MIT Press, 2016.", "Yuille, A. L., and Anand Rangarajan. \"The Concave-Convex Procedure.\" Neural Computation, vol. 15, no. 4, 2003, pp. 915\u201336. Crossref, doi:10.1162/08997660360581958.", "Schuster, M., and K. K. Paliwal. \"Bidirectional Recurrent Neural Networks.\" IEEE Transactions on Signal Processing, vol. 45, no. 11, 1997, pp. 2673\u201381. Crossref, doi:10.1109/78.650093.", "Vaswani, Ashish, et al. \"Attention is all you need.\" Advances in neural information processing systems. 2017.", "Chu, Jielei, et al. \"Restricted boltzmann machines with gaussian visible units guided by pairwise constraints.\" IEEE transactions on cybernetics 49.12 (2018): 4321-4334.", "Hu, Hengyuan, Lisheng Gao, and Quanbin Ma. \"Deep restricted boltzmann networks.\" arXiv preprint arXiv:1611.07917 (2016).", "Tolstikhin, Ilya, et al. \"Wasserstein auto-encoders.\" arXiv preprint arXiv:1711.01558 (2017).", "Park, Saerom, and Jaewook Lee. \"Stability Analysis of Denoising Autoencoders Based on Dynamical Projection System.\" IEEE Transactions on Knowledge and Data Engineering (2020).", "Vincent, Pascal, et al. \"Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.\" Journal of machine learning research 11.12 (2010).", "Hinton, Geoffrey E. \"Boltzmann machine.\" Scholarpedia 2.5 (2007): 1668.", "Alain, Guillaume, and Yoshua Bengio. \"What regularized auto-encoders learn from the data-generating distribution.\" The Journal of Machine Learning Research 15.1 (2014): 3563-3593.", "Ng, Andrew. \"Sparse autoencoder.\" CS294A Lecture notes 72.2011 (2011): 1-19.", "Hinton, Geoffrey E. \"A practical guide to training restricted Boltzmann machines.\" Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012. 599-619.", "Chen, Fu-qiang, et al. \"Contractive de-noising auto-encoder.\" International Conference on Intelligent Computing. Springer, Cham, 2014.", "Hwang, Juno, Wonseok Hwang, and Junghyo Jo. \"Tractable loss function and color image generation of multinary restricted Boltzmann machine.\" arXiv preprint arXiv:2011.13509 (2020).", "Carlson, David, Volkan Cevher, and Lawrence Carin. \"Stochastic spectral descent for restricted Boltzmann machines.\" Artificial Intelligence and Statistics. PMLR, 2015.", "Thickstun, John. \"Kantorovich-Rubinstein Duality.\" (2019).", "Hoffman, Matthew D., et al. \"Stochastic variational inference.\" Journal of Machine Learning Research 14.5 (2013).", "Cremer, Chris, Xuechen Li, and David Duvenaud. \"Inference suboptimality in variational autoencoders.\" International Conference on Machine Learning. PMLR, 2018", "Rezende, Danilo Jimenez, Shakir Mohamed, and Daan Wierstra. \"Stochastic backpropagation and approximate inference in deep generative models.\" International conference on machine learning. PMLR, 2014.", "Kingma, Diederik P., and Max Welling. \"Auto-encoding variational bayes.\" arXiv preprint arXiv:1312.6114 (2013).", "Zhang, Cheng, et al. \"Advances in variational inference.\" IEEE transactions on pattern analysis and machine intelligence 41.8 (2018): 2008-2026.", "Wainwright, Martin J., and Michael Irwin Jordan. Graphical models, exponential families, and variational inference. Now Publishers Inc, 2008.", "Johnson, Matthew J., et al. \"Composing graphical models with neural networks for structured representations and fast inference.\" Advances in neural information processing systems 29 (2016): 2946-2954.", "Yoshizawa, Shuji, Masahiko Morita, and Shun-Ichi Amari. \"Capacity of associative memory using a nonmonotonic neuron model.\" Neural Networks 6.2 (1993): 167- 176.", "Torres, Joaqu\u00edn J., Lovorka Pantic, and Hilbert J. Kappen. \"Storage capacity of attractor neural networks with depressing synapses.\" Physical Review E 66.6 (2002): 061910.", "Demircigil, Mete, et al. \"On a model of associative memory with huge storage capacity.\" Journal of Statistical Physics 168.2 (2017): 288-299.", "Amit, Daniel J., Hanoch Gutfreund, and Haim Sompolinsky. \"Statistical mechanics of neural networks near saturation.\" Annals of physics 173.1 (1987): 30-67.", "Krotov, Dmitry, and John J. Hopfield. \"Dense associative memory for pattern recognition.\" Advances in neural information processing systems 29 (2016): 1172- 1180.", "Cho, Youngmin. Kernel methods for deep learning. University of California, San Diego, 2012.", "Cho, Youngmin. Kernel methods for deep learning. University of California, San Diego, 2012.", "Hopfield, John J. \"Neural networks and physical systems with emergent collective computational abilities.\" Proceedings of the national academy of sciences 79.8 (1982): 2554-2558.", "Gretton, Arthur, et al. \"A kernel two-sample test.\" The Journal of Machine Learning Research 13.1 (2012): 723-773.", "Yuan, Ao, et al. \"U-statistic with side information.\" Journal of multivariate analysis 111 (2012): 20-38.", "Ramsauer, Hubert, et al. \"Hopfield networks is all you need.\" arXiv preprint arXiv:2008.02217 (2020).", "Widrich, Michael, et al. \"Modern hopfield networks and attention for immune repertoire classification.\" arXiv preprint arXiv:2007.13505 (2020).", "Demircigil, Mete, et al. \"On a model of associative memory with huge storage capacity.\" Journal of Statistical Physics 168.2 (2017): 288-299.", "Elman, Jeffrey L. \"Finding structure in time.\" Cognitive science 14.2 (1990): 179- 211.", "Krotov, Dmitry, and John Hopfield. \"Large associative memory problem in neurobiology and machine learning.\" arXiv preprint arXiv:2008.06996 (2020).", "Kim, Do-Hyun, Jinha Park, and Byungnam Kahng. \"Enhanced storage capacity with errors in scale-free Hopfield neural networks: An analytical study.\" PloS one 12.10 (2017): e0184683.", "Amit, Daniel J., Hanoch Gutfreund, and Haim Sompolinsky. \"Statistical mechanics of neural networks near saturation.\" Annals of physics 173.1 (1987): 30-67.", "Jordan, M. I. Serial order: a parallel distributed processing approach. Technical report, June 1985-March 1986. No. AD-A-173989/5/XAB; ICS-8604. California Univ., San Diego, La Jolla (USA). Inst. for Cognitive Science, 1986.", "Lipton, Zachary C., John Berkowitz, and Charles Elkan. \"A critical review of recurrent neural networks for sequence learning.\" arXiv preprint arXiv:1506.00019 (2015).", "Ratcliff, Roger. \"A theory of memory retrieval.\" Psychological review 85.2 (1978): 59.", "Mont\u00fafar, Guido. \"Restricted boltzmann machines: Introduction and review.\" Information Geometry and Its Applications IV. Springer, Cham, 2016.", "Aarts, Emile HL, and Jan HM Korst. \"Boltzmann machines and their applications.\" International Conference on Parallel Architectures and Languages Europe. Springer, Berlin, Heidelberg, 1987.", "Jaeger, Herbert, and Harald Haas. \"Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication.\" science 304.5667 (2004): 78-80.", "Robbins, Herbert, and Sutton Monro. \"A stochastic approximation method.\" The annals of mathematical statistics (1951): 400-407.", "Dauphin, Yann, et al. \"Identifying and attacking the saddle point problem in highdimensional non-convex optimization.\" arXiv preprint arXiv:1406.2572 (2014).", "Darken, Christian, Joseph Chang, and John Moody. \"Learning rate schedules for faster stochastic gradient search.\" Neural networks for signal processing. Vol. 2. 1992.", "Sutton, Richard. \"Two problems with back propagation and other steepest descent learning procedures for networks.\" Proceedings of the Eighth Annual Conference of the Cognitive Science Society, 1986. 1986.", "Qian, Ning. \"On the momentum term in gradient descent learning algorithms.\" Neural networks 12.1 (1999): 145-151.", "Nesterov, Yurii E. \"A method for solving the convex programming problem with convergence rate O (1/k^ 2).\" Dokl. akad. nauk Sssr. Vol. 269. 1983.", "Bengio, Yoshua, Nicolas Boulanger-Lewandowski, and Razvan Pascanu. \"Advances in optimizing recurrent networks.\" 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013.", "Duchi, John, Elad Hazan, and Yoram Singer. \"Adaptive subgradient methods for online learning and stochastic optimization.\" Journal of machine learning research 12.7 (2011).", "Sutskever, Ilya. Training recurrent neural networks. Toronto, Canada: University of Toronto, 2013.", "Dean, Jeffrey, et al. \"Large scale distributed deep networks.\" Advances in neural information processing systems 25 (2012): 1223-1231.", "Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. \"Glove: Global vectors for word representation.\" Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.", "Zeiler, Matthew D. \"Adadelta: an adaptive learning rate method.\" arXiv preprint arXiv:1212.5701 (2012).", "Heusel, Martin, et al. \"Gans trained by a two time-scale update rule converge to a local nash equilibrium.\" Advances in neural information processing systems 30 446 (2017).", "Dozat, Timothy. \"Incorporating nesterov momentum into adam.\" (2016).", "Huang, Gao, et al. \"Densely connected convolutional networks.\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.", "Johnson, Melvin, et al. \"Google's multilingual neural machine translation system: Enabling zero-shot translation.\" Transactions of the Association for Computational Linguistics 5 (2017): 339-351.", "Reddi, Sashank J., Satyen Kale, and Sanjiv Kumar. \"On the convergence of adam and beyond.\" arXiv preprint arXiv:1904.09237 (2019).", "Loshchilov, I., and F. Hutter. \"Decoupled weight decay regularization. arXiv.\" Preprint published January 4 (2019).", "Ma, Jerry, and Denis Yarats. \"Quasi-hyperbolic momentum and adam for deep learning.\" arXiv preprint arXiv:1810.06801 (2018).", "Lucas, James, et al. \"Aggregated momentum: Stability through passive damping.\" arXiv preprint arXiv:1804.00325 (2018).", "Niu, Feng, et al. \"Hogwild!: A lock-free approach to parallelizing stochastic gradient descent.\" arXiv preprint arXiv:1106.5730 (2011).", "McMahan, Brendan, and Matthew Streeter. \"Delay-tolerant algorithms for asynchronous distributed online learning.\" Advances in Neural Information Processing Systems 27 (2014): 2915-2923.", "Abadi, Mart\u00edn, et al. \"Tensorflow: Large-scale machine learning on heterogeneous distributed systems.\" arXiv preprint arXiv:1603.04467 (2016).", "Zhang, Sixin, Anna Choromanska, and Yann LeCun. \"Deep learning with elastic averaging SGD.\" arXiv preprint arXiv:1412.6651 (2014).", "LeCun, Y., L. Bottou, and G. B. Orr. \"Neural Networks-Tricks of the Trade, vol. 1524, ed. by G. Orr and K. M\u00fcller.\" (1998): 5-50.", "Bengio, Yoshua, et al. \"Curriculum learning.\" Proceedings of the 26th annual international conference on machine learning. 2009.", "Zaremba, Wojciech, and Ilya Sutskever. \"Learning to execute.\" arXiv preprint arXiv:1410.4615 (2014).", "Ioffe, Sergey, and Christian Szegedy. \"Batch normalization: Accelerating deep network training by reducing internal covariate shift.\" International conference on machine learning. PMLR, 2015.", "Neelakantan, Arvind, et al. \"Adding gradient noise improves learning for very deep networks.\" arXiv preprint arXiv:1511.06807 (2015).", "Bradbury, James, et al. \"Quasi-recurrent neural networks.\" arXiv preprint arXiv:1611.01576 (2016).", "Balduzzi, David, and Muhammad Ghifary. \"Strongly-typed recurrent neural networks.\" International Conference on Machine Learning. PMLR, 2016.", "Jaeger, Herbert. \"Echo state network.\" scholarpedia 2.9 (2007): 2330.", "Gallicchio, Claudio, and Alessio Micheli. \"Deep echo state network (deepesn): A brief survey.\" arXiv preprint arXiv:1712.04323 (2017).", "Vlachas, Pantelis R., et al. \"Backpropagation algorithms and reservoir computing in recurrent neural networks for the forecasting of complex spatiotemporal dynamics.\" Neural Networks 126 (2020): 191-217.", "Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. \"Layer normalization.\" arXiv preprint arXiv:1607.06450 (2016).", "Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. \"Imagenet classification with deep convolutional neural networks.\" Advances in neural information processing systems 25 (2012): 1097-1105.", "Khan, Asifullah, et al. \"A survey of the recent architectures of deep convolutional neural networks.\" Artificial Intelligence Review 53.8 (2020): 5455-5516.", "Szegedy, Christian, et al. \"Inception-v4, inception-resnet and the impact of residual connections on learning.\" Thirty-first AAAI conference on artificial intelligence. 2017.", "Dang, Lanxue, Peidong Pang, and Jay Lee. \"Depth-wise separable convolution neural network with residual connection for hyperspectral image classification.\" Remote Sensing 12.20 (2020): 3408.", "Chollet, Fran\u00e7ois. \"Xception: Deep learning with depthwise separable convolutions.\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.", "Maas, Andrew L., Awni Y. Hannun, and Andrew Y. Ng. \"Rectifier nonlinearities improve neural network acoustic models.\" Proc. icml. Vol. 30. No. 1. 2013.", "Shang, Wenling, et al. \"Understanding and improving convolutional neural networks via concatenated rectified linear units.\" international conference on machine learning. PMLR, 2016.", "Clevert, Djork-Arn\u00e9, Thomas Unterthiner, and Sepp Hochreiter. \"Fast and accurate deep network learning by exponential linear units (elus).\" arXiv preprint arXiv:1511.07289 (2015).", "Agarap, Abien Fred. \"Deep learning using rectified linear units (relu).\" arXiv preprint arXiv:1803.08375 (2018).", "Dumoulin, Vincent, and Francesco Visin. \"A guide to convolution arithmetic for deep learning.\" arXiv preprint arXiv:1603.07285 (2016).", "Woo, Sanghyun, et al. \"Cbam: Convolutional block attention module.\" Proceedings of the European conference on computer vision (ECCV). 2018.", "Hu, Jie, Li Shen, and Gang Sun. \"Squeeze-and-excitation networks.\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2018", "Kumar, Siddharth Krishna. \"On weight initialization in deep neural networks.\" arXiv preprint arXiv:1704.08863 (2017).", "Elfwing, Stefan, Eiji Uchibe, and Kenji Doya. \"Sigmoid-weighted linear units for neural network function approximation in reinforcement learning.\" Neural Networks 107 (2018): 3-11.", "Lu, Zhilong, et al. \"LSTM variants meet graph neural networks for road speed prediction.\" Neurocomputing 400 (2020): 34-45.", "Krause, Ben, et al. \"Multiplicative LSTM for sequence modelling.\" arXiv preprint arXiv:1609.07959 (2016).", "Wu, Yuhuai, et al. \"On multiplicative integration with recurrent neural networks.\" arXiv preprint arXiv:1606.06630 (2016).", "He, Kaiming, et al. \"Mask r-cnn.\" Proceedings of the IEEE international conference on computer vision. 2017.", "Redmon, Joseph, et al. \"You only look once: Unified, real-time object detection.\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.", "Tan, Mingxing, and Quoc Le. \"Efficientnet: Rethinking model scaling for convolutional neural networks.\" International Conference on Machine Learning. PMLR, 2019.", "Kolesnikov, Alexander, et al. \"Big transfer (bit): General visual representation learning.\" Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part V 16. Springer International Publishing, 2020.", "Lin, Tsung-Yi, et al. \"Focal loss for dense object detection.\" Proceedings of the IEEE international conference on computer vision. 2017.", "Szegedy, Christian, et al. \"Going deeper with convolutions.\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.", "Szegedy, Christian, et al. \"Inception-v4, inception-resnet and the impact of residual connections on learning.\" Thirty-first AAAI conference on artificial intelligence. 2017.", "Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. \"U-net: Convolutional networks for biomedical image segmentation.\" International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.", "Bansal, Aayush, et al. \"Pixelnet: Representation of the pixels, by the pixels, and for the pixels.\" arXiv preprint arXiv:1702.06506 (2017).", "Zhang, Cheng, et al. \"Advances in variational inference.\" IEEE transactions on pattern analysis and machine intelligence 41.8 (2018): 2008-2026.", "Gulrajani, Ishaan, et al. \"Pixelvae: A latent variable model for natural images.\" arXiv preprint arXiv:1611.05013 (2016).", "Stuner, Bruno, Cl\u00e9ment Chatelain, and Thierry Paquet. \"Handwriting recognition using cohort of LSTM and lexicon verification with extremely large lexicon.\" Multimedia Tools and Applications 79.45 (2020): 34407-34427.", "Wigington, Curtis, et al. \"Start, follow, read: End-to-end full-page handwriting recognition.\" Proceedings of the European Conference on Computer Vision (ECCV). 2018.", "Mart\u00ednek, Ji\u0159\u00ed, Ladislav Lenc, and Pavel Kr\u00e1l. \"Training strategies for OCR systems for historical documents.\" IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer, Cham, 2019.", "Sabour, Sara, Nicholas Frosst, and Geoffrey E. Hinton. \"Dynamic routing between capsules.\" arXiv preprint arXiv:1710.09829 (2017).", "Patrick, Mensah Kwabena, et al. \"Capsule networks\u2013a survey.\" Journal of King Saud University-computer and information sciences (2019).", "Choi, Jaewoong, et al. \"Attention routing between capsules.\" Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019.", "Fu, Jun, et al. \"Dual attention network for scene segmentation.\" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.", "Maity, Rajib. \"Basic Concepts of Probability and Statistics.\" Statistical Methods in Hydrology and Hydroclimatology. Springer, Singapore, 2018. 7-51.", "Liu, Linfeng, and Liping Liu. \"Localizing and Amortizing: Efficient Inference for Gaussian Processes.\" Asian Conference on Machine Learning. PMLR, 2020.", "Ng, Andrew, and Sparse Autoencoder. \"CS294A Lecture notes.\" Dosegljivo: https://web. stanford. edu/class/cs294a/sparseAutoencoder_2011new. pdf.[Dostopano 20. 7. 2016] (2011).", "Roy, Jean-Francis, Mario Marchand, and Fran\u00e7ois Laviolette. \"A column generation bound minimization approach with PAC-Bayesian generalization guarantees.\" Artificial Intelligence and Statistics. PMLR, 2016.", "Ganin, Yaroslav, et al. \"Domain-adversarial training of neural networks.\" The journal of machine learning research 17.1 (2016): 2096-2030.", "Martens, James. \"New insights and perspectives on the natural gradient method.\" arXiv preprint arXiv:1412.1193 (2014).", "Pascanu, Razvan, and Yoshua Bengio. \"Revisiting natural gradient for deep networks.\" arXiv preprint arXiv:1301.3584 (2013).", "Ly, Alexander, et al. \"A tutorial on Fisher information.\" Journal of Mathematical Psychology 80 (2017): 40-55.", "Kingma, Diederik P., and Jimmy Ba. \"Adam: A method for stochastic optimization.\" arXiv preprint arXiv:1412.6980 (2014).", "Nguyen, XuanLong, Martin J. Wainwright, and Michael I. Jordan. \"On surrogate loss functions and f-divergences.\" The Annals of Statistics 37.2 (2009): 876-904.", "Polyanskiy, Yury. \"January 7, 2020 Typed by Suzanne Sigalla (ENSAE, CREST).\" (2020).", "Glorot, Xavier, and Yoshua Bengio. \"Understanding the difficulty of training deep feedforward neural networks.\" Proceedings of the thirteenth international co
Titel: |
Deep Learning: Theoretical and Practical Approach
|
---|---|
Autor/in / Beteiligte Person: | Sina Ranjbar Kooh Farhadi |
Link: | |
Veröffentlichung: | Zenodo, 2022 |
Medientyp: | unknown |
DOI: | 10.5281/zenodo.6672317 |
Schlagwort: |
|
Sonstiges: |
|