Realization of neural networks with ternary inputs and ternary weights in NAND memory arrays

SanDisk Technologies, LLC

2023

Online Patent

Zugriff:

View record in USPTO Patent Grants (Volltext)

Use of a NAND array architecture to realize a binary neural network (BNN) allows for matrix multiplication and accumulation to be performed within the memory array. A unit synapse for storing a weight of a BNN is stored in a pair of series connected memory cells. A binary input is applied on a pair of word lines connected to the unit synapse to perform the multiplication of the input with the weight. The results of such multiplications are determined by a sense amplifier, with the results accumulated by a counter. The arrangement extends to ternary inputs to realize a ternary-binary network (TBN) by adding a circuit to detect 0 input values and adjust the accumulated count accordingly. The arrangement further extends to a ternary-ternary network (TTN) by allowing 0 weight values in a unit synapse, maintaining the number of 0 weights in a register, and adjusting the count accordingly.

Titel:	Realization of neural networks with ternary inputs and ternary weights in NAND memory arrays
Autor/in / Beteiligte Person:	SanDisk Technologies, LLC
Link:	View record in USPTO Patent Grants (Volltext)
Veröffentlichung:	2023
Medientyp:	Patent
Sonstiges:	Nachgewiesen in: USPTO Patent Grants Sprachen: English Patent Number: 11625,586 Publication Date: April 11, 2023 Appl. No: 16/653365 Application Filed: October 15, 2019 Assignees: SanDisk Technologies LLC (Addison, TX, US) Claim: 1. An apparatus, comprising: an array of non-volatile memory cells configured to store a plurality of ternary valued weights of a neural network, each weight stored in a pair of series connected non-volatile memory cells; and one or more control circuits connected to the array of non-volatile memory cells, the one or more control circuits configured to receive a plurality of inputs for a layer of a neural network, convert the plurality of inputs into a corresponding plurality of voltage patterns, apply the plurality of voltage patterns to the array of non-volatile memory cells to thereby perform an in-array multiplication of the plurality of inputs with the ternary valued weights, and accumulate results of the in-array multiplication. Claim: 2. The apparatus of claim 1 , wherein the non-volatile memory cells store data in a ternary format and a weight value of 0 corresponds to both of the memory cells in the pair storing the weight value being in an erased state. Claim: 3. The apparatus of claim 1 , wherein the plurality of inputs are ternary valued inputs. Claim: 4. The apparatus of claim 1 , wherein the non-volatile memory cells of the array are arranged as NAND strings, each weight stored in a pair of non-volatile memory cells on a common NAND string. Claim: 5. The apparatus of claim 4 , wherein the array includes a bit line to which one or more NAND strings, including the common NAND string, are connected, the apparatus further comprising: a register, the register configured to hold a value indicating a number of weights with a value of 0 stored in NAND strings connected to the bit line. Claim: 6. The apparatus of claim 5 , wherein the one or more control circuits are connected to receive the number of weights with a value of 0 stored in NAND strings connected to the bit line, the one or more control circuits further configured to: adjust the accumulated results of the in-array multiplication based on the number of weights with a value of 0 stored in NAND strings connected to the bit line. Claim: 7. The apparatus of claim 6 , wherein the plurality of inputs are ternary valued inputs and the one or more control circuits are further configured to: determine a number of the one or more voltage patterns that correspond to a zero input pattern; and further adjust the accumulated results of the in-array multiplication based on the number of the one or more voltage patterns that correspond to the zero input pattern. Claim: 8. The apparatus of claim 5 , wherein the one or more control circuits are further configured to: determine the number of weights with a value of 0 stored in NAND strings connected to the bit line; and store the number of weights with a value of 0 stored in NAND strings connected to the bit line in the register. Claim: 9. The apparatus of claim 4 , wherein: the array of non-volatile memory cells includes a plurality of NAND strings connected to a common bit line; and the one or more control circuits are further configured to concurrently apply the plurality of voltage patterns to the plurality of NAND strings connected to the common bit line and accumulate the results of the in-array multiplication in a multi-bit sensing operation for the common bit line. Claim: 10. The apparatus of claim 4 , wherein the array of non-volatile memory cells includes: a plurality of NAND strings connected to a common bit line; and the one or more control circuits are further configured to sequentially apply the plurality of voltage patterns to the plurality of NAND strings connected to the common bit line and accumulate the results of the in-array multiplication in sequential sensing operations. Claim: 11. The apparatus of claim 4 , wherein the array of non-volatile memory cells includes: a first plurality of NAND strings each connected to a corresponding bit line; and the one or more control circuits are further configured to concurrently apply a first of the plurality of voltage patterns to the first plurality of NAND strings and independently accumulate a result of the in-array multiplication for each of the first plurality of NAND strings concurrently. Claim: 12. The apparatus of claim 1 , wherein the one or more control circuits are further configured to provide accumulated results of the in-array multiplication as inputs for a subsequent layer of the neural network. Claim: 13. The apparatus of claim 12 , further comprising: a first plane of non-volatile memory cells including the array of non-volatile memory cells; and a second plane non-volatile memory cells storing a plurality of ternary valued weights of the subsequent layer of the neural network. Claim: 14. The apparatus of claim 13 , wherein each weight of the subsequent layer is stored in a pair of series connected non-volatile memory cells of the second plane. Claim: 15. The apparatus of claim 4 , wherein the array of non-volatile memory cells has a three dimensional architecture in which the NAND strings run in a vertical direction relative to a horizontal substrate. Claim: 16. The apparatus of claim 11 , wherein the array of non-volatile memory cells comprises a first plane of memory cells and a second plane of memory cells, one or more of the first plurality of NAND strings are in the first plane, and one or more of the first plurality of NAND strings are in the second plane. Patent References Cited: 7324366 January 2008 Bednorz et al. ; 7505347 March 2009 Rinerson et al. ; 8416624 April 2013 Lei et al. ; 8634247 January 2014 Sprouse et al. ; 8634248 January 2014 Sprouse et al. ; 8773909 July 2014 Li et al. ; 8780632 July 2014 Sprouse et al. ; 8780633 July 2014 Sprouse et al. ; 8780634 July 2014 Sprouse et al. ; 8780635 July 2014 Li et al. ; 8792279 July 2014 Li et al. ; 8811085 August 2014 Sprouse et al. ; 8817541 August 2014 Li et al. ; 9098403 August 2015 Sprouse et al. ; 9104551 August 2015 Spouse et al. ; 9116796 August 2015 Sprouse et al. ; 9384126 July 2016 Sprouse et al. ; 9430735 August 2016 Vali et al. ; 9887240 February 2018 Shimabukuro et al. ; 9965208 May 2018 Roohparvar et al. ; 10127150 November 2018 Sprouse et al. ; 10249360 April 2019 Chang ; 10459724 October 2019 Yu et al. ; 10535391 January 2020 Osada et al. ; 11170290 November 2021 Hoang et al. ; 11328204 May 2022 Choi et al. ; 20140133228 May 2014 Sprouse et al. ; 20140133233 May 2014 Li et al. ; 20140133237 May 2014 Sprouse et al. ; 20140136756 May 2014 Sprouse et al. ; 20140136757 May 2014 Sprouse et al. ; 20140136758 May 2014 Sprouse et al. ; 20140136760 May 2014 Sprouse et al. ; 20140136762 May 2014 Li et al. ; 20140136763 May 2014 Li et al. ; 20140136764 May 2014 Li et al. ; 20140156576 June 2014 Nugent ; 20140136761 July 2014 Li et al. ; 20140294272 October 2014 Madabhushi et al. ; 20150324691 November 2015 Dropps et al. ; 20160026912 January 2016 Falcon et al. ; 20160054940 February 2016 Khoueir et al. ; 20170017879 January 2017 Kataeva ; 20170054032 February 2017 Tsukamoto ; 20170098156 April 2017 Nino et al. ; 20170228637 August 2017 Santoro et al. ; 20180039886 February 2018 Umuroglu et al. ; 20180075339 March 2018 Ma et al. ; 20180082181 March 2018 Brothers et al. ; 20180144240 May 2018 Garbin et al. ; 20180315473 November 2018 Yu et al. ; 20180357533 December 2018 Inoue ; 20190065896 February 2019 Lee et al. ; 20190087715 March 2019 Jeng ; 20190102359 April 2019 Knag ; 20190108436 April 2019 David et al. ; 20190221257 July 2019 Jeng et al. ; 20190251425 August 2019 Jaffari et al. ; 20190280694 September 2019 Obradovic ; 20200034697 January 2020 Choi et al. ; 20200202203 June 2020 Nakayama et al. ; 20200234137 July 2020 Chen et al. ; 20200301668 October 2020 Li ; 20200311523 October 2020 Hoang et al. ; 20210110244 April 2021 Hoang et al. ; 20210192325 June 2021 Hoang ; 20220100508 March 2022 Pawlowski ; 20220179703 June 2022 Vincent ; 110597555 December 2019 ; 110598858 December 2019 ; 2016/042359 March 2016 ; 10-2019-009467 August 2019 Other References: International Search Report & The Written Opinion of the International Searching Authority dated Sep. 11, 2020, International Application No. PCT/US2020/024625. cited by applicant ; English Abstract of JP Publication No. JP2016/042359 published Mar. 31, 2016. cited by applicant ; Rastegari, Mohammad et al., “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks,” proceedings ECCV 2016, Aug. 2016, 55 pages. cited by applicant ; Wan, Diwen, et al., “TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights,” ECCV 2018, Oct. 2018, 18 pages. cited by applicant ; Chen, Yu-Hsin, et al., “Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks,” IEEE Journal of Solid-State Circuits, Feb. 2016, 12 pages. cited by applicant ; Sun, Xiaoyu, et al., “Fully Parallel RRAM Synaptic Array for Implementing Binary Neural Network with (+1, −1) Weights and (+1, 0) Neurons,” 23rd Asia and South Pacific Design Automation Conference, Jan. 2018, 6 pages. cited by applicant ; Gonugondla, Sujan K., et al., “Energy-Efficient Deep In-memory Architecture for NAND Flash Memories,” IEEE International Symposium on Circuits and Systems (ISCAS), May 2018, 5 pages. cited by applicant ; Nakahara, Hiroki, et al., “A Memory-Based Realization of a Binarized Deep Convolutional Neural Network,” International Conference on Field-Programmable Technology (FPT), Dec. 2016, 4 pages. cited by applicant ; Takeuchi, Ken, “Data-Aware NAND Flash Memory for Intelligent Computing with Deep Neural Network,” IEEE International Electron Devices Meeting (IEDM), Dec. 2017, 4 pages. cited by applicant ; Mochida, Reiji, et al., “A 4M Synapses integrated Analog ReRAM based 66.5 TOPS/W Neural-Network Processor with Cell Current Controlled Writing and Flexible Network Architecture,” Symposium on VLSI Technology Digest of Technical Papers, Jun. 2018, 2 pages. cited by applicant ; Chiu, Pi-Feng, et al., “A Differential 2R Crosspoint RRAM Array With Zero Standby Current,” IEEE Transactions on Circuits and Systems—II: Express Briefs, vol. 62, No. 5, May 2015, 5 pages. cited by applicant ; Chen, Wei-Hao, et al., “A 65nm 1Mb Nonvolatile Computing-in-Memory ReRAM Macro with Sub-16ns Mulitply-and-Accumulate for Binary DNN AI Edge Processors,” IEEE International Solid-State Circuits Conference, Feb. 2018, 3 pages. cited by applicant ; Liu, Rui, et al., “Parallelizing SRAM Arrays with Customized Bit-Cell for Binary Neural Networks,” DAC '18, Jun. 2018, 6 pages. cited by applicant ; Courbariaux, Maithieu, et al., “Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or −1,” arXiv.org, Mar. 2016, 11 pages. cited by applicant ; U.S. Appl. No. 62/702,713, filed Jul. 24, 2018. cited by applicant ; U.S. Appl. No. 16/052,420, filed Aug. 1, 2018. cited by applicant ; U.S. Appl. No. 16/368,347, filed Mar. 28, 2019. cited by applicant ; U.S. Appl. No. 16/405,178, filed May 7, 2019. cited by applicant ; U.S. Appl. No. 16/414,143, filed May 16, 2019. cited by applicant ; U.S. Appl. No. 16/368,441, filed Mar. 28, 2019. cited by applicant ; U.S. Appl. No. 16/653,346, filed Oct. 15, 2019. cited by applicant ; Simon, Noah, et al., “A Sparse-Group Lasso,” Journal of Computational and Graphical Statistics, vol. 22, No. 2, pp. 231-245, downloaded by Moskow State Univ Bibliote on Jan. 28, 2014. cited by applicant ; “CS231n Convolutional Neural Networks for Visual Recognition,” [cs231.github.io/neural-networks-2/#reg], downloaded on Oct. 15, 2019, pp. 1-15. cited by applicant ; Krizhevsky, Alex, et al., “ImageNet Classification with Deep Convolutional Neural Networks,” [http://code.google.com/p/cuda-convnet/], downloaded on Oct. 15, 2019, 9 pages. cited by applicant ; Shafiee, Ali, et al., “ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars,” ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Oct. 5, 2016, 13 pages. cited by applicant ; Han, Song, et al., “Learning both Weights and Connections for Efficient Neural Networks,” Conference paper, NIPS, Oct. 2015, 9 pages. cited by applicant ; Jia, Yangqing, “Learning Semantic Image Representations at a Large Scale,” Electrical Engineering and CS, University of Berkeley, Technical Report No. UCB/EECS-2014-93, May 16, 2014, 104 pages. cited by applicant ; Wen, Wei, et al., “Learning Structured Sparsity in Deep Neural Networks,” 30th Conference on Neural Information Processing Systems (NIPS 2016), Nov. 2016, 9 pages. cited by applicant ; Wang, Peiqi, et al., “SNrram: An Efficient Sparse Neural Network Computation Architecture Based on Resistive Random-Access Memory,” DAC '18, Jun. 24-29, 2018, 6 pages. cited by applicant ; Notice of Allowance dated Jan. 24, 2022, U.S. Appl. No. 16/368,347, filed Mar. 28, 2019. cited by applicant ; Notice of Allowance dated Mar. 11, 2020, U.S. Appl. No. 16/414,143, filed May 16, 2019. cited by applicant ; International Search Report & The Written Opinion of the International Searching Authority dated Jul. 30, 2020, International Application No. PCT/US2020/024615. cited by applicant ; Chiu, Pi-Feng, et al., “A Binarized Neural Network Accelerator with Differential Crosspoint Memristor Array for Energy-Efficient MAC Operations,” 2019 IEEE International Symposium on Circuits and Systems (ISCAS), May 2019, Abstract only. cited by applicant ; Sun, Xiaoyu, et al., “Low-VDD Operation of SRAM Synaptic Array for Implementing Ternary Network,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, No. 10, Jul. 2017, Abstract only. cited by applicant ; Kim, Hyeonuk, et al., “NAND-Net: Minimizing Computational Complexity of In-Memory Processing for Binary Neural Networks,” 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), Mar. 2019, Abstract only. cited by applicant ; English Abstract of KR Publication No. KR 10-2019-0094679 published Aug. 14, 2019. cited by applicant ; Non-Final Office Action dated Jun. 23, 2022, U.S. Appl. No. 16/653,346, filed Oct. 15, 2019. cited by applicant ; Zheng, Shixuan, et al., “An Efficient Kernel Transformation Architecture for Binary-and Ternary-Weight Neural Network Inference,” DAC' 18, Jun. 24-29, 2018, 6 pages. cited by applicant ; Notice of Allowance dated Feb. 20, 2020, U.S. Appl. No. 16/405,178, filed May 7, 2019. cited by applicant ; Resch, Salonik, et al., “PIMBALL: Binary Neural Networks in Spintronic Memory,” ACM Trans Arch. Code Optim., vol. 37, No. 4, Article 111, Aug. 2018, 25 pages. cited by applicant ; Zamboni, Maurizio, et al., “In-Memory Binary Neural Networks,” Master's Thesis, Politecino Di Torino, Apr. 10, 2019, 327 pages. cited by applicant ; Natsui, Masanori, et al., “Design of an energy-efficient XNOR gate based on MTJ-based nonvolatile logic-in-memory architecture for binary neural network hardware,” Japanese Journal of Applied Physics 58, Feb. 2019, 8 pages. cited by applicant ; U.S. Appl. No. 16/722,580, filed Dec. 20, 2019. cited by applicant ; Response to Office Action dated Sep. 8, 2022, U.S. Appl. No. 16/653,346, filed Oct. 15, 2019. cited by applicant ; Non-final Office Action dated Sep. 13, 2022, U.S. Appl. No. 16/722,580, filed Dec. 20, 2019. cited by applicant ; Non-final Office Action dated Sep. 15, 2022, U.S. Appl. No. 16/901,302, filed Jun. 15, 2020. cited by applicant ; U.S. Appl. No. 16/901,302, filed Jun. 15, 2020. cited by applicant ; Baugh, Charles R., et al., “A Two's Complement Parallel Array Multiplication Algorithm,” IEEE Transactions on Computers, vol. C-22, No. 12, Dec. 1973, 3 pages. cited by applicant ; Hoang, Tung Thanh, et al., “Data-Width-Driven Power Gating of Integer Arithmetic Circuits,” IEEE Computer Society Annual Symposium on VLSI, Jul. 2012, 6 pages. cited by applicant ; Choi, Won Ho, et al., “High-precision Matrix-Vector Multiplication Core using Binary NVM Cells,” Powerpoint, Western Digital Research, downloaded on Jun. 15, 2020, 7 pages. cited by applicant ; Ni, Leibin, et al., “An Energy-Efficient Digital ReRAM-Crossbar-Based CNN With Bitwise Parallelism,” IEEE Journal of Exploratory Solid-State Computational Devices and Circuits, May 2017, 10 pages. cited by applicant ; Zhou, Shuchang, et al., “DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients,” [arXiv.org > cs > arXiv:1606.06160], Feb. 2, 2018, 13 pages. cited by applicant ; U.S. Appl. No. 16/908,864, filed Jun. 23, 2020. cited by applicant ; International Search Report & The Written Opinion of the International Searching Authority dated Jul. 9, 2020, International Application No. PCT/US2020/024612. cited by applicant ; Houxiang Ji, et al., “RECOM: An Efficient Resistive Accelerator for Compressed Deep Neural Networks,” in 2018 Design, Automation & Test in Europe Conference & Exhibition, Mar. 23, 2018, Abstract only. cited by applicant ; Yang, Tzu-Hsien, et al., “Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressec Neural Networks,” Computer Architecture, pp. 236-249, Jun. 26, 2019, Abstract only. cited by applicant ; Notice of Allowance dated Jul. 12, 2021, U.S. Appl. No. 16/368,441, filed Mar. 28, 2019. cited by applicant ; Kim, Hyeonuk, et al., “NAND-Net: Minimizing Computational Complexity of In-Memory Processing for Binary Neural Networks,” 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), Mar. 2019. cited by applicant ; Ji, H., et al., “ReCOM: An efficient resistive accelerator for compressed deep neural networks,” 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018, pp. 237-240. cited by applicant ; Notice of Allowance dated Oct. 14, 2022, U.S. Appl. No. 16/653,346, filed Oct. 15, 2019. cited by applicant Primary Examiner: Vallecillo, Kyle Attorney, Agent or Firm: Vierra Magen Marcus LLP

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.