# A Logic-in-Memory Design with 3-Terminal Magnetic Tunnel Junction Function Evaluators for Convolutional Neural Networks

Sumit Dutta,<sup>1</sup> Saima A. Siddiqui,<sup>1</sup> Felix Büttner,<sup>2</sup> Luqiao Liu,<sup>1</sup> Caroline A. Ross,<sup>2</sup> and Marc A. Baldo<sup>1</sup> <sup>1</sup>Department of Electrical Engineering and Computer Science, <sup>2</sup>Department of Materials Science and Engineering

Massachusetts Institute of Technology Cambridge, MA 02139

Email: sumitd@alum.mit.edu

Abstract—Analog implementations of neuromorphic circuits within digital systems are increasingly becoming attractive due to the high throughput and low energy per operation they offer. Magnetic logic devices based on spin-orbit torque offer a pathway to low-power resistive analog circuits. We present the magnetic tunnel junction (MTJ) function evaluator, a design for a logic device that evaluates nonlinear and linear functions for neural networks. We model the device and extend it into a functional design implementation in a logic-in-memory architecture in a hybrid process with magnetic device layers and 45 nm CMOS.

*Index Terms*—MTJ, logic-in-memory, neural networks, crosspoint array, micromagnetic models, domain walls.

## I. INTRODUCTION

Deep neural networks have become effective in automating tasks such as image and speech recognition, but are increasingly bottlenecked by processing limitations [1]-[3]. While deep neural networks are being optimized for conventional hardware, further reductions in latency and power could arise from dedicated circuits based on emerging nonvolatile devices such as resistive memory or magnetic devices. We propose using magnetic domain wall (DW) devices for logic and memory applications. This technology provides a variable resistor that is read and written electrically [1], and we show here this technology can be designed to evaluate an arbitrary function. The variable resistor is a magnetic tunnel junction (MTJ), which has a fixed magnetic layer and a free magnetic layer to provide nominally two resistance states,  $R_P$  and  $R_{AP}$ . When there is a moving DW in the soft free layer under the MTJ, the MTJ resistance value  $R_{MTJ}$  can have one of many values between  $R_P$  and  $R_{AP}$  depending on the position of the DW. The wall is moved by an electrical current due to spin-orbit torque (SOT) [4], [5], which requires less current to move a wall compared to spin-transfer torque (STT) [6]. This technology is compatible with conventional complementary metal-oxide-semiconductor (CMOS) circuits in electrical design and in process technology. We propose a logic-inmemory system to implement an efficient convolutional neural network (CNN) and high-density memory based on magnetic logic components.



Fig. 1. Design of the MTJ function evaluator. (a) Device drawing. (b) Micromagnetic model of device with a domain wall whose motion, due to spin-orbit torque, changes locally by a variable current density along the wire.

## A. Device and Circuit Design Approach

A 3-terminal MTJ, shown in Fig. 1, is a device whose resistance value is set by an electrical current from the input terminal to the ground terminal, and is read by an electrical current through the tunnel junction terminal on top.

The line edge roughness (LER) of a magnetic nanowire leads to a discrete number of DW positions along the nanowire. Pinning may also be intrinsic from wire anisotropy variations [7], [8]. The fundamental limit to the number of DW positions in a nanowire ultimately sets how many resistance values can be used for  $R_{MTJ}$ . Nevertheless, the discrete

domain wall positions are spaced closely enough to provide the resolution needed in this work.

Our logic-in-memory system design borrows several architectural elements from [9], a logic-in-memory system based on resistive random access memory (RRAM). MTJs have a tunable resistance, like memristors in RRAM [10]. Furthermore, neural networks can be designed using MTJs [11]. Here we show that MTJs, often used for data storage, can also be used for the evaluation of nonlinear functions, a significant advantage for systems that seek to integrate logic in memory for the purposes of accelerating hardware implementations of neuromorphic algorithms.

The main contributions of this work are highlighted in the following sections. Section II describes the design of an MTJ function evaluator, a device that can evaluate any custom linear or nonlinear function used in convolutional neural networks. Section III describes a system architecture for a logic-in-memory using the MTJ function evaluator, including the relevant circuits. The paper then reports results from simulation in Section IV and concludes with Section V.

## II. THE MTJ FUNCTION EVALUATOR

Let  $x_0(I_{IN})$  be a function to generate the output domain wall position  $x_0$  with an input current  $I_{IN}$ . The output of the MTJ function evaluator is an analog resistance value, which we can use a small signal or a sense amplifier to read. The output resistance of our device is given by:

$$R_{MTJ} = R_P\left(\frac{x_0}{L}\right) + R_{AP}\left(1 - \frac{x_0}{L}\right) \tag{1}$$

where  $x_0$  is the final domain wall position in an MTJ of length L. We derive the width w(x) as a function of distance x along the wire from the initial DW position, necessary to satisfy  $x_0(I_{IN})$ , following from equations with the final DW position. We start with the DW velocity function v(x):

$$v(x) = \begin{cases} 0, & J < J_0 \text{ (i.e., } x = 0) \\ \eta \left( J(x) - J_0 \right), & J \ge J_0 \text{ (i.e., } x > 0) \end{cases}$$
(2)

where the critical current density for DW motion is  $J_0$ , and  $\eta$  is the proportionality constant between the current density and domain wall velocity. Next, we note that the input current, I, is related to the current density by:

$$\frac{dx}{dt} = \eta \left( J \left( x \right) - J_0 \right) = \eta \frac{I \left( x_0 \right) - I_0}{w \left( x \right) d} \tag{3}$$

where d is the thickness of the nanowire and  $I_0$  is the critical current. If we assume the current is applied in a pulse of width  $t_0$  and if we assume x > 0, it is possible to show that:

$$w(x_0) = \frac{\eta t_0}{d} \left( \frac{dI}{dx_0} \right) \tag{4}$$

Here,  $I(x_0)$  is the inverse function of  $x_0(I_{IN})$ . Note that because  $I(x_0)$  is a function with an offset, we do not need to know the offset in order to calculate its derivative in order to get w(x). The shape with width w(x) is then fabricated.

Further, there is a constraint on the desired transfer function that  $x_0$  must increase monotonically with  $I_{IN}$ . However, there



Fig. 2. (a) Transfer characteristic from input current to output domain wall position and resistance, determined in micromagnetic simulation. Normalized data are on the left and bottom axes and actual data are on the top and right axes. (b) Width function required for the shown shifted sigmoid thresholding function. (c) Current density profile due to the width profile.

are no further constraints on higher order derivatives of the function.

## A. Function Implementation with a Thresholding MTJ

In a CNN, the thresholding function, or activation function, performed on the dot product output is an arbitrary nonlinear function. We choose a shifted sigmoid function in this work, and derive the width function required for an implementation with a single MTJ function evaluator device.

We have the following general input to output analytical relation for a shifted sigmoid thresholding function, seen in Fig. 2(a):

$$x_0(I) = x_A \tanh\left(\frac{I - I_1}{I_2}\right) + x_B \tag{5}$$

 TABLE I

 Analytical model fit parameters for Fig. 2(a) using the shifted sigmoid equation (5)

| Parameter | Data Fit<br>(Normalized) | Data Fit<br>(Actual) | Ideal         |
|-----------|--------------------------|----------------------|---------------|
| $x_A$     | 0.4542                   | 373 nm               | $\frac{1}{2}$ |
| $x_B$     | 0.4138                   | 348 nm               | $\frac{1}{2}$ |
| $I_1$     | 0.3900                   | 4.47 μΑ              | $\frac{1}{2}$ |
| $I_2$     | 0.2630                   | 3.37 µA              | $\frac{1}{4}$ |

The MTJ width function required for the shifted sigmoid thresholding function is determined using (4), which requires the derivative of the inverse function of (5), applying the known result  $\frac{d}{du} \operatorname{arctanh} u = (1 - u^2)^{-1}$  where  $u \in [-1, 1]$ :

$$w(x) = \frac{\eta t_0}{d} \left( \frac{x_A I_2}{x_A^2 - (x - x_B)^2} \right)$$
(6)

Table I shows values for the parameters in (5) to best match the micromagnetic modeling results. Note that (5) has an ideal shifted sigmoidal shape if the ideal parameter values in Table I are used. In Fig. 2, the input I and output  $x_0$  are also shown as normalized, yielding a plot of  $x_0(I_{IN})$ .

The material parameters listed in in Table II are chosen such that the wire exhibits perpendicular magnetic anisotropy (PMA). The device geometry and process information are in Table III. The saturation magnetization is low enough such that the terminal domain wall velocity is not reached [12].

We use Mumax3 to simulate the micromagnetic dynamics in the device, accounting for spin-orbit torque [13]. The parameters for the micromagnetic models are listed in Table II, including relevant SOT parameters for the spin Hall effect (SHE) and the Dzyaloshinskii-Moriya interaction (DMI) which are responsible for spin currents to move the domain wall from charge currents in the underlying layer. While the domain wall velocity depends on the input current in spin-transfer torque (STT) devices [6], SHE and DMI greatly enhance the efficiency, providing greater domain wall velocity for a given input current. The completed micromagnetic simulations from Mumax3 are then visualized in the Object-Oriented Micromagnetic Framework (OOMMF) [14].

Current advances in devices based on SHE allow the critical current density to be further reduced to make circuits such as that presented here possible [4], [12].

#### **III. LOGIC-IN-MEMORY SYSTEM DESIGN**

We design a system that can perform an analog multiplyaccumulate for convolutional neural networks, and store and load data as memory. Our architecture is based on the MTJ function evaluator, which we use in a crosspoint array as a synapse and as a thresholding function.

### A. Crosspoint Array

A CNN has a synaptic function and an activation function. Figure 3(a) shows the synaptic function where a generic analog multiply-accumulate is performed using programmed

TABLE II MICROMAGNETIC MODEL PARAMETERS FOR 3-TERMINAL MTJS USING CoFeB with perpendicular magnetic anisotropy

| Damping parameter $(\alpha)$                           | 0.01                                 |
|--------------------------------------------------------|--------------------------------------|
| Saturation magnetization $(M_{sat})$                   | $7.96 \times 10^6$ A/m               |
| Out-of-plane anisotropy $(K_z)$                        | $7.82 	imes 10^5 \text{ J/m}^3$      |
| Spin Hall angle $(\theta_{SH})$                        | 0.15                                 |
| Exchange constant (A)                                  | $10^{-11} \text{ J/m}$               |
| Simulation mesh cell size                              | 1.5 nm $\times$ 1.5 nm $\times$ 2 nm |
| Interfacial Dzyaloshinskii-Moriya strength $(D_{DMI})$ | -0.5 mJ/m <sup>2</sup>               |

TABLE III Device model parameters for the 45 nm Cadence GPDK CMOS process technology hybrid with an MTJ process technology

| NMOS and PMOS from 45nm CMOS PDK                          |            |  |  |  |
|-----------------------------------------------------------|------------|--|--|--|
| Gate length, $L_{MG}$ (nm)                                | 45         |  |  |  |
| Minimum width, $w_{M0}$ (nm)                              | 120        |  |  |  |
| 3-Terminal MTJ Compatible with 45 nm CMOS                 |            |  |  |  |
| Tunnel magnetoresistance<br>(TMR = $(R_{AP} - R_P)/R_P$ ) | 100%       |  |  |  |
| Resistance limits: $R_{AP}$ , $R_P$ (k $\Omega$ )         | 25.8, 12.9 |  |  |  |
| Critical current to set $R_{AP}$ : $I_{C,AP}$ ( $\mu A$ ) | -7.6       |  |  |  |
| Critical current to set $R_P$ : $I_{C,P}$ (µA)            | 7.6        |  |  |  |
| Length, width of the magnetic layer (nm)                  | 900, 36    |  |  |  |
| Length of the MTJ above the domain wall layer (nm)        | 600        |  |  |  |
| Thickness of the heavy metal, magnetic layers (nm)        | 1, 2       |  |  |  |

conductances  $G_{i,j}$  in each row *i* and column *j*. Figure 3(b) shows the activation function.

In our crosspoint array in Fig. 4, we implement the current summation in (7) with analog voltages  $V_i$  on the inputs, and we implement the thresholding function in (8) by providing an analog voltage  $V_{OUT,j}$  on the output that depends on  $I_j$ :

$$I_j = \sum_i V_i G_{i,j} \tag{7}$$

$$V_{OUT,j} = f(I_j) \tag{8}$$

Each 3-terminal MTJ in the crosspoint array has only one access transistor. While current paths could be further decoupled with additional access transistors for each MTJ [15], this work trades that for less area and adds other line impedances to inactive cells to reduce leakage. The WL access transistors ensure accurate writing. The resistive circuit model for 3-terminal MTJs in Fig. 4(a) is based on the model in [16].

## B. System Architecture

The architecture for the logic-in-memory system, including the crosspoint array and its associated circuits, is shown in Fig. 5. The cell arrays are physically identical but separated into memory and CNN subarrays as in [9], but here the CNN data flow is able to cycle fully in the analog domain.

We do not implement column multiplexing in our design, but suggest it as an option for wider memory subarrays. This is



Fig. 3. General approach to using Ohm's law to perform an analog multiplication for deep convolutional neural networks. (a) The synaptic function performs the dot product. (b) The activation function evaluator performs thresholding based on a generally nonlinear function  $f(I_J)$ .



Fig. 4. (a) Circuit symbols and models for the synaptic MTJ and thresholding MTJ. (b) Crosspoint array using synaptic MTJs and thresholding MTJs as analog logic and memory elements.

because reading our multi-level cell MTJ makes multiple bits of data available at once. In other words, full words come from columns with high bit resolution, requiring less multiplexing.

# C. Voltage Drivers

Figure 6 shows the voltage driver circuits using transmission gate multiplexers. A common source amplifier with an active



Fig. 5. Architecture of the logic-in-memory system.



Fig. 6. (a) Read-line driver. (b) Bitline driver. (c) Common source amplifier with an active load used in the driver circuits.  $RL_i$  and  $BL_j$  are seen in Fig. 7.

load is used because it provides a high gain magnitude at a low power expense. Since the common source amplifier has negative gain, it has an inverting effect on data. We realize from (1) that high  $I_{IN}$  results in high  $x_0$ , in turn resulting in low  $R_{MTJ}$ . Thus, when a small voltage is read in the voltage divider across  $R_{MTJ}$ , the output can be made large again by an inverting amplifier, i.e. one with negative gain. The common source amplifier designed and shown in Fig. 6(c) has a sufficient gain of -4.0, which could be higher in magnitude in exchange for more area.

## D. The Sense Amplifier

The sense amplifier for analog to digital conversion is the multi-bit sense amplifier in [17], which is used by [9] and is also suitable for this logic-in-memory. This sense amplifier



Fig. 7. (a) Feed-forward network operation and (b) memory operations: reading (dashed lines only) and writing (solid lines only) weights.

compares a variable resistor to multiple reference resistors with control logic to encode the resistance level in a multi-bit word.

## E. Direct Layer to Layer Connections

Figure 8 shows that a single layer of a deep neural network implemented in a CNN subarray can have its output feed directly into another layer, meaning that no additional overhead is needed for the analog to digital conversion, storage in RAM, and digital to analog conversion as typical in other CNN implementations. The connection for the analog signal between the thresholding function output and the input to another layer is a long interconnect with an amplifier. The amplifier in Fig. 8 is the common source amplifier in Fig. 6(c). In the overall system architecture, one CNN subarray output could be connected with a long interconnect and amplifier to one or more inputs of another CNN subarray. Connections between CNN subarrays are programmed with multiplexers. Direct connections between layers speed up deep CNNs.

## F. System Operation

Figure 7 illustrates how the logic-in-memory system is operated in perceptron mode or in memory mode. In perceptron



Fig. 8. Any number of layers can be processed in the analog domain, without having to convert each layer result from analog to digital back to analog.

mode, a feed-forward network layer involves these steps:

- 1) Reset the thresholding MTJs by setting all  $RST_j = 1$  and inject a DW with RSTP.
- 2)  $RST_j = 0$ .  $WL_i = 0$ . Apply input voltages on  $RL_i$ . The currents add up on  $BL_j$ . For each column, the current into the thresholding MTJ sets its resistance.
- 3)  $BL_j = 0$ . Pass output to the next layer (Fig. 8). If on the final cycle, sense the thresholding MTJ resistances.

Only one cycle is required to operate each CNN layer.

In memory mode, the steps to read a row of bits are: select one row with  $RL_i = 0$ , other  $RL_i = Z$ , set  $WL_i = 0$ , and sense the resistance of each MTJ on  $BL_i$ .

In both modes, the steps to write the synaptic MTJs are: write one row in the matrix, driving  $WL_i = 1$ ;  $RL_i = Z$ ;  $SL_j = 0$ ; repeat for every row, injecting a DW with each  $BL_j$ . Since all weights in a row are written together, CNNs could be trained efficiently in this design.

## IV. DISCUSSION

The logic-in-memory system was modeled in a hybrid process with 45 nm CMOS from the Cadence GPDK and a scaled magnetic tunnel junction process based on [1]. Simulations were performed in Cadence Spectre using the process parameters in Table III and results are in Table IV.

Power is reported directly from simulations. However, latency also accounts for the MTJ switching time. In the micromagnetic models, our pulse length  $t_0$  is 4 ns in Fig. 2, which sets the MTJ switching time. The critical current density requirement is met because the MTJ write currents set the background current density to over  $10^{11}$  A/m<sup>2</sup>, well above values reported in the literature [18].

In our devices, CoFeB is used for the magnetic layers, Ta forms the heavy metal layer underneath, MgO is used for the insulating layer, and Ru is used for the contacts. These materials are chosen such that the device exhibits PMA behavior. The MTJ resistance-area product varies exponentially with the MgO thickness. The resistivities we assume are based on our previous work in [1] and parameters reported in [4]:

$$\rho_{\rm CoFeB} = 2.1 \times 10^{-6} \ \Omega \text{-m} \tag{9}$$

$$\rho_{\rm Ta} = 1.9 \times 10^{-6} \ \Omega \text{-m} \tag{10}$$

$$\rho_{\rm Ru} = 5.0 \times 10^{-7} \ \Omega \text{-m} \tag{11}$$

TABLE IV Power and delay analysis for the logic-in-memory in Fig. 5  $\,$ 

| Feed-forward network operation             |         |  |  |  |
|--------------------------------------------|---------|--|--|--|
| Static power for $2 \times 2$ convolution  | 68.6 µW |  |  |  |
| Dynamic power for $2 \times 2$ convolution | 15.4 nW |  |  |  |
| Latency per layer                          | 4 ns    |  |  |  |
| Weight or memory cell write operation      |         |  |  |  |
| Static power for $2 \times 2$ array        | 68.6 µW |  |  |  |
| Dynamic power for 2×2 array                | 10.7 nW |  |  |  |
| Latency                                    | 4 ns    |  |  |  |
| Memory cell read operation                 |         |  |  |  |
| Static power for 2×2 array                 | 68.6 µW |  |  |  |
| Dynamic power for 2×2 array                | 129 µW  |  |  |  |
| Latency                                    | 2 ns    |  |  |  |

The convolutional neural network based on this logic-inmemory system has several advantages over conventional designs. The variable resistance provided by the synaptic MTJ ensures high accuracy at a low energy cost, because the same scaled CNN on a general-purpose processor consumes at least  $50 \times$  more energy [3]. Our design overcomes the limitation of binary MTJ designs, noting that the sense amplifiers and voltage drivers are designed with a bit resolution matching the available MTJ DW positions, i.e.,  $R_{MTJ}$  values. The available DW locations are determined by self-affine statistics or from patterned notches, and we predict 3-bit MTJ resolution based on our pinning models [2], [19]. Our design ensures correct operation even though the TMR of 100% here is modest compared to the highest reported TMR of 604% [1], [20]. The ability to store layer outputs in the inherently nonvolatile thresholding MTJ reduced the need for the significant overhead to store or convert the activation function results.

# V. CONCLUSION

We present a 3-terminal MTJ device that can evaluate nonlinear functions. The device is applied to a logic-in-memory system design with MTJs and CMOS. The use of MTJ function evaluators increases the density of information in the logic-in-memory system and provides significant speedup for deep convolutional neural networks in the system where many layers are run exclusively in the analog domain.

## ACKNOWLEDGMENT

The authors thank E. Lage, E. F. Martin, D. I. Paul, and E. Rosenberg for discussions. S. Dutta acknowledges support from the National Science Foundation Fellowship.

This project was supported by the National Science Foundation award 1639921, and the Nanoelectronics Research Corporation, a subsidiary of the Semiconductor Research Corporation, through Memory, Logic, and Logic in Memory Using Three Terminal Magnetic Tunnel Junctions, an SRC-NRI Nanoelectronics Research Initiative under Research Task ID 2700.002.

#### REFERENCES

- [1] J. A. Currivan-Incorvia, S. Siddiqui, S. Dutta, E. R. Evarts, J. Zhang, D. Bono, C. A. Ross, and M. A. Baldo, "Logic circuit prototypes for three-terminal magnetic tunnel junctions with mobile domain walls," *Nat. Commun.*, vol. 7, p. 10275, Jan. 2016.
- [2] X. Jiang, L. Thomas, R. Moriya, and S. S. P. Parkin, "Discrete domain wall positioning due to pinning in current driven motion along nanowires," *Nano Lett.*, vol. 11, no. 1, pp. 96–100, Dec. 2011.
- [3] G. W. Burr, P. Narayanan, R. M. Shelby, S. Sidler, I. Boybat, C. Di Nolfo, and Y. Leblebici, "Large-scale neural networks implemented with non-volatile memory as the synaptic weight element: Comparative performance analysis (accuracy, speed, and power)," in *IEEE Int. Electron Devices Meet.*, Dec. 2015, pp. 4.4.1–4.4.4.
- [4] L. Liu, C.-F. Pai, Y. Li, H. W. Tseng, D. C. Ralph, and R. A. Buhrman, "Spin-Torque Switching with the Giant Spin Hall Effect of Tantalum," *Science*, vol. 336, no. 6081, pp. 555–558, May 2012.
- [5] K.-S. Ryu, L. Thomas, S.-H. Yang, and S. S. P. Parkin, "Chiral spin torque at magnetic domain walls." *Nat. Nanotechnol.*, vol. 8, no. 7, pp. 527–33, Jun. 2013.
- [6] G. S. D. Beach, M. Tsoi, and J. L. Erskine, "Current-induced domain wall motion," *J. Magnetism and Magnetic Materials*, vol. 320, no. 7, pp. 1272–1281, Apr. 2008.
- [7] T. Koyama, D. Chiba, K. Ueda, K. Kondou, H. Tanigawa, S. Fukami, T. Suzuki, N. Ohshima, N. Ishiwata, Y. Nakatani, K. Kobayashi, and T. Ono, "Observation of the intrinsic pinning of a magnetic domain wall in a ferromagnetic nanowire." *Nat. Mater.*, vol. 10, no. 3, pp. 194– 7, Feb. 2011.
- [8] J. S. Urbach, R. C. Madison, and J. T. Markert, "Interface depinning, self-organized criticality, and the Barkhausen effect," *Phys. Rev. Lett.*, vol. 75, no. 2, pp. 276–279, Jul. 1995.
- [9] P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie, "PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory," in *IEEE Int. Symp. Comput. Archit.*, Jun. 2016, pp. 27–39.
- [10] B. Li, P. Gu, Y. Wang, and H. Yang, "Exploring the Precision Limitation for RRAM-Based Analog Approximate Computing," *IEEE Des. Test*, vol. 33, no. 1, pp. 51–58, Feb. 2016.
- [11] A. Sengupta, Y. Shim, and K. Roy, "Proposal for an All-Spin Artificial Neural Network: Emulating Neural and Synaptic Functionalities Through Domain Wall Motion in Ferromagnets," *IEEE Trans. Biomed. Circuits Syst.*, vol. 10, no. 6, pp. 1152–1160, Dec. 2016.
- [12] E. Martinez, S. Emori, N. Perez, L. Torres, and G. S. D. Beach, "Currentdriven dynamics of Dzyaloshinskii domain walls in the presence of inplane fields: Full micromagnetic and one-dimensional analysis," *J. Appl. Phys.*, vol. 115, no. 21, p. 213909, Jun. 2014.
- [13] A. Vansteenkiste, J. Leliaert, M. Dvornik, M. Helsen, F. Garcia-Sanchez, and B. Van Waeyenberge, "The design and verification of MuMax3," *AIP Adv.*, vol. 4, no. 10, p. 107133, Oct. 2014.
- [14] "OOMMF: Object Oriented MicroMagnetic Framework," 2016. [Online]. Available: http://math.nist.gov/oommf
- [15] A. Sengupta, A. Banerjee, and K. Roy, "Hybrid Spintronic-CMOS Spiking Neural Network with On-Chip Learning: Devices, Circuits and Systems," *Phys. Rev. Appl.*, vol. 6, p. 064003, Nov. 2016.
- [16] J. A. Currivan, Y. Jang, M. D. Mascaro, M. A. Baldo, and C. A. Ross, "Low energy magnetic domain wall logic in short, narrow, ferromagnetic wires," *IEEE Magn. Lett.*, vol. 3, p. 3000104, Apr. 2012.
- [17] J. Li, C. I. Wu, S. C. Lewis, J. Morrish, T. Y. Wang, R. Jordan, T. Maffitt, M. Breitwisch, A. Schrott, R. Cheek, H. L. Lung, and C. Lam, "A novel reconfigurable sensing scheme for variable level storage in phase change memory," in *IEEE Int. Mem. Work.*, May 2011, pp. 1–4.
- [18] E. Martinez, S. Emori, and G. S. D. Beach, "Current-driven domain wall motion along high perpendicular anisotropy multilayers: The role of the Rashba field, the spin Hall effect, and the Dzyaloshinskii-Moriya interaction," *Appl. Phys. Lett.*, vol. 103, no. 7, p. 072406, Aug. 2013.
- [19] S. Dutta, S. A. Siddiqui, J. A. Currivan-Incorvia, C. A. Ross, and M. A. Baldo, "Micromagnetic modeling of domain wall motion in sub-100-nm-wide wires with individual and periodic edge defects," *AIP Adv.*, vol. 5, no. 12, p. 127206, Dec. 2015.
- [20] S. Ikeda, J. Hayakawa, Y. Ashizawa, Y. M. Lee, K. Miura, H. Hasegawa, M. Tsunoda, F. Matsukura, and H. Ohno, "Tunnel magnetoresistance of 604% at 300 K by suppression of Ta diffusion in CoFeB/MgO/CoFeB pseudo-spin-valves annealed at high temperature," *Appl. Phys. Lett.*, vol. 93, no. 8, p. 082508, Aug. 2008.