# Design of parallel prefix adder using multiplexer selection logic

Pavan Thatipamula<sup>2</sup>, Dr. Sunil Singarapu<sup>1</sup>

<sup>1</sup> PG Scholar, Department of Electronics and Communication Engineering, Chaitanya (Deemed to be University) Hanamkonda, Warangal, Telangana-506001 Email: pavanthatipamula6@gmail.com

<sup>2</sup> Associate professor, Department of Electronics and Communication Engineering, Chaitanya (Deemed to be University) Hanamkonda, Warangal, Telangana-506001. Email: sunil.ece@chaitanya.edu.in

## ABSTRACT

Parallel prefix adder (PPA) is the basic functional unit to perform the modular various cryptography arithmetic in and pseudorandom bit generator algorithms. Carrysave adder is the area-efficient and widely adopted technique to perform the three-operand binary addition in the modular arithmetic used in cryptography algorithms and PRBG methods. However, the longer carry propagation delay in the ripple-carry stage of Carry-save adder seriously influences the performance of the VLSI other devices and cryptography architectures on IoT based hardware devices. But the existing methods are suffering with high computational complexity. Thus, this article proposed a Multiplexer Selection Brent kung adder (MSBKA) based parallel prefix adder. Moreover, a PPA using MSBKA was also used for three-operand addition that significantly reduces the critical path delay at the cost of additional hardware. The simulation results shows that the proposed method resulted in reduced area, delay, power consumption as compared traditional adders.

# 1. INTRODUCTION

On a VLSI device, the complexity of the signal processing systems that are implemented is increasing all the time because the scale of

integration is growing all the time. Not only do these signal processing applications need a large calculation capability, but they also use a significant amount of power. In today's VLSI system design, performance and area are still the two most important design considerations; nevertheless, power consumption has emerged as an issue of paramount importance []. There are two primary drivers that contribute to the need for low-power VLSI systems. First, as the operating frequency and processing capacity per chip continue to steadily increase, big currents need to be provided, and the heat caused by high power consumption has to be removed using appropriate cooling methods. Both of these challenges must be met. Second, the amount of time that portable electronic gadgets may operate on a single charge is restricted. A design that consumes less power results in these portable gadgets having an operational life that is much longer.

Addition is a fundamental mathematical operation that, in most cases, has a significant influence on the overall performance of digital

systems. Adders are the most common kind of calculator used in electronic applications. Multipliers and digital signal processors (DSPs) utilise these components to carry out a variety of algorithms, such as FFT, FIR, and IIR, respectively. Adders are introduced into the discussion whenever the notion of multiplication is discussed. As is well knowledge, microprocessors are capable of carrying out millions of instructions in a single second. When it comes to the design of multipliers, the most crucial factor to take into account is the maximum possible processing speed. Miniaturization of the gadget should be prioritized, and power consumption should be kept to a minimum, so that it may be easily transported. Mobile phones, laptops, and other electronic devices demand a larger battery backup.

# 2. LITERATURE SURVEY

ALUs (arithmetic logic units), HPMs (highspeed multiplications), AMIs (advanced microprocessor design), and DSPs (digital signal process) are the four pillars of modern.

The least mean square technique has higher stability and quicker convergence than other algorithms, making it ideal for adaptive filters, which have a broad variety of applications but are most often employed for DSP. This architecture features a very simple register layout. [1]. This feature is explained in this article, along with its potential to support large input sampling rates. Constructing a DLMS adaptive filter that has a low-adaptation latency and an efficient pipelined design is conceivable, and doing so will allow for increased convergence performance. When doing calculations using Vedic Math, you can find it helpful to make use of a multiplier that has 8 bits. The DSP [2] relies heavily on adders as a fundamental building block. The DLMS adaptive filter has the capability to minimise the complexity of the registers, the capacity to enable a speedier convergence, a very little adaptation delay, and does not need a pipelining approach. All of these benefits come without the requirement for a pipelining technique. Zero adaptation delay The DLMS method offers superior performance compared to those of other algorithms, in addition to having no adaption delay. The DLMS algorithm needs the least amount of energy (EPS) [3] per given sample, and it also needs the least amount of space compared to other designs. In applications involving electroencephalograms (EEGs), the DLMS method may also be used for the cancellation of interference [4].

Implementing the LMS adaptive algorithm and the changes to it may also be done in a variety of different methods beyond those that have been described. Only a few of the many possible applications of these methods for DSP have been discussed in this article [5], despite the fact that they may be used extensively for those applications. Since the multiplier might cause bottlenecks, the DA method may also be used to replace the multiplier block that is used by the LMS algorithm. This is because the multiplier can slow things down. To simulate the operation of multiplication [6]. Using the radix-8 Booth method, we were able to significantly cut down on the quantity of incomplete products that the DA design produced. In order to do serial bit calculations, DA is used. Because of this, the LMS adaptive filtering approach may use DA throughout the process of developing the VLSI architecture for it. In DA, the limited products of the input model and the filter factors are stored in "two" different LUTs. These LUTs are separate from one another. In direction to realize a higher near of performance, the LUTs are multiplexed after being accessed and their contents have been added together [7]. This will happen after the error has been calculated. This is going to happen once they have been removed from the system. By integrating the functionality of two adaptive filters, it is possible to create a DA-based low-complexity pipelined least-mean-square filter. Because of this, the amount of labour needed to instal the

filter will be reduced. The total step heights are broken out as follows: The convergence performance may be adjusted using adaptive filters; however, in this case, the two ADFs that were previously utilised have been replaced with a single DA-based ADF. Because odd multiples are symmetrical to one another, an adder tree may be used to add offset words [8]. Only a minimal number of adders are required for the creation of odd multiples from longer words, and all that is required is a single offset adder tree to complete the process. The PAS and PSA are the names given to each of these methodologies (BLMS-ADF). When contrasted with the PSA approach, the PAS methodology results in a shorter critical path [9]. In [11] authors developed the approximate adders exhibits with slow speed of compact design but carry look ahead performed faster and consume more area. Hybrid CLAs/CSLAs is designed and performed in this addition process of increasing speed. In analysis of approximate adders utilizing VHDL shows that the speed of the approximate adders is almost double than the conventional RCA.

In [12] authors developed the HYAA design for video and image processing applications. It exhibits the gate depth in the structure of adder and implemented dual RCA with the input 0 and 1 respectively. It works with efficient and simple process of significantly modification of gate level and the parameters reduction in conventional CSLA. Further, HYAA is an optimization process in the constraints of VLSI designs, respectively. To overcome the APD consumption issues in 4-bit, 16-bit, 32-bit regular CSLA, modified CSLA without utilizing Mux, modified CSLA utilizing BEC and modified CSLA utilizing approximate adders is developed in [13] and simulations resulted in better performance for modified designs as compared to state of art approaches. The proposed architecture has been developed and evaluated the performances of the design in terms of APD.

In [14] authors developed the approximate adders with the gates level modifications, which required less gates to perform the operation in the proposed work. It provides area reduction and the total power. The results analysis shows the better performances of the circuit and faster than the others. In this way, it makes efficient and simple way of process the VLSI hardware implementations. The mobile industry is growing rapidly not only because of arithmetic unit but also with the arithmetic units of less power and area. A simple and efficient modification of gates level makes the reduction in power, area and delay. Based on the modifications of CSLA the performances compare with other adders. By the BEC modification instead of FA chain the logic converter provides the circuit with slight changes of delay. The fast process performances of CSLA are utilized for arithmetic functions in data processing processors.

In [15] authors developed the quantum adders for power utilization has transform. For the reduction of circuit consumption BEC is utilized in the modified quantum adders instead of CSLA, RCA with increasing the delay slightly. In [16] authors proposed the low error efficient approximate adders, which is linear proportional of N delay is performed with N-bit, so highest delay process is performed by these adders. Normally it provides faster results with more delay process than the other adders. It provided because of large number of logic gates and fanin. In [17] authors proposed SMFA, which exhibits high speed with compact design but consumes more area. Also, SMFA is accessible with low power multipliers. The simulation results shown that it resulted in better performance as compared to the basic FAs.

In [18] authors developed the QCA-based FA, which can be used in the design of highperformance modules like multiple bit adders, multipliers, multiplexers, subtractors, comparators, registers, etc. The advancement in fabrication nanotechnology with the shrinking device sizes has allowed for placement of nearly two billion transistors on Intel's advanced processor. In [19] authors performed the QCA Further. approximate adders. based the multipliers are developed by using approximate adders. Then, QCA based digital logic gates and circuits designed is significantly quicker than the CMOS based logic gates and circuits designed with normal static logic style. The aggressive technology scaling to enhance the performance also because the integration level makes the noise play a serious role in design parameters like area, power, and speed. In [20] authors presented spintronic FAs, which is a low power 1-bit FA with an accelerated efficiency. The proposed design overcomes the drawbacks of the hybrid-CMOS common sense type and makes use of CMOS transmission gate and pass transistor established XOR-XNOR circuits to generate SUM and carry. The proposed FA is compared with existing hybrid CMOS 1-bit FAs.

#### 3. EXISTING SYSTEM

carry-look away instrumentation amplifier (cla). the quantity yeah necessary time of about define hold snippets would be drastically cut noticeably and use a carry-look forward with op amp, which ends up in a rise along two test. and that can be appeared differently in relation the with easier, and though normally sluggish, amplifier, wherein the transport smidge does seem to be measured and along with it and amount smidge, but each piece should allow time till the prior convey has already been measured because once starting to measure not only its somebody else's conclusion as well as hold portions. throughout this style of partial product, a tally small piece seems to be estimated initial, tried to follow but by convey piece, and afterwards the total value slightly (see multiplexer just that specifics forward rcas). that whole carry-look head - to - head op amp calculates one or both of these convey pieces equal volume, that also gets shorter the quantity yeah time being spent watching for the

result of estimate that included it and snippets also with larger benefit. optionals of something like this a kind have included kogge-stone instrumentation amplifier or the brent-kung op amp, to call really great example. a entire batch on even a ripple-carry partial product was indeed eerily similar to something which is done with the a pencil. the 2 figures the said contest seem to be plop together, and also the results have been found through early part with generate the best decimal place, but the one with bad ones highly significant. also there's the potential there will be a perform of the this zeros location (for example in the case, while using pencil-and-paper methods, "9+5=4, hold operand") if it is a perform. from this, all percentage point places besides the generate the best one would need to consider its possibility of getting so as to add a further 3 - 3, on account of one bring that already has are available in from of the role towards the okay of that though.





transport look forwards reasoning would then analyze evey tad together in binary string that would have to be appended, and with each slightly, this would pick and chose of whether this same correlating slightly duo might very well come exclusively some onebring and otherwise take root positive bring. over this, that whole transmission line is ready to "preprocess" both statistics that are now being introduced to seek out it and convey just before absolute number addi - tionactually occurs. after as well, there is really no wasted energy expecting this same carry - skip influence ing enter into force so when absolute number addi tion would be performed (or time taken for such bring with the first adder of being passed from generation to generation until the last replete adder).

### 4. PROPOSED SYSTEM

The PPA-MSBKA is divided as n/k bits each like n-bit adder is split into 'k' ripple-carry adders and excepting the part lowest order; these entire blocks of adder are simulated. The simplest adder of n-bit PPA-MSBKA is created utilizing RCAs three n/2 bit. The first adder is exploited to execute the n-bit lower half sum, whereas the next two calculate the greater half: one created on the supposition that the zeroinput carry, the other on the supposition that it is one. In this way the higher half computation can start instantaneously; there is no necessity to wait for the completion of lower half. When the sum of lower half is computed and the next stage carry input is available, the accurate half of the sum is chosen by a multiplexer. Due to this simulation approach, the needed adder power consumption and area is mainly doubled with esteem RCA. to



# Figure.4.1. Proposed PPA-MSBKA architecture.

The PPA-MSBKA uses all possible values for carry input i.e., 0 and 1 and estimates the advance result. The result is chosen by the multiplexer. To generate partial sum and carry in the circuit the PPA-MSBKA uses dual RCA's by considering  $C_{in} = 0$  and  $C_{in} = 1$  then the multiplier selects the final sum and carry. In regular PPA-MSBKA, area expended is high due to the dual RCA's usage. The PPA-MSBKA structure is presented in Figure 1.

The MSBKA resolves the issues of carry delay by estimating in advance of the carry signals, according to the signals of input. It is based on generating carry signal in two cases. First is both bits are 1 and the other considering two bits carry-in is 1. Notice that, produced and terms propagation depends on the bits input and accordingly it is valid subsequently one and two gate delay, correspondingly. The expression is utilized to estimate the carry signals by anyone and the other not necessary to wait for the carry to ripple; through the proper value in finding in the all-previous stages.

### 4.1 Proposed FA

FAs are essential cell in several circuits which are utilized for accomplishing arithmetic functions such addition. subtraction. as multiplication, address calculation and MAC etc. FA is one of the simple structure blocks of digital VLSI circuits. various Several improvements have been made concerning its construction since its invention. The major goal of those changes is to decrease the area to be utilized to achieve the logic needs, power consumption reduction and increase the operation speed. The performance of FAs enhancement can expressively affect the whole performance of the system. The XOR gate is the basic structure block of the FA circuit. The FA performance can be enhanced by the XOR gate performance. FA performs the addition of three

single bit inputs. It takes X, Y, and Cin as input and provides the outcomes as Sum and Cout. The carry input is taken from the previous stages. The carry outcome of the present stage is propagated to the input of the next stage.



Figure.4. 2: Logic diagram of FA.

The logic diagram of FA is designed with the help of XOR, AND, OR gates. FA differs from the HA by its carry input. There is no need of carry input in HA. So, in multi-bit adders, a HA is applicable in initial stages only. In subsequent stages of FA is employed, because it propagates only the carry. The fundamental operation of FA is explained through the truth table and it is tabulated in Table1. The FA can also be designed by combining two HAs and a single OR gate which is presented below. Design of FA utilizing basic gates is depicted in Figure 2. The logic diagram of FA design utilizing HAs is presented in Figure 3 and Figure 4. Basic gates are termed as AOI gates and they are AND gate, OR gate and NOT gate. Basically, FAs have three inputs and two outcomes of sum and carry. The FA logic circuit can be executed with the service of XOR gate, AND gates and OR gates.



Figure.4.3. FA utilizing HA with combination of XOR gates.



Figure.4.4: FA utilizing HA with basic Gates

A binary adder can be built with the connection of FAs in cascade with the carry form outcome of each FA connected to the carry input of the next FA in the chain. The four-bit adder is a usual standard component. It can be utilized in various applications including the function of arithmetic. In parallel the addition of binary numbers suggests that all the bits are obtainable for computation at the same time. As in any combinational circuit, the signal must transmit over the gates earlier for correct sum then the outcome can be obtained. Unless the signals are assumed, sufficient time is required to circulate over the gates else the outcome will not be connected correctly. The propagation time delay of adder which precedes the carry to propagate over the FAs. The signal from the input carry to the outcome propagates of carry through an AND gate and OR gate, which equals 2 gate levels.

Table 1: Truth table of FA.

|   | Inputs | Outputs |                     |   |  |  |                       |  |  |  |
|---|--------|---------|---------------------|---|--|--|-----------------------|--|--|--|
| X | Y      | Cin     | C <sub>in</sub> Sum |   |  |  | C <sub>in</sub> Sum C |  |  |  |
| 0 | 0      | 0       | 0                   | 0 |  |  |                       |  |  |  |
| 0 | 0      | 1       | 1                   | 0 |  |  |                       |  |  |  |
| 0 | 1      | 0       | 1                   | 0 |  |  |                       |  |  |  |
| 0 | 1      | 1       | 0                   | 1 |  |  |                       |  |  |  |
| 1 | 0      | 0       | 1                   | 0 |  |  |                       |  |  |  |
| 1 | 0      | 1       | 0                   | 1 |  |  |                       |  |  |  |
| 1 | 1      | 0       | 0                   | 1 |  |  |                       |  |  |  |
| 1 | 1      | 1       | 1                   | 1 |  |  |                       |  |  |  |

### 5. RESULTS

For the creation of each and everyplace-MSBKA design, The VIVADO software was the one that was used. This piece of software has the capacity to generate two separate kinds of outputs, namely simulation and synthesis. These outputs are both possible. The results of the simulation make it possible to conduct an indepth analysis of the PPA-MSBKA architecture with relation to the many permutations of input and output byte levels. A simple decoding technique may be approximated by applying a large number of different combinations of inputs and watching a wide variety of outputs while doing a simulation study of accurate encoding. As a consequence of the conclusions of the synthesis, the use of space in proportion to the number of transistors will be carried out. In addition, a time summary will be obtained with reference to the various route delays, and a power summary will be prepared making use of the static and dynamic power consumption. Both of these summaries will be gathered. Both of these summaries will be done. The results of the simulation of the proposed PPA-MSBKA are shown in Figure 5.

The design (area) overview of the suggested technique may be seen in Figure 6. In this case, the suggested technique makes use of a very small portion of the slice LUT space, namely 79 of the total 17600 that are accessible. The timing breakdown of the suggested technique may be shown in Figure 7. In this situation, the suggested procedure used a total of 6.904 nanoseconds worth of time delay, of which 2.290 nanoseconds worth of delay is logical and 2.348 nanoseconds worth of delay is route.



Figure 5.1 Simulation results.



Figure.5.2. Design summary.

| Q I O C M                                                                       | 1 Q I 0 K                 | - s                   | ۰,     | 0 10   | Timing Ches | ics - Seta | ap.      |             |             |           |         |       |             |                  |      |
|---------------------------------------------------------------------------------|---------------------------|-----------------------|--------|--------|-------------|------------|----------|-------------|-------------|-----------|---------|-------|-------------|------------------|------|
| General Information<br>Settings                                                 | Name<br>V D Unconstrained | Slack ^1<br>Paths (1) | Levels | Routes | High Fanour | From       | To       | Tetal Delay | Logic Delay | Net Delay | Logic % | Net % | Requirement | Source Clock     | Dest |
| <ul> <li>✓ Timing Decks (30)</li> <li>Setue (10)</li> <li>Hotel (10)</li> </ul> | > (none) (10)             |                       |        |        |             |            |          |             |             |           |         |       |             |                  |      |
|                                                                                 | 🍒 Path 11                 |                       | 17     | 15     | 6           | a(64)      | sum(125) | 11.933      | 2.523       | 9.411     | 21.1    | 789   |             | input port clock |      |
|                                                                                 | % Path 12                 |                       | 17     | 15     | 6           | a[64]      | sum(124) | 11.781      | 2.620       | 9.162     | 22.2    | 77.8  |             | input port clock |      |
|                                                                                 | % Path 13                 |                       | 17     | 15     | 5           | à[6]       | sum(57)  | 11.766      | 2,733       | 9.033     | 23.2    | 76.8  |             | input port clock |      |
|                                                                                 | 🐍 Path 14                 | 14                    | 17     | 15     | 5           | a[6]       | s.m[53]  | 11.737      | 2.972       | 8,766     | 25.3    | 74.7  |             | input port clock |      |
|                                                                                 | 1. Path 15                |                       | 17     | 15     | 6           | a[64]      | sum[122] | 11.716      | 2,479       | 9,237     | 21.2    | 78.8  |             | input port clock |      |
|                                                                                 | Seth 16                   |                       | 17     | 15     | 6           | a[64]      | sum[121] | 11.640      | 2.566       | 9.074     | 22.0    | 78.0  |             | input port clock |      |
|                                                                                 | 🦕 Path 17                 |                       | 17     | 15     | 5           | a[6]       | sum(59)  | 11.580      | 2712        | 8.858     | 23.4    | 76.5  |             | input port clock |      |
|                                                                                 | Neth 18                   | 10                    | 17     | 15     | 5           | a[6]       | sum(54)  | 11.514      | 2.978       | 8.536     | 25.9    | 74.1  | -           | input port clock |      |
|                                                                                 | Seth 19                   |                       | 17     | 15     | 6           | a(64)      | sum[125] | 11.510      | 2.550       | 8.959     | 22.2    | 77.8  |             | input port clock |      |
|                                                                                 | Peth 20                   |                       | 17     | 15     | 5           | a[6]       | sum(60)  | 11.501      | 2725        | 8.776     | 237     | 763   |             | input port clock |      |

Figure .5.3. Time summary

Figure 8 presents the power consumption data that was generated by the proposed PPA-MSBKA is 0.065 watts was determined to be the amount of power that the PPA-MSBKA required during this particular test. Table 2 presents a comparison and contrast of the results of performance assessments conducted on a number of different PPA-MSBKA controllers. In this instance, the proposed PPA-MSBKA resulted in superior (reduced) performance in terms of LUTs, time-delay, and power consumption when compared to conventional approaches such as RCA [20], CSA [22], and CSLA [24]. This was the case because the PPA-MSBKA used fewer transistors than the conventional approaches. This was the case because the PPA-MSBKA was able to reduce the time-delay without sacrificing the number of LUTs.



Figure 5.4. Power summary.

| Table 2. | Performance | comparison.                             |
|----------|-------------|-----------------------------------------|
| 10010 -  |             | ••••••••••••••••••••••••••••••••••••••• |

| Metric     | RCA<br>[20] | CSA<br>[22] | CSL<br>A<br>[24] | Propose<br>d PPA-<br>MSBKA |
|------------|-------------|-------------|------------------|----------------------------|
| LUTs       | 173         | 142         | 89               | 79                         |
| Time delay | 11.2        | 8.28        | 7.453            | 6.904                      |
| (ns)       | 8           | 4           |                  |                            |
| Power      | 0.49        | 0.34        | 0.849            | 0.065                      |
| consumptio |             |             |                  |                            |
| n (w)      |             |             |                  |                            |

### 6. CONCLUSION

A three-operand binary adder is the fundamental building block for performing modular cryptographic arithmetic in many and pseudorandom number generation algorithms. When it comes to modular arithmetic, which is used in cryptography algorithms and PRBG methods, the carry-save adder is the areaefficient and widely adopted technique for performing the three-operand binary addition. The performance of VLSI devices and other cryptography architectures on IoT based hardware devices is severely impacted by the longer carry propagation delay in the ripplecarry stage of Carry-save adder. However, existing methods have high computational complexity and are therefore inefficient. As such, this piece proposed a parallel prefix adder based on the Modified Brent kung adder. Furthermore, a PPA based on MSBKA was used for three-operand addition, which significantly decreased the critical path delay at the expense of additional hardware. As can be seen in the simulation results, the proposed method significantly reduces the area, delay, and power consumption of conventional adders.

## REFERENCES

- [1] Swamynathan, S. M., and V. Banumathi. "Design and analysis of FPGA based 32bit ALU using reversible gates." 2017 IEEE International Conference on Electrical, Instrumentation and Communication Engineering (ICEICE). IEEE, 2017.
- [2] Telagam, Nagarjuna, and Nehru Kandasamy. "Low Power Delay Product 8-bit ALU design using decoder and data selector." *Majlesi Journal of Electrical Engineering* 12.1 (2018): 103-108.
- [3] Abiri, Ebrahim, et al. "Optimized gate diffusion input method-based reversible magnitude arithmetic unit using non-dominated sorting genetic algorithm II." *Circuits, Systems, and Signal Processing* 39.9 (2020): 4516-4551.
- [4] Hasan, Mehedi, et al. "Overview and comparative performance analysis of various full adder cells in 90 nm technology." 2018 4th International Conference on Computing Communication and Automation (ICCCA). IEEE, 2018.
- [5] Hasan, Mehedi, et al. "Low Power Design of a Two Bit Mangitude Comparator for HighSpeed Operation." 2019 International Conference on Computer Communication and Informatics (ICCCI). IEEE, 2019.

- [6] Naghibzadeh, Armin, and Monireh Houshmand. "Design and simulation of a reversible ALU by using QCA cells with the aim of improving evaluation parameters." *Journal of Computational Electronics* 16.3 (2017): 883-895.
- [7] Shilpa, K. C., et al. "Performance analysis of parallel prefix adder for datapath VLSI design." 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT). IEEE, 2018.
- [8] Ahmadpour, Seyed-Sajad, Mohammad Mosleh, and Saeed Rasouli Heikalabad.
  "The design and implementation of a robust single-layer QCA ALU using a novel fault-tolerant three-input majority gate." *Journal of Supercomputing* 76.12 (2020).
- [9] Pragati, R. Sugantha, and A. Jawahar. "Design and implementation of FPGA based processor for wireless sensor nodes." 2017 International Conference on Communication and Signal Processing (ICCSP). IEEE, 2017.
- [10] Ahmad, Peer Zahoor, et al. "A novel reversible logic gate and its systematic approach to implement cost-efficient arithmetic logic circuits using QCA." *Data in brief* 15 (2017): 701-708.
- [11] Akurati, Siva Kumar, A. Anita Angeline, and VS Kanchana Bhaaskaran. "ALU design using Pseudo Dynamic Buffer based domino logic." 2017 International Conference on Nextgen Electronic Technologies: Silicon to Software (ICNETS2). IEEE, 2017.
- [12] Pittala, Chandrashekar, and Vallabhuni Vijay. "Design of 1-Bit

FinFET Sum Circuit for Computational Applications." International Conference on Emerging Applications of Information Technology. Springer, Singapore, 2021.

- [13] Gao, Mingming, et al. "A new nano design for implementation of a digital comparator based on quantum-dot cellular automata." *International Journal of Theoretical Physics* 60.7 (2021): 2358-2367.
- [14] Karthi, S. P. "Performance Analysis of 16 bit Adders in high speed computing applications." 2019 International Conference on Advances in Computing and Communication Engineering (ICACCE). IEEE, 2019.
- [15] Tiwari, Rajinder, et al. "Performance analysis of reversible ALU in QCA." Indian Journal of Science & Technology 10.29 (2017): 01-05.
- [16] Safoev, N., and J. C. Jeon. "Cell Interaction Based QCA Multiplexer for Complex Circuit Design." Advanced Science Letters 23.10 (2017): 10097-10101.
- [17] Soysouvanh, S., et al. "Ultrafast all-optical ALU operation using a soliton control within the cascaded InGaAsP/InPmicroring circuits." *Microsystem Technologies* 25.2 (2019): 431-440.
- [18] Sani, Mojtaba Hosseinzadeh, et al. "An ultrafast all-optical half adder using nonlinear ring resonators in photonic crystal microstructure." *Optical* and Quantum Electronics 52.2 (2020): 1-10.