Reduced Instruction Set Computing (RISC) Processors

Modern RISC processors operate under stringent constraints of power, area, and performance. Silicon implementations commonly balance 32-bit or 64-bit native word sizes with execution throughput requirements of 1-4 instructions per clock cycle. Contemporary designs must manage thermal envelopes between 1-15W for mobile applications while delivering computation capabilities necessary for increasingly complex workloads.

The fundamental challenge in RISC processor design lies in maintaining instruction set simplicity and execution efficiency while extending capabilities to handle specialized computational domains without compromising the clean architectural model.

This page brings together solutions from recent research—including variable-length opcode extensions for expanded addressing, integrated matrix multiplication acceleration within existing vector units, flexible multi-precision SIMD instruction support, and FPGA-integrated designs for user-defined instruction sets. These and other approaches demonstrate how the RISC philosophy continues to evolve while preserving the architectural clarity that enables efficient implementation across diverse computing environments.

1. ARM CPU Core with Integrated Outer Product Engine and Accumulator Array for Scalable Matrix Extensions Execution

MICROSOFT TECHNOLOGY LICENSING LLC, 2025

Implementing ARM's Scalable Matrix Extensions (SME) instruction set in an ARM CPU core without adding a separate SME accelerator. It reuses the existing vector hardware already present in the ARM CPU core for executing the SSVE instructions for the SME instruction set. The method involves adding an outer product engine and an accumulator array inside the CPU core to compute outer products and accumulate results for matrix multiplication. The outer product engine uses temporal single-instruction multiple-data (SIMD) processing to reduce memory bandwidth by computing over multiple cycles. The CPU core clears the vector registers and accumulator array when entering and exiting streaming mode.

2. Design of Low Power Control Unit for RISC-V Processor Core

Johannes Chan, Chong Li, Warsuzarina Mat Jubadi - Akademia Baru Publishing, 2024

This research work focuses on the development of a low-power decode logic for a RISC-V processor core with specifications. The goal is to create a controller that performs all six groups of instruction formats outlined in the RV32I Base Integer Instruction Set. The control unit is designed to decode a total of 13 instruction sets, allowing for a comprehensive range of operations. A single instruction pipeline approach is implemented in the design to optimize performance. The synthesis of the design is carried out using the 32 nm standard library, resulting in a maximum operating frequency of 666.67 MHz. To further enhance power efficiency, clock gating techniques are employed, leading to a reduction in power consumption by 18.72 % from 112.15 W to 91.45 W. Additionally, the layout of the design is optimized, resulting in an area of 354.74 mm2. The successful development of this low-power decode logic demonstrates its potential for integration into larger RISC-V processor cores. Future enhancements can include expanding the instruction decoding capability to encompass the full range... Read More

3. Implementation of a Multiclocked Pipelined Processor Based on RISc-V using RV321

- REST Publisher, 2024

This study's primary objective is to create a 32-bit pipelined processor based on the open-source RV32I Version 2.0 RISC-V ISA that operates across several clock domains. A processor known as a RISC (Reduced Instruction Set Computer) employs less hardware than a CISC (Complex Instruction Set Computer) in order to reduce the complexity of the instruction set and accelerate the execution time per instruction. In addition, we built this processor with five pipelining layers, which allows for concurrent processing of instructions. All of the procedures are thoroughly explained, supported with the required block diagrams. To guarantee that variable delays, such as clock skew and meta-stability, are avoided inside the stage pipeline registers, multiple clock domains using two clock sources are employed.

4. Synchronization Support in 64-bit Out-Of-Order Superscalar Dual-Core RISC-V Processor

Shubham Yadav, Manish Kumar, S Sajin - IEEE, 2024

This paper discusses the implementation of atomic instructions in a dual-core 64-bit out-of-order superscalar processors based on the open-source RISC-V instruction set architecture. Leveraging the advantage of RISC-V's modularization characteristics, each core implements RV64IMAFDC extension and optional supervisor and user mode privilege levels. In this paper, we focus on the A-extension, the atomic instruction set extension. This extension introduces instructions that provide atomic memory operations, enabling synchronization across multiple RISC-V harts within the same memory space. Our goal is to present an efficient execution flow of atomic memory operation instructions and Load-Reserved/Store-Conditional instructions for a dual-core System-on-Chip. We subsequently verify the synchronization capabilities through the execution of a standalone game application on SoC implemented on a Xilinx Kintex UltraScale KU085 FPGA-based board.

5. Matrix Multiplication Instruction with Configurable Vector Register Groups for RISC-V Processors

APPLE INC, 2024

A matrix multiplication instruction for RISC-V processors that enables efficient matrix operations by specifying a target vector register group for the result and source vector register groups for the input matrices, allowing for flexible matrix sizes and vector lengths.

6. Multi-Voltage Design of RISC Processor for Low Power Application: A Survey

Dheeraj Sharma, R Vikram - Deanship of Scientific Research, 2024

Power management is becoming important aspect as the size of transistor is shrinking.For processor design, Reduced Instruction Set Computer (RISC) architecture is preferable as compared to Complex Instruction Set Computer (CISC) architecture because of its simplicity and availability.To design the low power RISC processor, there are a few techniques that had been used earlier, such as a) pipelining and b) Common Power Format language to generate power intent of RISC processor design.In the present work, for designing a 16-bit RISC processor with low power consumption, a multi-voltage design technique has been used.In this technique, different supply voltages are provided to different blocks of the design.This technique is implemented with the help of Unified Power Format (UPF).Further, various operations such as ADD, SUB, INVERT, AND, OR, Right Shift, Left Shift, and Less Than are verified on modelsim for the designed 16-bit RISC processor.

7. Generation of Coverage based Verification Benchmark Programs for RISC-V Processor

Sudeendra Kumar, Adarsh Hegde, V V Likhita - IEEE, 2024

The RISC-V architecture has gained significant popularity as an open and extensible instruction set architecture (ISA) for a wide range of computing applications. This paper introduces a novel approach to enhance coverage in RISC-V processor verification through the design and development of benchmark verification programs. To achieve comprehensive coverage, a CRIG (Constrained Random Instruction Generator) is designed which is capable of generating a diverse set of random instruction encodings while adhering to the RISC-V ISA specifications. enabling the creation of randomized instruction sequences based on constraints. The utilization of coverage groups ensures critical aspects of the processor are adequately tested. Through extensive testing, the paper identifies a benchmark constraint randomized file that showcases exceptional coverage, serving as a valuable reference for future verification projects. The resulting benchmark programs form a comprehensive suite that rigorously tests the RISC-V processor's functionality, providing confidence in its compliance with the RISCV ISA spe... Read More

8. Design a 5-stage pipeline RISC-V CPU and optimise its ALU

Lifu Deng - EWA Publishing, 2024

The RISC-V instruction set has advanced and expanded significantly in recent years. It is an open instruction set architecture (ISA) based on the concept of Reduced Instruction Set Computing (RISC). This article uses Verilog to design a 5-stage pipeline CPU based on RISC-V architecture in Vivado 2022.2. The CPU can execute 38 instructions and optimises its arithmetic logic unit (ALU) by optimising adders, shifters, and multipliers. Next, write a testbench in the simulation software to verify the functionality of the CPU. RTL diagrams and reports are then generated to verify the design structure and evaluate resource allocation. Finally, the CPU successfully executes the instruction and obtains the correct operation result, and the occupation of LUT resources in the shifter part is reduced. This work serves as an important reference for system-on-chip (SoC) and computer design in general. It not only highlights the potential of the RISC-V architecture but also demonstrates the success of optimisation efforts. This paves the way for more powerful and efficient computing systems.

9. RISC-V processor enhanced with a dynamic micro-decoder unit

J. Pottier, Thomas Nieddu, Bertrand Le Gal, 2024

For years, the open-source RISC-V instruction set has been driving innovation in processor design, spanning from high-end cores to low-cost or low-power cores. After a decade of evolution, RISC architectures are now as mature as the CISC architectures popularized by industry giant Intel. Security and energy efficiency are now joining execution speed among the design constraints. In this article, we assess the benefits and costs associated with integrating a micro-decoding unit inspired by CISC processors into a RISC-V core. This unit, added in a specific pipeline stage, should enable dynamic custom instruction sequences execution whose usage could be, for instance to compress binaries, obfuscate behavior, etc.

10. A Performance Modelling-Driven Approach to Hardware Resource Scaling

Alexandre Rodrigues, Leonel Sousa, Aleksandar Ilić - Springer Nature Switzerland, 2024

The continuous demand for higher computational performance and the stagnating developments in the general purpose processor landscape have led to a surge in interest for highly specialized and efficient hardware. Combined with the rising popularity of parameterizable hardware, a new opportunity to optimize these architectures for particular workloads arises, largely driven by the RISC-V Instruction Set Architecture (ISA). This work present an application-specific optimization methodology for general purpose processors, enabling the development of architectures which are faster and more efficient for their designated workloads. Driven by the Cache-Aware Roofline Model (CARM) insights, the methodology guides the configuration of the memory and computational subsystems of the processor. We apply this methodology to two applications, demonstrating up to a $$2.67\times $$ performance increase and a $$1.34\times $$ improvement to energy efficiency.

11. Design of RISCV processor using verilog

E. Jaya, B. Maneesha, G. Sriram - i-manager Publications, 2024

The main goal of this paper is to develop a 32-bit pipelined processor with several clock domains based on the RISCV (open source RV32I Version 2.0) ISA. To minimize the complexity of the instruction set and speed up the execution time per instruction, a RISC (Reduced Instruction Set Computer) processor that uses less hardware than a CISC (Complex Instruction Set Computer) is used. Furthermore, this paper constructed this processor with five levels of pipelining with the aid of necessary block diagrams, and all of the processes are well described. In this paper, a RISCV processor is designed and simulated using Verilog. The design of the RISCV processor provides an alternative for software and hardware design to the computer designers as it provides free and open instruction set architecture (ISA). Besides, the designed RISCV processor will be using 5-stage pipeline techniques to improve the overall performance of the processor. This system is started by implementing several main modules, such as alu, aludec, maindec, imem, dmem, regfile, pc_mux, result_mux, pipeline register (IF/ID,... Read More

12. Out-of-Order Execution of Instructions for In-Order Five-Stage RISC-V Processor

Sushmita Hubballi, Saroja V. Siddamal - Springer Nature Singapore, 2024

In recent years, there have been remarkable advancements in Integrated Circuit (IC) technology, enabling the development of highly sophisticated computer systems on a single chip. Custom System on Chip (SoC) designs, where the processor core(s) and cache represent a smaller portion of the overall chip, have gained widespread popularity. Nowadays, it is challenging to come across an electronic product of any size that does not incorporate a processor. Open-source instruction set architecture, such as RISC-V-based processors, has gained traction in custom SoC design. A processor is the core of an electronic system. In a five-stage pipelined RISC-V processor, instructions are executed in the sequence that they are given. In this work, the architecture suggests an out-of-order execution of instructions when the resources are available yet all instructions are to be blocked owing to a multi-cycle instruction. In this architecture, in comparison with an in-order execution, we notice a difference of 120 ns, i.e., six clock cycles being used efficiently in an out-of-order execution when a mu... Read More

13. Optimizing CNN Computation Using RISC-V Custom Instruction Sets for Edge Platforms

Shihang Wang, Xingbo Wang, Zhiyuan Xu - Institute of Electrical and Electronics Engineers (IEEE), 2024

Benefit from the custom instruction extension capabilities, RISC-V architecture can be optimized for many domain-specific applications. In this paper, we propose seven RISC-V SIMD (single instruction multiple data) custom instructions that can significantly optimize the convolution, activation and pool operations in CNN inference computation. More specifically, instruction CONV23 can greatly speed up the operation of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">F</i> (2 2, 3 3). With the adoption of Winograd algorithm, the number of multiplications can be reduced from 36 to 16, and the execution time is also reduced from 140 to 21 clock cycles. These custom instructions can be executed in batch mode within the acceleration module where the immediate data can be reused, so the latency and energy overhead associated with excess memory accesses can be eliminated. Using inline assembler in C language, the custom instructions can be called and compiled together with C source code. A revised RISC-V processor, RI5CY-Accel is construct... Read More

14. RISC-V V Vector Extension (RVV) with reduced number of vector registers

Eino Jacobs, Dmitry Utyansky, Muhammad Hassan, 2024

To reduce the area of RISC-V Vector extension (RVV) in small processors, the authors are considering one simple modification: reduce the number of registers in the vector register file. The standard 'V' extension requires 32 vector registers that we propose to reduce to 16 or 8 registers. Other features of RVV are still supported. Reducing the number of vector registers does not generate a completely new programming model: although the resulting core does not have binary code compatibility with standard RVV, compiling for it just requires parameterization of the vector register file size in the compiler. The reduced vector register file allows for still high utilization of vector RVV processor core. Many useful signal processing kernels require few registers, and become efficient at 1:4 chaining ratio.

15. RISC-V Processor for IOT Applications

Et al. Rajveer Singh - Auricle Technologies, Pvt., Ltd., 2023

RISC-V is a recently introduced instruction-set architecture (ISA) that offers innovative advantages, including low power consumption, affordability, and scalability. Utilizing an open, non-proprietary Instruction Set Architecture (ISA) enables the creation of on-the-fly design of soft error countermeasures at the microarchitecture level. This may significantly enhance the resilience of Application Specific Standard Products (ASSP) and FPGA implementations. This paper offers a quick overview of the RISC-V architecture. This paper presents a plan to create and execute a 32-bit single-cycle RISC-V processor using Verilog HDL in the Vivado software.

16. How to Design an ISA

David Chisnall - Association for Computing Machinery (ACM), 2023

Over the past decade I've been involved in several projects that have designed either ISA (instruction set architecture) extensions or clean-slate ISAs for various kinds of processors (you'll even find my name in the acknowledgments for the RISC-V spec, right back to the first public version). When I started, I had very little idea about what makes a good ISA, and, as far as I can tell, this isn't formally taught anywhere. With the rise of RISC-V as an open base for custom instruction sets, however, the barrier to entry has become much lower and the number of people trying to design some or all of an instruction set has grown immeasurably.

17. Design of Decoded Instruction Cache

Takero Magara, Nobuyuki Yamasaki - IEEE, 2023

Recent microprocessors improve performance by extracting various levels of parallelism. Among these, out-of-order processors focus on ILP to improve performance. On the other hand, out-of-order processors consume a lot of power because they fetch and decode many instructions.We propose a Decoded Instruction Cache (DIC), in which the control signals generated by decoding RISC instructions are stored as decoded instructions in the DIC. The scheme improves performance and reduces power consumption because the results of fetch and decode can be reused. The DIC also supports multi-threaded execution, so TLP is also improved.When implemented in a multithreaded RISC processor, the DIC improves IPC by 2.39%.

18. An Open-Source FPGA Platform for Shared-Memory Heterogeneous Many-Core Architecture Exploration

Rafael Tornero, David R. Rodriguez, José Maria Martínez - IEEE, 2023

Many-core architectures, especially those with heterogeneous components, are gaining momentum due to the benefits of having an Open Source Instruction Set Architecture (ISA), such as RISC-V. In this paper we present a new computing platform for developing and analysing future heterogeneous architectures where specific custom accelerators are integrated with more standard RISC-V computing cores. The platform implements a coherent shared memory model which simplifies programmability and enables efficient communication support to all heterogeneous components. We detail the network and memory subsystems and provide preliminary evaluation results showing the benefits when using two systolic accelerators managed by two computing cores.

19. RISC processor implementation 32-bit MIPS-based: an approach to teaching and learning

Francisco Silva e Serpa, Alan Marcel Fernandes de Souza, Hélio Fernando Bentzen Pessoa Filho - Uniao Atlantica de Pesquisadores, 2023

This article describes the development of the design of a processor based on the RISC architecture, taking the 32-bit MIPS microprocessor as a basis. The RISC architecture, which stands for Reduced Instruction Set Computer, is characterized by having a reduced instruction set, aiming to optimize the processor's overall performance. The designed MIPS processor follows a 5-stage pipeline, which comprises the instruction fetch, instruction decode, execution, preparation and memory access phases. The main objective of this article is to carry out the structural development of the processor, using the hardware description language. This implies the creation of a Verilog representation that will later be used to generate the extraction of the processor's logic circuit. Furthermore, the project involves generating a timing diagram that illustrates the temporal behavior of processor operations and, ultimately, the physical implementation of the processor core. This work seeks to contribute knowledge in the field of computer architecture, providing a practical implementation of a RISC process... Read More

20. Vectorized Nonlinear Functions with the RISC-V Vector Extension

Eric Bavier, Nicholas Knight, Hugues de Lassus Saint-Geniès - IEEE, 2023

The RISC-V Vector instruction set extension (RVV) provides scalable data-parallel instructions suitable for accurate and performant implementations of numerical algorithms across many application domains [1]. The primary objective of this paper is to share our experience implementing vector C99 <math.h> (libm) functions using RVV. Our contributions are threefold: First, we contributed an RVV port of SLEEF, a multi-platform open-source vector libm. Second, we show that while SLEEF simplifies porting efforts, it also precludes some RVV-specific optimization opportunities. With SiFive's X280 vector processor micro-architecture as a case-study, we highlight RVV features that optimized code can use. We also expand the discussion to how these features might be used differently when optimizing for other cores. Third, we compare the performance of our SLEEF RVV port to our own RVV-native routines. We present results from 1-ulp accurate implementations of Libm functions in a cycle-accurate simulation of the X280 pipeline to show the impact of RVV-enabled optimizations.

21. Design of Double 16_32 –Bit RISC Processor

Nagendra Prasad N, Sujatha Hiremath, Arjumath Farraj - International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2023

Abstract: The SOCs built today offer a high level of functionality, serve a variety of applications, and improve in efficiency and cost. Embedded systems also face area and power consumption constraints in addition to real-time challenges. The main objective is to design and implement a 32-bit High-performance RISC (Reduced Instruction Set Computer) Processor architecture. The Processor is designed as an instantiation of submodules using Verilog HDL (Hardware Description Language). a 16-bit compatibility is introduced which makes use of the ISA to execute two 16bit operations at the same time and thus provides the capability to switch and execute both 32-bit and two 16-bit operations using the execution unit. The ISA is modified to meet the requirement to execute both 16-bit operation and 32-bit operations. Each of these instructions are independent of the other instruction and can be executed simultaneously. This enables the RISC based architecture to also enhance the speed of the design by a factor of 2 for 16 bit operations.

22. Design of DMS-RRIP replacement algorithm for L1-cache of RISC-V-based single-core embedded processor

Zining Ma, Honglan Jiang, Hong Peng Li - SPIE, 2023

As a new open-source reduced instruction set, RISC-V has received a large amount of attention. Because of its highly concise instruction set encoding and modular extended instruction sets, RISC-V instruction set has been widely used in the embedded system. As an indispensable part of the current embedded processor, cache directly affects the speed of the processor. In cache design, the cache replacement algorithm plays a key role, which is a necessary component affecting the hit rate of cache in. Aiming at the L1-Cache of RISC-V embedded processor, this paper proposes a cache replacement algorithm that dynamically adjusts the storage as per the memory access characteristics of the executing program. The simulation results show that the proposed replacement algorithm effectively improves the cache hit rate, and thus improves the overall performance of the processor.

23. A Deeply Pipelined FMA Unit for High Performance RISC-V Processor

Yu Qi, Mengxue Chen, Guilan Li - IEEE, 2023

The open RISC-V instruction set architecture provides a new innovative platform for integrated circuit design, and the multiplier and adder are the core units of the computing unit. This paper designs a high-performance 64-bit floating-point multiply-add operation unit based on the RISC-V floating-point instruction set, which supports single-precision, double-precision and half-precision calculations, and can realize a deep pipeline design with a maximum execution stack of 10 levels. In the VIVADO simulation environment, the function verification of the floating-point multiplication and addition unit is carried out, and each module can meet the correctness requirements. The results show that under the 28nm CMOS process, the longest critical path delay is 500ps, the slack is 0.09, and the operating frequency is 1.67GHz.

24. Design and Simulation of RISC Processor Using Verilog

P. Sudhanya, Aryan Kumar, Tushar Sharma - IEEE, 2023

Reduced Instruction Set Computer (RISC) Processors have recently become quite an important aspect of keeping up with all the advances in technology. This research presents a comprehensive study on the design and simulation of an improved RISC processor architecture on Field-Programmable Gate Arrays (FPGAs). The objective is to enhance the performance, efficiency, and versatility of RISC processors by incorporating novel design techniques and optimizations. The proposed design mainly focuses on the improved design of the Arithmetic and Logic Unit (ALU). The effectiveness of the proposed design is evaluated through extensive simulations. The results demonstrate significant improvements in performance in time and total power. The findings of this study indicate that the improved RISC processor design offers a promising approach to address the increasing demands of modern computing systems.

25. Design of High Performance Core Micro-Architecture Based on RISC-V ISA for Low Power Applications

Nidhi Jaiswal - International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2023

Abstract: Numerous current and future applications aim to create highly efficient central processing units (CPUs). The RISC V processor micro-architecture is one illustration of a design that satisfies the necessities. The RISC-V Instruction Set Architecture [ISA] provides support for the micro-architecture. The instruction set architecture and the micro-architecture of a processor are two of the most crucial aspects of its design. The multiplier and divider circuits have a relatively high level of hardware complexity compared to other stages of the instruction execution process, which must be taken into account in any core microarchitecture. The construction of an appropriate hardware circuit that is capable of multiplication and division determines the overall size, power, and performance of a core. This center has four phases, and during those stages, each guidance is done, except for stacking and putting away information. The arithmetic operations can be completed within one clock cycle. On the other hand, the division and multiplication operations are repeated in an effort to sh... Read More

26. Design of High-Performance Core Micro-Architecture Based on 32- Bit RISC-V Instruction Set Architecture [ISA]

K. Ragini, Nidhi Jaiswal - International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2023

Abstract: A wide range of present and future applications strive to develop highly efficient central processing units (CPUs). One particular design that meets these requirements is the RISC V processor micro-architecture. The RISC-V Instruction Set Architecture (ISA) provides the necessary support for this micro-architecture. The instruction set architecture and microarchitecture are crucial components in processor design. Among these components, the multiplier and divider circuits exhibit a relatively high level of hardware complexity compared to other stages of instruction execution. Therefore, it is essential to consider these factors when designing the core micro-architecture. The size, power, and performance of a core are determined by the construction of an appropriate hardware circuit capable of handling multiplication and division operations. The core consists of four phases, with each instruction being executed within these stages, except for data storage and retrieval. Arithmetic operations can be completed within a single clock cycle. However, division and multiplication o... Read More

27. Optimisation of x264 encoder acceleration based on RISC-V vector instructions

Jiaolong Wang, Lei Wang, Peixin Wang - IEEE, 2023

RISC-V, as an emerging open source instruction set architecture, has the advantages of simplicity and modularity. With the increasing maturity and perfection of the related tool chain, the construction of software ecology is being paid more and more attention. As an open source video encoder, many scholars have proposed different optimized implementations based on the characteristics of different architectural instructions, but there are few efficient implementations and optimizations of the x264 algorithm library based on vector instructions for the RISC-V platform., this paper rewrites and optimises the x264 source code in assembly language based on the vector extension instruction version 1.0. After an in-depth study of the characteristics of vector instructions, instruction-level optimisation of the SAD function is carried out and a fast SAD algorithm is proposed. The DCT transform is vector optimised and an efficient access algorithm is designed based on the characteristics of the instruction set in order to solve the instruction redundancy problem caused by discontinuous access... Read More

28. Vectorization Programming Based on HR DSP Using SIMD

Chunhu Xie, Huachun Wu, Jian Zhou - MDPI AG, 2023

Single instruction multiple data (SIMD) vector extension has become an essential feature of high-performance processors. Architectures such as x86, ARM, MIPS, and PowerPC have specific vector extension instruction sets and SIMD micro-architectures. Using SIMD vectorization programming can significantly improve the performance of application algorithms while keeping the hardware overhead low. In addition, other methods can enhance algorithm performance, such as selecting the best SIMD vectorization model for algorithms, ensuring sufficient instruction streams, implementing reasonable and effective cache data prefetching, and aligning data access and storage addresses according to instruction characteristics. The goal of this paper is three-fold. First, we introduce the basic structural characteristics of a general RISC processor, Hua Rui (HR) DSP, with a custom vector instruction set based on compatibility with an MIPS64 fixed-point and floating-point instruction set, as well as a Fei Teng (FT) processor compatible with an ARMv8 instruction set. Second, we summarize the fundamental pr... Read More

29. Design and Physical Synthesis of a 16 BIT RISC Processor

- International Research Journal of Modernization in Engineering Technology and Science, 2023

A sort of microprocessor called RISC, or Reduced Instruction Set of Computers, was created by Harvard style data path structure to operate at high speed with a minimal number of instructions.This project describes the design and development of a low power, 4-stage pipelining-based CPU.This feature leads to increase the reliability and speed of the system.Fetch, decode, execute, and memory read/write operations are all included in pipelining.Designing a 4-stage pipelined RISC processor from RTL to GDSII (Physical Design) is the major goal of the project.The processor was created using the Verilog HDL language in Synopsys Fusion Compiler and Xilinx Vivado.Calculated area, power, delay using Synopsys Fusion Compiler using standard libraries of 32nm technology.

30. Vitruvius+: An Area-Efficient RISC-V Decoupled Vector Coprocessor for High Performance Computing Applications

Francesco Minervini, Oscar Palomar, Osman Ünsal - Association for Computing Machinery (ACM), 2023

The maturity level of RISC-V and the availability of domain-specific instruction set extensions, like vector processing, make RISC-V a good candidate for supporting the integration of specialized hardware in processor cores for the High Performance Computing (HPC) application domain. In this article, 1 we present Vitruvius+, the vector processing acceleration engine that represents the core of vector instruction execution in the HPC challenge that comes within the EuroHPC initiative. It implements the RISC-V vector extension (RVV) 0.7.1 and can be easily connected to a scalar core using the Open Vector Interface standard. Vitruvius+ natively supports long vectors: 256 double precision floating-point elements in a single vector register. It is composed of a set of identical vector pipelines (lanes), each containing a slice of the Vector Register File and functional units (one integer, one floating point). The vector instruction execution scheme is hybrid in-order/out-of-order and is supported by register renaming and arithmetic/memory instruction decoupling. On a stand-alone synthesis... Read More

31. Design and Implementation of RISC-V ISA (RV32IM) on FPGA

Anmol Singh, Arpit Kumar, Abhishek Singh - Seventh Sense Research Group Journals, 2023

RISC-V, an open-source Instruction Set Architecture, originated from the collaborative efforts of researchers at the University of California, Berkeley, in 2010. It is a basic Load and Store type architecture based on traditional principles of RISC whilst providing flexibility in terms of extensions to the base Integer Set such as multiply, floating point and atomic instructions. This paper details the Design and Implementation of 5 stages pipelined RV32IM (base integer set with multiply extension). The design also incorporates a 2-bit branch predictor for increased throughput. Analysis and Verification have been performed for proper decoding, pipelined operation, branch prediction, stalling, memory access, and overall functionality. Verilog HDL on Intel QuestaSim has been used to design the core and simulation. DE 10 Lite board with Max 10 family of FPGA has been used for hardware synthesis and analysis of the design.

32. Three-Dimensional RISC-V Many-Core Processor Architecture with Micro-Nucleus Array and 3DRouter Integration

SHANDONG LINGNENG ELECTRONIC TECHNOLOGY CO LTD, 2023

A RISC-V-based three-dimensional interconnected many-core processor architecture that enhances performance by integrating a micro-nucleus array layer with 3DRouter, an accelerator layer, and a main control layer. The micro-nucleus array layer enables efficient data interaction between micro-cores and the accelerator layer, while the 3DRouter provides a fast data path back to the main core. The architecture leverages RISC-V's open-source nature and simplifies instruction set extensions through a single-pipeline micro-kernel design.

33. Basic RISC-V Instruction Set Architecture: Design and Validation

Paolo Pavan, Pradhan Kamal, G. Govardhan - International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2023

Abstract: This project's primary goal is to design and implement a simple RISC V instruction set architecture. This paper offers insights into the architecture of the risc v instruction set. This system employs the RISC V R-type (register) type instruction format. Using this format, we designed the fundamental isa and tested its functionality using verilog code. There is no licence fee for using RISC V, an open source isa that is available to everyone. Reduced instruction set (RISC) computers are created to make the individual instructions given to computers to perform various tasks more manageable. Most instruction set architectures, or isas, are proprietary and cannot be used or modified without permission from the companies; as a result, an isa that is free and open source, which is provided by risc v, will help in increasing the speed and lowering system costs by using these instruction set formats, and we are designing 32 bit isa architecture. instruction formats for the risc v instruction set architecture

34. System and Method for Large-Word Operations in RISC Processor Using Special Purpose Execution Unit with Overlapping Variable-Size Registers

INTERNATIONAL BUSINESS MACHINES CORP, 2023

A method and system for supporting large-word operations in a Reduced Instruction Set Computer (RISC) processor using a special purpose execution unit (SPU) with registers of varying sizes that can overlap with CPU registers. The SPU state is synchronized with the CPU state using master bits, enabling efficient execution of operations that require larger word widths than the CPU's native register size.

35. RISC-V ISA Extension Toolchain Supports: A Survey

Yue Gao, Wei Qian, Enfang Cui - ACM, 2023

RISC-V is an open source modular and scalable emerging instruction set. As the RISC-V architecture gradually matures in the field of contemporary chips, the RISC-V software ecosystem is also gradually prospering. Some mainstream tool chains and operating systems have supported RISC-V architecture since the beginning, and now gradually support multiple RISC-V expansion directive. Although there are many works dedicated to advancing RISC-V instruction extensions to adapt to various scenarios under different computing power requirements, and exploring the RISC-V software ecosystem, there is no work on existing tool chains and operating systems to extend RISC-V Conduct systematic research and conclusions on the support of the instructions. The purpose of this paper is to systematically and comprehensively investigate and summarize the adaptation of tool chain and operating system to RISC-V extended instructions, including some extensions defined in the RISC-V instruction set specification and some customized Define extensions. In this article, we mainly elaborate on our research from fou... Read More

36. AsteRISC: A Size-Optimized RISC-V Core for Design Space Exploration

Jonathan Saussereau, Camille Leroux, Jean-Baptiste Bégueret - IEEE, 2023

The RISC-V open source instruction set architecture is a promising solution for applications related to low power embedded systems. This paper presents a configurable RISC-V processor architecture providing a compromise between the number of clock cycles required to execute an instruction, the maximum operating frequency, the resource utilization and the power consumption. This architectural flexibility enables the processor to be adapted to fit application constraints, on either FPGA or ASIC targets.

37. RISC CPU Instruction Set with Variable-Length Opcode for Extended Jump Addressing

SHENZHEN FABTHINK TECH LTD, SHENZHEN FABTHINK TECHNOLOGY LTD, 2023

A RISC-based CPU instruction set system that enables longer jump addresses while maintaining 32-bit instruction length. The system employs a variable-length opcode field that can range from 4 to 32 bits, allowing it to combine with other fields to form different instruction formats. This design enables jump instructions to access up to 64 GB of instruction space, compared to the traditional 256 MB limit, while maintaining the benefits of fixed-length instructions.

38. Survey and Comparison of Pipeline of Some RISC and CISC System Architectures

Yan He, Xiangning Chen - IEEE, 2023

Instruction set is a set of instructions used by CPU to calculate and control computer system, and is the interface between hardware and software. There are two common instruction sets: CISC and RISC. Pipeline technology is widely used in instruction set processor design to improve the efficiency of executing instructions. This paper introduces the difference between CISC and RISC in pipeline implementation, introduces the basic pipelining and two advanced pipelining - superscalar and superpipelining in detail, and introduces several pipelining using CISC and RISC architecture processors, including ARM, RISC-V, Longarch, and X86.

39. Alabama A&M Symmetric Overloaded Minimal Instruction Set Architecture (SOMA)

Patrick Jungwirth, Andrew Scott, Zhigang Xiao - IEEE, 2023

In this paper, the open instruction set architecture for the symmetric overloaded minimal instruction set computer architecture (SOMA) is presented. There are only two instruction classes for the architecture. MISC architectures date back to 1949 with the Manchester Mark 1 developed by The Victoria University of Manchester.Linux operating system began the trend towards open architectures in 1991. Open instruction set architectures and open hardware started with the open and extendable RISC-V instruction set in 2012. With the development of the RISC-V instruction set architecture, there has been renewed interest in researching microprocessor architecture fundamentals.Unlike high level languages, there is no standard format for assembly languages. The various architectures, x86, MIPS, ARM, RISC-V, et al., each have their own dialects. Multiple addressing modes add to the difficulty. An assembly language format simplifying instruction classes and using simple statements would significantly improve understanding.For the SOMA architecture, register-to-register and integer classes are merg... Read More

40. Architecture Support for Bitslicing

Pantea Kiaei, Thomas B. Conroy, Patrick Schaumont - Institute of Electrical and Electronics Engineers (IEEE), 2023

The bitsliced programming model has shown to boost the throughput of software programs. However, on a standard architecture, it exerts a high pressure on register access, causing memory spills and restraining the full potential of bitslicing. In this work, we present architecture support for bitslicing in a System-on-Chip. Our hardware extensions are of two types; internal to the processor core, in the form of custom instructions, and external to the processor, in the form of direct memory access module with support for data transposition. We present a comprehensive performance evaluation of the proposed enhancements in the context of several RISC-V ISA definitions (RV32I, RV64I, RV32B, RV64B). The proposed 14 new custom instructions use 1.5 fewer registers compared to the equivalent functionality expressed using RISC-V instructions. The integration of those custom instructions in a 5-stage pipelined RISC-V RV32I core incurs 10.21% and 12.72% overhead respectively in area and cell count using the SkyWater 130 nm standard cell library. The proposed bitslice transposition unit with DM... Read More

41. Design of Risc-V Processing Unit Using Posit Number System

D. Malathi, R. Sneha, M Shanmugapriya - IEEE, 2023

RISC-V is an Open Instruction Set (ISA) architecture which allows the implementation of the custom instruction set. The RISC-V processor was introduced to reduce the instruction set and increase register resource investment. Unlike other ISA designs, RISC-V is offered under an open-source licence. The chip and device manufacturing industries have provided the CPU with substantial support. RISC-V is therefore primarily made to be flexibly expandable and configurable for use in a variety of applications. In RISC-V, the number system is represented in several formats of real number arithmetic. The representation of real numbers in computer are standard IEEE 754 Floating point. It has many impacts like rounding, excess of Not a Number (NaN), signed Zero. By combining these aspects using posit number format, posit processing unit is developed. The Posit number system is alternative to IEEE 754 Floating point representation. Modification to the standard RISC-V ISA which enables conversion of 8 or 16 bit posits to 16-bit IEEE Floating point number to obtain more dynamic range within the giv... Read More

42. RISC-V Processors for Spaceflight Embedded Platforms

S.A. Malone, Patrick Saenz, Patrick E. Phelan - IEEE, 2023

Reduced Instruction Set Computer Five (RISC-V) is an open-source processor instruction set architecture which is rapidly gaining popularity in space applications. Not only can this architecture be implemented in standalone application specific integrated circuit (ASIC) hardware processors, but can be configured within field-programmable gate array (FPGA) fabrics. This paper will discuss the benefits and challenges of using RISC-V soft-cores within radiation-tolerant FPGAs for embedded space applications. Spacecraft are limited by size, weight, power, and cost (SWAP-C), and most rely on both FPGAs and discrete microprocessors to meet onboard processing needs. With advances in capacity for radiation-hardened FPGAs, it is now feasible to implement an advanced soft-core microprocessor (or even multiple cores) within the FPGA itself. This has potential to reduce part count and simplify designs greatly, make efficient use of spare FPGA capacity, and reduce the overall SWAP-C of spaceflight computers. The research team tested the latest RISC-V FPGA offerings from Microchip (product name Mi-... Read More

43. An Optimum Design and Implementation of a 16-bit ALU on CADENCE Using RISC-V Architecture

Muhammad Ali Raza, Iraj Shahzad, Hafsa Anwar - IEEE, 2023

The Arithmetic Logic Unit (ALU) is the fundamental component of the Central Processing Unit (CPU) that processes data based on logical and arithmetic operations. The performance of the ALU is measured by the logic delay, power consumption, and chip area based on the architecture used. RISC (Reduced Instruction Set Architecture) has become a standard and remarkable platform for people working in the sphere of chip design and programming. The main intent of this paper is to develop an optimum design of a 16-bit RISC-V-based ALU using the CADENCE tool in the EDA Playground along with XILINX ISE Design Suite 14.7. In the proposed design, the ISA delivers an acting interface between hardware and software which is up to 90% more efficient in terms of processing speed as compared to already existing designs in the literature. Furthermore, it is 33% reduced in size and 80% more efficient in terms of power consumption. This allows optimization of the processor in terms of improved power consumption and speed.

44. Versatility, Variety, Value

Mark Lippett - Mark Allen Group, 2023

The continued rise of RISC-V instruction-set architecture.

45. Building a Pipelined RISC-V Processor

Bernard Goossens - Springer International Publishing, 2023

This chapter will make you build your second RISC-V processor. The implemented microarchitecture proposed in this second version is pipelined. Within a single processor cycle, the updated processor fetches and decodes instruction i, executes instruction i-1, accesses memory for instruction i-2 and writes a result back for instruction i-3.

46. FPGA-Based 128-Bit RISC Processor Using Pipelining

T. Subhashini, M. Kamaraju, K. Babulu - Springer Nature Singapore, 2023

The main aim is to implement 128-bit RISC processor using pipelining techniques through FPGA with the help of von Neumann architecture. With the increase in the use of the FPGA in various embedded applications, there is a need to support processor designs on FPGA. The type of processor proposed is a soft processor with a simple instruction set which can be modified according to use because of the reconfigurable nature of FPGA. The type of architecture implemented is von Neumann. Prominent feature of the processor is pipelining which improves the performance considerably such that one instruction is executed per clock cycle. Due to the increase in innovations in the development of processors, the increasing popularity of open source projects like RISC-V ISA (Instruction Set Architecture), there is a need to also rapidly understand these designs and also upgrade them which can easily be performed on FPGA with trade off in speeds and size as compared to commercial ASIC processors, and hence, we are motivated to understand these systems. In this paper, a 128-bit RISC processor is impleme... Read More

47. Design and Implementation of 16-Bit Optimized RISC Processor with Novel Pipelining

Shweta Soni, Pallabi Sarkar, Ribu Mathew - Springer Nature Singapore, 2023

In the modern era, improvement in the quality of the processor plays a vital role in SoC designing. Understanding and designing of RISC (Reduced Instruction Set Computer) processor, ARM-based processor plays a vital role in the semiconductor domain since its being used in various devices across like smartphones, supercomputers, etc. In this paper, an optimized 16-bit RISC processor is proposed with the concept of pipelining and clock gating, which utilizes minimum on-chip power with maximum throughput achieved. The proposed design is based on the architecture that has separate blocks like Program Counter, Multiplexer, Instruction Memory, Data Memory, Arithmetic Logic Unit (ALU), Decoders, Registers, Flag Register, Adders, and various pipelines added. This processor supports 16 instructions with each instruction as 24-bit wide. In the register file, it has a total of 16 registers with each register as 16-bit. 16-bit ALU has been used in the design, which supports a total of 11 operations. It also incorporates a three-bit register, which can detect carry, zero, and parity status of th... Read More

48. Building a Fetching, Decoding, and Executing Processor

Bernard Goossens - Springer International Publishing, 2023

This chapter prepares the building of your first RISC-V processor. First, a fetching machine is implemented. It is only able to fetch successive words from a code memory. Second, the fetching machine is upgraded to include a decoding mechanism. Third, the fetching and decoding machine is completed with an execution engine to run computation and control instructions, but not yet memory accessing ones.

49. 32-Bit RISC-V CPU Core on Logisim

S. Patil, Premraj V. Jadhav, Siddharth Sankhe, 2023

This project focuses on making a RISC-V CPU Core using the Logisim software. RISC-V is significant because it will allow smaller device manufacturers to build hardware without paying royalties and allow developers and researchers to design and experiment with a proven and freely available instruction set architecture. RISC-V is ideal for a variety of applications from IOTs to Embedded systems such as disks, CPUs, Calculators, SOCs, etc. RISC-V(Reduced Instruction Set Architecture) is an open standard instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles. Unlike most other ISA designs, the RISC-V ISA is provided under open source licenses that do not require fees to use.

50. Building a RISC-V Processor

Bernard Goossens - Springer International Publishing, 2023

This chapter makes you build your first RISC-V processor. The implemented microarchitecture proposed in this first version is not pipelined. The IP cycle encompasses the fetch, the decoding, and the execution of an instruction.

1. ARM CPU Core with Integrated Outer Product Engine and Accumulator Array for Scalable Matrix Extensions Execution

2. Design of Low Power Control Unit for RISC-V Processor Core

3. Implementation of a Multiclocked Pipelined Processor Based on RISc-V using RV321

4. Synchronization Support in 64-bit Out-Of-Order Superscalar Dual-Core RISC-V Processor

5. Matrix Multiplication Instruction with Configurable Vector Register Groups for RISC-V Processors

6. Multi-Voltage Design of RISC Processor for Low Power Application: A Survey

7. Generation of Coverage based Verification Benchmark Programs for RISC-V Processor

8. Design a 5-stage pipeline RISC-V CPU and optimise its ALU

9. RISC-V processor enhanced with a dynamic micro-decoder unit

10. A Performance Modelling-Driven Approach to Hardware Resource Scaling

11. Design of RISCV processor using verilog

12. Out-of-Order Execution of Instructions for In-Order Five-Stage RISC-V Processor

13. Optimizing CNN Computation Using RISC-V Custom Instruction Sets for Edge Platforms

14. RISC-V V Vector Extension (RVV) with reduced number of vector registers

15. RISC-V Processor for IOT Applications

16. How to Design an ISA

17. Design of Decoded Instruction Cache

18. An Open-Source FPGA Platform for Shared-Memory Heterogeneous Many-Core Architecture Exploration

19. RISC processor implementation 32-bit MIPS-based: an approach to teaching and learning

20. Vectorized Nonlinear Functions with the RISC-V Vector Extension

21. Design of Double 16_32 –Bit RISC Processor

22. Design of DMS-RRIP replacement algorithm for L1-cache of RISC-V-based single-core embedded processor

23. A Deeply Pipelined FMA Unit for High Performance RISC-V Processor

24. Design and Simulation of RISC Processor Using Verilog

25. Design of High Performance Core Micro-Architecture Based on RISC-V ISA for Low Power Applications

26. Design of High-Performance Core Micro-Architecture Based on 32- Bit RISC-V Instruction Set Architecture [ISA]

27. Optimisation of x264 encoder acceleration based on RISC-V vector instructions

28. Vectorization Programming Based on HR DSP Using SIMD

29. Design and Physical Synthesis of a 16 BIT RISC Processor

30. Vitruvius+: An Area-Efficient RISC-V Decoupled Vector Coprocessor for High Performance Computing Applications

31. Design and Implementation of RISC-V ISA (RV32IM) on FPGA

32. Three-Dimensional RISC-V Many-Core Processor Architecture with Micro-Nucleus Array and 3DRouter Integration

33. Basic RISC-V Instruction Set Architecture: Design and Validation

34. System and Method for Large-Word Operations in RISC Processor Using Special Purpose Execution Unit with Overlapping Variable-Size Registers

35. RISC-V ISA Extension Toolchain Supports: A Survey

36. AsteRISC: A Size-Optimized RISC-V Core for Design Space Exploration

37. RISC CPU Instruction Set with Variable-Length Opcode for Extended Jump Addressing

38. Survey and Comparison of Pipeline of Some RISC and CISC System Architectures

39. Alabama A&amp;M Symmetric Overloaded Minimal Instruction Set Architecture (SOMA)

40. Architecture Support for Bitslicing

41. Design of Risc-V Processing Unit Using Posit Number System

42. RISC-V Processors for Spaceflight Embedded Platforms

43. An Optimum Design and Implementation of a 16-bit ALU on CADENCE Using RISC-V Architecture

44. Versatility, Variety, Value

45. Building a Pipelined RISC-V Processor

46. FPGA-Based 128-Bit RISC Processor Using Pipelining

47. Design and Implementation of 16-Bit Optimized RISC Processor with Novel Pipelining

48. Building a Fetching, Decoding, and Executing Processor

49. 32-Bit RISC-V CPU Core on Logisim

50. Building a RISC-V Processor

51. High-Performance Code Compression Using Adaptive Encoding for RISC Processor

52. Supporting RISC-V Performance Counters Through Linux Performance Analysis Tools

53. W-IQ: Wither-logic based issue queue for RISC-V superscalar out-of-order processor

54. RISC-V Instruction Set Architecture Extensions: A Survey

55. Building a RISC-V Processor with a Multicycle Pipeline

Get Full Report

39. Alabama A&M Symmetric Overloaded Minimal Instruction Set Architecture (SOMA)