== Configuration and Integration === Hazard3 Source Files Hazard3's source is written in Verilog 2005, and is self-contained. It can be found here: https://github.com/Wren6991/Hazard3/tree/master/hdl[github.com/Wren6991/Hazard3/blob/master/hdl]. The file https://github.com/Wren6991/Hazard3/blob/master/hdl/hazard3.f[hdl/hazard3.f] is a list of all the source files required to instantiate Hazard3. Files ending with `.vh` are preprocessor include files used by the Hazard3 source. Two to take note of are: * https://github.com/Wren6991/Hazard3/blob/master/hdl/hazard3_config.vh[hazard3_config.vh]: the main Hazard3 configuration header. Lists and describes Hazard3's global configuration parameters, such as ISA extension support * https://github.com/Wren6991/Hazard3/blob/master/hdl/hazard3_config_inst.vh[hazard3_config_inst.vh]: a file which propagates configuration parameters through module instantiations, all the way down from Hazard3's top-level modules through the internals Therefore there are two ways to configure Hazard3: * Directly edit the parameter defaults in `hazard3_config.vh` in your local Hazard3 checkout (and then let the top-level parameters default when instantiating Hazard3) * Set all configuration parameters in your Hazard3 instantiation, and let the parameters propagate down through the hierarchy === Top-level Modules Hazard3 has two top-level modules: * https://github.com/Wren6991/Hazard3/blob/master/hdl/hazard3_cpu_1port.v[hazard3_cpu_1port] * https://github.com/Wren6991/Hazard3/blob/master/hdl/hazard3_cpu_2port.v[hazard3_cpu_2port] These are both thin wrappers around the https://github.com/Wren6991/Hazard3/blob/master/hdl/hazard3_core.v[hazard3_core] module. `hazard3_cpu_1port` has a single AHB5 bus port which is shared for instruction fetch, loads, stores and AMOs. `hazard3_cpu_2port` has two AHB5 bus ports, one for instruction fetch, and the other for loads, stores and AMOs. The 2-port wrapper has higher potential for performance, but the 1-port wrapper may be simpler to integrate, since there is no need to arbitrate multiple bus masters externally. The core module `hazard3_core` can also be instantiated directly, which may be more efficient if support for some other bus standard is desired. However, the interface of `hazard3_core` will not be documented and is not guaranteed to be stable. [[config-parameters-section]] === Configuration Parameters ==== Reset state configuration ===== RESET_VECTOR Address of the first instruction executed after Hazard3 comes out of reset. Default value: all-zeroes. ===== MTVEC_INIT Initial value of the machine trap vector base CSR (<>). Bits clear in <> will never change from this initial value. Bits set in <> can be written/set/cleared as normal. Default value: all-zeroes. ==== Standard RISC-V ISA support [[param-EXTENSION_A]] ===== EXTENSION_A Support for the A extension: atomic read/modify/write. 0 for disable, 1 for enable. Default value: 1 [[param-EXTENSION_C]] ===== EXTENSION_C Support for the C extension: compressed (variable-width). 0 for disable, 1 for enable. Default value: 1 [[param-EXTENSION_M]] ===== EXTENSION_M Support for the M extension: hardware multiply/divide/modulo. 0 for disable, 1 for enable. Default value: 1 [[param-EXTENSION_ZBA]] ===== EXTENSION_ZBA Support for Zba address generation instructions. 0 for disable, 1 for enable. Default value: 0 [[param-EXTENSION_ZBB]] ===== EXTENSION_ZBB Support for Zbb basic bit manipulation instructions. 0 for disable, 1 for enable. Default value: 0 [[param-EXTENSION_ZBC]] ===== EXTENSION_ZBC Support for Zbc carry-less multiplication instructions. 0 for disable, 1 for enable. Default value: 0 [[param-EXTENSION_ZBS]] ===== EXTENSION_ZBS Support for Zbs single-bit manipulation instructions. 0 for disable, 1 for enable. Default value: 0 [[param-EXTENSION_ZBKB]] ===== EXTENSION_ZBKB Support for Zbkb basic bit manipulation for cryptography. Requires: <>. (Since Zbb and Zbkb have a large overlap, this flag enables only those instructions which are in Zbkb but aren't in Zbb. Therefore both flags must be set for full Zbkb support.) Default value: 0 [[param-EXTENSION_ZCB]] ===== EXTENSION_ZCB: Support for Zcb basic additional compressed instructions Requires: <>. (Some Zcb instructions also require Zbb or M, as they are 16-bit aliases of 32-bit instructions present in those extensions.) Note Zca is equivalent to C, as we do not support the F extension. Default value: 0 [[param-EXTENSION_ZCMP]] ===== EXTENSION_ZCMP Support for Zcmp push/pop and double-move instructions. Requires: <>. Note Zca is equivalent to C, as we do not support the F extension. Default value: 0 [[param-EXTENSION_ZIFENCEI]] ===== EXTENSION_ZIFENCEI Support for the fence.i instruction. When the branch predictor is not present, this instruction is optional, since a plain branch/jump is sufficient to flush the instruction prefetch queue. When the branch predictor is enabled (<> is 1), this instruction must be implemented. Default value: 0 [[cfg-custom-extensions]] ==== Custom Hazard3 Extensions [[param-EXTENSION_XH3BEXTM]] ===== EXTENSION_XH3BEXTM Custom bit manipulation instructions for Hazard3: `h3.bextm` and `h3.bextmi`. See <>. Default value: 0 [[param-EXTENSION_XH3IRQ]] ===== EXTENSION_XH3IRQ Custom preemptive, prioritised interrupt support. Can be disabled if an external interrupt controller (e.g. PLIC) is used. If disabled, and NUM_IRQS > 1, the external interrupts are simply OR'd into mip.meip. See <>. Default value: 0 [[param-EXTENSION_XH3PMPM]] ===== EXTENSION_XH3PMPM Custom PMPCFGMx CSRs to enforce PMP regions in M-mode without locking. See <>. Default value: 0 [[param-EXTENSION_XH3POWER]] ===== EXTENSION_XH3POWER Custom power management controls for Hazard3. This adds the <> CSR, and the `h3.block` and `h3.unblock` hint instructions. See <> Default value: 0 ==== CSR support NOTE: the Zicsr extension is implied by any of <>, <>, <>. [[param-CSR_M_MANDATORY]] ===== CSR_M_MANDATORY Bare minimum CSR support e.g. <>. This flag is an absolute requirement for compliance with the RISC-V privileged specification. However, the privileged specification itself is an optional extension. Hazard3 allows the mandatory CSRs to be disabled to save a small amount of area in deeply-embedded implementations. Default value: 1 [[param-CSR_M_TRAP]] ===== CSR_M_TRAP Include M-mode trap-handling CSRs, and enable trap support. Default value: 1 [[param-CSR_COUNTER]] ===== CSR_COUNTER Include the basic performance counters (`cycle`/`instret`) and relevant CSRs. Note that these performance counters are now in their own separate extension (Zicntr) and are no longer mandatory. Default value: 0 [[param-U_MODE]] ===== U_MODE Support the U (user) privilege level. In U-mode, the core performs unprivileged bus accesses, and software's access to CSRs is restricted. Additionally, if the PMP is included, the core may restrict U-mode software's access to memory. Requires: <>. Default value: 0 [[param-PMP_REGIONS]] ===== PMP_REGIONS Number of physical memory protection regions, or 0 for no PMP. PMP is more useful if U-mode is supported, but this is not a requirement. Hazard3's PMP supports only the NAPOT and(if <> is 0) NA4 region types. Requires: <>. Default value: 0 [[param-PMP_GRAIN]] ===== PMP_GRAIN This is the _G_ parameter in the privileged spec, which defines the granularity of PMP regions. Minimum PMP region size is 1 << (_G_ + 2) bytes. If _G_ > 0, `pmcfg.a` can not be set to NA4 (attempting to do so will set the region to OFF instead). If _G_ > 1, the _G_ - 1 LSBs of pmpaddr are read-only-0 when `pmpcfg.a` is OFF, and read-only-1 when `pmpcfg.a` is NAPOT. Default value: 0 [[param-PMP_HARDWIRED]] ===== PMP_HARDWIRED PMPADDR_HARDWIRED: If a bit is 1, the corresponding region's pmpaddr and pmpcfg registers are read-only, with their values fixed when the processor is instantiated. PMP_GRAIN is ignored on hardwired regions. Hardwired regions are far cheaper, both in area and comparison delay, than dynamically configurable regions. Hardwired PMP regions are a good option for setting default U-mode permissions on regions which have access controls outside of the processor, such as peripheral regions. For this case it's recommended to make hardwired regions the highest-numbered, so they can be overridden by lower-numbered dynamic regions. Default value: all-zeroes. [[param-PMP_HARDWIRED_ADDR]] ===== PMP_HARDWIRED_ADDR Values of pmpaddr registers whose PMP_HARDWIRED bits are set to 1. Has no effect on PMP regions which are not hardwired. Default value: all-zeroes. [[param-PMP_HARDWIRED_CFG]] ===== PMP_HARDWIRED_CFG Values of pmpcfg registers whose PMP_HARDWIRED bits are set to 1. Has no effect on PMP regions which are not hardwired. Default value: all-zeroes. [[param-DEBUG_SUPPORT]] ===== DEBUG_SUPPORT Support for run/halt and instruction injection from an external Debug Module, support for Debug Mode, and Debug Mode CSRs. Requires: <>, <>. Default value: 0 [[param-BREAKPOINT_TRIGGERS]] ===== BREAKPOINT_TRIGGERS Number of hardware breakpoints. A breakpoint is implemented as a trigger that supports only exact execution address matches, ignoring instruction size. That is, a trigger which supports type=2 execute=1 (but not store/load=1, i.e. not a watchpoint). Requires: <> Default value: 0 ==== External interrupt support [[param-NUM_IRQS]] ===== NUM_IRQS NUM_IRQS: Number of external IRQs. Minimum 1, maximum 512. Note that if <> (Hazard3 interrupt controller) is disabled then multiple external interrupts are simply OR'd into mip.meip. Default value: 1 [[param-IRQ_PRIORITY_BITS]] ===== IRQ_PRIORITY_BITS IRQ_PRIORITY_BITS: Number of priority bits implemented for each interrupt in meipra, if EXTENSION_XH3IRQ is enabled. The number of distinct levels is (1 << IRQ_PRIORITY_BITS). Minimum 0, max 4. Note that multiple priority levels with a large number of IRQs will have a severe effect on timing. Default value: 0 [[param-IRQ_INPUT_BYPASS]] ===== IRQ_INPUT_BYPASS Disable the input registers on the external interrupts, to reduce latency by one cycle. Can be applied on an IRQ-by-IRQ basis. Ignored if <> is disabled. Default value: all-zeroes (not bypassed). ==== Identification Registers [[param-MVENDORID_VAL]] ===== MVENDORID_VAL Value of the <> CSR. JEDEC JEP106-compliant vendor ID, or all-zeroes. 31:7 is continuation code count, 6:0 is ID. Parity bit is not stored. Default value: all-zeroes. [[param-MIMPID_VAL]] ===== MIMPID_VAL Value of the <> CSR. Implementation ID for this specific version of Hazard3. Should be a git hash, or all-zeroes. Default value: all-zeroes. [[param-MHARTID_VAL]] ===== MHARTID_VAL Value of the <> CSR. Each Hazard3 core has a single hardware thread. Multiple cores should have unique IDs. Default value: all-zeroes. [[param-MCONFIGPTR_VAL]] ===== MCONFIGPTR_VAL Value of the <> CSR. Pointer to configuration structure blob, or all-zeroes. Must be at least 4-byte-aligned. Default value: all-zeroes. ==== Performance/size options [[param-REDUCED_BYPASS]] ===== REDUCED_BYPASS Remove all forwarding paths except X->X (so back-to-back ALU ops can still run at 1 CPI), to save area. This has a significant impact on per-clock performance, so should only be considered for extremely low-area implementations. Default value: 0 [[param-MULDIV_UNROLL]] ===== MULDIV_UNROLL Bits per clock for multiply/divide circuit, if present. Must be a power of 2. Default value: 1 [[param-MUL_FAST]] ===== MUL_FAST Use single-cycle multiply circuit for MUL instructions, retiring to stage 3. The sequential multiply/divide circuit is still used for MULH* Default value: 0 [[param-MUL_FASTER]] ===== MUL_FASTER Retire fast multiply results to stage 2 instead of stage 3. Throughput is the same, but latency is reduced from 2 cycles to 1 cycle. Requires: <>. Default value: 0 [[param-MULH_FAST]] ===== MULH_FAST Extend the fast multiply circuit to also cover MULH*, and remove the multiply functionality from the sequential multiply/divide circuit. Requires: <> Default value: 0 [[param-FAST_BRANCHCMP]] ===== FAST_BRANCHCMP Instantiate a separate comparator (eq/lt/ltu) for branch comparisons, rather than using the ALU. Improves fetch address delay, especially if `Zba` extension is enabled. Disabling may save area. Default value: 1 [[param-RESET_REGFILE]] ===== RESET_REGFILE Whether to support reset of the general purpose registers. There are around 1k bits in the register file, so the reset can be disabled e.g. to permit block-RAM inference on FPGA. Default value: 1 [[param-BRANCH_PREDICTOR]] ===== BRANCH_PREDICTOR Enable branch prediction. The branch predictor consists of a single BTB entry which is allocated on a taken backward branch, and cleared on a mispredicted nontaken branch, a fence.i or a trap. Successful prediction eliminates the 1-cyle fetch bubble on a taken branch, usually making tight loops faster. Requires: <> Default value: 0 [[param-MTVEC_WMASK]] ===== MTVEC_WMASK MTVEC_WMASK: Mask of which bits in mtvec are writable. Full writability (except for bit 1) is recommended, because a common idiom in setup code is to set mtvec just past code that may trap, as a hardware `try {...} catch` block. * The vectoring mode can be made fixed by clearing the LSB of MTVEC_WMASK * In vectored mode, the vector table must be aligned to its size, rounded up to a power of two. Default: All writable except for bit 1. === Interfaces (Top-level Ports) Most ports are common to the two top-level wrappers, `hazard3_cpu_1port.v` and `hazard3_cpu_2port.v`. The only difference is the number of AHB5 manager ports used to access the bus: `hazard3_cpu_1port.v` has a single port used for all accesses, whereas `hazard3_cpu_2port.v` adds a separate, dedicated port for instruction fetch. ==== Interfaces Common to All Wrappers Global signals [options="header",cols="1,1,4,4"] |=== | Width | I/O | Name | Description 4+| Global signals | 1 | I | `clk` | Clock for all processor logic not driven by `clk_always_on`. Must be the same as the AHB5 bus clock. You should an external clock gate controlled by `clk_en` if the Xh3power extension is configured. | 1 | I | `clk_always_on` | Clock for logic required to wake from a low-power state. Connect to the same clock as `clk`, but do not insert an external clock gate. | 1 | I | `rst_n` | Active-low asynchronous reset for all processor logic. There is no internal synchroniser, so you must arrange externally for reset assertion/removal times to be met. For example, add an external reset synchroniser. 4+| Power control signals | 1 | O | `pwrup_req` | Power-up request. Disconnect if Xh3power is not configured. Part of a four-phase (Gray code) req/ack handshake for negotiating power or clocks with your system power controller. The processor releases `pwrup_req` on entering a sufficiently deep `wfi` or `h3.block` state, as configured by the `msleep` CSR. It then waits for deassertion of `pwrup_ack`, before reasserting `pwrup_req` when the processor intends to wake from the low-power state. | 1 | I | `pwrup_ack` | Power-up acknowledged. Tie to 1 if Xh3power is not configured, or if there is no external system power controller. The processor does not access the bus when either `pwrup_req` or `pwrup_ack` is low. | 1 | O | `clk_en` | Control output for an external top-level clock gate on `clk`. Active-high enable. | 1 | O | `unblock_out` | Pulses high when an `h3.unblock` instruction executes. Disconnect if Xh3power is not configured. | 1 | I | `unblock_in` | A high input pulse will release a blocked `h3.block` instruction, or cause the next `h3.block` instruction to immediately fall through. 4+| Debug Module controls | 1 | I | `dbg_req_halt` | Debugger halt request. Connect to the matching signal on the Debug Module. Tie low if debug support is not configured. | 1 | I | `dbg_req_halt_on_reset` | Debugger halt-on-reset request. Connect to the matching signal on the Debug Module. Tie low if debug support is not configured. | 1 | I | `dbg_req_resume` | Debugger resume request. Connect to the matching signal on the Debug Module. Tie low if debug support is not configured. | 1 | O | `dbg_halted` | Debug halted status. Asserts when the processor is halted in Debug mode. Connect to the matching signal on the Debug Module. Disconnect if debug support is not configured. | 1 | O | `dbg_running` | Debug halted status. Asserts when the processor is halted in Debug mode. Connect to the matching signal on the Debug Module. Disconnect if debug support is not configured. | 32 | I | `dbg_data0_rdata` | Read data bus for mapping Debug Module `dmdata0` register as a CSR. Connect to the matching signal on the Debug Module. Tie to zeroes if debug support is not configured. | 32 | O | `dbg_data0_wdata` | Write data bus for mapping Debug Module `dmdata0` register as a CSR. Connect to the matching signal on the Debug Module. Disconnect if debug support is not configured. | 1 | O | `dbg_data0_wen` | Write data strobe for mapping Debug Module `dmdata0` register as a CSR. Connect to the matching signal on the Debug Module. Disconnect if debug support is not configured. | 32 | I | `dbg_instr_data` | Instruction injection interface. Connect to the matching signal on the Debug Module. Tie to zeroes if debug support is not configured. | 1 | I | `dbg_instr_data_vld` | Instruction injection interface. Connect to the matching signal on the Debug Module. Tie low if debug support is not configured. | 1 | O | `dbg_instr_data_rdy` | Instruction injection interface. Connect to the matching signal on the Debug Module. Disconnect if debug support is not configured. | 1 | O | `dbg_instr_caught_exception` | Exception caught during Program Buffer excecution. Connect to the matching signal on the Debug Module. Disconnect if debug support is not configured. | 1 | O | `dbg_instr_caught_ebreak` | Breakpoint instruction caught during Program Buffer execution. Connect to the matching signal on the Debug Module. Disconnect if debug support is not configured. 4+| Shared System Bus Access | 32 | I | `dbg_sbus_addr` | Address for System Bus Access arbitrated with this core's load/store access. Tie to zeroes if this feature is not used. | 1 | I | `dbg_sbus_write` | Write/not-Read flag for System Bus Access arbitrated with this core's load/store access. Tie low if this feature is not used. | 2 | I | `dbg_sbus_size` | Transfer size (0/1/2 = byte/halfword/word) for System Bus Access arbitrated with this core's load/store access. Tie low if this feature is not used. | 1 | I | `dbg_sbus_vld` | Transfer enable signal for System Bus Access arbitrated with this core's load/store access. Tie low if this feature is not used. | 1 | O | `dbg_sbus_rdy` | Transfer stall signal for System Bus Access arbitrated with this core's load/store access. Disconnect if this feature is not used. | 1 | O | `dbg_sbus_err` | Bus fault signal for System Bus Access arbitrated with this core's load/store access. Disconnect if this feature is not used. | 32 | I | `dbg_sbus_wdata` | Write data bus for System Bus Access arbitrated with this core's load/store access. Tie to zeroes if this feature is not used. | 32 | O | `dbg_sbus_rdata` | Read data bus for System Bus Access arbitrated with this core's load/store access. Disconnect if this feature is not used. 4+| Interrupt requests | `NUM_IRQS` | I | `irq` | If Xh3irq is not configured, this is the RISC-V external interrupt line (`mip.meip`) which you should connect to an external interrupt controller such as a standard RISC-V PLIC. If Xh3irq is configured, this is a vector of level-sensitive active-high interrupt signals which the core's internal interrupt controller can route through the `mip.meip` vector. Tie low if unused. | 1 | I | `soft_irq` | This is the standard RISC-V software interrupt signal, `mip.msip`. Tie low if unused. | 1 | I | `timer_irq` | This is the standard RISC-V timer interrupt signal, `mip.mtip`. It should be connected to a standard RISC-V platform timer peripheral (`mtime`/`mtimecmp`) accessible to M-mode software on your system bus. Tie low if unused. |=== ==== Interfaces for 1-port CPU This wrapper adds a single standard AHB5 manager port, with signals prefixed `ahblm_`. See the AMBA 5 AHB specification from Arm for definitions of these signals. ==== Interfaces for 2-port CPU This wrapper adds two standard AHB5 manager ports, with signals prefixed `i_` for instruction and `d_` for data. See the AMBA 5 AHB specification from Arm for definitions of these signals. The I port only generates word-aligned word-sized read accesses. It does not use AHB5 exclusives. When shared System Bus Access (SBA) is used, the SBA bus accesses are routed through the D port.