Some doc updates

2021-07-17 12:58:08 +01:00 · 2021-07-17 12:58:08 +01:00 · 46f95f859d
parent 14ba030271
commit 46f95f859d
4 changed files with 1472 additions and 1290 deletions
--- a/Readme.md
+++ b/Readme.md
@ -1,6 +1,29 @@
 # Hazard3

-Hazard3 is a 3-stage RV32IMC processor based on Hazard5. The stages are:
+Hazard3 is a 3-stage RISC-V processor, providing the following architectural support:
+
+* `RV32I`: 32-bit base instruction set
+* `M` extension: integer multiply/divide/modulo
+* `C` extension: compressed instructions
+* `Zicsr` extension: CSR access
+* M-mode privileged instructions `ECALL`, `EBREAK`, `MRET`
+* The machine-mode (M-mode) privilege state, and standard M-mode CSRs
+* Debug support, compliant with RISC-V debug specification version 0.13.2
+
+You can [read the documentation here](doc/hazard3.pdf). (PDF link)
+
+This repository also contains a compliant RISC-V Debug Module for Hazard3, which can be accessed over an AMBA 3 APB port or using the optional JTAG Debug Transport Module.
+
+There is an [example SoC integration](example_soc/soc/example_soc.v), showing how these components can be assembled to create a minimal system with a JTAG-enabled RISC-V processor, some RAM and a serial port.
+
+The following are planned for future implementation:
+
+* Support for `WFI` instruction
+* `A` extension: atomic memory access
+
+Hazard3 is still under development.
+
+# Pipeline

 - `F` fetch
 	- Instruction fetch data phase
@ -18,46 +41,3 @@ Hazard3 is a 3-stage RV32IMC processor based on Hazard5. The stages are:
 	- Some complex instructions, particularly multiply and divide

 This is essentially Hazard5, with the `D` and `X` stages merged and the register file brought forward. Many components are reused directly from Hazard5. The particular focus here is on shortening the branch delay, which is one of the weak points in Hazard5's IPC.
-
-Merging the decode and execute stages shouldn't have too much effect on overall cycle time, which on Hazard5 is dominated by branch target decode in `D` being presented to the bus. On Hazard3, the branch target decode is much the same, except branch direction is now resolved in parallel with the branch target decode (as the ALU will be physically alongside the branch address adder) and all jumps/branches will be presented in stage 2 of the pipeline. The branch timings on Hazard5, with its static branch prediction, were:
-
- `JAL`: 2 cycles
- `JALR`: 4 cycles (this includes `RET`!)
- Backward branch taken: 2 cycles
- Backward branch nontaken (mispredict): 4 cycles
- Forward branch taken (mispredict): 4 cycles
- Forward branch nontaken: 1 cycle
-
-On Hazard3 the expectation is for all jumps and taken branches to take 2 cycles, and nontaken branches to take 1 cycle.
-
-## Other Architectural Expansion
-
- A extension (at least `ll`/`sc`, AMOs would be nice but are easy to emulate)
- Don't half-ass exceptions -- particularly things like instruction fetch memory fault
- Debug
- Don't half-ass CSRs
- WFI instruction
-
-
-# Exceptions
-
-Exceptions have a number of sources:
-
- Instruction fetch (hresp captured and piped through to decode, flushed if fetch speculation was incorrect)
- Instruction decode (invalid instructions, or exception-causing instructions like ecall)
- CSR address decode
- Load/store address alignment (address phase)
- Load/store bus error (data phase)
- External interrupts
- Internal interrupts (timers etc)
- Debugger breakpoints
- Debugger single-step
-
-Out of these the most troublesome is probably load/store bus error, as it *must* be associated with stage 3 of the pipeline, not with stage 2.
-
-Therefore it may be best to take the exception branch from stage 3, kind of like a branch mispredict. This flush signal would inhibit side-effecting instructions in stage 2. In particular, the case of a load/store in stage 2 with a faulting load/store in stage 3. The sequence of events there is probably:
-
- Cycles m through n (maybe): data phase stall of instruction in stage 3
- Cycle n + 1: first cycle of error response. Error is registered locally.
- Cycle n + 2: Second cycle of error response. Exception branch is generated, and load/store in stage 2 is suppressed based on the registered flag.
-
--- a/doc/hazard3.pdf
+++ b/doc/hazard3.pdf
--- a/doc/sections/debug.adoc
+++ b/doc/sections/debug.adoc
@ -4,21 +4,21 @@ Hazard3, along with its external debug components, implements version 0.13.2 of

 * Minimal impact on core timing when present
 * No external components which need integrating at the other end of your bus fabric -- just slap the Debug Module onto the core and away you go
-* Maximally efficient block data transfers to target RAM for faster edit-compile-run cycle
+* Efficient block data transfers to target RAM for faster edit-compile-run cycle

 Hazard3's debug support implements the following:

 * Run/halt/reset control as required
 * Abstract GPR access as required
 * Program Buffer, 2 words plus `impebreak`
-* Automatic trigger of abstract command on `data0` access (`abstractauto`) for efficient memory block transfers from the host
+* Automatic trigger of abstract command (`abstractauto`) on `data0` or Program Buffer access for efficient memory block transfers from the host
 * (TODO) Some minimum useful trigger unit -- likely just breakpoints, no watchpoints

-The DM can inject instructions directly into the core's instruction prefetch buffer. The DM writes instructions from the Program Buffer to this interface, as well as writing its own hardcoded instructions to manipulate core state and implement abstract commands.
+The DM can inject instructions directly into the core's instruction prefetch buffer. This mechanism is used to execute the Program Buffer, or used directly by the DM, issuing hardcoded instructions to manipulate core state.

-The DM's `data0` register is implemented as an externally-accessible core CSR in the debug space, so abstract GPR accesses translate to a `csrw data0, x` (read GPR `x`) or `csrr x, data0` (write to GPR `x`). The DM always follows this instruction up with an `ebreak` so that it is notified by the core when the instruction sequence completes, just like the implicit `ebreak` at the end of the Program Buffer.
+The DM's `data0` register is exposed to the core as a debug mode CSR. By issuing instructions to make the core read or write this dummy CSR, the DM can exchange data with the core. To read from a GPR `x` into `data0`, the DM issues a `csrw data0, x` instruction. Similarly `csrr x, data0` will write `data0` to that GPR. The DM always follows the CSR instruction with an `ebreak`, just like the implicit `ebreak` at the end of the Program Buffer, so that it is notified by the core when the GPR read instruction sequence completes.

-The debugger implements memory and CSR access using the Program Buffer, which uses the same instruction injection interface used by the DM to implement abstract GPR access. The `abstractauto` feature allows the DM to execute the program buffer automatically following every abstract GPR access, which can be used for e.g. autoincrementing read/write memory bursts.
+The debug host must use the Program Buffer to access CSRs and memory. This carries some overhead for individual accesses, but is efficient for bulk transfers: the `abstractauto` feature allows the DM to trigger the Program Buffer and/or a GPR tranfer automatically following every `data0` access, which can be used for e.g. autoincrementing read/write memory bursts. Program Buffer read/writes can also be used as `abstractauto` triggers: this is less useful than the `data0` trigger, but takes little extra effort to implement, and can be used to read/write a large number of CSRs efficiently.

 Abstract memory access is not implemented because it offers no better throughput than Program Buffer execution with `abstractauto` for bulk transfers, and non-bulk transfers are still instantaneous from the perspective of the human at the other end of the wire.

@ -35,7 +35,6 @@ Features implemented by DM (beyond the mandatory):
 Not implemented:

 * Hart array mask selection
-* Halt summary registers
 * Abstract access memory
 * Abstract access CSR
 * Post-incrementing abstract access GPR
@ -45,9 +44,10 @@ Core behaviour:

 * Branch, `jal`, `jalr` and `auipc` are illegal in debug mode, because they observe PC: attempting to execute will halt Program Buffer execution and report an exception in `abstractcs.cmderr`
 * The `dret` instruction is not implemented (a special purpose DM-to-core signal is used to signal resume)
-* Entering and exiting debug mode does not clear an atomic load reservation; the host may explicitly clear a reservation using a dummy `sc` instruction via the program buffer.
 * The `dscratch` CSRs are not implemented
-* `data0` is implemented as a scratch CSR mapped at `0x7b2` (the location of `dscratch0`), readable and writable by the DM.
+* External `data0` register is exposed as a dummy CSR mapped at `0x7b2` (the location of `dscratch0`), readable and writable by the DM.
+** This is a debug mode CSR, so raises an illegal instruction exception when accessed in machine mode
+** The DM ignores writes unless it is currently executing an abstract command on this core (`hartsel` = this core, `abstractcs.busy` = 1)
 * `dcsr.stepie` is hardwired to 0 (no interrupts during single stepping)
 * `dcsr.stopcount` and `dcsr.stoptime` are hardwired to 1 (no counter or internal timer increment in debug mode)
 * `dcsr.mprven` is hardwired to 0
--- a/doc/sections/introduction.adoc
+++ b/doc/sections/introduction.adoc
@ -8,12 +8,11 @@ Hazard3 is a 3-stage RISC-V processor, providing the following architectural sup
 * `Zicsr` extension: CSR access
 * M-mode privileged instructions `ECALL`, `EBREAK`, `MRET`
 * The machine-mode (M-mode) privilege state, and standard M-mode CSRs
+* Debug support, fully compliant with version 0.13.2 of the RISC-V external debug specification

 The following are planned for future implementation:

 * Support for `WFI` instruction
-* Debug support
 * `A` extension: atomic memory access
 ** `LR`/`SC` fully supported
 ** AMONone PMA on all of memory (AMOs are decoded but unconditionally trigger access fault without attempting memory access)
-* Some nonstandard M-mode CSRs for interrupt control etc