IRC channel logs

<oriansj>doing NAND2Tetris makes me want to design my own instruction set architecture and CPU

<oriansj>then port stage0 to it

<oriansj>anyone know of any good tools for that sort of work?

<muurkha>without really having done it, I'd guess: yosys, nextpnr, APIO, and Lattice FPGAs

<oriansj>I guess I could figure out the high level details and encoding without deciding on the actual which bits encode which.

<aggi>gigatron ttl got some documentation too

<muurkha>maybe, but I think it's easy to end up with a field with 5 or 9 possible values that way

<oriansj>muurkha: well yes, expecially if it is 3 or more bits long

<muurkha>I mean, I think it's helpful to develop which bits encode which things in parallel

<oriansj>muurkha: ah very fair

<oriansj>well less dense instructions should certainly make for simpler decode logic

<oriansj>although I don't see a real benefit to allowing instructions like R0, R1 = R2 + R3

<oriansj>4 registers would be suboptimal for M2-Planet, 16 registers would map nicely to hex but 64 registers would be optimial for advanced optimizing compilers.

<oriansj>then I can do [opcode 8][register 6][register 6][immediate 18] and [opcode][register 6][register 6][xop 12][register 6]

<oriansj>which would leave lots of bits for future expansion and ensure the most common displacements would fit in a 256KB cache

<oriansj>wow I must be tired to have missed that. [opcode 10][register 6][register 6][immediate 18] and [opcode 10][register 6][register 6][xop 12][register 6] for 40 bit instructions (not dense at all) but 2^10 should give plenty of 2OPI instructions and the xop should more than cover a boatload of 3OP instructions without needing more than 1 or 2 of the 1024 values in the primary opcode space

<muurkha>lots of registers do impose costs on interrupt handling, context switches, and sometimes even subroutine calls

<muurkha>depending on your calling convention

<oriansj>muurkha: well if one can use the registers as either integer or floating point; it ends up reducing the number of wasted registers in any particular code block

<muurkha>maybe, maybe not

<muurkha>remember that doing that means that you have less integer registers (or less floating point registers) for the same size operand bitfield

<muurkha>because some of the registers you could conceivably be addressing for integer instructions are being used for floating point (sometimes; doesn't matter in a compiler or a kernel generally)

<muurkha>check out the italicized sections in the "F" chapter of the RISC-V unprivileged instruciton spec

<oriansj>true and in superscalar implementations, the register file will probably be duplicated anyway internally to reduce the number of read/write ports.

<muurkha>aliasing between integer and floating-point uses also causes substantial trouble for high-performance implementations

<muurkha>...or so I've heard, as you know by now, I've never designed a high-performance chip

<oriansj>well it incurs a single clock cycle delay between execute writing to a register and when the next execute cluster can read that value.

***Server sets mode: +cnt

<muurkha>oriansj: not sure that's the issue

<stikonas>argh, cc_amd64 does not support dereferencing pointers...

<stikonas>might need to drop to inline assembly

<stikonas>oh, maybe I don't need to do that...

***robin__ is now known as robin

***robin_ is now known as robin

IRC channel logs

2022-10-13.log