<oriansj>doing NAND2Tetris makes me want to design my own instruction set architecture and CPU <oriansj>anyone know of any good tools for that sort of work? <muurkha>without really having done it, I'd guess: yosys, nextpnr, APIO, and Lattice FPGAs <oriansj>I guess I could figure out the high level details and encoding without deciding on the actual which bits encode which. <aggi>gigatron ttl got some documentation too <muurkha>maybe, but I think it's easy to end up with a field with 5 or 9 possible values that way <oriansj>muurkha: well yes, expecially if it is 3 or more bits long <muurkha>I mean, I think it's helpful to develop which bits encode which things in parallel <oriansj>well less dense instructions should certainly make for simpler decode logic <oriansj>although I don't see a real benefit to allowing instructions like R0, R1 = R2 + R3 <oriansj>4 registers would be suboptimal for M2-Planet, 16 registers would map nicely to hex but 64 registers would be optimial for advanced optimizing compilers. <oriansj>then I can do [opcode 8][register 6][register 6][immediate 18] and [opcode][register 6][register 6][xop 12][register 6] <oriansj>which would leave lots of bits for future expansion and ensure the most common displacements would fit in a 256KB cache <oriansj>wow I must be tired to have missed that. [opcode 10][register 6][register 6][immediate 18] and [opcode 10][register 6][register 6][xop 12][register 6] for 40 bit instructions (not dense at all) but 2^10 should give plenty of 2OPI instructions and the xop should more than cover a boatload of 3OP instructions without needing more than 1 or 2 of the 1024 values in the primary opcode space <muurkha>lots of registers do impose costs on interrupt handling, context switches, and sometimes even subroutine calls <muurkha>depending on your calling convention <oriansj>muurkha: well if one can use the registers as either integer or floating point; it ends up reducing the number of wasted registers in any particular code block <muurkha>remember that doing that means that you have less integer registers (or less floating point registers) for the same size operand bitfield <muurkha>because some of the registers you could conceivably be addressing for integer instructions are being used for floating point (sometimes; doesn't matter in a compiler or a kernel generally) <muurkha>check out the italicized sections in the "F" chapter of the RISC-V unprivileged instruciton spec <oriansj>true and in superscalar implementations, the register file will probably be duplicated anyway internally to reduce the number of read/write ports. <muurkha>aliasing between integer and floating-point uses also causes substantial trouble for high-performance implementations <muurkha>...or so I've heard, as you know by now, I've never designed a high-performance chip <oriansj>well it incurs a single clock cycle delay between execute writing to a register and when the next execute cluster can read that value. ***Server sets mode: +cnt
<stikonas>argh, cc_amd64 does not support dereferencing pointers... ***robin__ is now known as robin
***robin_ is now known as robin