IRC channel logs

<Gooberpatrol66> https://pineapple-one.github.io/

<Gooberpatrol66>this thing looks cool for bootstrapping

<lagash>Ha, I was about to post that link!

<stikonas>yes, this one was mentioned here a few times over the last couple of years

<oriansj>the 512KB memory limit however really limits what one can do. The lack of a MMU also means one can't do virtual memory to supplement with disk space to allow bigger programs to run (even at a slower speed)

<oriansj>although SRAM prices these days do make it cost prohibitive to make a 4GB of SRAM

<bauen1>you could always use it to implement an emulator ; though performance will suffer a lot

<stikonas>probably even stage0 stuff will run forever...

<muurkha>oriansj: Forth systems did routinely do virtual memory in software

<muurkha>the main words were BLOCK and UPDATE

<muurkha>37 BLOCK would load block 37 from disk (unless it was already resident) and then return you the address of its buffer

<muurkha>UPDATE would mark the last block you had thus loaded as dirty, so that it would be written back to disk if at some point it were evicted

<muurkha>these are 1024-byte blocks

<muurkha>the last two blocks you had requested with BLOCK were always guaranteed to still be resident as long as you hadn't yielded, so you could keep accessing them without invoking BLOCK over and over

<muurkha>this was the entire filesystem

<muurkha>"yielded" here means to yield control of the CPU to another thread, which I/O words normally did

<muurkha>this allowed you to implement a sort of virtual memory with 2KiB of buffer memory and a few additional bytes of state, in a way that scales to machines with a megabyte or more

<muurkha>with only 2KiB I think it had the problem that BLOCK itself had to be defined to not yield, but with 2KiB per thread that problem goes away

<muurkha>the overall problem with this is that it's not very transparent; it's very simple to use, but it's more like bank-switching than the VAX-style demand-paging that we're used to

<muurkha> https://forth-standard.org/standard/block/BLOCK explains the mechanics a bit, but is a bit short on the pragmatics of how to use it

<muurkha>also it meant that on a 16-bit machine you could comfortably address 64 megabytes of RAM

<muurkha>(or disk!)

<muurkha>maintaining resident the last two blocks addressed meant that you could use CMOVE (memcpy) to copy data from one disk block to another, which of course alternately accesses the two blocks without intervening calls to BLOCK

<muurkha>if Forth had been designed for numerical computation instead of real-time control, probably the number would be 3 instead of 2, so that you could do things like matrix multiplication

<muurkha>wasm does something vaguely similar; wasm implementations have to, more or less, bounds-check each read or write to "linear memory", and also offset them by a base address. which means that basically every time a C program accesses memory it has to do this stuff, which sounds like it would be super slow

<muurkha>but in fact there is an important exception: wasm has local variables that are allocated outside of linear memory, and the indexes of local variables are static and can thus be checked at compile time

<muurkha>(C local variables whose address is taken must be allocated in linear memory rather than as wasm local variables)

<muurkha>as it turns out, the majority of accesses to memory in C programs are accesses to local variables, and wasm JIT compilers are good enough at hoisting bounds checks out of inner loops that it all works reasonably well

<muurkha>I suspect that you could get reasonable performance out of software-implemented virtual memory with that same strategy

<muurkha>transparent to the user like wasm, with pages like Forth

<muurkha>and with good performance like Forth, by delegating to a compiler the work done by the user in Forth of figuring out when to check for page faults

IRC channel logs

2023-09-17.log