IRC channel logs

2022-11-02.log

back to list of logs

<muurkha>stikonas: do you think a buddy malloc will be simpler than a PDP-11-style first-fit free-list malloc? or is it that the efficiency gain will be worth the extra complexit?
<stikonas>well, I guess first fit might be simpler
<stikonas>maybe it's worth considering that too
<stikonas>given that we don't care too much about efficiency but care about readability
<stikonas>well, actually the simplest thing to do is not to implement any free at all...
<stikonas>just port malloc away from _brk to block style allocator...
<stikonas[m]>I guess that would be first step in and case no matter what allocator is used
<oriansj>well we need a bit of complexity to do block tracking (otherwise every 3byte malloc would end up using a full block [4KB+]) and at which point we might as make free work.
<stikonas>oriansj: can't we do something even simpler kind of keep current linear allocation [new_malloc_ptr = old malloc + requested memory] if there is enough memory in current e.g. 4 KiB block, if there isn't then request from OS (either using brk or in UEFI allocate_pages) the minimum number of pages that would fit current malloc request
<stikonas>so if we do many 3 byte mallocs, they'll all go to the same block
<stikonas>though if you do 3 byte malloc followed by 10 KiB malloc, then yes 3 byte malloc will end up using full block once, though if you request another 3 byte malloc, it might fit at the end of 10 KiB earlier allocation
<stikonas>we can tweak block size a bit
<stikonas>basically in the limit block size -> infinity we go back to the current behaviour
<stikonas>or is something like that too simple?
<stikonas>basically mallocs bigget than block size could potentially waste up to block size of memory
<stikonas>so I think in the worst case we would end up wasting half of a memory
<stikonas>though I expect in practice it should be much better
<stikonas>as I think we do very few big mallocs in stage0
<stikonas>it's mostly structs
<stikonas>possibly one exception would be user_stack
<stikonas>hmm, I'll have to think what's better
<stikonas>if tracking is not too complex, maybe it's still worth it
<stikonas>first-fit allocator is actually very similar to this
<stikonas>just a bit of tracking to make free work
<fossy>oriansj: i agree with stikonas[m] regarding git-lfs
<fossy>i also don't really want the default option to be any kind of centralised system, the closer we can get to the original hosters the better, especially if all tarballs could be compromised in one step
<fossy>for internal usage it makes a lot of sense to have a centralised server tho, so thats a good optin imo
<oriansj>I had a weird idea for dependency management. require a specification to namespace. That way upstream code changes to a library can't break one's code.
***civodul` is now known as civodul
***civodul` is now known as civodul
<muurkha>I was looking at RISC-V implementations yesterday and I found this one that fits into 1000 LUTs on an iCE40 FPGA and is a little less than 300 lines of code: https://github.com/sylefeb/Silice/blob/master/projects/ice-v/README.md
<muurkha>about 20 million instructions per second on a US$5 FPGA
<muurkha>that's only RV32I though, it doesn't have a user/supervisor mode split or virtual memory
<oriansj>muurkha: only executing from BRAM seems like a rather serious limitation at 128 Kilobits (4KB)
<oriansj>at the high end of the iCE40 family and 0bits on the low end
<muurkha>it's 1 megabit, 128 kilobytes
<oriansj>(correction 16KB of 4Kb blocks)
<muurkha>also 128 kilobits would be 16 kilobytes, which is plenty of RAM for a compiler toolchain if you also have mass storage
<oriansj>the datasheet says Up to 128 kbits sysMEM™ Embedded Block
<muurkha>for the iCE40UP5K?
<oriansj>iCE40™ LP/HX Family Data Sheet
<muurkha>yeah, those are a little smaller
<muurkha>(though some of them actually have more LUTs than the iCE40UP5K I have)
<muurkha>FPGAs are good at moving data in and out of them rapidly, and I think this one can handle a few gigabytes per second of I/O bandwidth, so external mass storage is a significantly more powerful thing to add than, like, the floppy drive on a Commodore PET
<oriansj>umm Up to 120 kb sysMEM™ Embedded Block RAM for iCE40 UltraPlus™ Family
<muurkha>yeah, but it also has 1024 kilobits of single port SRAM
<muurkha>the LP/HX don't
<muurkha>you're looking at the HX because ice-v/README.md talks about the HX1K, right?
<oriansj>bingo
<muurkha>but that's just because it's the smallest FPGA it will run on (well, maybe the UL1K)
<muurkha>the UP5K has the same kind of LUTs, is supported by the same toolchains, and runs at the same speed
<muurkha>it just has a mebibit of SPRAM
<oriansj>but yeah it does appear the UP family can have up to 1024 kb Single Port SRAM in addition; which might be enough to do something useful in
<muurkha>even 16KiB is plenty to do something useful in though
<oriansj>useful is a very broad category
<muurkha>to do a self-hosting development environment in a high-level langauge
<muurkha>since you can hook it up to an external Flash or even SRAM
<oriansj>but it is probably viable for some of the early bootstrapping steps
<muurkha>you can boot Linux on that FPGA, people have
<oriansj>as 128KB is enough for cc_* with a few tweaks
<muurkha>the stretch goal with that chip is not "run a C compiler"
<muurkha>it's "resynthesize the SoC bitstream from HDL"
<oriansj>with external RAM probably but using only internal; it'll be painful
<oriansj>and a boatload of swapping
<oriansj>unless I missed something important (like slice running on bare metal); then we are talking a kernel, filesystem and possible extra abstractions.
<muurkha>well, I haven't done it, but I don't think it's particularly more painful using external RAM than internal. also though 128 KiB is 2-4 times the size of the machines people used to run Turbo Pascal and BDS C on, and those machines ran at 0.3 MIPS instead of 15-20 MIPS