IRC channel logs

<oriansj>well the hyper majority of the complexity in GCC and Clang is just dedicated to taking advantage of everything in an architecture that would result in better performance; I don't think as much effort was made to enhance -Os to produce optimally tiny binaries.

<muurkha>yeah, one of the disappointments in the VAX was that some of the more complex instructions were actually slower than just generating the code to do it step-by-step

<muurkha>even if it was VAX code and not ARM code

<muurkha>if your code is fluffy it hurts both your cache hit rate on big machines and your ability to have functionality on small ones

<muurkha>so there has been a distinctly nonzero amount of effort devoted to code density

<muurkha>I spent a lot of last night studying the Cortex-M0 instruction set

<muurkha>which only implements [most of] Thumb plus a little of Thumb2

<muurkha>it's a lot denser than regular ARM code but also pretty limited

<oriansj>muurkha: I don't need any of thumb or thumb2; just a 28 instructions (it would be less if they had divide and modulus instructions)

<muurkha>28 instructions?

<oriansj>yes, the processor would only need to support 28 instructions to run the stage0 steps

<oriansj>such as bl label; push {r14}; ldr r0, [r8]; etc

<muurkha>oh, I see

<muurkha>it doesn't support any ARM instructions

<muurkha>only thumb

<oriansj>oh, then I guess I would need to a thumb only port to support it

<muurkha>yeah

<muurkha>heh, I wanted to see how GCC implements division on ARM, so I wrote a decimal print function called decout and compiled it. it implemented division by 10 with multiplication by 0x66666667 and a bit of cleanup

<muurkha>fine, so I'll rename it to basebout and pass a base parameter b, so it has to be variable at run time so it can't do that

<muurkha>the assembly has a call to basebout.constprop.0 with the same code as before

<muurkha>GCC produced a version of the function specialized for that constant parameter so it wouldn't have to divide at runtime

<muurkha>there we go, bl __aeabi_idivmod

<stikonas> https://matrix.org/blog/2023/07/deportalling-libera-chat/

<oriansj>muurkha: to be fair M2libc has a function for division for armv7l (making use of a couple conditional instructions along the way for performance reasons) https://git.sr.ht/~oriansj/M2libc/tree/main/item/armv7l/libc-full.M1 and it probably isn't as optimal as what GCC has

<oriansj>stikonas: so we need to plumb bootstrappable? ok, how does one do that?

<stikonas>no idea...

<stikonas>but we do have quite a lot of people connecting from matrix here

<stikonas>looks like there will be a more detailed guide in the future

<stikonas>(if we even decide to convert)

<jcmdln>that will make it much harder to lurk here and pretend I understand everything

<jcmdln>s/everything/anything

<muurkha>GCC's is bulky

<muurkha>but yeah probably several times faster

<oriansj>jcmdln: no one here understands everything but by working together; collectively everything can be understood and achieved. (we are about where I expected it would take me about 30 years of work to do)

IRC channel logs

2023-07-07.log