IRC channel logs
2021-09-17.log
back to list of logs
<stikonas>just compiled my first C program with cc_riscv64, although cc_riscv64 is still very much WIP <Jeremy_Rand>civodul: I could paste the desired HTTP header text here if you're able to convert that into the Scheme format used; would that be suitable? <Jeremy_Rand>civodul: is enabling inclusion in the HSTS preload list okay? Advantage is that web browsers will know to use HTTPS even before the first visit (so you get better security against sslstrip attacks), disadvantage is that if you decide to disable HTTPS in the future, you'd have to wait for browser updates to remove you from the preload list. <Jeremy_Rand>(Asking because that affects which header text I paste) <stikonas>unless something happens to let's encrypt, there shouldn't be any reason to disable it <Jeremy_Rand>stikonas: yeah, that's my thinking too. Looks like civodul disconnected; guess I'll wait for them to return before proceeding <oriansj>Jeremy_Rand: no need. this channel is logged; civodul or rekado will see the logs and we can incorporate your fix <oriansj>so just post your suggested fix when you think it is ok for others to see <oriansj>stikonas: I'll be doing the AMD64 hex2 fix later today and hopefully the AArch64 this weekend. <stikonas>it takes 15 minutes on my AMD64 laptop on qemu <stikonas>I think after the fix it will be down to maybe 2 minutes <stikonas>not sure if it can be called a "fix", it's more of a workaround <oriansj>performance enhancement as it isn't actually fixing anything? <stikonas>I don't think it matters for non-emulated systems <stikonas>on risc-v I've also fixed hex1, but it might be harder on other arches <stikonas>I didn't want to recalculate jumps, but it just happened that I only had to insert 1 noop instruction <oriansj>well it doesn't really save much time on other architectures to fix hex0 and hex1 <oriansj>oh and the raspberryPI memory mapped performance improvement still is only in the 12% range; So something else might be in play there but I'll look into that later. <stikonas>to see if you can spot where time is spent <stikonas>but it's really dramatic on more powerful machines <oriansj>well ramdisk should eliminate i/o as the bottleneck but I guess I might as well byte the bullet and just strace it (it'll take about an hour) <stikonas>bottleneck might be visible quite quickly <xentrac>so, I'm noticing some things in Qfitzah I hadn't noticed before. <xentrac>the context is that I'm trying to make the executable as small as I can; I'm down below 900 bytes at this point <xentrac>if I put a value in a callee-saved register, that saves me from having to save and restore it across calls to other subroutines (2 bytes of code per call) <xentrac>but I have to save and restore the register on entry and exit; this is 2 bytes, plus an extra byte per early return, because I have to put push %ebp or whatever at the top and pop %ebp before every ret <xentrac>up to a maximum of 1 extra byte per early return, because I can replace pop %ebx; pop %ebp; ret with a simple jump to a shared procedure epilogue that pops each thing <xentrac>you'd think this would sometimes be a win for space, but it basically never is because I never have enough child function calls with live temporary values. instead what wins is putting all my values in call-clobbered (caller-saved) registers and pushing them and popping them around the calls <xentrac>which as an extra bonus allows me to move the values from one register to another for free when there's a call in between <oriansj>a bunch of cacheflush(0x611be900, 0x611be968, 0) = 0 <xentrac>sometimes I'm also doing things like mov 4(%esp), %ecx even though that's 4 bytes; I think that's probably an error <xentrac>this is sort of going to the opposite extreme from hex0 though: compromising comprehensibility and bug-proneness in pursuit of bumming the code down by a few bytes <xentrac>I'm thinking I might reassign %ebx, %ebp, and maybe even %esi and %edi, to be call-clobbered (caller-saved) in Qfitzah to take advantage of the more compact instruction encodings that come from the 8086 <xentrac>I'm curious to hear about other people's experiences <stikonas>ok, shoul be enough work on cc_riscv64 for now, got easy stuff working (labels, goto, return and asm statements) <stikonas>and the binary is 6.4 KB now, so I guess I'm about 1/3 done <stikonas>well, these are really easy, basically directly translates to assembly instruction <stikonas>I guess parsing expressions will be where most of the remaining complexity is <xentrac>I mean if you're doing recursive descent you just refactor things like <sum> "+" <term> | <sum> "-" <term> into things like <term> ("+" <term> | "-" <term>)* <xentrac>although C has a lot of precedence levels and implicit coercions and so it ends up still being a huge pain <xentrac>I've actually done an LR-style parser by hand in the past, which might be easier, dunno. using repeated attempted pattern matches on the stack instead of a constant-time table <stikonas>I don't think cc_* M2_Planet look at precedence levels (but I still need to look at that code <xentrac>holy shit, I wrote that 15 years ago <xentrac>did my grammar-factoring example make sense? <stikonas>well, I guess. But in any case I'm just following what we already have for C code (M2-Planet/cc_* prototype) and cc_amd64.M1 <stikonas>oh, continue statement is basically no-op in cc_* <stikonas>M2-Planet has only 1 continue statement but it looks like in a code that M2-Planet won't hit when building itself <oriansj>yes it does have precedence levels. The hardest part is the tokenization which is why we have the debug_list being the most important function you write <oriansj>and the tokenization looks good? then you only have a couple hours of work before you are done. <stikonas>well, I need to do flow control / local variables and expressions <stikonas>cc_reader.c is only 200 lines in size... <oriansj>but the hardest to debug and do correctly <oriansj>hence why debug_list is absolutely essential until it is done <stikonas>I did debug it a bit... but it was not too bad <oriansj>you'll need to debug it until debug_list's output matches what the C code would produce <oriansj>hence take the C code and have it just dump the list so easier inspection. <stikonas>well, I didn't try it yet on big programs... <oriansj>the pointer addresses might be different but the tokens should be identical <stikonas>but on smaller programs it was working just fine <oriansj>the real question is M2-Planet building just fine <stikonas>right now I'm just trying to do one small feature at a time <stikonas>I suppose tokenization is much harder to get working right when you do it for the first time in assembly <Jeremy_Rand>Good point oriansj. Okay, so, if you are confident that HTTPS is supported for all subdomains where HTTP exists, the raw HTTP response header that should be sent on TLS port 443 is: <Jeremy_Rand>Strict-Transport-Security: max-age=63072000; includeSubDomains; preload <Jeremy_Rand>TCP port 80 should also be changed to do an HTTP redirect to HTTPS, instead of serving the same content that TLS port 443 has; let me know if you need help with that change <Jeremy_Rand>If you are concerned that there may be unnoticed subdomains where HTTP exists but HTTPS does not, you may wish to temporarily decrease the max-age number at first while you monitor server logs for issues <Jeremy_Rand>Hope this helps, ping me if you have any issues/questions. Or if I've disconnected accidentally, you can email me at jeremyrand at danwin1210.me to get my attention <fossy>stikonas[m]: got almost all your guile PR rebased, just testing rn <stikonas>oriansj: hmm, that strace looks like it's still getting stuck in mprotect