IRC channel logs

2022-08-26.log

back to list of logs

<theruran>aggi: had you tried Plan9 libc? not sure about MINIX
<aggi>theruran: didn't think of those yet
<aggi>noted. the test-case is almost trivial to verifiy: AS=arm-tcc and see what happens
<theruran>good luck!
<aggi>theruran: won't need it, thanks very much, nonetheless.
<oriansj>aggi: well libcs are pretty tiny and if one is willing to take a tiny performance hit only needs about 20 lines of assembly per architecture
<oriansj>so you could take any libc and remove most of the assembly without much effort
<aggi>oriansj: in dietlibc i counted ~700 .S files...
<aggi>in total, for all architectures
<oriansj>as the only thing in a libc that can't be just a plain C function are syscalls
<ekaitz>and syscalls are pretty simple assembly btw
<oriansj>So write a .S with syscall_zero(), syscall_one(void*) ,,, syscall_eight(void*, void*, void* ..)
<oriansj>oops forgot a little detail
<aggi>whatever conclusion you draw, the current situation is this: there isn't any libc which AS=arm-tcc could digest
<oriansj>syscall_zero(int number), syscall_one(int number, void* param) ..
<oriansj>then all syscalls become calls to those few functions written in assembly
<ekaitz>doesn't Mes have a minimal libc too?
<oriansj>ekaitz: yes, infact I am describing the very thing meslibc does
<aggi>i don't insist on aarch32 and AS=arm-tcc, it's only a cornerstone of proof, if any alternative assembler and/or architecture are feasible
<ekaitz>aggi: I'm late to the party so IDK what are you trying to do with it
<aggi>currently i emit two firmware variants: aarch32 (with aarch64 kernel/uboot), and amd64; reason being to catch some iregularities
<oriansj>aggi: if you are willing to write only 20 lines of assembly you should be able to convert any libc to only have assembly that TCC supports
<aggi>greetings ekaitz: for aarch32 i replaced the entire GNU toolchain with AS/AR/CC/LD=arm-tcc
<aggi>except, libc
<ekaitz>mes's libc should be supported by tcc
<ekaitz>and it's ported to arm if i'm not mistaken
<aggi>could mes libc be used with linux-kernel and toybox userland?
<ekaitz>aggi: that's a question I'm not able to answer
<ekaitz>maybe... try?
<aggi>and, of cause tcc compiler itself, to create a minimal *nix development system
<oriansj>aggi: mes libc was only designed to be just enough for the building of TCC and nothing more
<aggi>oriansj: which is a good start, given tcc is a relatively complex piece of software already
***nckx_ is now known as nckx
<oriansj>you can get mes and meslibc here: https://gitlab.com/janneke/mes.git
<aggi>thank you!
<ekaitz>aggi: but tcc is very specific... so it meslibc will only implement file access and stuff like that
<ekaitz>not anything more complex
<aggi>i would consider a toybox-userspace a proof-of-concept
<oriansj>ekaitz: (it is also used by mes.c so it has some user interaction bits
<oriansj>)
<ekaitz>oriansj: cool, thanks for clarifying
<ekaitz>I might take a deeper look into it once I finish with tcc...
<aggi>anyway, i am willing to practice and hack ASM, of cause; however i am not willing to struggle with hundreds of vendor-specific and GNU gas specific extensions
<oriansj>aggi: you don't, one only needs to know about 7-8 assembly instructions
<oriansj>(mostly just settings a few registers and then doing the syscall)
<oriansj>I can even point you at the assembly you would need
<ekaitz>aggi: if you want to learn about that in Hex0 you can see how to make the syscalls
<oriansj>or just look at M2libc: https://github.com/oriansj/M2libc/blob/main/aarch64/linux/unistd.c
<aggi>oriansj: i will try something else first... mes-libc+toybox-userspace, for either i386/amd64/aarch32 and see if any of this passes
<aggi>i take your word for granted, mes-libc can be compiled/linked/assembled with arm-tcc/i386-tcc etc
<oriansj>aggi: I said meslibc can be used to build tcc
<oriansj>it include M1 assembly which needs tweaking to be something tcc can directly build
<aggi>it's just i had a look at the aarch32 ASM parts inside dietlibc/musl-libc/uclibc/newlib ... didn't seem simple to me which requires "only 7-8 assembly instructions"
<oriansj>aggi: aarch32 syscalls are just the settings of a few registers and then using the syscall instruction
<aggi>syscalls, and the main() entry point required some asm too
<oriansj>in fact most architectures are just the setting of a few registers and doing a syscall instruction (or calling an interrupt)
<aggi>some other parts can probably be re-written in C
<ekaitz>aggi: you need the crt* files too
<ekaitz>and those are normally written in assembly
<aggi>yes, and what i had seen in musl-libc, for the main() entry point with the related aarch32 asm wasn't simple either
<ekaitz>but it shouldn't be too much to understand
<oriansj>actually you mean _start and that is super simple too: https://github.com/oriansj/M2libc/blob/main/aarch64/libc-full.M1
<aggi>the _start entry for aarch32 in musl-libc is _NOT_ simple
<oriansj>you just setup the stack for main so that argc, argv and envp are in order and then feel free to call a C function for doing anything more advanced
<ekaitz>aggi: I feel you with that... these days I'm working on TinyCC and its not tiny and it's not easy to read either
<ekaitz>aggi: try to avoid frustration and everything will get better with time... assembly looks harder than it actually is
<aggi>as a minimum acceptance-test, simple or not, i would define the following: CC/LD/AR,and AS=arm-tcc and see _any_ practically userfull libc pass with it
<aggi>then next, compilation/linking of a toybox-userspace against such a libc
<oriansj>ekaitz: well, if I change up my M3 plans in a serious way. I could probably help you indirectly with all that.
<ekaitz>oriansj: I hope we can get rid of tcc from the chain asap so if you do, that would be actually awesome
<oriansj>well I can only put maybe 0.5-1 hour a night into it so it'll take a bit unless someone can save me to the slow bits
<oriansj>for example figure out the compile order so that it could be built with say a single gcc command
<ekaitz>if you have this plan written somewhere I can try to help but I can't promise anything
<stikonas[m]>Well, setting up the stack can't be done from C...
<stikonas[m]>So some assembly is unavoidable
<stikonas[m]>That's one of the few assembly bits not exposed in C
<stikonas[m]>Another is syscalls
<aggi>granted, in theory; yet in practice in dietlibc (as an example for this), i counted ~700 .S files in total
<oriansj>ekaitz: not a fully written up plan but the general plan is make TCC buildable by M2-Planet while preserving all features needed to build GCC; create linker and assembler that become drop in replacements for binutils
<oriansj>cut out a bunch of steps in the bootstrap
<stikonas[m]>Tcc buildable by m2-planet is still quite ambitious
<ekaitz>oriansj: hm if that involves touching tcc I'm not your man
<stikonas[m]>Need to add quite a few C99 things
<ekaitz>still TCC is not even compilable from mes nowadays so...
<oriansj>ekaitz: how about writing C tests?
<stikonas[m]>I don't think we want to touch tcc too much
<stikonas[m]>Ideally, bootstrap compilers are improved
<stikonas[m]>Rather than more complicated simplified
<ekaitz>oriansj: that sounds a little bit better to me :)
<aggi>ekaitz: i did something different; fully removed gcc/binutils and used tcc-toolchain components _only_ for the entire userspace (~600 builds), until i hit aarch32 asm in musl-libc (and C99 _Complex, less problematic than ASM)
<oriansj>ekaitz: well you have seen my code style before right?
<aggi>meaning, if any libc can be repaired, gcc/binutils are not necessary anymore at all (for userspace, kernel is a separate todo still)
<ekaitz>oriansj: probably, but I don't remember... I spent too much time digging in GCC and TCC and i'm brainwashed lol
<ekaitz>aggi: that's great news, gcc is a very complex beast
<aggi>of cause, it's still possible to bootstrap/compile gcc/binutils with arm-tcc, except libc, which is the very bad news currently
<stikonas[m]>BTW, meslibc is designed to build more than tcc
<stikonas[m]>Also stuff like make, bash
<stikonas[m]>Even GCC 2.95
<oriansj>ekaitz: well look at this: https://github.com/oriansj/M2-Planet/blob/master/cc_reader.c and tell me if that style of code is something you are willing to look at
<oriansj>as that is about as ugly as code gets with me
<ekaitz>oriansj: looks acceptable: variable names have more than one character
<oriansj>as most of the early work will me just adding comments and understanding to the code
<oriansj>and figuring out what needs tests
<oriansj>(and if I could get your help writing those tests, it'll really speed me up for the second part)
<theruran>aggi: I dunno if I said it, but I appreciate the work you did and your vision without GNU/LLVM tools. I hope you can stick around here at least and share some of your wisdom
<ekaitz>oriansj: it looks good to me, keep me posted with that and I'll help if my life permits
<oriansj>the second part of course being rewritting it into something M2-Planet can build
<theruran>oriansj: your C code looks straightforward to me. I've seen worse
<ekaitz>theruran: it's pretty good actually
<ekaitz>I have to leave you folks, keep me posted oriansj please
<oriansj>ekaitz: I'll see what I can do
<ekaitz>no pressure
<aggi>btw. with the last test-run of arm-tcc toolchain i forgot to set CPP="arm-tcc -E", not sure if gcc preprocessor was picked up occasionally
<aggi>still occupied with cleanup-tasks to _freeze_ the gcc47 c-only toolchain system profile
<oriansj>ekaitz: it is more I don't find some tasks fun and if I see too many, it reduces my motivation to work on something.
<ekaitz>oriansj: I feel you 100%
<aggi>i'll re-run the tcc-toolchain, and then test a minimal mes-libc/toybox/tcc setup for aarch32
<aggi>however, this approach is different to the bootstrappable one, since i intend to skip the gcc/binutils stage entirely for the entire system (including libc,kernel etc)
<oriansj>aggi: good, it means more potential paths forward
<oriansj>(and there is no one true bootstrappable path, just those currently being worked on and those worked on in the past)
<stikonas[m]>Well, some task are not really fun but need doing, hex0 jump calculations are not really fun...
<oriansj>hence why it has no competition yet
<oriansj>I'd love to see some competitors which bring fresh new ideas and better solutions to the early stages
<oriansj>but until then, we have something that works and exists on multiple architectures
<oriansj>which we can always refine and improve with time.
<stikonas[m]>Well, different approach would be some small interpreter...
<oriansj>So there is no way for me to lose. If it is the best possible solution => We have a working bootstrap :woot: ; if better solutions are found => we have even more paths to a working bootstrap :woot:
<stikonas[m]>But probably hard to write something small
<oriansj>well we have an example of a tiny FORTH: https://github.com/oriansj/stage0/blob/master/stage2/forth.s which could be self-hosting if someone gave it love
<stikonas[m]>Yes, but that's after M0
<oriansj>well right now it is 4,372bytes but it could with cleverness get down to 900bytes
<oriansj>after which it would be about as auditable as hex0+kaem
<theruran>oriansj: forth.s can be reduced to 900 bytes? how do you figure?
<oriansj>theruran: https://github.com/cesarblum/sectorforth.git
<oriansj>because it has been reduced to 510bytes
<theruran>got it! :)
<oriansj>and with a few additions, it'll be able to read/write files
<oriansj>without having to do super ugly hacks
<theruran>sectorforth cannot be used because it's licensed MIT?
<oriansj>theruran: ???
<theruran>oriansj: I mean, why not just port this to M0 on stage0-posix?
<stikonas[m]>MIT is fine
<oriansj>theruran: people no one has opted to do that work yet
<oriansj>^people^because^
<theruran>OK
<oriansj>There is nothing stopping anyone, except the fact no one has done the work yet
<theruran>I guess, are there any gotchas for converting NASM code?
<oriansj>theruran: nope but that is a bootsector application so you'll have to replace the interrupts with syscalls
<oriansj>and please take a hard look at examples/01-helloworld.f to know what was done to fit in 510bytes
<stikonas[m]>theruran: if you use my new M0 defines, then M0 code is fairly similar to normal assembly
<stikonas[m]>It's not exactly the same of course but close enough
<stikonas[m]>Compare catm.S vs catm.M1 in https://git.stikonas.eu/andrius/stage0-uefi/src/branch/main/amd64/Development
<stikonas[m]>I also have hex2 converted locally but still need to finish hex2.hex1 file before I push
<oriansj>they look very nice stikonas
<stikonas[m]>The only thing is that they are sometimes less explicit
<stikonas[m]>E.g. mov_rax, does not say whether it is 8 bit or 32 bit immediate
<stikonas[m]>But it is clear once you see ! or % after it
<oriansj>unavoidable if one wishes to have close to nasm syntax
<oriansj>assuming one is familiar with hex2
<stikonas[m]>Though if we want to use both, then we'll have to disambiguate
<oriansj>indeed
<stikonas[m]>I might need to use jmp8 and jmp32 in hex0
<oriansj>well assuming you want to keep the size down, yes
<stikonas[m]>Well, that's mostly a concern for bootstrap seeds
<stikonas[m]>Or at most for .hex0 programs
<oriansj>theruran: if you do a FORTH.M1 for stage0-posix, I will of course merge it for anyone who really wants a FORTH
<stikonas[m]>hex2.hex1 is deliberately using 32 bit stuff since that's what hex1 supports
<oriansj>and it probably allows some simplifications of hex1 as well
<oriansj>sorry my brain was thinking about RISC-V hex1 which was much more complicated than most hex1s
<stikonas[m]>Yes, risc-v hex code is annoying
<stikonas[m]>All the immediates are so painful to calculate...
<oriansj>which reminds me, I don't thank you enough for saving me from that
<stikonas[m]>x86 immediates are just hex numbers
<oriansj>just have to remember little endian byte order
<stikonas[m]>But risc-v has a nice fixes list of defines
<stikonas[m]>Since we can construct anything using . ooerator
<oriansj>and do %0x3b7269c9 instead of %0xc969723b
<stikonas[m]>E.g. mv and r11 are separate defines there
<oriansj>yeah RISC-V M1 defines are surprising close to the original RISC-V instructions
<stikonas[m]>Maybe x86 can be done separately in octal...
<oriansj>stikonas[m]: absolutely by anyone willing to do that work
<stikonas[m]>But in hex we have to use big defines like mov_r11,r12
<stikonas[m]>Well, probably not me
<oriansj>kinda have to as x86 is an octal aligned instruction set
<stikonas[m]>Busy with other stuff
<oriansj>and there are some details about x86 encoding which made me nervous enough to just skip that work
<stikonas[m]>And holidays too, so nothing at all from me for 2 weeks
<oriansj>stikonas[m]: well you clearly deserve a beautiful vacation
<oriansj>and should have great fun ^_^
<oriansj>I don't think the entire Illumos community actually validate commits
<oriansj>because if they did, git clone would instantly return: error: object 447603b54aaea470ed1dcdb5c52d0be1d7801f84: badEmail: invalid author/committer line - bad email
<qyliss>Lots of old repositories are like that
<qyliss>rails is another
<qyliss>I've had to turn off validation for certain repositories
<oriansj>qyliss: just seems like a bad idea long term
<qyliss>yeah, but the only way to fix it is to rewrite history
<qyliss>so it's not exactly an easy fix
<oriansj>git replace --edit followed by git filter-repo
<qyliss>you still break the hash for every subsequent commit, right?
<oriansj>qyliss: unfortunately yes
<qyliss>so that's a very disruptive fix
<qyliss>it would be better if git allowed you to allowlist known-problematic historical revs
<qyliss>and then gerrit (in illumos's case) or whatever else could check all new objects going forward
<qyliss>(maybe git even has such functionality already)
<oriansj>qyliss: potentially but I am not familar with anything that would do that
<oriansj>and yeah if you have a bunch of forks, it'll potentially be a short term pain to clear out the issue but it'll solve it so it doesn't keep showing up forever
<oriansj>as who wants to keep a bug around forever?
<oriansj>especially if you know about and know how to fix it?
<oriansj>also wouldn't git these days make it impossible to make such commits these days?
<oriansj>yeah it appears to silently just delete the extra <
<oriansj>git fsck.skiplist also seems like an option
<oriansj>although I don't know if that would be something that would survive a git clone or if something would need to be done to enable that
<oriansj>mind you this has been known and the default since 2015-Sep-29
<oriansj>possibly: remote.fsck.skiplist
***bauen1_ is now known as bauen1
<oriansj>The downside of patching the tools to prevent the creation of problems is one can't replicate the problem in a manner which they can test solutions
<oriansj>atleast not in an easy manner