IRC channel logs

2022-09-26.log

back to list of logs

<stikonas>ok, M0.efi just pushed https://git.stikonas.eu/andrius/stage0-uefi
<oriansj>stikonas: awesome
<stikonas>one more step (cc_amd64) and I guess after that it will be a lot of M2libc work...
<oriansj>well we will finally have a solid reason to make a proper malloc and free
<stikonas>wouldn't malloc in UEFI be just calling allocate_pool?
<stikonas>and request some memory from UEFI
<oriansj>stikonas: well yes but don't we have to track the allocated memory to free it on exit?
<stikonas>can we not leave it up to the programmer?
<stikonas>free everything that you malloced?
<stikonas>hmm
<stikonas>but maybe that's not what is done in stage0-posix... I don't remember now
<stikonas>in that case maybe we do need to reserve bigger blocks and have proper malloc and free
<stikonas>hmm, yes, even M2-Planet does not run free()...
<oriansj>stikonas: M2libc doesn't clean up after itself at all; so we would have to add it there
<stikonas>well, at the time of writing nobody expected that we would need clean up
<stikonas>after all Linux does that itself
<oriansj>and free(void*) is just a stub that does nothing
<stikonas>and I don't think we thought about other platforms (i.e. UEFI)
<oriansj>indeed
<oriansj>Oh I did, but I also wanted to move forward to prevent the spread of copies of files and make a standard library for M2-Planet before things have gone too far
<oriansj>So it was punted to a future when we had M2libc as the standard M2-Planet Library
<oriansj>which minus the Mes.c bit which we haven't absorbed yet; we did achieve
<oriansj>So now it is just going to be time to add that missing functionality and if janneke takes some time to make mes.c to be something easy for GCC to compile, we should be able to finish the transistion to M2libc
<stikonas>(or if we get that M3 compiler you were taling about... But that's probably longer term)
<oriansj>well it is a huge task, so it is hard for me to get started.
<stikonas>yes, I know...
<stikonas>well, best to split it into smaller subtasks
<oriansj>which should make it easier to support going forward.
<vagrantc>hrm. my proposed talk fo bsidespdx didn't make the cut, but i might get a slot if anyone cancels
<oriansj>vagrantc: well if it makes you feel any better, I would have loved to see it ^_^
<oriansj>In related news my proposed talk to the Michigan Cyber Security conference on stage0 was rejected as according to them it was "Not related to Cyber Security"
<oriansj>So I enjoyed a good laugh after reading that; cracked a root beer and now I am just going to work on a simple preprocessor with virtually no expectations on it
<stikonas>C preprocessor?
<stikonas>or more generic?
<sam_>oriansj: lol
<sam_>oriansj: they have no idea
<vagrantc>oriansj: if i don't manage to present, maybe i'll make a video of it or something
<vagrantc>solving fundamental decades old security problems of nearly all computing platforms is just not cyber enough
<oriansj>stikonas: well at this point it is more of an experiment of how simple preprocessing can be made
<oriansj>sam_: but then again I shouldn't be surprised as they have been rejecting my talks for the last 5 years.
<oriansj>vagrantc: I'll be sure to link it in my list of bootstrapping talks
<oriansj>stikonas: the current state is reducing the tokenization down to about 70 lines then doing token fusion to produce the desired token stream. Then implementing language specific preprocessing should be simpler and solve the current M2-Mesoplanet bug of reading all files
<oriansj>I think we might have a fully compliant C tokenizer soonish
<oriansj>although I need to figure out a solution to label: vs ternary ? operator : alt
<oriansj>right now it is in a very basic state: https://git.sr.ht/~oriansj/M3-Preprocess/tree
<oriansj>basically I am sorting out the token fusion stage right now: https://git.sr.ht/~oriansj/M3-Preprocess/tree/main/item/c_stage1.c
<oriansj>although if I was aiming for C compatibility with GCC I'd be doing # 0 "foo.c" instead of #FILENAME foo.c 1
<oriansj>well that heuristic works kinda (until someone figures out a better one)
<oriansj>and that is uploaded, with all of its hackiness
<oriansj>if there is something in the C standard that I missed and tokenized completely wrong, please let me know
***attila_lendvai_ is now known as attila_lendvai
<oriansj>and should I do Trigraph sequence replacements in the preprocessor?
<oriansj>and it turns out c labels: can have an arbitrary amount of whitespace between the label and the :
<oriansj>they can literally be on different lines
<stikonas[m]>I think trigraphs are not widely used
<stikonas[m]>And are getting removed in newer standards
<stikonas[m]>Probably safe to ignore those
<oriansj>well it is more of a question of are they used in GCC and thus would have to be supported.
<oriansj>and if so, should we just deal with it in the preprocessor or would I have to deal with it in the compiler itself for technical reasons.
<stikonas[m]>GCC seems to disable them by decal
<stikonas[m]>Default
<stikonas[m]>So unlikely to use them either
<oriansj>well that is good to hear; they didn't look hard to convert with a single else if(pattern_compress(i, trigraphbracketopen)) i->s = "["; line but I was just more worried about when they can't be preprocessed away.
***attila_lendvai_ is now known as attila_lendvai
<oriansj>in fact, I'll do that and we will support trigraphs provided there is no technical special extra details needed in the compiler to deal with them. (aka if the preprocessor can just replace them, then we will be fine)
<oriansj>and with a few more tweaks I can probably make inspecting macro expansion quite simple (so we wouldn't have to break out gdb to figure out what we created)
<stikonas[m]>Maybe make it optional?
<stikonas[m]>Trigraphs can accidentally break things
<stikonas[m]>If some code accidentally uses sequence if chars that ends up being trigraph but is always compiled without trigraphs
<oriansj>well it would be trivial to disable as it is isolated to a single function
<oriansj>which is called exactly once in a single function, so create a global, add a flag and put an IF(!flag) replace_trigraphs(token);
<oriansj>and the second we hit that bug, I'll add it
<stikonas[m]>Yeah, we can do it only if we hit that bug...
<oriansj>less global state reduces the number of things I need to test when fuzzing
<oriansj>hmmm M2-Planet supports -D and M2-Mesoplanet didn't I probably should fix that
<oriansj>and in case anyone missed it: https://paste.debian.net/1255074/ a spinning donut buildable by M2-Mesoplanet (not that it would be essential for anyone's bootstrap)
<oriansj>although it would make a funny post: from hex to spinning donuts
<oriansj>wow, appearently arbitrary whitespace can exist between # and define according to the standard (that seems like a bad idea)
<oriansj>well, as I am fixing up trigraphs, I might as well fix that garbage input as well
<oriansj>because you know #\n\t define seems sooo reasonable
<oriansj>even emacs and vim don't syntax highlight that correctly
<oriansj>with the extra fun of sorting out the behavior of the # and ## operators
<oriansj>as there seems to be a conflict in the standard in regards to maximal munch of tokens and # ## #
<oriansj>fortunately it doesn't occur in GCC's code at all and ## and # never appear to show up on the same line
<oriansj>So I'll be sticking with GCC's maximal munch behavior
<stikonas>well, editors are definitely not as good as compilers at interpreting C syntax
<oriansj>fair C has a good few strange rules
<oriansj>although I think I need to unify the token_list and macro_list types to save myself some pain
<oriansj>but first lets hack thing into a useful state
<oriansj>so close, just need to wire up the #include logic to eat the line and inject the newly processed list
<oriansj>and with a minor tweak you can identify each file being expanded and walk through them using --display-token-stage
<oriansj>does anyone know what the long form of -D is?
<stikonas>does gcc even have long form -f -D?
<oriansj>it doesn't look like it from the man page but it does look like it stands for define macro
<oriansj>so I was thinking of doing --define-macro being equal to -D
<stikonas>yeah, it probably does stand for define
<stikonas>because -U undefines macro, so probably stands for #undefine
<oriansj>should we add -U to M3-Preprocess ?
<stikonas>hmm, not sure if we need it
<stikonas>would be be hard to add?
<stikonas>I would imagine it shouldn't be too hard
<oriansj>probably not
<stikonas>though there are some corner cases
<stikonas>what happens if one writes -Ddef -Udef
<oriansj>or does -Udef apply to all defines inside #includes?
<stikonas>probably...
<stikonas>same as -D
<stikonas>I don't see any explicit use of -U in live-bootstrap until close to the end (perl 5.32) where we have gcc anyway
<stikonas>and it was used for -U__DATE__ -U__TIME__
<oriansj>which is needed to make them reproducible
<stikonas>yes, but that's well past the time we start using gcc
<stikonas>so I suspect we won't need -U
<oriansj>so good point and we can skip it until we have to fix it later
<stikonas>you can always revisit it if we need it
<oriansj>indeed
<stikonas>indeed...
<oriansj>^_^
<stikonas>at some point later, perhaps you can try running your preprocessor on some real software and then try to feed output to gcc
<stikonas>?
<stikonas>that might help you notice any missing features that we need
<oriansj>good idea
<oriansj>finally got #include to work correctly
<oriansj>now to figure out why the # 0 file lines disappeared
<oriansj>now one should be able to do: ./bin/M3-Preprocess -D __M2__ --architecture x86 --c-preprocess --include-library-directory M2libc -f foo.c and get meaningful output
<oriansj>now to start filling in the C preprocessor features needed to build musl libc
<oriansj>looks like we have a long way to go: https://paste.debian.net/1255098/
<oriansj>hmmm musl seems to reference alltypes.h which appears to be 100% generated from autools
<stikonas>hmm, let's check...
<oriansj>and might have to add logic for searching down multiple library paths to find the source code as some is in obj/include/ and others is in include/
<oriansj>easy to check with: ./bin/M3-Preprocess --architecture x86 --c-preprocess --include-library-directory ../musl/include/ -f foo.c
<stikonas>ok, but it's generated, at least it's not pre-generated