IRC channel logs

2024-02-04.log

back to list of logs

<oriansj>sam_: probably because if it something you haven't done before and few people have done before it must be hard. Or in my case, ignored it as it wasn't needed yet and didn't get into it until something simple (and less than 100KLoC was found)
<oriansj>state -= state; ??? like why even ??
<sam_>yeah, must be
<sam_>it was just odd that people kept seeming to "pass it around" too
<oriansj>sam_: well if they think my unxz.c is complicated, then there is no convincing them
<oriansj>although about 1/8th of the code appears to just figuring out what state they are in and what state they need to transistion into.
<Googulator>LZMA probably also has a reputation of being complex because its _compressor_ is indeed complex
<Googulator>plus, 7-Zip is written in C++
<Googulator>& the existence of a C implementation in the LZMA SDK alongside the C++ one is possibly not well known
<oriansj>ok, this one puzzles me; {statement;} and statement; behave differently?
<oriansj>ok; uint8_t foo; {uint8_t foo = 1;} only sets the foo inside of the {} and unsets back to what foo was outside of the {}
<Googulator>That sounds like correct(-ish) C scoping behavior.
<oriansj>but to depend upon it in your code? like how hard is it to give something a different name?
<Googulator>It's certainly a bad practice, and indeed many compilers warn about it.
<oriansj>but not gcc or clange
<oriansj>or tcc
<Googulator>weird, I definitely remember gcc warning about that by default
<Googulator>although maybe it needs -Wall
<oriansj>not even with -Wextra
<Googulator>oh, you are using just a plain {...} to set up a block
<Googulator>as opposed to for (...) {...} or if (...) {...} or similar
<Googulator>plain {...} in the middle of another block is pretty much exclusively used specifically for scoping purposes, so gcc treats this as deliberate shadowing of foo by a programmer who "knows what he's doing"
<oriansj>well not me but the code I am refatoring into something M2-Planet can build
<Googulator>IIRC you do get a warning if you shadow in a block that's "required by the language", such as an if or a for
<Googulator>but a plain block is treated as the equivalent of explicitly marking a fallthrough in a switch-case statement with /* fallthrough */
<oriansj>well there is uint32_t distance;
<oriansj>inside of an if block which the outer has a uint32_t distance;
<oriansj>and still no warning either.
<Googulator>That sounds like a bug.
<oriansj>in both gcc and clang?
<Googulator>The classical example is this:
<Googulator>bool contains_zero(int *array, int len) {
<Googulator>    bool result = false;
<Googulator>    for (int i = 0; i < len; i++) {
<Googulator>        if (!array[i]) {
<Googulator>            bool result = true;
<Googulator>        }
<Googulator>    }
<Googulator>    return result;
<Googulator>}
<Googulator>If something like this doesn't warn, that's a bug.
<oriansj>but the return bit isn't in the code in question
<oriansj>only the setting of bool result = true;
<oriansj>and as it is not being returned; it isn't triggering that case.
<oriansj>replace return result with another if statement (say using if(result))
<Googulator>This will always return false; but it should also generate a compiler warning, because the inner "bool" is clearly an error
<oriansj>yet somehow zero warnings without the return;
<Googulator>Whether the offending bool is returned, or used in some other way, shouldn't matter
<Googulator>but it could well be that unit tests meant to catch this were written using return for both gcc and clang, and so failed to catch breakage when the usage isn't a return
<Googulator>& of course, we have https://github.com/llvm/llvm-project/issues/48943
<Googulator>"relying on undefined behavior in any part of your code makes the behavior of the code as a whole, and that of any other part, undefined; one shouldn't assume undefined behavior can be contained in just part of the code"
<Googulator>(although I don't see any UB in case of shadowing a variable in a block)
<oriansj>true; it is just a very bad practice but technically 100% legal in ANSI C
<fossy>stikonas, my PRs are good now i thnk
<oriansj>cleaned up to line 1129 https://paste.debian.net/1306240/
<oriansj>hopefull will get done tomorrow
<oriansj>the only thing I can think of is that lzma2 requires an unforgiving state machine that appears to use a 700+ line 4 level deep if statements. but it probably could be broken up if someone took time to reason it out.
<oriansj>should I remove the lzma support if we don't need it or just leave it as it is only about 40 lines extra
<oriansj>?
<oriansj>weirdly none of the lzma logic block is actually used when a .tar.lzma file is unpacked.
<oriansj>wait, nope logic for the test was wrong
<Googulator>you mean lzma2?
<Googulator>The problem is, you can't really know in advance whether or not a .xz will contain lzma2 blocks
<Googulator>Plain lzma (as opposed to lzma2) is used in _all_ xz files, so removing that would just render the code useless
<Googulator>fossy: updated https://github.com/fosslinux/live-bootstrap/pull/415 again; please let me know if there's anything left to do in it
<stikonas>also git live-bootstrap seems to have failed with checksum errror...
<Googulator>yeah, I actually included the corrected checksum in my updated PR
<stikonas>maybe let me cherry-pick that diff change
<Googulator>oh, unfortunately it got rolled into the main commit
<stikonas>well, I'll just manually copy it
<stikonas>I need to re-run it locally anyway before ppushing
<stikonas>and double check for some obvious issues, like non-reproducible documentation
<oriansj>Googulator: no look at line 1952: https://paste.debian.net/1306287/
<oriansj>(the code is currently in the state that doing unxz input output with xz and lzma files will work)
<oriansj>(still not done cleaning it up and making it more mescc-tools-extra standard in terms of interface)
<Googulator>oh, you mean raw LZMA files
<oriansj>bingo
<Googulator>I guess it's expendable if it turns out to be hard to port to m2-planet
<Googulator>but I wouldn't drop it just to reduce code size; it's C code, so minimizing size isn't a goal in and of itself
<oriansj>unlikely, it just is unneed extra functionality (which someone might want)
<Googulator>if it were hex0, then yeah, making it as small as possible trumps everything else
<Googulator>but it's C
<stikonas>yeah, I would keep it too...
<Googulator>IMO it's better to keep working code around, even if it's unused at the moment
<Googulator>BTW, there's more breakage in the recent PRs
<Googulator>the one that added an early m4 in particular breaks kernel bootstrapping
<Googulator>due to m4-1.4.10.tar.gz not being included in fiwix-file-list.txt
<stikonas>argh...
<stikonas>at least this is a simple fix...
<Googulator>yeah, a short term fix is easy
<Googulator>long term, I'd like to get rid of fiwix-file-list entirely
<Googulator>and just have make-fiwix-initrd parse builder-hex0's in memory FS directly
<stikonas>yaeh, that thing is a bit annoying...
<Googulator>we already have 1:1 FS transfer from Fiwix to Linux
<Googulator>would be great to also have that from builder-hex0 to Fiwix
<stikonas>ok, pushed
<stikonas>though I'm still running bootstrap...
<stikonas>early transitions are harder though...
<Googulator>yeah, we can't do a real "ls" or "find"
<stikonas>I haven't reached anywhere close to stage0-uefi -> Fiwix transition, but it's probably just as tricky...
<Googulator>but we do have a shortcut, since builder-hex0 doesn't protect memory
<Googulator>so you can just read into kernelspace from userspace
<stikonas>that won't work on UEFI though...
<stikonas>at least not until exiting boot services
<Googulator>but UEFI FS support is a lot more extensive than builder-hex0
<stikonas>yeah, that's true...
<stikonas>so a bit easier to deal with
<Googulator>the UEFI shell even has an ls / dir command
<stikonas>yeah, if we a want to use it...
<stikonas>though ideally bootstrap should work without it
<Googulator>probably not directly, but it shows that the APIs are there
<stikonas>but UEFI shell is very useful for debugging
<sam_>[02:16:14] <@oriansj> ok; uint8_t foo; {uint8_t foo = 1;} only sets the foo inside of the {} and unsets back to what foo was outside of the {}
<sam_>yeah, this is a block
<Googulator>it is, and that's indeed a documented feature of ANSI C
<Googulator>doesn't mean it's not terribly bad practice that warrants a warning
<Googulator>after all, compilers do warn on implicit return types
<Googulator>also a feature required by ANSI C
<Googulator>maybe not uint8_t foo; {uint8_t foo = 1;} since it's pretty clearly an intentional use of block scoping
<Googulator>but certainly uint8_t foo; if (bar) {uint8_t foo = 1;}
<Googulator>which is almost certainly an error for uint8_t foo; if (bar) {foo = 1;}
<oriansj>well legal C includes a great deal of just bad ideas; like macros and poor default types
<oriansj>*(ptr + index) being the same as ptr[index]
<oriansj>foo = ptr[index(a>b ? -- : ++)]++; being just fine to use is another
<oriansj>or using = as assignment rather than := and using == for equality rather than =
<Googulator>that's just syntax bikeshedding
<Googulator>you could argue that not interpreting != as "factorial equals" is a bad idea in C
<Googulator>but what I meant is programming practices that ANSI C explicitly enables, but which still need warnings
<Googulator>e.g. if (size = 0) is perfectly valid C, but compilers generally warn about it
<oriansj>then you realize one can make if(true == 1 && true == 0) {puts("suprise")} work via macros;
<oriansj>mixing of assignment and mutation at the same time is a code smell;
<oriansj>and lack of bounds checking on arrays is responsible for a great many bugs.
<oriansj>but yes I agree compilers and their writers have done a great job improving the situation which the programming language standards writers have created.
<oriansj>and I wish a = b++ = c-- = a; would be a warning.
<Googulator>meanwhile, almost successful transition to Fiwix with no filelist
<Googulator>the only limitation is, the filelist-free make_fiwix_initrd can't pass through zero-length files
<Googulator>because in builder-hex0, they are indistinguishable from directories
<Googulator>so I need to recreate mes/config.h (which is empty) once Fiwix boots, but before trying to build anything
<Googulator>empty directories are also not passed through, for the same reason
<stikonas>Googulator: I'm surprised mes/config.h is used after Fiwix is booted...
<oriansj>agreed
<stikonas>Googulator: your diffutils checksum also failed...
<stikonas>perhaps the build is not yet reproducible...
<Googulator>It's not really "used" as much as it needs to exist because another header includes it
<oriansj>hmmm; sounds like we need to put a #ifdef in there to ignore that file if it isn't used.
<stikonas>potentially it is used in some configurations of meslibc...
<stikonas>I guess that's where configure stuff goes
<Googulator>It is used during build of mes itself, but then replaced with an empty file
<Googulator>Filelist-free Fiwix kexec will also need a small fix to builder-hex0
<stikonas>for zero length files?
<Googulator>The file name table isn't zeroed at the beginning, so unused entries will have garbage data in them instead of zeros
<Googulator>Which makes it impossible to find the last valid file
<Googulator>The obvious fix is to zero the file table before beginning to parse the srcfs
<Googulator>Which is a 14MiB "rep stosb"
<Googulator>hopefully we don't need to run builder-hex0 on any system where that's too slow to be acceptable
<Googulator>Zero length files are ignored for now - the only relevant one I found so far was mes/config.h, which I just recreate in an improve step after booting into Fiwix
<stikonas>yeah, I guess recreating is fine
<stikonas>still, builder-hex0 is quite amazing...
<stikonas>and rickmasters too
<stikonas>posix-builder still works much worse despite being written in C
<stikonas>and I didn't have much progress in the last 3 weeks :(
<euleritian>stikonas: Did you see the FOSDEM talk by Ekaitz today? You apparently achieve much more than progress measurable in code!
<stikonas>euleritian: not yet, I missed it though I intend to view recording later
<stikonas>well, we did make quite a bit of progress on riscv bootstrapping
<stikonas>and as Ekaitz said, thing often go much faster when you can actively collaborate :)
<stikonas>if one person is stuck, often the other can help to get the process unstuck
<stikonas>euleritian: I did see the slides of the talk though
<oriansj> https://fosdem.org/2024/events/attachments/fosdem-2024-1755-risc-v-bootstrapping-in-guix-and-live-bootstrap/slides/21755/presentation_cjmPXBA.pdf