IRC channel logs

2024-02-06.log

back to list of logs

<matrix_bridge><Lance R. Vick> Here is the diff. Looks like a number of things beyond just the symlinks that I get no errors/warnings for. https://dpaste.org/vpwre/raw
<matrix_bridge><Lance R. Vick> will fiddle and see if I can force them to match
<stikonas>hmm, this looks like different commits...
<matrix_bridge><Lance R. Vick> and yet, its the exact same tar file. one extracted with kaem and one with debian tar
<matrix_bridge><Lance R. Vick> which is bizarre
<stikonas>hmm, maybe not
<stikonas>though some files are missing in one
<stikonas>e.g. checksum-transcriber-1.0.riscv64.checksums
<stikonas>which wouldn't break your build but still bizarre
<matrix_bridge><Lance R. Vick> for some reason the stage0 tar just silently decides to ignore some things it seems
<stikonas>well, it was only tested on a few tarballs only
<stikonas>to get to GNU tar
<stikonas>oh, and it might very well be that your tarball is too new
<stikonas>tar has different formats....
<stikonas>and most of the tarballs that untar has to deal with is for old GNU software
<stikonas>so it works well
<stikonas>but your live-bootstrap tarball is probably produced by something much newer
<matrix_bridge><Lance R. Vick> Yeah I am getting it via "wget https://codeload.github.com/lrvick/live-bootstrap/legacy.tar.gz/fc6eeb6bd75ea0d0025a79ea9fe45614bd60ba14 -O live-bootstrap.tgz"
<matrix_bridge><Lance R. Vick> * "ADD https://codeload.github.com/lrvick/live-bootstrap/legacy.tar.gz/fc6eeb6bd75ea0d0025a79ea9fe45614bd60ba14
<matrix_bridge><Lance R. Vick> Will see if I can abuse the OCI ADD built in git or untar functionality then, as it seems stage0 tar is currently a non starter for extracting githubs tar exports.
<matrix_bridge><Andrius Štikonas> you can even create tar files that extract different things depending on implementation
<matrix_bridge><Lance R. Vick> Which has some scary security implications. Two implementations get the same hash, and different extracted directories.
<matrix_bridge><Lance R. Vick> TIL
<matrix_bridge><Andrius Štikonas> yeah, it was mentioned here a couple of years ago
<matrix_bridge><Andrius Štikonas> but it was published somewhere else
<matrix_bridge><Lance R. Vick> my hypothesis is that the files that are dropped seem to be > 16 char filenames
<matrix_bridge><Lance R. Vick> If we want github tar exports of live-bootstrap to be compatible with stage0 tar, will probably have to shorten a few filenames
<matrix_bridge><Lance R. Vick> and drop symlinks
<matrix_bridge><Andrius Štikonas> or better fix stage0 tar
<matrix_bridge><Andrius Štikonas> Lance R. Vick: oh here I finally found it https://www.openwall.com/lists/oss-security/2021/10/03/1
<stikonas>as for long names
<stikonas>I thought only length of full path matters...
<stikonas>oriansj: so using my own trivial memory allocator to store memory of forked programs seems to work
<stikonas>I suspect that what was happening with M2libc imeplementation was
<stikonas>we are allocating some big chung of memory, then freeing it, then some small malloc asks for memory, so it is given the big chunk, so when the next big chunk is allocated, we again need to grab more form the system
<stikonas>and eventually we run out of memory
<oriansj>yeah, we definitely should add some improvements to untar
<oriansj>stikonas: I am glad to hear that you have a working solution.
<stikonas>well, need to improve my commit a bit before pushing
<stikonas>I converted saved memory for now but it makes sense to also do the same for saved stack and saved program (binary itself)
<stikonas>this was a bug that took a while to figure out...
<stikonas>mostly due to stupid copy/paste error that masked the real reason for the failure
<matrix_bridge><Lance R. Vick> Current debian hack, and commented out kaem script that I expect -should- be able to work once we figure out the bugs with stage0 tar. At that point the stagex live-bootstrap setup will -only- use stage0 as a base image and no binaries from existing distros.
<matrix_bridge> https://git.distrust.co/public/stagex/src/branch/kernel/src/bootstrap/stage1/Containerfile#L273-L292
<matrix_bridge><Lance R. Vick> The download-distfiles was proving to be pretty unreliable with people reproducing, and using a bunch of ADDs not only avoids me having to borrow curl from a debian contanier etc, but they also pull in paralell with graceful resume and caching.
<matrix_bridge><Lance R. Vick> though it us ugly. I may autogenerate that as a separate build context file in a follow up
<oriansj>Googulator: the fixes to the various stage0-posix builds for the sys/utsname.h change are incoming (it just requires adding -f ./M2libc/sys/utsname.h \ right before unistd.c in the M2-Planet build commands)
<oriansj>but it does appear that there is a regression for wrap.c with x86/linux/sys/stat.c
<oriansj>which doesn't make any sense
<oriansj>as it should load #include <sys/types.h> prior to #include <sys/stat.h>
<oriansj>(god I hate debugging C preprocessor behavior)
<fossy>stikonas, oh, i hate those manpages
<fossy>oh
<fossy>that's why checksum changed then
<fossy>conveniently i did the rebase Feb 1 so i presumed that checksum had changed due to that :-\
<Googulator>fossy: working on eliminating fiwix-file-list.txt, it seems a bug has slipped through the simplify refactor: the Linux initramfs is no longer compressed
<Googulator>(also, gen_init_cpio cannot handle file names with spaces, but luckily the only file with this problem is /High Level Prototypes, easily renamed)
<fossy>Googulator: i guess, does it matter if it's compressed?
<fossy>Googulator: thanks for doing that work too by the way. fiwix-file-list.txt irked me as to its existance
<Googulator>fossy: due to the lack of compression, I'm now hitting the 256MiB kexec limit
<Googulator>after removing fiwix-file-list
<Googulator>restoring compression solves that
<matrix_bridge><Lance R. Vick> Is there any method in kaem/stage0 land to 0 out timestamps in files? Other than the tar issue, it looks like I am going to need to import debian just for "touch" which is sad.
<matrix_bridge><Lance R. Vick> oh! Turns out OCI exporters recently support "rewrite-timestamp=true" so I can drop my "touch"es everywhere.
<Foxboron>Yes, there is a guy that has been working on better supporting reproducible builds in the OCI/container space
<matrix_bridge><Lance R. Vick> it all actually works now. Doing a big refactor to drop all my wgets, sha256sums, touches, line continuations etc. Can get away with heredocs and "from scratch" most of the time now thanks to OCI built-ins
<fossy>Googulator: ah, of course, alright
<Googulator>it's kind of a miracle we didn't hit that limit earlier
<fossy>would be trivial to write a touch i think anyways for stage0
<Googulator>I since solved the 0-length file issue
<stikonas>well, catm already does 50% of what people use touch for
<fossy>yeah, not zeroing timestamps
<fossy>sorry, didnt ping lrvick
<oriansj>well zeroing timestamps would require an additional system call
<oriansj>and we didn't want to add an extra issue for builder-hex0
<stikonas>yeah, though I think in this case it might as well be optional
<stikonas>builder-hex0 will just return success on unknown system call
<oriansj>well I guess we could make a touch for mescc-tools-extra; the biggest lift would be defining the syscall in M2libc
<stikonas>well, but do we still need it?
<stikonas>Lance R. Vick managed to use OCI builtins for that
<oriansj>need is probably a no; want or nice to have is maybe; will any effort be spent on it? Only if someone feels like it.
<matrix_bridge><Lance R. Vick> Unrelated: Noticed I have been chattering about stagex stuff in like 5 places. It is now at #stagex:matrix.org (https://matrix.to/#/#stagex:matrix.org) for anyone in matrix land. May try bridging to libera or oftc at some point.
<oriansj>stikonas: I figured out the issue with the M2libc headers and I think we have a problem.
<stikonas>?
<oriansj>basically if a UEFI library includes a base library and it is loaded prior to the base library, and if the resulting binary is not UEFI; the base library is dropped with it
<oriansj>so perhaps it is finally time to make M2-Mesoplanet smarter about libraries and not load files it doesn't plan on using.
<stikonas>hmm, perhaps
<oriansj>and doing so will likely be a breaking change
<oriansj>so, definitely worth a tag or two
<stikonas>would #include logic from mescc be helpful?
<stikonas>might be worth looking at what is done there
<oriansj>well no harm in reading how someone else solved the #include problem
<stikonas>probably in this file https://git.savannah.gnu.org/cgit/mes.git/tree/module/mescc/preprocess.scm
<stikonas>hmm, but it's nyacc based...
<oriansj>yep
<oriansj>and nyacc is a big beast
<oriansj>that returns a proper AST
<oriansj>(unlike M2-Planet/M2-Mesoplanet's simple token list)
<oriansj>if one had a proper AST, the answer is trivial: just prune the tree on the #if/#ifdef/#ifndef statements
<oriansj>no more complexity needed
<oriansj>but we have things like #ifndef _FCNTL_H #define _FCNTL_H which need to be evaluated
<oriansj>which close off those libraries after a single load
<oriansj>so that we don't get duplicates
<oriansj>ok, I think I found an ugly but working solution
<oriansj>but it means we will need to add strstr to M2libc
<oriansj> https://paste.debian.net/1306473/
<oriansj>and both fixes are up
<oriansj>stikonas: hopefully I got the uefi side behavior correct (if not please let me know)
<stikonas>hmm, probably not
<stikonas>there is build error in M2libc/uefi/unistd.c:322 ERROR in create struct
<stikonas>strange...
<stikonas>but it was an update to all submodules
<stikonas>could have been some other update too
<stikonas>oh, probably need to add more stuff to manual build scripts
<stikonas>anyway, I'll check in the evening
<Googulator>stikonas: possibly it's https://github.com/oriansj/M2libc/commit/fb6701a73189afca152ea1154650c315df4e6a93#r138302605
<stikonas>probably
<oriansj>probably just need to #include <sys/utsname.h> in the unistd.h prior to the unistd.c
<Googulator>assuming this is in something built via mesoplanet
<stikonas>no, this will need fixes to kaem script
<stikonas>this is way before mesoplanet
<stikonas>first build of hex2.c
<oriansj>then yes, it would be the -f ./M2libc/sys/utsname.h \ right before unistd.c
<oriansj>stage0-posix-x86 has been fixed
<oriansj>now just to fix the rest
<oriansj>well it appears fuzzing unxz.c is a *very* slow process
<oriansj>so, guess fixing the segfaults in the code is going to take a while and I will just have to figure out the M2-Planet builds issue
<oriansj>another route.
<oriansj>but I'll probably get that done over the next 2 weeks. (assuming I don't get stuck again)
<oriansj>stage0-posix-amd64 has been fixed
<oriansj>and riscv32, riscv64 and aarch64 fixes are up
<Googulator>Does unxz.c work when compiled by something more capable than M2-(Meso)planet?
<Googulator>e.g. by tcc
<stikonas>oriansj: by the way, what do you think of renaming AMD64 dir and file names to amd64 in stage0-posix
<stikonas>I think janneke was complaining that it complicates packaging right now
<stikonas>some of our things have amd64 and some AMD64
<oriansj>Googulator: yes, it builds just fine with clang and gcc (have not tested tcc yet)
<janneke>stikonas: was i? could be, i'm all for uniformity but that's not always easy?
<oriansj>stikonas: I don't have a problem with standardizing things
<stikonas>hmm, maybe somebody else then
<stikonas>but I remember somebody from guix...
<oriansj>as long as there is a single standard to follow; not a problem at all
<janneke>mes even has mostly x86, but also uses i386.scm; when it doesn't immediately hurts it takes some effort to standardize things
<oriansj>but I refuse to accept Intel's name for AMD's instruction set.
<stikonas>well, that's fine when it is internal stuff
<stikonas>but with stage0-posix as a user you might need both
<stikonas>e.g. M1 would need "amd64" in --architecture but file output is in AMD64/bin/
<oriansj>and x86 uses i386 in a few files too
<oriansj>but that is because none of the bits needed floating point or more advanced features of x86
<oriansj>we could string insensitive matching and then AM64D/amd64/Amd64 would all work
<matrix_bridge><Christoph> s/string/case/
<matrix_bridge><Christoph> * s/string/use case/
<Googulator>I don't get why "AMD64" is a better name than "x86-64"... why use a vendor-specific brand name instead of a generic one?
<Googulator>(I agree not "x64" since that's a Microsoftism)
<Googulator>btw, I already see one issue with unxz
<Googulator>in main, we null-check the variable "name"
<Googulator>but it's never initialized
<Googulator>same problem for "dest"
<oriansj>Googulator: it is the rule of who creates a thing gets to name it.
<oriansj>AMD created the 64bit standard used on x86 processors and called that standard amd64
<Googulator>didn't AMD call the standard x86_64 and its own implementation of it AMD64(R)?
<Googulator>to me, it's like "paracetamol" vs "Tylenol"
<oriansj>fair enough
<oriansj>well not exactly but close enough: https://web.archive.org/web/20120308030806/http://www.amd.com/us/press-releases/Pages/Press_Release_751.aspx
<Googulator>I remember Microsoft themselves struggled with the naming, but for a different reason
<Googulator>contract with Intel stipulating that "64-bit" must be synonymous with "Itanium" in all communications
<oriansj>well Intel called their Itanium IA64 and their 32bit x86 IA32 and when AMD's K8 was winning they released an x86_64 which was not compatible with AMD64
<Googulator>how was it not compatible?
<oriansj>the sysenter and sysexit calls were different
<oriansj>it is why gentoo still calls it amd64
<stikonas>not just gentoo
<stikonas>Debian also calls it amd64
<stikonas> https://packages.debian.org/sid/amd64/bash/download
<oriansj>another missing feature was the no-execute bit
<oriansj>and until the core2duo they were not compatible 64bit cpus
<stikonas>what happened later to sysexit and sysenter?
<stikonas>I think now everybody uses syscall and sysreturn
<stikonas>was it just abandoned?
<oriansj>Well effectively, yeah
<stikonas>oriansj: so I'm now retrying stage0-uefi with your submodule updates
<stikonas>now it goes much further but M2-Mesoplanet fails building sha256sum.efi
<stikonas>the error is "unknown host"
<oriansj>bastically only the Pentium 4 family had those 64bit instructions and none of the 64bit Linux distros really supported it
<Googulator>precursor to the AVX512 mess I guess
<oriansj>Googulator: AMD and Intel have broken each other's standards multiple times at this point.
<Googulator>AVX512 wasn't even AMD vs Intel
<oriansj>Like 3DNow and Intel using that opcode for something else
<Googulator>it was Intel vs Intel
<stikonas>oh I guess i need to set some extra command line flags now
<stikonas>oriansj: oh, that's because I was passing --operating-system UEFI
<stikonas>and your check is lowercase...
<oriansj>>.< sorry
<oriansj>I shall fix that
<stikonas>yeah, I think in other places it's uppercase
<stikonas>including in cc_spawn.c...
<oriansj>fixed and pushed
<oriansj>I really should make those places case insensitive
<Googulator>blood-elf's debug data confuses the hell out of Ghidra
<Googulator>every jump target gets labeled as a function
<oriansj>probably would make a good bug for them to figure out
<oriansj>Googulator: well guess they need to improve that ;-p
<Googulator>After manually deleting the fake functions, it reconstructs the control flow just fine
<Googulator>The segfault when running "unxz" without any parameters is caused by the uninitialized variables in main()
<Googulator>If I do specify --file and --output, interestingly I don't get a segfault, but rather a silent quit
<stikonas>oriansj: ok, it's working now, just need to update checksums
<oriansj>stikonas: thank you
<oriansj>Googulator: probably something wrong with the pointer math as +1 in M2-Planet means just that but +1 in C means different things depending upon the type
<oriansj>and anyone else heard of Haskell's Eveil Mangler perl script?
<oriansj>^Eveil^Evil^
<Googulator>    switch(global->readCur[7])
<Googulator>    {
<Googulator>        /* None */
<Googulator>        case 0: checksumSize = 1;
<Googulator>                break;
<Googulator>        /* CRC32 */
<Googulator>        case 1: checksumSize = 4;
<Googulator>                break;
<Googulator>        /* CRC64, typical xz output. */
<Googulator>        case 4: checksumSize = 8;
<Googulator>                break;
<Googulator>        default: return SZ_ERROR_BAD_CHECKSUM_TYPE;
<Googulator>    }
<Googulator>this is where we break
<Googulator>global->readCur[7] reads from the right location, but with the wrong size
<Googulator>32 bits starting at address readCur + 7
<Googulator>instead of just 8
<Googulator>& then it checks the whole 32-bit value read into ebx against the constants in the switch statement
<Googulator>which won't match
<Googulator>I have a fealing every time we use readCur[...] directly without assigning it to a variable, this is gonna bite us
<Googulator>*feeling
<Googulator>oriansj: would switch((uint8_t) global->readCur[7]) be sufficient to fix this, or will that still result in 32-bit reads?
<Googulator>also, this line is most certainly wrong: global->readEnd[0] = hold;
<Googulator>hold is an int32_t, while readEnd is an array of uint8_ts
<Googulator>the value of hold comes from an fgetc, so presumably hold was ment to be an uint8_t
<Googulator>oriansj: possibly fixed unxz: https://paste.debian.net/1306537/
<Googulator>can't test it right now in M2-Mesoplanet
<Googulator>(perhaps it would be better to fix M2-Planet instead, so global->readCur[7] reads the correct size)
<oriansj>well doing (type) thing is not valid in M2-Planet but I can just do & 0xFF and get effectively the exact same thing
<oriansj>but great work
<oriansj>don't forget to add your name to the copyright header ^_^
<Googulator>What's the best way to test compilation of programs using M2-(Meso)planet?
<Googulator>Something a bit more lightweight than live-bootstrap
<oriansj>Googulator: git clone --recursive https://github.com/oriansj/M2-Mesoplanet.git; make then git clone --recursive mescc-tools and git clone --recursive M2-Planet. Put M2-Planet, M1, hex2 and blood-elf in your ~/bin folder and ensure ~/bin is in your path will work
<oriansj>you can use GCC/clang/gcc to compile them
<oriansj>^gcc^tcc^
<oriansj>then you should be able to just do: M2-Mesoplanet -I M2libc/ -f unxz.c -o unxz
<Googulator>managed to get it done with the stage0-posix repo