IRC channel logs

2024-02-07.log

back to list of logs

<stikonas>Googulator: I'm looking at fiwix file list PR, looks good I guess but it's really builder-hex0 specific...
<stikonas>we'll need something different for UEFI then
<Googulator>same for kexec-fiwix
<Googulator>which uses the same technique to locate the initrd
<stikonas>well, yeah...
<stikonas>I guess we'll have to clone/fork the whole file...
<stikonas>well, kexec-fiwix will be completely different anyway
<stikonas>as we need to shut down UEFI services, etc...
<stikonas>but this initrd stuff kind of does 2 things: 1. reads files and 2. creates initrd
<stikonas>2 will be the same I guess...
<stikonas>but perhaps it's infeasible to split 1 and 2...
<stikonas>hmm, "cp" command doesn't work in UEFI. Both posix-runner.efi cp and cp.efi hang...
<stikonas>yet catm works
<oriansj>Googulator: well the good news is & 0xFF does not change the test result for GCC built unxz.c; bad news is it didn't result in a successful test after M2-Mesoplanet build. and you included a mistake fgetc returns an int not a char.
<Googulator>actually, I've gotten further
<Googulator>as in, a lot further
<oriansj>good
<Googulator>*it actually works*
<oriansj>oooh
<Googulator>I just need to clean it up
<oriansj>ok
<oriansj>oh and LzmaDec_InitStateReal where the global->reps[0] =global->reps[1] =global->reps[2] =global->reps[3] =1; is undefined behavior in M2-Planet; splitting it up into global->reps[0] = 1; global->reps[1] = 1; global->reps[2] =1; global->reps[3] = 1; should be the correct equal
<Googulator>Seems like it still has some issues: small xz files extract fine, but linux-4.9.10.tar.xz is truncated
<Googulator>only the first 0xC001000 of output is written
<Googulator>this issue also happens with the gcc-compiled version
<oriansj>well lets get small file unxz.c commited and then later we can work out the exact bugs
<Googulator> https://github.com/oriansj/mescc-tools-extra/pull/19
<Googulator>linux-4.9.10 extracts successully after another small modification :)
<Googulator>I corrected what I presumed to be a typo in the copyright header
<Googulator>2021 doesn't seem realistic
<oriansj>merged
<oriansj>you probably want to use: https://paste.debian.net/1306550/ instead of the constant 4 (which would cause the wrong behavior in gcc and make comparison testing harder)
<Googulator>the "p" pointer was used as a workaround for that
<Googulator>it's a uint8_t *, so it behaves the same in m2 and gcc
<Googulator>also, "wrap" currently doesn't build with m2-planet
<Googulator>"Unknown type mode_t"
<oriansj>it should now
<oriansj>(with the updated M2libc)
<Googulator>228MiB init.img for bare metal ;)
<ekaitz>guys, my talk is published: https://fosdem.org/2024/schedule/event/fosdem-2024-1755-risc-v-bootstrapping-in-guix-and-live-bootstrap/
<ekaitz>also stikonas some day you gotta send me an audio pronouncing your name
<ekaitz>hehe
<ekaitz>i was kind of afraid to butcher the surname
<fossy>ekaitz: nice talk, good work!!
<ekaitz>thank you fossy !
<oriansj>Googulator: nice
<Guest25>Hi all
<Guest25>it seems we succesfully built 32 bit musl rootfs with live-bootstrap
<Guest25>but it only works with python script called "rootfs.py"
<Guest25>is it working for amd64?
<Guest25>it reaches until tcc-mes /usr/lib/mes/crt1.o lib/linux/x86_64-mes-gcc/crt1.c and does nothing? Will it continue?
<janneke>Guest25: 64bit is not supported yet
<Guest25>ok, is there any method to build with glibc or we have to build with musl?
<Guest25>also is it possible to build without using python?
<mid-kid>You can build glibc from the finished live-bootstrap system but it's not covered by the scripts
<mid-kid>Similarly, you can bootstrap an amd64 system by building a cross-compiler in a finished live-bootstrap system and then building the compiler for amd64
<mid-kid>But none of this is directly supported by the project - linux from scratch's chapter 5,6 and 7 might help.
<Guest25>ok, thank you. We have a hackaton on February 9 where we try to bootstrap a c compiler, how can we help to implement bootstrapping of  64 bit compiler? Is the problems known?
<mid-kid>I don't remember exactly, it's been some 4ish years since I tried
<mid-kid>and the live-bootstrap has changed a lot
<Guest25>is there any other option than live-bootstrap, because it seems too messy?
<mid-kid>what I do know is that I started with linux from scratch's instructions and that got me going
<mid-kid>Guest25: live-bootstrap is currently the only project that achieves a full bootstrap from a single 512 byte binary.
<mid-kid>Anything else relies on more bootstrap binaries.
<mid-kid>It depends on what you're trying to do really.
<mid-kid>If you just want to build a new linux system/toolchain from source, linux from scratch is that.
<mid-kid>but you need a working, modern C compiler to follow its instructions
<Guest25>i want to have really trusted environment where  I can build a linux from scratch
<Guest25>this is why stage0 is needed for me
<mid-kid>then keep hacking on it I guess
<Guest25>also can someone explain why 64 bit is not supported yet?
<janneke>Guest25: 64bit is not a priority, in fact it's completely optional; like mid-kid says in the cross-compile bootstrap phase we compile from 32bit to 64bit gcc
<Googulator>unxz is looking good, but also bad
<Googulator>tested on bare metal with binutils-2.30 and linux-4.9.10 included as tar.xz
<Googulator>binutils worked fine, but Linux went OOM when trying to extract in Fiwix
<Googulator>even though both are lzma2:26
<ismael>hi
<oriansj>ismael: hello
<oriansj>Googulator: so it is using too much ram?
<Googulator>It seems so.
<Googulator>If it were lzma2:37, I would understand
<ismael>so... where's the gcc 4.7 fork?
<Googulator>but lzma2:26 shouldn't run OOME
<Googulator>*OOM
<Googulator>and it seems to allocate more and more for larger files
<Googulator>peak allocation for linux-4.9.10 is 960MiB
<Googulator>for binutils it's 288MiB
<Googulator>both use the same compression mode (dictionary size), so the allocation should be amortized constant, not growing linearly with size
<Googulator>gcc-built version uses about half the memory, but still scaling linearly with file size
<Googulator>oriansj: m2libc's memmove isn't supposed to allocate memory, right?
<oriansj>Googulator: it doesn't allocate anything
<oriansj>perhaps a defect in unxz.c doing calloc repeatedly somehow?
<Googulator>is there a good tool for graphically viewing core dump files?
<Googulator>oriansj: https://github.com/oriansj/mescc-tools-extra/pull/20
<Googulator>GCC version now shows identical memory consumption between binutils and linux, M2 one shows _some_ growth in Linux, but well within what one could call "amortized constant" behavior
<oriansj>meged
<oriansj>^meged^Merged^
<Googulator>thanks
<Googulator>oriansj: keeping ".lzma" support was the right call
<Googulator>coreutils-6.10 is available only as gz or lzma
<Googulator>& there's a huge saving in the lzma version
<Googulator>Locally updated all pre-network sources to xz / lzma wherever available (plus a few missed gz -> bz2 options), bare metal image dropped from 283MiB to 210MiB
<Googulator>unfortunately I still got OOM when decompressing the Linux kernel sources - now trying an alternative "pipeless" way where unxz outputs to a file, then it quits, and finally tar reads that file
<Googulator>as opposed to piping stdin/out
<stikonas>well, Guest25 is already logged out but in case they read the logs, we are working on 64-bit bootstrap, at least I am trying to get it working
<stikonas>and we aren't that far... tcc-mes already builds, but is a big buggy... Once that is sorted and tcc-mes runs, it should be not too bad to finish the bootstrap
<stikonas>unlike riscv, x86_64 was fairly well supported by those versions of "old" software that we are building
<Googulator>Does riscv have multiple pointer formats?
<Googulator>as in, reading 32 bits from address 0x1000 accesses a different physical memory cell than reading 8 bits from 0x1000
<Googulator>AKA "word addressing"
<stikonas>no idea... it does have instructions ot read 8 bits or 32-bits...
<stikonas>but why would it be a different physical memory cell?
<Googulator>some RISC architectures do that
<stikonas>according to wikipedia Almost all modern computer architectures use byte addressing, and word addressing is largely only of historical interest
<Googulator>0x1000 is the 4096th byte of memory when used for 8-bit addresses, but the 4096th word when used for 32 bit
<Googulator>so riscv64 doesn't do that then?
<stikonas>no, nothing like that
<Googulator>good
<Googulator>because it would complicate the unxz work
<stikonas>it's byte addressed
<stikonas>it's fairly sane at assembly level
<stikonas>only instruction encoding is tricky...
<Googulator>Is it worse than Itanium?
<stikonas>I don't know itanium, but probably
<stikonas>i.e. on x86 constants are just little endian
<Googulator>Itanium's instruction encoding was infamously brain-dead
<stikonas>on riscv constants are using one of the 4 messy encodings with bits shuffled all around the plcae
<Googulator>"EPIC" encoding... basically VLIW on Krokodil
<stikonas>Googulator: e.g. see here https://i.stack.imgur.com/MUKIE.png
<stikonas>let's say branching instruction
<stikonas>first you have bit 12 of the number , then bits 10 to 5
<Googulator>oh, so it's just intermediates that are all weird
<stikonas>then some opcodes/registers
<stikonas>abd then bits 1-4, and finally bit 11
<Googulator>that's pretty much normal fare for RISC archs
<stikonas>yeah, but it makes hard to write hex0 code
<Googulator>I thought in-memory values were also all weird
<Googulator>at least instructions themselves are not weirdly intertwined with each other, like on Itanium
<Googulator>AFAIK the goal there was to have the CPU execute an entire "block" of 3 intertwined-encoded instructions all at once
<stikonas>yeah, that's horrible
<stikonas>wasn't it too hard for compilers to optimiza?
<stikonas>optimize?
<stikonas>which is why Itanium more or less died...
<Googulator>that was pretty much what killed it indeed
<Googulator>your normal, "development" compiler would just encode each instruction into its own block with 2 NOPs, so each block would really be a single instruction
<Googulator>then, when you were done & ready to build a fully optimized version, you would send it to a mainframe-sized Itanium that would spend a helluva lot of computation on generating optimized code
<Googulator>in reality, the "fully optimizing" compilers never really got written
<Googulator>& in retrospect, that compiler would basically have to have been an AI
<oriansj>Googulator: I am assuming that you are using TCC to compile as M2-Mesoplanet isn't building unxz.c yet.
<Googulator>No, I added unxz.c to the mescc-tools-extra build scripts
<Googulator>& then building the stage0-posix way
<Googulator>it does build using mesoplanet
<oriansj>yes, it builds but the resulting binary when built is not passing my test files which I created when preparing it.
<Googulator>hmm
<Googulator>what test files are you using?
<Googulator>btw, sent another pr: https://github.com/oriansj/mescc-tools-extra/pull/21 (still testing this one)
<fossy>would have liked to know what Guest25 considers messy about live-bootstrap. have been focusing on making it less "messy" in the past few months, and i thought that the process was fairly comprehensible by now...
<oriansj>I made a couple files via tar cavf and unxz unpacks them perfectly when I build via gcc/clang/tcc but the builds have not been successful thus far
<Googulator>I have my stage0-posix directory set up like this: https://gist.github.com/Googulator/d2970b95f953376504ebfd509f6bef21
<Googulator>& it seems to work
<Googulator>oops, there's actually a regression in that last PR
<Googulator>broke .lzma support
<oriansj>well it appears we are on the same commits
<Googulator>and you're testing on x86, right?
<Googulator>pushed the regression fix to https://github.com/oriansj/mescc-tools-extra/pull/21
<Googulator>oriansj: are you getting a sha256 of 933731db23ec9adebade815554252dcd68281f5de4f71c0617ccace1ec09a0c7 for unxz?
<Googulator>that's the version I'm testing in live-bootstrap rn
<oriansj>when I do a x86 build I get 29c3eface498ee9a5078dc17101dbfd7ab3f5a96d5c938c76187507fc72e68e5 which works and ecbf9115cac3451afb42756f89b57253a11887819ff5ce8d5c84e0facd8716c0 for my AMD64 build which segfaults on the same file
<Googulator>I didn't test on AMD64 so it could still be broken there
<Googulator>in fact, I'm pretty sure I know what's breaking there
<Googulator>> dicl[diclPos] = (0xFF & symbol) | (0xFFFFFF00 & dicl[diclPos]);
<Googulator>this pattern
<Googulator>on 64-bit archs, it needs to be dicl[diclPos] = (0xFF & symbol) | (0xFFFFFFFFFFFFFF00 & dicl[diclPos]);
<oriansj>you need to do ~0xFF
<Googulator>does ~0xFF work in M2-Planet?
<oriansj>it should
<oriansj>yeah, it loades the immediate 0xFF and then does NOT R0 R0; so yeah, it'll flip all bits to 1 execpt the bottom 8 which will be zero
<oriansj>do (~0xFF) if you are concerned about ordering
<Googulator>seems to not break x86 at least
<Googulator>unfortunately AMD64 doesn't even build
<Googulator>fails in m2libc
<oriansj>odd
<Googulator>actually it's failing in hex2-linker-1
<Googulator>./M2libc/amd64/linux/unistd.c:186:ERROR in create_struct
<Googulator> Missing {
<oriansj>(unless you did (0xFFFFFFFFFFFFFF00 which super doesn't work))
<Googulator>I did (~0xFF)
<Googulator>but it doesn't even reach building unxz
<oriansj>are you seeing reading file: <sys/utsname.h> prior to unistd.c?
<oriansj>as the stage0-posix-amd64 should be updated to use the new M2libc file
<Googulator>that was it
<oriansj>yeah, any struct name not recognized is considered a struct definition in M2-Planet
<oriansj>which is why struct foo {..} is the supported pattern not struct {...} foo;
<Googulator>OK, with (~0xFF), no segfault, but wrong output
<oriansj>well the checksum is expected to change with code changes
<oriansj>or do you mean unxz now is not producing correct output?
<Googulator>yes, the output is wrong
<Googulator>in fact, it seems to be outputting an empty file
<Googulator>on amd64
<oriansj>well atleast we found it early
<Googulator>on x86, it works
<oriansj>and the assembly should be very close