IRC channel logs

2021-03-22.log

back to list of logs

<stikonas>fossy: unless there review comments to address, I think https://github.com/fosslinux/live-bootstrap/pull/75 is good now
<stikonas>CI passes after I removed that earlier bash
<pder>stikonas: I think the bash-3.2.57 directory can be removed as part of your gcc branch
<stikonas>oh indeed...
<stikonas>I forgot to clean it up
<stikonas>pder: thanks PR updated
<stikonas>in case you want to build more stuff, and for the avoidance of clashes, I'll do libtool next
***stikonas_ is now known as stikonas
<stikonas>argh, now bash failed in CI again...
<stikonas>maybe something is not fully reproducible...
<fossy>stikonas: it could be the seed kernel differing. What seed kernel are you using
<stikonas>fossy: self-compiled kernel (I'm on Gentoo)
<fossy>I can try Ubuntu kernel tonight
<fossy>mkay
<stikonas>I was thinking maybe build process is a bit underterministic
<stikonas>and sometimes you get one hash, sometimes another
<stikonas>(with bash 5.1
<fossy>of bash?
<stikonas>yes
<stikonas>bash 5.1 in particular
<stikonas>not e.g. that 3.2.57...
<stikonas>we also had some minor issues like that with autotools, that's why I added MAKEINFO=true everywhere
<fossy>possible, what would be the difference though -- I got the same hash as you when I compiled bash from master
<stikonas>well, maybe it's not too undeterministic and there are e.g. two outcomes only
<stikonas>so not unlikely that you would also get the same
<stikonas>so last CI build passed
<stikonas>and the only difference is that I removed unused directory
<fossy>possible
<stikonas>I can try to run it a few more times, although not today...
<fossy>ill run a few times today and see
<stikonas>one possible way to workaround such problem if we can't fix it, is to retry failed steps
<stikonas>might be reasonably easy to do that with that build function
<pabs3>bauen1: your code review thoughts sound like crev from the Rust community https://github.com/crev-dev/crev https://github.com/crev-dev/cargo-crev
<stikonas>or maybe we need to use older bash version that builds better
<fossy>even 4.x would be great
<stikonas>yeah, 4.x gets us most
<stikonas>don't think there is much in 5 that we need
<stikonas>it has some extra variables defined and more precise timestamps
<stikonas>might be easier to just run single bash build from bootstrapped system for testing
<stikonas>rather than rerun the whole 30 minute thing...
<stikonas>that might help us to determine which hashes are possible and how frequently
<fossy>yeah that was my plan
<pder>Ill see if I can reproduce the bash checksum fail. Maybe we could do a binary diff and get an idea on whats different.
<stikonas>hmm, 10 short builds and same checksum..
<stikonas>maybe you need full build...
<stikonas>(short meaning just bash)
<pder>do you have a copy of the bash binary that failed?
<stikonas>no, it failed on CI
<stikonas>well, probably worth running a few long builds
<pder>interesting, so through qemu I got a failure with bash checksum
<pder>maybe something to do with autodetection of host in bash configure?
<fossy>lol i enjoy we can say "bison being too new"
<fossy>now
<fossy>stikonas[m]: jus to check
<fossy> https://github.com/fosslinux/live-bootstrap/pull/75/files#diff-8afdb3952fba828962f274c6dd9c3fc0e0054c899638bc9e2524c8ec4b6e9a69R22
<fossy>here
<fossy>this is cause Makefile.in is handwritten yeah (it is according to source) so we don't need automake
<fossy>NEAT
<fossy>/after/kexec-tools-2.0.21/build/kexec-tools-2.0.21 # kexec/kexec -l /kernel
<fossy>36/after/kexec-tools-2.0.21/build/kexec-tools-2.0.21 # echo $?
<fossy>0
<fossy>now just to get -e to work
<fossy>which might be harder
<fossy>well my reboot() impl for mes dosent work :|
*fossy can taste the finish line of this
<fossy>i had the wrong syscall number :P
<bauen1>pabs3: yes, that looks very similiar, but i image that for the bootstrap you might not worry about signatures and just uses hashs since you're supposed to review all code anyway (not saying you can't use signatures to distribute the code review)
<bauen1>*imagine
<fossy>hm
<fossy>well it now -e's
<fossy>but then qemu just.. exists?
<fossy>exits*?
<pder>fossy, stikonas: do you know of an easy way to obtain a copy of a file from a qemu run of live-bootstrap?
<stikonas>pder: probably mount some block device into it...
<stikonas>but yes, I also got different hash there...
<stikonas>in qemu
<stikonas>what hash have you got by the wya?
<stikonas>(mounting probably has to be done on qemu startup)
<pder>I don't know, since the kernel just panics
<stikonas>oh, I just added || true on that bash build line
<stikonas>and then got cf0d590ac6187d3c5b9da63dd2f67296e1c1bed564e7cb3dbcd8075d984d98c6 /after/bin/bash
<pder>how are you computing the hash from qemu?
<stikonas>from interactive shell
<stikonas>sha256sum /after/bin/bash
<stikonas>if you add || true to bash build, then it will consider it successful anyway
<stikonas>so you'll get to shell at the end
<stikonas>but I only ran it once, so I don't know if it's always the same hash
<pder>Ill try that and add a hard drive to qemu so I can mount it and copy it there
<stikonas>maybe there is a difference between qemu and chroot only
<pder>I think we ran into that with bash-2.05
<pder>Had to specify host
<stikonas>I'm on 64 bit kernel though even in qemu
<pder>same here
<pder>I'll see what hash I get with qemu
<pder>I would also like to diff the two binaries
<stikonas>pder: oh, mounting from inside qemu will be non-trivial
<stikonas>we don't have mount utility yet
<stikonas>basically one has to write a small C program to mount things
<stikonas>as per "man 2 mount"
<pder>I was thinking of just copying busybox into our starting image and having qemu include an image
<pder>Then I can use busybox for mkfs, mount, etc
<stikonas>well, that would work too
<pder>netcat might be a useful utility to have in our bootstrap even if it isnt necessary
<pder>but I guess we have no way of configuring the network yet either
<pder>I noticed bootstrap.log is missing everything that is printed on stderr which includes compiler warnings
<pder>Probably should change line 280 to sudo PATH="/after/bin:${PATH}" chroot . /init 2>&1 | tee "$LOGFILE"
<pder>of rootfs.sh
<stikonas>yeah...
<stikonas>and 2nd qemu run here resulted in the same checksum
<pder>hmm, my qemu run just now gave me the same checksum as in checksums file
<stikonas>could be one of the two...
<stikonas>independent of qemu or chroot
<stikonas>just some randomness injected somewhere which results in two choices
<pder>now its just a matter of getting a copy of that binary to inspect the differences
<stikonas>probably some timestamp...
<stikonas>binaries have the same size
<pder>not sure, it seems like if it were a timestamp we would never get the same hash
<stikonas>hmm, indeed...
<pder>Im hoping for another failure with qemu. I have copied busybox to the initrd and added -hda myimage.img to qemu
<pder>I got the failure, now just how do I copy this bash binary to the host...
<pder>stikonas: same checksum that you had above
<stikonas>so can you mount external drive using busybox?
<pder>I suppose I need to create /dev device manually?
<stikonas>I suppose...
<pder>I passed -hda myimage.img to qemu
<stikonas>with mknod, we have that already from coreutils
<pder>I dont see anything obvious when I run busybox dmesg
<pder>Maybe my kernel doesnt have support, or its only as a module?
<stikonas>hmm
<stikonas>the other option is to try to get the other checksum from rootfs chroot
<stikonas>because it doesn't look like qemu vs chroot is the only thing that determines it
<pder>so far, I have only seen this checksum fail with qemu. Have you seen it running rootfs.sh chroot?
<stikonas>not sure... I remember adjusting checksum once in chroot... But I don't know if it's this issue or something else
<pder>stikonas: so I think I have a technique to copy /after/bin/bash out of qemu. I run qemu within tmux with a large scrollback buffer. Then I can run busybox uuencode /after/bin/bash - to write to stdout. Then ^b : capture-pane -S 60000 and ^b save-buffer bash.txt
<stikonas>oh ok
<pder>should be capture-pane -S -60000
<stikonas>well, if you can try to find diff between two binaries, that would be interesting to see
<pder>thats what I'm hoping. I already have a copy of bash with the original checksum
<pder>such a strange issue
<lfam>Hey, I have an off-hand question. Does anyone know if the Guix bootstrap, as it exists currently, is i686. That is, 32-bit Intel-compatible architecture?
<pder>stikonas: https://paste.debian.net/1190539/
<pder>not sure what to make of it. Maybe I need to compare each object file with checksum pass build vs checksum fail build
<stikonas>lfam: yes
<lfam>Thanks stikonas
<stikonas>pder: so just 1 bit...
<stikonas>probably some flag...
<stikonas>lfam: live-bootstrap is also x86 for now
<lfam>By x86, you mean 32-bit Intel?
<stikonas>well, yes
<stikonas>or 32-bit AMD...
<stikonas>but 64 bit can run 32-bit code
<stikonas>so 32-bit bootstrap is suitable for 64-bit machines too
<pder>This might be a better clue: https://paste.debian.net/1190540/
<lfam>Intel owns that IP :)
<stikonas>hmm, I wonder which one bash we want...
<lfam>Yes, I know that x86_64 can transparently emulate the old 32-bit mode
<lfam>Thanks for your help!
<stikonas>with 0x200 or 0x1000
<stikonas>well, not much help... just a single question...
<pder>redir.c seems to be the file and the function is do_redirection_internal
<stikonas>pder: maybe worth comparing that function with old bash
<stikonas>although, didn't we get mismatch even when building with tcc...
<stikonas>I thought it failed on first pass, or was it 2nd...
<pder>This is the first issue I've seen like this and this is with gcc
<pder>I was able to capture a tar.gz of the bash build directory and saved it on my machine
<pder>I am wondering if there are any other files that differ in the build directory
<stikonas>probably .o files
<stikonas>from redir.c->redir.o
<pder>stikonas: builtins/pipesize.h differs between the builds. In the checksum passing version PIPESIZE is defined as 65536 and in the checksum failing version it is defined as 512
<pder>now we just need to figure out where that is set
<pder>65536 = 0x10000 and 512= 0x200 like we saw in the assembly diffs
<stikonas>ok, that looks promising
<stikonas>not sure why pipesize is different...
<stikonas>isn't there some mkbuiltins script
<stikonas>oh, it's mkbuiltins.c
<stikonas>no, that's actually written by psize.sh
<stikonas>looks like race condition
<stikonas>there is sleep 3 there
<stikonas> https://git.savannah.gnu.org/cgit/bash.git/tree/builtins/psize.sh#n37
<pder>maybe having mktemp will help?
<stikonas>it definitely might
<stikonas>is that coreutils?
<stikonas>maybe then need to rebuild coreutils before bash
<stikonas>it's not yet in coreutils 6.3
<stikonas>need something newer
<stikonas>I think we need 7.1
<pder>interesting
<stikonas>it might be good to get newer coreutils anyway
<stikonas>and this time it will be all of them...
<stikonas>not just those we manually picked
<pder>that sounds good. hopefully autotools is not an issue
<stikonas>it is
<stikonas>autoconf is fine, but it needs automake 2.10
<stikonas>which I can build but it is borken
<stikonas>broken (I think due to perl)
<pder>maybe perl needs a rebuild with gcc?
<stikonas>well, maybe can patch it to require older automake
<stikonas>I don't think that's enough
<stikonas>but it might be easy to fix automake requirement
<stikonas>it might be due to dist-xz
<stikonas>in configure.ac
<pder>could we do roughly this order? gcc pass1, gcc pass2, coreutils, bash?
<stikonas>yeah, I think we can
<stikonas>well, I need to fix remaining comments in gcc PR...
<stikonas>and then add coreutils (hopefully not too hard to patch automake)
<stikonas>although, longer term, perl needs to be sorted out
<stikonas>but hopefully, that will be easier with GCC
<pder>I also wondered with perl if we can look at metaconfig which is supposed to generate a Configure script
<pder>another idea is to use your build script that you created for sha256sum and coreutils 6 and update it to use coreutils 7 plus add mktemp
<stikonas>well, I think patching out automake stuff will be easier
<stikonas>but yeah, looking at metaconfig might be possible now
<stikonas>and if not, we can still try to manually build perl 5.8 instead
<stikonas>maybe it will work better
<stikonas>fossy: what is generated in intl/?
<stikonas>I can't find anything there
<stikonas>gcc/po indeed ships some prebuilt mo files