IRC channel logs

2020-10-15.log

back to list of logs

<OriansJ>rekado: as in it never gets past python-minimal@3.7.4
<OriansJ>which appears to be required for guix pull to complete successfully
<OriansJ>I posted about it on #guix with a paste.debian link
<OriansJ>xentrac: yes, that is currently the only process known to enable an encrypted /boot on debian
<xentrac>:(
<OriansJ>xentrac: I provide exact steps, so that others can help me fix problems
<OriansJ>It is impossible for me to know how to fix everything myself but it is always possible for me to share exactly the method used which resulted in me arriving at that state, so that others who know differently can fix what they previously didn't know about.
<OriansJ>The big problem is it can take more than 48 hours just to do guix pull and then see it fail
<OriansJ>and it appears that guix developers generally depend upon substitutes to a degree that they never notice these build issues; until after I report them "like a whiny bitch" (direct quote from IRC)
<OriansJ>What is the point of guix challenge if one can't even get the source code needed to challenge the binaries being shipped?
<OriansJ>after 36 hours of dedicated VPS time, I've only managed to get 17.8GB of 85GB of source tarballs
<OriansJ>although it appears that I am only 30% done with all of the links; so we might get to 60GB if the average holds or about a 71% successful for valid links (and then we can check the validity of the checksums when this is all done)
<OriansJ>[These server timeouts are really slowing wget down]
<OriansJ>I'm just going to add --timeout=30 to cut that down
<rekado_>OriansJ: I cannot find the direct quote in the IRC logs
<rekado_>upstream source tarballs do disappear, and that is why we cache them all on ci.guix.gnu.org
<rekado_>that’s part of the substitution mechanism
<rekado_>that sources.json thing is pretty new and I’ve never used it. It’s likely buggy.
<OriansJ>rekado_: I am glad that all of the upstream tarballs are being cached at ci.guix.gnu.org but the substitution mechanism doesn't work if one doesn't have binary substitutes enabled.
<civodul>OriansJ: hey! did you see https://issues.guix.gnu.org/28659 ?
<OriansJ>civodul: yes and I noticed you labeled it important on 2 Oct 2017 17:16
<civodul>yes, and it's been important ever since!
<civodul>but you know, just because you rehash things won't lead to a quicker fix
<civodul>it's one issue among many that volunteers have to deal with
<OriansJ>civodul: very true and I am thankful for everyone's efforts
<OriansJ>although I wonder if there is a preference to working on "sexy new features" over the boring work of clearing out bugs
<civodul>surely there is, especially as volunteers, but we do clear old boring bugs too
<OriansJ>indeed
<OriansJ>although I find the privacy discussion odd; as one can leverage torsocks for the downloading of the source tarballs from a single centralized source while preserving privacy
<civodul>yeah, you'd need to run guix-daemon under torsocks
<OriansJ>civodul: easy to do though
<OriansJ>just a single line in: guix-daemon.service; which for people on non-guix systems is trivial. Not sure what shepard would need though
<civodul>torsocks is LD_PRELOAD, i'm not sure how this is robust to fork and all
<OriansJ>civodul: fair
<OriansJ>One can also put guix into a container and tunnel all the traffic through tor
<OriansJ>or a vm
<OriansJ>And honestly, I find the security implications of not being able to download the source a bigger danger than a centralized server knowing you downloaded a source tarball
<OriansJ>because we have tor and cheap VPS providers who take prepaid credit cards
*civodul nods
<OriansJ>if it makes the work easier, pull from a guix server first and it'll be fine. we can deal with privacy from the server later as a different class of bug
<OriansJ>But this behavior we have now is broken for new users; who just downloaded guix and nothing beyound guix --version works
<civodul>right, you made your point
<OriansJ>thank you for your time civodul
<OriansJ>oh shit, I just realized how long I have been working on stage0
<OriansJ>from 2016-05-01 til now;
<OriansJ>it has been 4+ years of hacking on bootstrapping gcc from nothing
<OriansJ>and we are stuck at the stage of we have a lisp interpreter we can't bootstrap which can run MesCC and bootstrap GCC and a lisp interpreter we can bootstrap but which can't run MesCC
<rain1>morning
<OriansJ>morning rain1
<rain1>what about making a good short term goal that's within reach?
<rain1>something we could do as a team in a couple weeks
<OriansJ>rain1: well that is a hard one; most things could be solved in a day by someone who knows what they are doing
<rain1>maybe it would be better to have information that would help onboard people then?
<OriansJ>the problem is getting that full day of developer time from a developer who actually is familiar with the code base as basically everything here is virtuall solo work
<OriansJ>rain1: sure
<OriansJ>how about mes-m2; getting started on hacking guide
<OriansJ>I even have an open issue for it: https://github.com/oriansj/mes-m2/issues/3
<OriansJ>would you like commit access rain1 ?
<OriansJ>as long as we keep it buildable by M2-Planet at every commit; the second it can run MesCC; the bootstrap will be done
<OriansJ>as one can see https://github.com/oriansj/mes-m2/blob/master/mes_init.c#L149
<OriansJ>we have a bunch of tested and working scheme primitives
<OriansJ>with a bit more done in scheme: https://github.com/oriansj/mes-m2/blob/master/module/mes/boot-0.scm
<OriansJ>I've tried to match guile's behavior in regards to both input and output
<OriansJ>I know that https://github.com/oriansj/mes-m2/blob/master/mes_macro.c needs work but I am not sure how to do it in a guile compatible way yet
<OriansJ>with export MES_CORE=0 it becomes trivial to use cgdb to walk into the interpreter and see exactly how a scheme primitive works with anything you want
<OriansJ>Thus we know if every primitive behaves correctly; in theory mes-m2 should be reasonably bug-free (or atleast easier to remove bugs from)
<nimaje>why has test/test101/hello.sh a /bin/bash shbang? (in mes-m2); seems to work fine with a normal /bin/sh shbang
<OriansJ>same reason as the rest of the tests after test 100; copy and paste was used for the header
<OriansJ>it is something that probably could be changed
<OriansJ>as most of the tests over #100 were janneke's original tests and the lower tests were primitive tests I created trying to ensure basic functionality
<nimaje>well, the makefile only executes up to 101 and as freebsd doesn't have sha256sum I currently try to build something via awk and sha256 that checks the answers
<OriansJ>nimaje: oh, I have a solution from that you can probably steal
<OriansJ> https://github.com/oriansj/mescc-tools/blob/master/sha256.sh
<OriansJ>ng0 previously did porting to the BSDs for mescc-tools
<OriansJ>and as you can see: https://github.com/oriansj/mescc-tools/blob/master/test/test2/hello.sh it can be integrated quite nicely
<OriansJ>It just requires someone to manually update the tests' hello.sh files
<OriansJ>and the make file of course
<nimaje>that sha256_check function doesn't work on freebsd; sum doesn't take arguments and sha256's -c takes a hash to compare the input file against and sha256sum doesn't exist
<nimaje>ok, awk '{ rc=system("sha256 -c "$1" "$2); if (rc != 0) { exit rc } }' test/test.answers should work
<nimaje>oh, there is shasum, why is it not in the see also section of sha256's man page?!
<nikita`>sounds like an addition to the manpage..
<OriansJ>and if the script doesn't work on freebsd, that would sound like a fix is needed for mescc-tools as well
<nimaje>oh, that shasum tool was installed by some perl package
<OriansJ>as mescc-tools is a requirement for MesCC to work
<nimaje>I wonder which docs the person who wrote sha256_check read to think sha256 -r -c "$1" on freebsd is similar to sha256sum -c "$1" on linux
<nimaje>tests for mes-m2 seem to pass on freebsd, tests for mescc-tools fail with ELF binary type "0" not known.
<OriansJ>nimaje: probably a netbsd user who write it for netbsd
<nimaje>the first version checks with uname for freebsd, so that person must have found some docs that made them believe it would work
<OriansJ>nimaje: that would be ng0 who wrote the first version
<OriansJ>and I would be the idiot who possibly broke it later
<OriansJ>So I take any blame for it being wrong
<nimaje>yes, removing the os check for sum makes it more broken on freebsd as sum on freebsd only calculates CRC
<nimaje>but sha256 -r -c "$1" on freebsd check if the hash of stdin is $1 instead of using $1 as list of checksums to check
<OriansJ>nimaje: can you verify that "get_machine --OS" returns the correct results?
<nimaje>get_machine --OS returns FreeBSD
<OriansJ>good
<OriansJ>so we can use ./bin/get_machine --OS and not have the external dependency on uname to enable seperate behaviors for FreeBSD
<OriansJ>and the failing with "ELF binary type "0" not known" is because it attempts to run manually generated Linux Binaries
<OriansJ>if you notice: elf_headers/elf32.hex2
<nimaje>well, linux binary labled as sysv abi; adding brandelf -t Linux test/results/test1-binary and enabling linux emu results in a segfault, but no idea how to debug linux coredumps on freebsd
<OriansJ>nimaje: well because these are simple programs; readelf -h $file (to get the entry address) and then gdb b* $address; followed by si and ni to step through it
<OriansJ>^gdb^gdb $file then using b* $address to start at the entry point after you type run $args^
<OriansJ>where $file is the name of the binary you wish to debug; $address is the Entry point address of $file and $args are the arguments used to run the program
<OriansJ>we can also disale the tests for the BSDs if you don't want to mess with making the generated binaries work on BSDs; as only the checksums of the output files being exactly correct matters
<OriansJ>The key points of mescc-tools are M1 and hex2 should always output the exact same outputs when given the same inputs with the same arguments
<OriansJ>anything that violates it is a priority #1 bug that I will spin up machines/vms/containers/etc just to try to figure out why and make changes to eliminate it.
<OriansJ>If the generated binaries don't work, it is because I wrote something wrong in them (maybe something linux specific that depends upon on undefined behavior); which would prompt potential changes in M2-Planet (to make sure those binaries don't trigger such behavior)
<nimaje>so, currently I disable trying to run the test binaries by setting GET_MACHINE_FLAGS=--OS (yes that's hacky) and try to get sha256.sh working as expected
<OriansJ>might I suggest stealing an idea from M2-Planet?
<OriansJ> https://github.com/oriansj/M2-Planet/blob/master/test/test1000/hello-x86.sh#L64
<OriansJ>the string for linux is "Linux"
<OriansJ>which can be combined with get_machine's --override
<OriansJ>in the manner of (export GET_MACHINE_FLAGS="--override x86") or anything honestly (export GET_MACHINE_FLAGS="--override 'I am the very model of a modern major general'")
<nimaje>yes, tests in mescc-tools also have that if, that's why GET_MACHINE_FLAGS=--OS works
<OriansJ>indeed
<nimaje>ok, checksums seem to match for tests of mescc-tools if run on freebsd, I just had to get sha256.sh working correctly https://0x0.st/iGdn.diff
<nimaje>to get those binaries running e_ident[EI_OSABI] would have to be set to 3, which would change all checksums
<OriansJ>true and if that change doesn't break anything; we can make that change (with testing on all systems where it currently works with 0)
<OriansJ>did you want just nimaje as your commiter name or something else?
<nimaje>nimaje is fine
<OriansJ>as you desire
<OriansJ>nimaje: your commit has been incorporated; thank you for the patch
<nimaje>ok and it seems like lldb output suggest that the segfault comes from 0x600078: jmp 0x600106 https://0x0.st/iGnQ.txt
<OriansJ>which if you look at test/test1/hex.M1
<OriansJ>that would be the JMP32 %_start
<OriansJ>so when you use gdb does it not arrive at _start?
<OriansJ>also it would be run < test/test1/hex0.hex0 > test/test1/proof1 as just run would result in everything you type would be input until you do ctrl-d
<OriansJ>(with readline requiring [ENTER] prior to sending the chars unless you put it into RAW mode)
<nimaje>yes, it seems like it doesn't even arrive at _start for some reason
<OriansJ>which means that hex2 is putting the wrong offset or that _start moved for some reason
<OriansJ>but if the checksum matches, hex2 can't be putting the wrong offset
<OriansJ>So why would _start not be at the right place?
<nimaje>(or freebsd's linux emulation does something odd)
<OriansJ>nimaje: well that is the thing linux emulation, would just be emulating the syscalls; not rewriting the binaries themselves
<OriansJ>perhaps something the FreBSD devs could help us with?
***stikonas_ is now known as stikonas
<nimaje>that's why I put it in (), I don't belief that linux emulation is at fault, but who knows
<OriansJ>nimaje: well there are only a few places where things can break: 1) the M1 definitions are wrong (in this case DEFINE JMP32 E9 ) 2) hex2 put the wrong displacement (matching checksum indicates unlikely) 3) OS tampering with binary (generally unlikely but something definitely not ruled out) or 4) hardware is not behaving according to specification (Would x86 behave differently for FreeBSD vs Linux? I think unlikely)
<OriansJ>ruling out 1 and 4; only leaves 2 and 3 as possible causes
<OriansJ>and if you do: ndisasm -k 0x600000,0x78 -o 0x600000 -b 64 ./test/results/test1-binary
<nimaje>ndisasm gives me https://0x0.st/iG5n.txt should be fine?
<OriansJ>well this is what I got https://paste.debian.net/1167341/ ; which has the same sha256sum as yours does; I am guessing identical (and no diff output)
<OriansJ>so based on the disassembly, it is jumping to the correct address and when stepping the binary what actually causes the exception?
<nimaje>it really segfaults at that first jump