IRC channel logs

2021-10-06.log

back to list of logs

<stikonas>oriansj: maybe the easiest solution is to create submodule symlinks in $ARCH/ directories, e.g. x86/mescc-tools -> ../mescc-tools and adjust kaem scripts to assume $ARCH is the root dir
<stikonas>(the problem is that when live-bootrap starts, we have no means of changing current directory, so $ARCH dir has to be the root directory
<stikonas>then live-bootstrap can basically just copy $ARCH directory into /, replace symlinks with actual directories and would just work
<oriansj>clemens3: the issue with mes is mes hasn't been updated to be compatible with the latest release of mescc-tools (as it required a breaking change for RISC-V support) If you just use one release previous it will work just fine
<oriansj>stikonas: well mescc-tools-extra uses the full kaem, so we certainly have a great many options to sort it out, including just replacing the mescc-tools-extra kaem script
<stikonas>well, yes, replacing mescc-tools-extra kaem script might work too
<stikonas>I'm just trying to think of something that would require least maintenance in the future
<stikonas>hence that symlink idea
<stikonas>i.e. making $ARCH directories fully self-contained
<stikonas>not just split bin files
<stikonas>but that's just one option...
<oriansj>is there a problem with the root of live-bootstrap just being the stage0-posix tarball extracted and the after.kaem replaced?
<oriansj>because I can trivially change the scripts up to not have to do the cd $ARCH bit
<stikonas>well, yes, we basically need to enter x86 directory and run kaem there
<oriansj>So if I remove the need to enter the x86 directory, problem solved?
<stikonas>I think so
<stikonas>we did quite ugly hack before
<oriansj>I'm game for that change after I finish my isolation work
<oriansj>Then all you would have to do is replace after.kaem
<stikonas> https://github.com/fosslinux/live-bootstrap/blob/master/sysa.py#L61
<oriansj>yeah, I'm am going to eliminate all of that complexity for you
<stikonas>so we have then kaem files in rootfs and all paths will contain $ARCH in kaem?
<stikonas>s/rootfs/repository root/
<oriansj>we first change the bootstrap-seed kaem in the following way: instead of kaem.run it becomes kaem.$ARCH is the name of the script it will run
<oriansj>then populate the root of stage0-posix with the inital scripts and the rest can be self contained for each architecture
<stikonas>yeah, that sounds good, although will need some recalculation in kaem-minimal.hex0
<stikonas>shouldn't be too bad as it's just .data section...
<oriansj>unfortunately but then live-bootstrap gets a no touch setup
<stikonas>yes, so longer term it should be better
<oriansj>just overwrite after.kaem and boom go
<oriansj>better some short term pain for longer term benefits
<stikonas>sounds good. Let's also get fossy's opinion on this
<stikonas>but I think we'll have an agreement
<oriansj>in the mean time, I'll get back to work on RISC-V isolation.
<stikonas>hopefully it's not much harder than others... risc-v is slightly different due to kaem-micro...
<oriansj>well lack of ability to edit kaem.run to change the first couple steps is less than ideal
<stikonas>yes...
<stikonas>well, we can always replace bootstrap seed with larger kaem-minimal
<fossy>It would be *very* nice to not need to manipulate that extraction
<fossy><stikonas> well, we can always replace bootstrap seed with larger kaem-minimal
<fossy>wdym?
<fossy>I def want the hex0 kaem as the kaem seed
<fossy>The plan sounds good though generally
<fossy>Ideally the sysa rootfs would just be stage0-POSIX with bootstrap-seeds added and after.kaem and after/ added
<stikonas>fossy: risc-v has a smaller kaem
<stikonas>kaem-micro
<stikonas>which is not really a shell
<stikonas>fossy: https://github.com/stikonas/stage0-posix/blob/master/riscv64/GAS/kaem-micro.S (GAS prototype)
<stikonas> https://github.com/stikonas/stage0-posix/blob/master/riscv64/kaem-micro.hex0
<stikonas>it's just hardcodes "hex0-seed hex0_riscv64.hex0 hex0" "hex0 kaem-minimal.hex0 kaem" and finally "kaem"
<stikonas>it's 361 bytes then
<stikonas>but editing those commands is somewhat harder...
<stikonas>you have to ascii encode your commands and adjust some pointers
<oriansj>well right now hard-coded kaem-macro might be smaller and is probably the optimal minimal bootstrap shell. (probably could exec that last command too) but right now the flexiblity of a kaem-optional that reads a script will probably better serve us while we sort out some details between stage0-posix and live-bootstrap
<oriansj>So lets keep kaem-micro in our back pocket (possibly find a way to shrink it smaller and port to the other architectures as well) and once stage0-posix is the perfect drop in for live-bootstrap with all of the details worked out, we then will transistion to using it for all architectures.
<pabs3>re the developer trust stuff, I think this is a better model (social code review): https://github.com/crev-dev/
<xentrac>than what?
<pabs3>than binary trust of individual developers
<Hagfish>pabs3: good point
<Hagfish>it's probably the other side of the coin when it comes to binary transparency / reproducible builds
<Hagfish>i wonder if there needs to be a system for randomly assigning code review tasks to developers and seeing if they can spot previously-found bugs
<xentrac>that's an interesting idea
<Hagfish>maybe put some financial incentive in or gamify it some way
<Hagfish>could be implemented as a smart contract even
<lfam>What's measured gets managed
<xentrac>although it won't necessarily help you to find classes of bugs that nobody knows about
<xentrac>also, ultimately we should aim higher than just "no bugs we know how to find"
<xentrac>we should aim for "no bugs"
<xentrac>as an exercise I wrote binary search the other morning, not for the first time
<Hagfish>i think we're not even at the stage of "we know whether to trust this reviewer or not"
<xentrac>I wasted about 20 minutes fiddling with the Python REPL and writing buggy versions
<xentrac>finally I sat down with an empty text buffer and worked it out logically
<xentrac>and then... it still didn't work
<Hagfish>heh
<xentrac>but it was obvious why and so I fixed it
<xentrac>which took about 10 minutes
<xentrac>which is similar to my memory of the last time I did this; I spent about 15 minutes staring at the code and editing it before I tested it
<xentrac>and it worked the first time
<xentrac>ultimately we don't use the Haber-Bosch process because we trust Fritz Haber
<xentrac>which is good, because nobody should trust Fritz Haber. his wife committed suicide to escape being married to him
<Hagfish>the HB process has had plenty of time for people to gain confidence in its failure modes
<xentrac>well, we have verifiable evidence that his process works
<Hagfish>if we could afford to wait a decade before installing a software update, we could probably have more security too
<xentrac>there's only one instance of this I know of in the software world, which is seL4
<xentrac>and of course even seL4 succumbed to Spectre: an out-of-context problem that its verification didn't address
<Hagfish>sure
<Hagfish>we shouldn't hold it against the reviewers of the seL4 code/proofs that they didn't think of Spectre
<xentrac>but nobody has found a case where its proof was invalid, but slipped through Coq
<xentrac>instead of trying and failing to find bugs, they went with trying to show the *absence* of bugs
<xentrac>and it was much more difficult than finding bugs, but they did eventually succeed
<Hagfish>yeah, that's a good way to look at it
<xentrac>similarly, reproducible builds allow us to not trust any one build machine
<xentrac>because we have verifiable evidence that it builds things correctly
<xentrac>it would be very desirable to eliminate single points of corruption where corrupting one guy can sabotage a bunch of people trying to cooperate
<xentrac>so, I think code review is super useful, and even without *formal* proof we can ask people for *informal* proof in code reviews
<xentrac>"how do you know this aray index is within bounds?"
<oriansj>I'd be much happier if reproducible builds were the standard.
<xentrac>we've made a lot of progress on that!
<oriansj>depends where one is looking.
<oriansj>In Debian, NixOS and Guix absolutely
*xentrac flattens down his skirt self-consciously
<oriansj>In the world of software in general. LOL nope
<oriansj>there are compilers which can't even build hello world reproducibly
<oriansj>So there is no hope for the tools that depend upon them for builds.
<xentrac>yesterday I was very surprised to learn that my compiler *builds* hello world reproducibly, but once I started printing the address of a global variable as well as "hello, world", it didn't *run* reproducibly
<xentrac>because by default it was built as PIE for ASLR
<xentrac>and -fno-pie didn't help
<xentrac>it just broke the build
<xentrac>by "my compiler" I mean "the version of GCC I am using"
<pabs3>why were you printing the variable address?
<xentrac>to see if it was built as PIE even though I didn't ask for it to be
<xentrac>because someone had just told me that was the default now and I said "no it isn't"
<xentrac>so then I went and tested
<xentrac>guess what, they were right
<pabs3>ah
<xentrac>so I had to say "hmm, yes apparently it is"
<xentrac>and try to scrunch down very small in the IRC channel so nobody would see me
<xentrac>anyway, ASLR potentially adds nondeterminism to execution
<pabs3>everyone makes mistakes, no big deal
<xentrac>yeah, but I had just been an arrogant dick about the mistake, see
<xentrac>that was the embarrassig part
<pabs3>yeah, that isn't the greatest move in any situation :)
<pabs3>even one where you are obviously right and verified that fact
<xentrac>yeah, it's probably been a career-limiting move for me more than once, although zero times I can absolutely verify
<xentrac>but lots of times where that might have been the reason
<xentrac>for not getting one opportunity or another
<xentrac>anyway, enough therapy
<xentrac>just an interesting note about ASLR: if there's data flow from pointer bits or between-sections pointer comparisons to your program, it may end up behaving different from run to run
<xentrac>since on modern Linux everything gets built PIE (checksec can tell you)
<xentrac>dunno if that's actually a thing anybody here is facing this week, but I thought it was an interesting learning, and potentially relevant
<oriansj>I can only image the Lisp garbage collection behavior swings
<oriansj>because pointer happy is an understatement
<xentrac>I don't think it typically makes a difference for that
<xentrac>(and unless you have finalizers, you won't have data flow back from the GC into the mutator, ideally)
<oriansj>ideal and reality are both the same in one and completely different in the other.
<oriansj>So close to being done with RISC-V isolation
<oriansj>but its those last few steps that just were not quite right.
<oriansj>hmmmm there is a bug in stage0-posix for RISC-V where mescc-tools-extra is built 3 times in a row
<oriansj>I suspect there is a bug in kaem-minimal.hex0 but I'll have to dig into that tomorrow
<oriansj>make clean test-all -j4 will attempt to do clean builds for all and sha256sum -c *.answers will show you if all architectures were built correctly
***robin_ is now known as robin
<fossy>Holy hell
<fossy>autogen is the worst on bootstrapping
<fossy>the FIRST COMMIT in the repository has a self-dependency. wtf
***sm2n_ is now known as sm2n
<oriansj>fossy: sounds like we found a new dragon to slay
<oriansj>also another possiblity for the triple build: I make too many mistakes when tired and just missed something I did in the kaem scripts.
<stikonas[m]>yes, autogen is a nightmare. I had a look at it too
<fossy>i have a feeling i might have a solution
<stikonas[m]>and it also interfaces quite closely with guile, so we might need somebody more familiar with scheme than fossy or me
<stikonas[m]>fossy oh?
<fossy>the basic autogen package in the oldest tarball only uses one file, which is puerly for argument parsing
<stikonas[m]>oh ok, that can be rewritten
<fossy>note there *are* other autogen files in the tree but only one is required for the actual autogen binary
<stikonas[m]>like perl 5.000 also had some perl script but I rewrote it in awk
<fossy>so it would go something like build the autogen binary, rebuild the whole things, then go up the chain
<stikonas[m]>fossy, how soon do you think we can build it?
<stikonas[m]>before gcc?
<fossy>stikonas[m]: idk much about the guile integration...
<fossy>depends on whether guile can be moved
<stikonas[m]>well, if your method works, we don't need to worry abou that
<stikonas[m]>oh that's true
<stikonas[m]>so maybe we just do it after guile
<fossy>yeah and use it for subsequent gccs
<stikonas[m]>and for gcc 10
<stikonas[m]>I think we can build 10 next, although we need newer binutils first
<fossy>i think we can jump straight to gcc 10 now fingers crossed
<fossy>but i think if we were to go to 11 we might have to stop at 10, 11 i think has a dependency on 4.9.4 or newer or something
<stikonas[m]>if I recall correctly, gcc 11 needs newer gcc than 4.7.4
<stikonas[m]>exactly
<fossy>hah thinking the same thing
<stikonas[m]>so I think 10 and then latest (11 or so)
<fossy>i think we are nearly there in terms of having a toolchain bootstrap, then we can look at things like other architectures, building actual linux distros, etc
<stikonas[m]>fossy: can you fix deblobbing first?
<stikonas[m]>we might need to store those scripts in our repo
<fossy>stikonas[m]: oh yeah
<stikonas[m]>upstream does not keep them on download server for too long...
<fossy>i didn't notice because they were in my sources/ folder
<stikonas[m]>so don't delete them for now...
<stikonas[m]>keep a backup
*fossy practices 321 backups
<fossy>sources is included in that
<stikonas[m]>unless latest 4.9.x deblob script works on older kernel
<fossy>this is new
<fossy>when i was making linux kernel literally EVERYTHING was there
<fossy>thousands of versions were on that page
<fossy>oh old/gen6
<fossy>that's where our current script is
<fossy>although i will look into using these new scripts
<fossy>the diff is tiny, but it is probably best to stick with the thing actually for the version from old/gen6
<stikonas[m]>actually, I'm not sure which one is the best
<stikonas[m]>whether old or new...
<stikonas[m]>it might be that it doesn't matter
<stikonas[m]>although, even new versions might move at some point...
<oriansj>54321 backups would be ideal, however finding the 5th person on 4 different Continents storing 3 copies on atleast 2 different types of media for the 1 purpose of preserving the data is quite a task.
<xentrac>git is pretty good at that
<clemens3>oriansj: thanks for the feedback.. I assume you mean one release previous of mescc-tools?
<clemens3>i am there 3 commits after 1.3
<clemens3>i will try with Release_1.2.0 of mescc-tools..
<clemens3>hmm, seems something still strange.. I did messcc-tools of Release_1.2.0 but not sure about the submodule..
<clemens3>and mes git checkout master
<clemens3>still similar error
<clemens3>maybe let me know which commits/tags of which project and maybe also how to handle the sub module M2lib
<clemens3>of mescc-tools.. when/if time.. thanks..
<stikonas[m]>clemens3: did you run git submodule update ?
<clemens3>maybe that is the problem
<clemens3>the second time around..
<clemens3>so in mescc-tools cd M2lib
<clemens3>git log
<clemens3>what commit should be there?
<clemens3>i have df1f4..
<stikonas[m]>df1f4 sounds correct
<stikonas[m]> https://github.com/oriansj/mescc-tools/tree/Release_1.2.0
<stikonas[m]>it shows taht revision here too ^
<clemens3>so i used that of mescc-tools and sudo make install as well
<clemens3>then mes
<clemens3>v0.22
<clemens3>error
<clemens3>wrote `module/mescc/bytevectors.go'
<clemens3> GUILEC module/mescc/compile.scm
<clemens3>make: *** [GNUmakefile:95: build] Error 1
<stikonas[m]>well, that's probably some guile issue. Your configure probably still misdetects it as guile 2...
<stikonas[m]>but do you have to use guile?
<stikonas[m]>mescc should work with mes
<stikonas[m]>although much slower
<stikonas[m]>gcc for building mes and guile for running mescc might be helpful for development, but if you want to bootstrap things, I would think you want to run mescc with mes
<clemens3>aeh, i have no clue, i am just building it out of curiousity..
<clemens3>using LFS 11.. maybe unusual combination of stuff installed
<clemens3>so guile is used to build mescc?
<clemens3>i try the bootstrap build
<clemens3>but it says the bootstrap build is part of guix..
<clemens3>have nothing todo with guix
<stikonas[m]>well, I wrote my own build script for mes/mescc for bootstrap
<stikonas[m]>(nothing to do with guix)
<stikonas[m]>but I think upstream build system should also be able to do that
<stikonas[m]>you might need M2-Planet build too...
<stikonas[m]>these are the commands we run in live-bootstrap to build some specific version of mes https://github.com/fosslinux/live-bootstrap/blob/master/sysa/mes/mes.kaem
<stikonas[m]>at the moment live-bootstrap uses HEAD^ from https://github.com/oriansj/mes-m2
<stikonas[m]>HEAD of master actually adds support for new mescc-tools
<clemens3>anyway, just feedback..
<clemens3>if need, happy to provide debugging logs..
<stikonas>oriansj: you run mescc-tools-extra more than once because there is "./bin/kaem --verbose --strict --file mescc-tools-full-kaem.kaem" line in mescc-tools-mini-kaem.run
<stikonas>and then another exec ./bin/kaem --verbose --strict --file mescc-tools-extra.kaem
<stikonas>so it kaem files are both chained but also called from kaem.run
<stikonas>so you get kaem.run->mini->full->extra then kaem.run->full-> extra and finally kaem.run ->extra
<stikonas>so need to get rid of either chaning or extra invocations in kaem.run
<stikonas>s/chaning/chaining/
<fossy>stikonas: well, the diff between new and old, is that new blobs have been added/old blobs have been removed, in 4.9.x
<fossy>so we do want old, to ensure we catch all the blobs for that version
<stikonas>yeah, ok...