IRC channel logs

2021-05-16.log

back to list of logs

<melg8>not a big deal)
<melg8>for now i just need to stop inner procrastination) because one thing is to just translate commands from kaem to nix-kaem) and other - patch c code) even though i dont think it will take long
<melg8>does live-bootstrap check that sha256sum of kaem is right?
<melg8>OriansJ i've tried to add
<melg8>        -f ${m2-libc}/sys/types.h \
<melg8>        -f ${m2-libc}/x86/Linux/sys/stat.h \
<melg8>        -f ${mescc-tools}/Kaem/kaem.h \
<melg8>but now get Unknown type typedef
<melg8>cfvrm5-M2libc/sys/types.h:20:Subprocess error
<OriansJ>melg8: typedef support was added in 921cc86ce64037493a526736ff7c49b3f8475486 and will work assuming you are not using --bootstrap-mode
<OriansJ>So I guess I really need to get that updated stage0-posix out the door this weekend.
<melg8>yea) it's cascading - hard to know which commits to choose so that all happy
<melg8>because now i get Unknown type FILE
<melg8>))
<melg8>xyq0pmry9yxp5351i7fgsaixjidwcir-M2-Planet/test/common_x86/functions/file.c
<OriansJ>gforce_d11977: this might really help: https://en.wikipedia.org/wiki/Hudson_Soft_HuC6280#Memory_mapping
<melg8>so  now it's not in test by in m2libc i guess
<OriansJ>melg8: well in cc_* FILE is just an int but in M2-Planet it is a proper struct which enables much faster performance (and far less syscalls)
<OriansJ>hence why M2libc stdio.c is quite different than the old M2-Planet file functions
<OriansJ>(which came form cc_x86)
<melg8>okay, so i better just copy-paste this single function now, so i dont drift to much away from live-bootstrap, and when you update stage0-posix, than update as well, because now i at least can compare two and figure out which hashes mismatch and what i'm doing worng
<melg8>wrong*
<OriansJ>melg8: side effects of me being really behind in all the work I need to ship
<OriansJ>speaking of which rain1 if you have free time I could use your help.
<bauen1>i think a single-instruction computer for the very early stages hex0 (and a few steps up) would be a quite cool proof of concept
<bauen1>but i'd expect that to be too slow for anything after that
<bauen1>however if you can however implement a hash function and editor that is fast enough to be usable in it, then it could be a viable "trusted componente"
<bauen1>s/however//
<gef>bauen1: that's *exactly* how I think about it. It should help to arrive at least up to level m2-planet and perhaps some more with extra hacks.
<gef>As regards hash functions, it is a pain. Or it can be a pain :) When looking at this https://en.wikipedia.org/wiki/Checksum#See_also , some modified variations of BSD & SYSV checksums should be possible and reasonable. On the upside, the benefit of the bootstrappable project is that we would't require cryptographically secure hash functions and you can even use several of them together. That would make the procedure very trustable across a range of
<gef>processors and platofrms, without duplication of work.
<bauen1>gef: the use case i have in mind for a trusted device with a trusted i/o + editor + hash does require a cryptographically secure hash (e.g. sha-2)
<OriansJ>gef: perhaps a good first step is add SBN4 support to MesCC or M2-Planet + mescc-tools
<gef>fi. I'd use some hash function vector for files of the type (size, BSD', SYSV', [1-yet-to-identify-has-function])
<bauen1>gef: i got to sleep, but tl;dr: use hashes to make a hash tree of the bootstrap and it's component so you can develop on it / repeat it without redoing an entire code review
<gef>*has -> hash
<bauen1>gef: then all you need to do is review the code up to the hash implementation and remember the root of hash tree, then you use it to verify a file containg the tree, and if it matches you can use it to skip code review / verify output of untrusted machines etc...
<gef>bauen1: exactly that. It can bootstrap the steps reasonably well, until you have primitives you can trust more (implementing sha* hash functions can only come with microarchitecture with lengthy microcode implementation - sb4 is just 12 lines, it just tries to win on simplicity and trust)
<gef>oriansj: would this python snipet be of interest? it is really very trivial code, it is an OISC after all: https://gitlab.com/roosemberth/single-instruction-machine/-/blob/master/README.md#single-instruction-machine-simulator
<OriansJ>gef: a single point of trust is always a bad idea. Better enable the maximal number of possible ways to falsify the trust of any implementation/stack
<OriansJ>gef: I would not personally implement OISC support but I would merge efforts made to include that support in the various pieces to enable to be another root to check all the other roots.
<gef>oriansj: you can implement that across architectures of even computer languages. But if you wish to have the ability to compare hashes and have cross-checks, you'd have to consider some common layer of agreement. This is what can be achieved easily in this oisc way - and maybe there are betters ways to do the same I'm just offering my angle here. My interest is more on the hardware side of things: microcode implementations which are trustable by design
<gef>. I fray upon the idea that you have all the software in open source and the hardware is as if by magic trustable.
<OriansJ>gef: That common layer of agreement is that a byte is 8bits big bit endian order.
<gef>how do you cross-check different architectures for being equivalently trustable? Or until which point in the bootstrap the hashes start to diverge and you need to treat them as islands of trust?
<gef>(asking openly here, for your insight)
<stikonas>presumably binaries will always have different hashes but you can build cross-compilers (like cc_x86 on arm)
<fossy>goddamn perl embedding dates into binaries
<xentrac>ha ha! naughty perl!
<OriansJ>gef: the binaries never have to match unless they are built towards a uniform target. For example M2-Planet+M2libc building M2-Planet for AArch64 will always be the exact same hash given the exact same source regardless of what platform does the build.
<xentrac>OriansJ: I think I've explained before that computers without bit-serial ALUs do not have bit endianness within bytes
<xentrac>and in bit-serial layers like SPI and teletype, there is in fact variation in bit endianness
<OriansJ>xentrac: yes you have
<OriansJ>but notice I have explicitly picked 1 order as the definition so for those layers that don't do it correctly will need to do bit banging to produce correct results.
<xentrac>picking an order is a good idea, I'm just saying the hardware of things like a 6502 or a Z80 has no preference
<xentrac>an i386 only has a preference because it pretends to have byte-addressable memory
<OriansJ>xentrac: I understand as the order only matters in regards to the transmittion of bytes between systems not the actual calculations.
<OriansJ>except in regards to >> 4 and the like behaving in the way defined.
<gef>oriansj: do you foresee any technical limitation for having a single hash for _all_ M2-Planet builds, across all archs beyond AArch64?
<xentrac>right. >> 4 doesn't depend on whether you consider the MSB or LSB as coming first; it shifts four in the direction of the LSB either way
<OriansJ>gef: all M2-Planet builds have universal hashes regardless of the target
<xentrac>but serial ports do care
<OriansJ>gef: M2-Planet --architecture $arch -f ifile -o ofile will always produce the exact same byte for byte ofile given the exact same ifile and $arch
<OriansJ>the host operating system, libraries, runtimes, time of day, floating point behavior, etc none of that matters. The output must always be byte for byte identical across *ALL* architectures
<xentrac>(unless something is broken)
<OriansJ>xentrac: or unless the bootstrap is compromised in some way.
<xentrac>right
<gef>ok, that's a plus: it helps to cross-check architectures. But porting to a new architecture still requires some work in assembly of several lines, right?
<OriansJ>as M2-Planet only knows about the flags given (or hard-coded in bare metal versions) and the source code given to it. Nothing else
<OriansJ>gef: No, that is only if you want it to generate code for a particular architecture; then all of the architectures gain that ability.
<OriansJ>M2-Planet is just a string process, it reads in C token strings and writes out M1 macro output strings.
<melg8>does live-bootstrap check that sha256sum of kaem is right?
<OriansJ>melg8: well kaem is built and used before the sha256sum tool is; so maybe as a sanity check but stikonas or fossy would know better.
<melg8>i just didnt find it in list of presha256sum
<melg8>btw
<OriansJ>gef: and all the assembly code writing is done in M2libc in the M2-Planet+cc_* C subsets
<melg8>bootstrap_seeds ➤ sha256sum /nix/store/lq24n2pdyk2ap5fj2z5q6z7072h3v3ff-messcc-tools-mini-build/*                                                                                                             git:feature/BootstrapNix*
<melg8>b56c8f27f92cee4f81d41edb06c9e0c0b69f390dc78216ed66c477a6adf627f3  /nix/store/lq24n2pdyk2ap5fj2z5q6z7072h3v3ff-messcc-tools-mini-build/hex2
<melg8>72012a7d50996690f498f75738b80f93f1d0640a712ea327c9579995b2f63048  /nix/store/lq24n2pdyk2ap5fj2z5q6z7072h3v3ff-messcc-tools-mini-build/kaem
<melg8>af85c2f30389f1c6ee2e945c442ae34caf4dece48d28b2fa8c76ce458007c63a  /nix/store/lq24n2pdyk2ap5fj2z5q6z7072h3v3ff-messcc-tools-mini-build/M1
<stikonas>live-bootstrap only checked after folder
<stikonas>something that you can improve
<OriansJ>mescc-tools M1, hex2 and blood-elf all behave the exact same way. So the flow from C code to binary and every step inbetween is universally checkable and cross-checkable between all architectures that are supported.
<OriansJ>M1 --architecture $arch -f ifile -o ofile (big byte endian by default add --little-endian if you want little byte endian output)
<OriansJ>hex2 --architecture $arch -f ifile -o ofile (big byte endian by default and --little-endian if you want little byte endian output)
<OriansJ>there is only 1 valid possible output given 1 exact input using 1 exact set of flags for every single step
<OriansJ>So any compromise at any level will be detected by *ALL* noncompromised architectures. So any Trusting trust or Nexus intruder class attack will have to compromise *ALL* architectures including ancient ones and ones that haven't even been invented yet
<OriansJ>So if I invent architecture foo and do the steps of stage0 from hex0 to cc_foo and build the current M2-Planet to check the x86 output it better be exactly identical or we just found your trusting trust attack in the diff
<OriansJ>In short assuming 1 collage student in the world escapes having his hardware compromised and spends a weekend implementing the stage0 steps. All of the trusting trust attacks implemented will become obvious for all
<OriansJ>Note this also applies to nation states like Iran, North Korea, etc which have a reason not to keep such attacks secret.
<melg8>OriansJ can initial set of tools (mescc-tools-mini) manipulate as chmod +x? because now i got stuck on trying to rerun new kaem from nix, which says me that it is not executable (after i catm-ed it to out)
<melg8>I found temporary solution by rebuilding it from new_kaem called from original kaem seed and sending right to the out, but still
<OriansJ>melg8: no but mescc-tools-extra has chmod and I should probably share my current work
<OriansJ> https://github.com/oriansj/mescc-tools-extra
<OriansJ>I haven't finished adding the M2libc builds to the makefile or the untar and ungz and sha2 pieces but that is on my list to get done
<OriansJ>fossy: I've granted you access so feel free to add any pieces you need built by M2-Planet for live-bootstrap
<OriansJ>and you too stikonas
<stikonas>well, live-bootstrap can't yet use M2libc...
<stikonas>or rather I think mes doesn't build yet
<stikonas>but untar and ungz can be added first, we'll need them for mes anyway
<stikonas>fossy: at some point melg8's coreutils might also benefit from your review https://github.com/fosslinux/live-bootstrap/pull/115
<stikonas>especially we should decide whether to use coreutils tarball or git archive snapshot of the tag...
<stikonas>annoyingly importing-gnulib is not idempotent there... It removes some files that are not then present in release tarball
<Hagfish>"So what architecture is the microchip in the Covid vaccine?" "Obviously it's arm-based!"
<fossy> https://github.com/fosslinux/live-bootstrap/pull/117
<fossy>OriansJ: ok neat
<gforce_de1977>melg8: about the 'chmod +x' problem: once I choosed to ship the needed 0-byte files in my initial ramdisk already chmod +x marked, this works 8-)
<OriansJ>Hagfish: RNA computation to generate spike proteins to train the immune response to spike proteins that match; which exist on the Covid virus. as RNA isn't binary it isn't something we currently nor plan to support at this time.
<stikonas>fossy: I've reviewed your perl PR...
<stikonas>mostly small issues, although a few will take longer to fix
<melg8>Hi, will mescc-tools-extra from live-bootstrap be migrated to mescc-tools repo?
<stikonas>(mainly I would like to build 32-bit perl, not 64-bit)
<stikonas>I thought no, but I don't remember exact discussions
<stikonas>mescc-tools are tools directly needed to run mescc
<stikonas>mescc-tools-extra are actually tools unrelated to mescc (the name of the repo is a bit misleading) but useful in bootstrap
<stikonas>(that said I'm not saying they can't be merged)
<melg8>but chmod :( and cp :( two essential things
<melg8>btw will we bootstrap download/making sysa image process?) or we just ignore that?
<stikonas>melg8: you make it manually
<stikonas>if you want to bootstrap on real hardware...
<stikonas>although, it's a good question of how much can be made manually there
<stikonas>you can't enter the whole 100MB of source code into your RAM by flipping switches...
<stikonas>anyway, for now we ignore that...
<gef>stikonas: you only need to manually enter the initial boot loader, a trustable has function & the hashes to check against - the rest can be imported via untrusted mechanisms.
<stikonas>although, I deliberately kept cpio uncompressed
<gef>*has -> hash
<stikonas>I guess yes
<gef>effectively, what is being discussed here is a very very minimal trusted execution environment
<melg8>shouldn't it be at least in form of static repo? so anybody could check what's really in there? because now it's like all over the place with different mechanisms involved - in form of git submodules and python loading stuff, and system copying files from one place to other
<melg8>and... for example - nix - just do not allow network access in build step. so all sources should be prepared before that.
<stikonas>melg8: git submodules are temporary thing
<melg8>what will be after?)
<stikonas>OriansJ: has now ported ungz.c and untar.c that I found to M2libc
<stikonas>just like after submodules, upstream tarballs
<stikonas>earlier tcc-0.27 and tar were also git submodules
<stikonas>not anymore, I've now removed them
<stikonas>melg8: we don't have network access either (at least when you run in qemu)
<melg8>but we dont run sysa in qemu
<stikonas>in fact, it's not even that easy to get it, we would have to build ip command
<stikonas>do you mean ./rootfs.py?
<melg8>yea
<stikonas>because sysa itself runs in qemu (unless you run everything in chroot)
<stikonas>well, rootfs.py does neet network access to download sources
<stikonas>but that's all it uses from the internet
<stikonas>all sources are hash-checked though
<stikonas>(not from the inside of chroot, but that will be done too)
<melg8>what should do linux distro, which have other way of loading sources?) not via python script)
<stikonas>I guess it's up to distro
<stikonas>e.g. downloading sources is integrated into guix binary
<melg8>what problem i've encountered for example - is - i can load git repos using builtin nix functions, so my minimal extra dependency for nix would be nix itself (with it's closure of stuff) and that's it, i dont even need to use some shell it will do the right thing. But if i need to fetch submodules, or even fetch alot of different packets from all
<melg8>over internet (that's okay) but to put them all in right places - as placed by sysa script - it would need me shell + ln + cp + at least ) ... so if i want nix to be absolute minimum (maybe i just crazy) - best would be just prepared git (or tar) with all sources, which i load 1 time, than can inspect in my system - can check for hashes and from
<melg8>that just run kaem + kaem.run - and it will do the right thing. Or i need to manually regenerate all paths in all build scripts because it's not just ../bin/kaem its /nix/store/h1vkf9k6hqxsw06p1s93mczr710n9fiy-mescc-tools-mini/kaem - where stuff lays.
<stikonas>yeah, I see...
<stikonas>well, like I said in the medium term git submodules will go away
<stikonas>but it depends on M2-Planet + M2libc being able to build mes (I guess not yet released mes 0.24)
<melg8>can live bootstrap do releases, and inside release just have unpacked result of sysa run?
<stikonas>at some point I think we want to do that
<stikonas>well, releases
<stikonas>but not sure about sysa.run...
<stikonas>that's a huge amount of source
<melg8>it loaded by sysa anyway?
<stikonas>well, you need to host that big tarball somewhere
<stikonas>and for now it changes a lot
<melg8>how big it would be?
<stikonas>right now I guess it's about 200MB
<stikonas>will be more later...
<stikonas>and right now it's not that useful yet for other projects
<stikonas>we only have a C compiler
<stikonas>no C++ yet
<melg8>and than we just start using self-unpacking tarball... and instead of binary gcc, we would be using binary seed of 200 MB :)
<stikonas>other projects would also want newer GCC than 4.0.4
<stikonas>well, not really
<stikonas>tarball is not really a seed
<stikonas>tar files are human readable
<melg8>at least it can be viewed
<stikonas>yeah...
<melg8>yea
<stikonas>you don't need any software to read tar
<stikonas>although, it's less user friendly
<stikonas>I guess same is for cpio
<stikonas>that's why I also haven't compressed initramfs
<stikonas>in rootfs.py
<stikonas>anyway, I'm away for some time
<melg8>okay)
<gforce_de1977>stikonas: i'am still struggling with the "autocleanup" thingy (delete sources, after we have a compiled binary). i have some "exclude" rules, e.g. delete bison-3.4.1 only after the third iteration, but for some reason it stops on "automake-1.6.3: preparing source." with the message: "autoconf-2.52: no input file" - see: on bottom: http://intercity-vpn.de/bootstrap/error-on-autocleanup.txt
<gforce_de1977>(maybe you have an idea, what i have deleted and can cause this - i dont see it)
<gforce_de1977>these are the buildsteps as is see it: look for string "stage0 1 17 16" - see: http://intercity-vpn.de/bootstrap/memplot-memhack26-1999M.txt
<gforce_de1977>bauen1: i tried to just patch kernels "initramfs.c" and did a 's/.tv_sec = mtime;/.tv_sec = 65222;/g' - but with not effect. do you have more inside which underlying kernel-functions has to be annoyed to get A) every file mtime/atime = 01-01-2000 or B) every timecall (timekeeping.h?) returns XY?
<stikonas>gforce_de1977: did you remove autoconf-2.52 before you rebuilt second stage?
<fossy>stikonas: good point on the x86_64
<stikonas>gforce_de1977: by the way, what I tought about cleanup, is actually not removing the whole dir but build dirs
<fossy>kinda didn't think of that
<stikonas>fossy: it's worth testing other autotools packages too in your PR
<fossy>i tested autoconf and automake of the newest version suing perl 5.32 and they are fine
<stikonas>sometimes --build/target/host=i386-unknown-linux-gnu would result in different hash
<fossy>right
<stikonas>sometimes it matters, sometimes it doesn't...
<stikonas>well, target probably doesn't matter at all except in toolchain (binutils/gcc)
<stikonas>anyway, nice work in general, nice to see newer make, etcc...
<stikonas>fossy: oh, I also didn't mention, when you manually install stuff, maybe install into ${DESTDIR}, that will make bauen1's rebase easier
<fossy>oh yea
<stikonas>otherwise bauen1 will forever play catch up...
<fossy>mhm
<bauen1>stikonas: i've actually managed to catch up (for now) :D
<stikonas>oh, that's good
<stikonas>although, no PR yet :)
<bauen1>stikonas: well for a PR i have higher standards than "it works"
<gforce_de1977>stikonas: i think i found my autocleanup-mistake, will try to build again and report...thanks for the input!
<gforce_de1977>bauen1: according to this link, i can maybe abuse current_kernel_time() or clock_gettime() in kernel - what do you think? - https://stackoverflow.com/questions/22579157/kernel-mode-clock-gettime/32003279
<stikonas>gforce_de1977: but it would be much simpler to just delete build dirs, that's what uses up most of the RAM
<gforce_de1977>stikonas: ok, but the tarballs are also using ~150 Megabytes
<stikonas>gforce_de1977: yes, that's true, but it's harder to know when we can remove them. And it also has some benefits that we can inspect source after bootstrap
<gforce_de1977>stikonas: ofcourse the autocleanup-stuff is not suited for debugging things 8-) i'am just trying to play with the RAM limits
<xentrac>attila_lendvai: how's stuff going with maru
<attila_lendvai>xentrac, i haven't worked on it in the last few weeks, been busy with other aspects of life. i need a spark of inspiration for something exciting, like the removal of the libc dependency was.
<attila_lendvai>and i have found an ugly bug with bindings and closures that is deep down at the bottom of the evaluator, and i should really fix it before any other ventures... but fixing these kinds of bugs require a high tolerance for frustration... :)
<attila_lendvai>and i should also spend more time at the bird's eye view perspective, i.e. learn more about MetaII & co., and about type systems, before i spend too much being lost in the forest without a proper vision
<xentrac>i don't know, i think wandering around in the forest is the most important way to learn bushcraft
<xentrac>wrt the bindings and closures stuff you might find the art of the interpreter or essentials of programming languages to be enjoyable reading
<xentrac>but of course they're not necessary
<xentrac>i've found that 'i should really fix x before y' is a terrible way to open the door to inspiration
<melg8>btw, what is maru? and attila_lendvai what are you talking about libc dependency from where?
<gforce_de1977>melg8: see his repo on github and the README
<melg8>what is his nick on github?
<xentrac>libc dependency in maru
<gforce_de1977>melg8: same nick
<xentrac>maru is super inspiring in terms of maximal expressiveness in minimal code (and thus minimal tcb)
<xentrac>attila_lendvai: also 'i should learn more about x before i do y' is usually not really true. occasionally you'll run into some wanker that sneers at you for not knowing things, but it doesn't happen often and usually those people can be ignored
<xentrac>if i read something that explains a better way to do y before i have ever done y, i won't understand it. at best nothing will stick, at worst i'll acquire some kind of cargo-cult pseudo-knowledge i later have to unlearn
<xentrac>if i first do y, then read x, then i am equipped to understand the aspects of x that are applicable to doing y
<xentrac>but i don't really understand them until i do y a second time after having read x
<xentrac>so x y is a worse sequence than y x, and y x y is better than just y x
<xentrac>sometimes even y y x or y y y x is worthwhile ;)
<xentrac>probably anybody who sneers at you in particular can be safely ignored in any case, given what you've already achieved ;)
<melg8>fossy you here? i have a question in regards of your feedback on my mr
<stikonas>melg8: fossy is sleeping right now
<stikonas>probaby will be waking up soon
<melg8>okay)
<stikonas>I also commented on those issues
<stikonas>maybe using git archive instead of release tarball will be easier for us
<melg8>i'm testing with https://github.com/coreutils/coreutils/archive/refs/tags/v8.32.tar.gz
<melg8>it has this files there
<melg8>But...
<melg8>  File "/home/user/work/live-bootstrap/lib/utils.py", line 39, in copytree
<melg8>    shutil.copytree(src, os.path.join(dst, file_name), ignore=ignore)
<melg8>  File "/nix/store/q6gfck5czr67090pwm53xrdyhpg6bx67-python3-3.8.9/lib/python3.8/shutil.py", line 555, in copytree
<melg8>    with os.scandir(src) as itr:
<melg8>FileNotFoundError: [Errno 2] No such file or directory: '/home/user/work/live-bootstrap/sysa/v8.32'
<melg8>do we want file named v8.32?)
<fossy>hi
<melg8>hi)
<stikonas>melg8: yes, that's what git archive is...
<stikonas>but maybe savannah is better?
<melg8> https://github.com/fosslinux/live-bootstrap/pull/115 i wrote you some replies there
<stikonas>since it's upstream repository
<stikonas>melg8: an no, we need to rename file, not v8.32.tar.gz
<stikonas>but that's already possible in sysa.py
<melg8>that error gets from get_file
<stikonas>melg8: that's expected
<stikonas>because you are not naming it correctly
<fossy>hm I will do some looking
<stikonas>if you rename it in get_file to coreutils-8.32 it will work fine
<stikonas>(if I remember correctly)
<melg8>you mean when i call get_file in sysa?
<stikonas>melg8: look at that sha-2 package
<stikonas>it does exactly that
<stikonas> self.get_file("https://github.com/amosnier/sha-2/archive/61555d.tar.gz", mkbuild=True,
<melg8>aww, yes, cool
<stikonas> output="sha-2-61555d.tar.gz")
<melg8>then i'll check how it builds
<stikonas>you won't need mkbuild=True but output is what you want
<stikonas>it's just a parameter to rename downloaded file
<stikonas>and it is expected to be put into basename(output) directory rather than basename(url)
<stikonas>fossy: so I think after your PR we'll only have gzip and patch still using meslibc
<stikonas>melg8: alternatively, if you download from savannah, it already has the right name
<stikonas> https://git.savannah.gnu.org/cgit/coreutils.git/snapshot/coreutils-8.32.tar.gz
<melg8>very nice!
<melg8>fossy, maybe it would be easy fix, by just using right variant of repo, i'll check if it's builds now.