IRC channel logs

2023-11-14.log

back to list of logs

<oriansj>Googulator: perfection can not yet be achieved but progressed compared to yesterday definitely can be and thank you for helping move it forward.
<stikonas_>and I'm back now :)
<Googulator>alright, just verified with a toy srcfs that kexec-fiwix is indeed working on my hardware
<Googulator>but I get slightly nondeterministic behavior afterwards
<Googulator>which suddenly has me remember something
<Googulator>this used to be my old gaming PC
<Googulator>I probably had it overclocked
<Googulator>& all this nondeterministic behavior feels a lot like a clock/power issue
<Googulator>maybe all it needs is a CMOS reset
<moik>oriansj thanks, that helps a lot
<Googulator>Is it intentional that the Linux kernel config only supports Intel E1000 and E1000E for networking?
<Googulator>During bare metal testing, I managed to boot into sysc, and then died due to network being inaccessible
<Googulator>All of my Ethernet cards are either realtek, marvell, 3com or broadcom unfortunately
<Googulator>E1000(E) seems not to be all that common in actual hardware
<Mikaku>Googulator: I think there is a correlation with QEMU since e1000 seems to be the most common driver
<Googulator>Yeah, qemu is configured by rootfs.py to emulate e1000
<Googulator>But it's counterproductive for bare metal
<Googulator>Would it be OK to unconditionally enable all reasonable Ethernet drivers (i.e. PCI/USB/ISA/LPC/SPI, but excluding things that are just IP blocks and thus won't be encountered on a PC motherboard or card), or does it add too much to the compile time & kernel size?
<Googulator>If the latter, we can keep the current minimalist config for qemu, and use a more extensive one for bare metal
<stikonas>it might blow up the size too much but you have to try, without trying you wont' know if you exceed the limits
<Googulator>first try failed with "No rule to make target `drivers/net/ethernet/3com/typhoon.o'" - probably because deblobbing removes that entire driver
<Googulator>ouch... deblobbing also kills RTL8169 Ethernet support
<Googulator>luckily this board also has a Marvell Yukon, but I'm quite sure RTL8169 & derivatives handled by the same driver are single most common Ethernet chipset people will have access to
<Googulator>ugh, this seems really heavy-handed... only some devices handled by r8169 require firmware, but the deblob script just straight up deletes the entire driver
<Googulator>RTL8169 for example appears to need no firmware
<Googulator>neither does RTL8110SC, which is what this board has
<oriansj>well some have argued that mankind is at war with Firmware and firmware thus far has been winning.
<Googulator>I fully understand not wanting to run unaudited vendor firmware in the trusted bootstrap environment, but in this case, it would be more than enough to just, well, not include the firmware files themselves
<Googulator>which is what Linux already does by default in this case
<Googulator>The only side effect is that those few Realtek Ethernet chips that do require it won't work
<Googulator>from a cursory look of the code, even for most RTL chipsets that do use firmware blobs, they're optional for things like power management
<Googulator>It seems this deblob script was written from a standpoint of "it's illegal for a GPL kernel to even have the capability to load or interface with non-GPL firmware", not from "block untrusted and unauditable code"
<stikonas>Googulator: we were just running deblog script with delete functionality for speed
<stikonas>proper deblob can use sed/awk/perl to do more fine tuned cleanups
<Googulator>in this case, the firmware files are even under a license that permits reverse engineering, and quite possibly even the sources for them are public
<stikonas>but 1. it takes much longer (extra 1h or so)
<stikonas>2. I don't remember if our perl or awk were buggy...
<stikonas>perhaps deblob can be a configurable option
<Googulator>I do consider deblobbing in general to be necessary to ensure the security of the bootstrap environment, but it should be restricted to actually removing any remaining embedded blobs
<Googulator>Even removing firmware file loading capability is unnecessary if we don't include any firmware files anyway
<Googulator>& dropping entire drivers just because they can optionally load firmware... sigh
<Googulator>What's worse, the deblob script says this at the beginning:
<Googulator># This script, suited for the kernel version named below, in kver,
<Googulator># attempts to remove only non-Free Software bits, without removing
<Googulator># Free Software that happens to be in the same file.
<Googulator># Drivers that currently require non-Free firmware are retained, but
<Googulator># firmware included in GPLed sources is replaced with /*(DEBLOBBED)*/
<Googulator># if the deblob-check script, that knows how to do this, is present.
<Googulator># -lxoliva
<Googulator>yeah, that's apparently a lie
<Googulator>oh, I see it now... --force has this side effect
<Googulator>it's not just for ignoring errors
<stikonas>yeah, we use much faster mode....
<stikonas>but it drops far more stuff
<stikonas>perhaps its fine to switch to longer mode
<stikonas>as long as you can get it to work
<Googulator>Even a similar simple deblob script that only removes drivers that actually embed blobs would be fine
<Googulator>r8169 doesn't
<Googulator>Even in its non-force mode, the script we use currently also tries to remove firmware file loading support, which is completely unnecessary here
<Googulator>Removing entire drivers for embedding blobs is probably fine, but not hunting down file loading capability (which is harmless if the files aren't there)
<stikonas>well, discuss it with fossy...
<stikonas>maybe we can make it configurable
<stikonas>but yes, I think firmware situation won't be getting any better
<stikonas>(unless vendors decide to move it to embedded storage on the devices)
<stikonas>but we won't be seeing almost any free/open firmware
<stikonas>it's often signed/encrypted these days anyway
<Googulator>Worse, the script also kills drivers that use free firmware
<Googulator>Filed GH issues for both the deblob issue and the larger problem of only qemu ethernet drivers being enabled
<Googulator>For now, the deblob issue is not a blocker for me, but probably is a blocker for most people
<Googulator>The config one is a blocker for virtually everyone trying bare-metal, but I already have a fix in the pipeline for that
<Googulator>Is there any reason why we aren't using 4.9.10 also for the linux-headers package?
<Googulator>Having a whole 2nd Linux kernel source tarball in srcfs (one that's newer and much larger than the one that's actually built) just for headers seems like an overkill.
<stikonas>Googulator: no reason
<stikonas>I suggested to fossy to downgrade it, or maybe use newer kernel for building (though we can't used 5.x)
<stikonas>but everybody was busy...
<stikonas>so it hasn't been done
<stikonas>since linux package was added, we upgraded binutils
<stikonas>so we are likely to be able to build newer kernel
<stikonas>(and are only limitted by GCC)
<stikonas>we also could probably upgrade GCC to 4.6
<Googulator>Would that enable skipping the 4.7 build in sysc?
<stikonas>then maybe immediately build something from 5.x (e.g. this one is happy with GCC 4.6 https://kernel.org/doc/html/v5.1/process/changes.html)
<stikonas>no
<stikonas>we need C++ there
<stikonas>though maybe we can just keep it at 4.6 in sysc too
<stikonas>in sysa we only build C backend
<stikonas>so potentially we can cleanup quite a few things
<stikonas>but that's non-trivial amount of work
<stikonas>(also we think that riscv will eventually target 4.6)
<stikonas>as that is where riscv support was being backported by ekaiz
<Googulator>Also, what is guile used for in the bootstrap?
<Googulator>I was wondering if we could move it behind INTERNAL_CI
<stikonas>Googulator: autogen
<stikonas>which is used for creating top level ./configure script in binutils and gcc
<stikonas>(it's just to rebuild some pre-generated files)
<stikonas>but that's large part of what live-bootstrap is
<Googulator>So neither is used in the Python bootstrap
<stikonas>e.g we don't run other ./configure scripts either until we have bootstrapped autotools
<stikonas>no, it's not for Python
<stikonas>Python might actually be unused right now
<stikonas>but you'll need it if you want to build glibc
<Googulator>doesn't the final GCC also need Python?
<Googulator>also, Python, despite having so many steps, is relatively fast to bootstrap
<Googulator>much faster than Guile
<Googulator>I'd suggest moving guile and autogen after [ "${INTERNAL_CI}" = "pass2" ] && exit 0
<Googulator>Unless there's some grander plan for using it
<stikonas>Googulator: I'm not aware of python usage in gcc but maybe i'm wrong
<stikonas>that pass2 is determined by timings, it's a bit of a hack to fit in github action ci time limits
<Googulator>Oh OK
<stikonas>but I agree, there is some potential for reordering
<Googulator>I thought it was for a quicker build during development when you don't care about modern GCC in the end
<stikonas>but it has to be before final rebuild of binutils and gcc
<stikonas>no, for quicker build during development use prebuilt packages
<stikonas>there is --early-preseed and --repo options
<stikonas>you can skip most of the early stages
<Googulator>I noticed --early-preseed, but what if you specifically want to test the earlier stages?
<Googulator>preferably with a way to inspect the results
<stikonas>well, then you need to run those
<stikonas>it's basically mes that is slow
<Googulator>Yes, but since nothing writes to the disk until sysc, we won't get a chance to inspect the results until sysc gives us the final Bash prompt, which right now happens after guile :(
<stikonas>on baremetal, yes...
<stikonas>for development you can always do that in chroot/bwrap
<stikonas>though one could create a bash prompt earlier, just need to change scripting
<stikonas>e.g. stop in sysc after bash 5 is built
<stikonas>that ahojlm somehow seems to have released something this year and claims we are slower with full source bootstrap...
<stikonas>(this is from bootstrappable/rb-general mailing list)
<oriansj>oh well. Meaningless claims that depend on non-standard definitions are easy to make. not worth fighting over.
<oriansj>if there was a standard definition and it was generally agreed upon. then who cares who was first; we are having fun.
<oriansj>our greatest success is the people here.
<oriansj>and no one can ever take that away (even if we need to move IRCs again ^_^)
<vagrantc>test suite failures for mes on riscv64: https://buildd.debian.org/status/fetch.php?pkg=mes&arch=riscv64&ver=0.25-1&stamp=1699793557&raw=0
<vagrantc>and i'm not sure what failed on armhf: https://buildd.debian.org/status/fetch.php?pkg=mes&arch=armhf&ver=0.25-1&stamp=1699749262&raw=0
<vagrantc>ekaitz, janneke : ^^
<ekaitz>vagrantc: the groundhog day
<ekaitz>stikonas_: ^^ it's setjmp!
<ekaitz>and something else
<ekaitz>and truncate-shift
<ekaitz>snuik: later tell janneke truncate-shift might be garbage
<snuik>Will do.
<ekaitz>snuik: setjmp... you have to talk with stikonas for that
<snuik>Err...
<ekaitz>snuik: later tell janneke setjmp... you have to talk with stikonas for that
<snuik>Will do.
<ekaitz>snuik: botsnack
<snuik>:-)
<stikonas_>hmm
<stikonas_>which compiler is used there?
<stikonas>we have only tested setjmp on tinycc
<stikonas>and build tested on mescc
<stikonas>I guess nothing else is expected to work on riscv
<vagrantc>stikonas: ah, it is probably gcc ... 13 maybe?
<stikonas>well, we haven't tested riscv64 setjmp at all with gcc...
<stikonas>perhaps we should skip that test for now...
<stikonas>riscv64 was only really tested with mescc and bootstrappable tinycc
<vagrantc>the debian packages of mes build mes with gcc and in some cases (armhf) mescc
<vagrantc>er, armhf and i386
<vagrantc>should riscv64 in theory work with configure --with-bootstrap?
<stikonas>hmm, I think it should
<stikonas>but whether that script is tested
<stikonas>that's another matter...
<vagrantc>well, there's at least one way to test it :)
<stikonas>at least for live-bootstrap scripts that I used for testing all of this, we build mes-m2 and then use it to bootstrap mes
<stikonas>but that's kaem scripts
<stikonas>ekaitz did some testing with bash scripts
<stikonas>so perhaps they do work
<ekaitz>i wouldn't count on that
<stikonas>anyway, that's the first release with riscv64, we can't have everything working :)
<vagrantc>yeah, i enabled it to shake the tree, so to speak. :)
<ekaitz>:)
<stikonas>vagrantc: I think that ahojlm is confusing bootstrapping for reproducible building...
<stikonas>anyway, I probably won't waste any more time on that thread...
<vagrantc>stikonas: yeah, i am pretty much done with that too.
<vagrantc>hence, xkcd.
<jcowan>oriansj: I think it's unlikely that the Korean Empire will attack a second time
<oriansj>stikonas: well one wants to seeks credits above all else, rationality tends to be left behind.
<oriansj>and they admitted as much
<Googulator>Meanwhile, went searching for potential parallel NAND flash and USB Mass Storage controller pairings that could work for the "Trusted Flash drive", and found... nothing usable.
<Googulator>An alternative approach may be to use SPI flash, which is available in sizes as small as 128 bytes, and as large as 256MB - and you can multiplex 2 SPI chips on the same bus, only needing to switch the Chip Select line between them with a mechanical switch (for security)
<Googulator>Unfortunately I'm not aware of any pure hardware implementation of an USB Mass Storage controller with an SPI backend, so a microcontroller would be needed, which means _firmware_
<Googulator>But if we can find a microcontroller that can read that firmware from an external SPI flash chip (a 3rd one), and make it sufficiently small, preferably small enough to be bitbanged into a flash chip by hand (like https://www.youtube.com/watch?v=8ZYMrcHm91s but for SPI), it's doable without introducing a potential backdoor path
<Googulator>We don't currently fit on 256MB, but we will if we drop the 2nd Linux source tarball currently used for the headers, and recompress the 4.9.10 tarball as bz2
<oriansj>well SPI flash chips tend to be little more complicated to program by hand but workable (relative to byte/word wide flash chips)
<oriansj>and firmware isn't an issue if we can create it manually (aka small)
<oriansj>I'd put the upper limit of manually correctly hand toggling a binary at 1KB
<oriansj>(potentially more if lucky but I wouldn't bet on it)
<Googulator>Finally got Linux to start and get an Internet connection, and actually start building on baremetal... and then another curveball. For whatever reason, the keyboard doesn't work immediately after boot - and because consoleblank=0 isn't set, after 10 minutes, the screen goes blank
<Googulator>This time I'm sure it's not frozen, as I can see it download packages in tcpdump
<stikonas>slow but steady progress :)
<stikonas>you might be the first one running this so far on baremetal
<stikonas>I wanted to run it on my next laptop but that's in the future...
<Googulator>I kind of had a feeling this was never actually tested on bare metal...
<Googulator>The USB issue is really weird, as during the short time I had real display, I could plug in a USB flash drive, and it did print on console that it detected a new UAS device
<Googulator>But not if I plug in a keyboard
<Googulator>hmm... maybe UHCI/OHCI is disabled, so it's not detecting USB 1.x devices?
<Googulator>let's try a hub with a TT...
<Googulator>Hub's not working at all, not even USB 2.0 devices are detected
<Googulator># CONFIG_USB_OHCI_HCD is not set
<Googulator># CONFIG_USB_UHCI_HCD is not set
<Googulator># CONFIG_USB_XHCI_HCD is not set
<Googulator>CONFIG_USB_EHCI_HCD=y