IRC channel logs

2021-02-17.log

back to list of logs

<stikonas>pder: so we might need to patch perl a bit later...
<stikonas>sicne it reports version 0.000
<stikonas>I might also need to build newer perl's later
<stikonas>for not ancient versions of autoconf/automake
<pder>stikonas: nice work on all the perl stuff. I have been wondering what is currently broken with floats and tcc. I'll try running some tests with tcc-mes and tcc-musl. Also, do you think there is any benifit to trying to build gio's fork of tcc that includes softfloat?
<stikonas>hmm, maybe, you can try
<stikonas>maybe it will fix floats...
<stikonas>anyway, I'm going to bed now...
<stikonas>will work more on perl stuff later...
<pder>cool, thanks for all the work on it
<stikonas>no problem
<stikonas>are you going to try autotools later?
<pder>I was planning on trying to understand this float problem better first
<stikonas>ok
<stikonas>anyway, goodnight
<pder>goodnight
<gforce_d11977>stikonas[m]: fossy: instead of using: sed -e '1,/__END__/ d' keywords.pl | sed '1d' | awk '{print "#define", "KEY_"$0, NR-1}' > keywords.h - we can maybe use a script like this: http://intercity-vpn.de/keywords.sh.txt
<gforce_d11977>i there is interest, i can convert 'opcode.awk' also into a /bin/sh script
<gforce_d11977>stikonas[m]: fossy: i think it is generally a good idea to run e.g. /my/new/program --version (or --help)
<fossy>gforce_d11977: why would we want to not use awk?
<fossy>gforce_d11977: i agree on the testing thing, that has not been added to the bash build harness yet though
<stikonas[m]>fossy: if I understand correctly, writemain is used to create final non mini perl
<fossy>but we use miniperlmain no?
<stikonas[m]>Yes
<stikonas[m]>So without any extensions
<gforce_d11977>fossy: i would generally avoid AWK or SED, or lets say: avoid unreadable oneliners. Its a documentation issue IMHO. In terms of bootstrap, we should avoid using all the tools which are not in good shape (old versions) and stick to portable constructs if somehow possible. for the sed-hacks in the early beginning, maybe we extend 'catm' for that task
<fossy>stikonas[m]: ah i see
<fossy>gforce_d11977: i agree with the avoiding unreadable oneliners
<fossy>what do you mean by sedhacks? we use sed only a couple of times to delete some lines from files?
<fossy>most of the tools, while old, are in pretty fine shape, and work quite well - i don't think that is a reason to not use them
<stikonas[m]>Yeah, those commands we use are fairly basic
<fossy>that line that you quoted just needs a comment or two just to explain what it does imo
<stikonas[m]>And e.g. sed is not too old
<stikonas[m]>We have built sed 4
<stikonas[m]>And opcode.awk should be just as readable as perl script if not more...
<fossy>re: portable constructs, it is true that sed and awk are not posix utilities. however they exist on almost every unix system. but i think that's beside the point here, the entire point of live-bootstrap is precisely that we don't need to rely on portable constructs
<fossy>the bootstrap is a contained end-to-end process, we don't need to care if something is portable; it's a fully closed system
<fossy>while something being portable does make it easier to "fix" if we swapped out say the bootstrap kernel, userland programs don't rely on any of that
<fossy>re: opcode.awk, i see no problem with it, it's one of the more readable awk scripts i have ever seen
<fossy>i also do not understand how awk/sed are documentation issues
<stikonas[m]>Yeah, I haven't used awk much before but I found its documentation alright when I was writing opcode.awk
<stikonas[m]>And writing our own tool is definitely worse in terms of documentation
<stikonas[m]>And especially extending catm
<stikonas[m]>catm is written in hex0
<stikonas[m]>I think OtiansJ can confirm that writing in hex0 is no big fun
<stikonas[m]>*OriansJ (sorry, typing on phone)
<gforce_d11977>fossy: our sed4 is 15 years old
<stikonas[m]>sed is from 1970ies...
<stikonas[m]>So 15 years old is not that old
<stikonas[m]>It's now almost 50 years
<stikonas[m]>(And GCC 2.95 is older than sed 4 too)
<stikonas[m]>Guix used sed 1 for bootstrap which is quite a bit older
<gforce_d11977>you are right, that the bootstrapping itself is a "closed box". At least I wish, that we use more comments.
<fossy>gforce_d11977: i am more than happy to add mroe comments :)
<fossy>in general
<fossy>well not really "how to use this" comments, that belongs in a readme or something like, but "how this works" comments are good
<gforce_d11977>generally i dont trust our tools yet 8-) that was the main reason to use bash2.05-syntax (which is also 20 years old 8-))) https://lists.gnu.org/archive/html/info-gnu/2002-07/msg00005.html)
<gforce_d11977>really, I'am unsure: Maybe it's better to write small tools in C for those tasks, than suck in e.g. awk or sed. I know that we need it for autotools/automake, but maye we can live with small stubs till then. I think for now we only use 'sed s/foo/bar/ file' or 'awk print $1"
<gforce_d11977>(but that is something for later and not important for now, dont get me wrong)
<fossy>i am unconvinced
<fossy>1. is there a problem with old working tools 2. what is the issue with awk or sed
<fossy>it's wayyyy easier to do text processing tasks in awk than it is in shell
<fossy>we aren't really looking to build an ecosystem of programs here/our own os
<fossy>i guess i'm thinking quite a lot about audibility here
<fossy>it's not whether it "works", it's a lot more about if a team with some time and effort can sit down and ensure the bootstrap is correct and secure
<fossy>adding a whole lot of our random tools does not assist with that, as does using the bad tool for the job (sh for text processing, for example)
<gforce_d11977>fossy: if we can avoid 'sed' and 'awk' in the early process, we build newer tools with proper 'make' or even autotools/automake. the simple things for which we are using this tools now are easy made with sh (or bash) too
<gforce_d11977>and: it is more easy to audit a 20 line c-program, than a big GNU-sed or GNU-awk
<stikonas>gforce_d11977: but these tools are needed for autotools/automake anyway
<stikonas>so you can't avoid auditing them
<stikonas>so by adding another C program you actually make audit harder
<stikonas>also auditing custom C program is probaby harder
<stikonas>gnu sed and awk were looked at by many more people over time
<fossy>^ yes, this exactly
<fossy>sed + awk are used by configure scripts and automake makefiles extensively
<bauen1>the problem with old versions of such programs is that they're no longer getting looked at
<fossy>this is true
<fossy>but i'm not sure that's really a problem? if it was looked at then, will that not hold for that now?
<bauen1>but in any way, getting a "minimal viable project", a toolchain capable of building linux, glibc, gcc should go a long way for raising interest in this project
<bauen1>fossy: there are still bugs in awk/sed ; even in the old versions, just now there is nobody that keeps looking at the source code and tries to find them
<fossy>bauen1: i see what you mean now, yes this is correct
<bauen1>not necessarily a very big issue, but by using such old versions you're effectively soft-forking them, like with tinycc
<fossy>how so
<fossy>?
<bauen1>fossy: if you discover a bug, you need to patch it, etc...
<fossy>m, yes
<stikonas>but then you only need to inspect patches
<bauen1>ideally you could somehow convince "upstream projects" to make their code simpler ...
<stikonas>which we keep to the minimum
<fossy>i get what you're saying though bauen1
<fossy>it's quite hard when upstream is "dead" (when i mean upstream i mean that version)
<stikonas>and newer versions generally cna't be used for bootstrap
<stikonas>either because they depend on new libc or because they are too interconnected
<fossy>and i don't think there is a real solution to this either; with the current manpower we have, keeping up with the target (which is always moving) requires constant mvoement as is
<stikonas>yeah, and we haven't even looked at non-x86 arches
<fossy>replacing older programs through either maintianing them, writing replacements (as gforce_d11977 is describing), or extending libc/gcc/whatever to work with newer versions is just not possible right now for me
<fossy>the only time i would really even consider a replacement of a useful tool such as awk with a custom hand-written script that reimplements an awk script in another language is if it meant a newer version could be used later or allowed the bootstrap to advance signficiantly more quickly
<bauen1>fossy: getting a "minimal viable product" i.e. a toolchain ready to build a kernel, glibc, gcc, git, ... will raise a lot more interest, up to the point where you might be able to influence upstream, start rewriting tools (e.g. in a safer C), review code, etc...
<fossy>i don't believe that is the case for this particular opcodes.awk issue
<fossy>bauen1: i hope so!!!
<stikonas>and in any case, we later build safe tools, most bugs can't propagate to there
<stikonas>only something very specific like TrustingTrust attack can
<stikonas>and we don't have that in binaries because we build them from scratch
<bauen1>stikonas: what about bugs in the source code that *do* propagate ? not sure if there are examples of this
<stikonas>yeah, I'm not sure if there are examples of this...
<stikonas>I think you have to do a lot more work than random bug in order to make something propagate
<stikonas>because it has to survive purely in binary code
<bauen1>stikonas: i was thinking about wrong constants, which propagate more easily
<bauen1>but yeah, i thankfully can't find an example of that yet
<stikonas>probably wrong constants would break things completely, so would be spotted before they are introduced...
<stikonas>I think it has to be more sophisticated
<bauen1>stikonas: that's what i'm not sure about, if e.g. a compiler bug introduces a bug in musl -> potentially a bug in gcc -> ... ; for example not taking a branch under very specific cirumstances
<bauen1>the real question would be if it propagates (in any form) past the bootstrap or a gcc recompile would "make it go away"
<bauen1>which does seem rather unlikely
<gforce_d11977>bauen1: the more giant blobs->SLOBS we use, the more attack surface
<stikonas>I think gcc recompile, especially against glibc will make it go away
<bauen1>or rather it should be catched if you recompile gcc twice (or more)
<gforce_d11977>i hope so...
<bauen1>stikonas: well not make it go away, but unless the propagation is "stable" ; which is unlikely for a non-malicious bug i would say, you can compare checksums after a few recompiles
<gforce_d11977>(that is stuff for a university: injecting stuff, which survives bootstrapping...
<gforce_d11977>)
<bauen1>gforce_d11977: it's not that hard ; i did that to hex0 which hijacks the write syscall, you just need a common pattern ; and then somehow survive the code review
<stikonas>bauen1: but does it actually survive all the way (and this is malicious case)
<stikonas>you have to do some deliberate effort for that, harder to imagine accidentally happening
<bauen1>stikonas: no, i didn't add enough patterns to make it survive very far, but would be doable
<stikonas>well, addign enough patters blows up initial binary size
<stikonas>you can't add much to 357 bytes (even if you pass code review)
<bauen1>stikonas: why put it in hex0 ? just hide it somewhere later
<bauen1>it means that you can't really affect earlier binaries, but you can still propagate to the final toolchain, possibly undetected
<stikonas>somebody should have added it in 15 year old code then
<stikonas>but it's probaby much easier to spot it when it's in source
<stikonas>rather than in binary
<bauen1>yes
<gforce_d11977>what i just want to say: for making the bootstrap, it is a good idea to not only avoid binaries, but also to avoid sourcecode (as much as posible)
<gforce_d11977>bauen1: nice idea! 8-)
<gforce_d11977>(hijacking the write syscall)
<OriansJ>stikonas: well there is always the proposed M3 solution to making the bootstrap simpler; it just is still 2 years of work away from being done.
<OriansJ>or getting MesCC to the level of being able to Compile GCC directly ( janneke any quess how long that would take? )
<stikonas>well, if you want to bootstrap autotools you would still need to get other tools running...
<stikonas>that depends on your goals...
<stikonas>and there are different things one can do
<stikonas>and also stuff like bison...
<OriansJ>stikonas: yes and being able to use GCC to do that work directly, would be much easier.
<stikonas>hmm, possibly...
<stikonas>I wonder how many files are there in gcc core
<stikonas>that we need to build
<OriansJ>well my path was largely assume you had the tool you needed, make a working path and then build it to spec
<stikonas>yeah, but my question is how are we building GCC?
<stikonas>compiler and liker is not enough
<OriansJ>yep, bison to generate the grammer
<stikonas>it's all the build system too...
<stikonas>need to compile make too, etc...
<OriansJ>stikonas: we can strip out the build system
<stikonas>well, up to some point
<OriansJ>convert it to a single shell script
<stikonas>not sure if GCC is simple enough
<OriansJ>No
<OriansJ>EVERYTHING
<stikonas>we were stripping build systems in live-bootstrap
<OriansJ>is just a list of commands executed
<OriansJ>It might be a long painful list
<stikonas>well, yeah...
<OriansJ>but it'll be build these things in this order
<stikonas>with a lot of defines...
<stikonas>but yeah...
<OriansJ>which you can record from a cheat run and use
<OriansJ>The Bison generation does have to be done and that bootstrap does have to be solved but autotools, make and the rest are just nice to haves.
<OriansJ>Strip out complexity that isn't absolutely required.
<OriansJ>and in no way is autotools and make complexity that is actually required.
<stikonas>well, make is actually very easy to compile...
<stikonas>but bison is not that simple
<stikonas>it depends on fairly advanced libc, reasonably new m4 and flex
<stikonas[m]>Which is half of live-bootstrap...
<janneke>OriansJ: never really looked at it
<gforce_d11977>yeah, stripping out complexity is a nice goal with a long road - BTW: i hacked a 'sed' replacement in 40 lines: http://intercity-vpn.de/remove_lines.c
<gforce_d11977>(this is for what we use sed at the moment)
<OriansJ>stikonas: Ok but could we remove the need for bison entirely with nyacc or another easier to bootstrap alternative?
<stikonas>I am not familiar enough with nyacc to tell...
<stikonas>but I wouldn't be suprised that with some work it would be possible
<stikonas>gcc only seems to have bison files, so on its own it does not need flex
<dannym>janneke: I found Linux config option OABI_COMPAT that would allow OABI executables to run. Is it turned on on novena? Hard to check because Guix kernels don't have /proc/config :(
<dannym>Maybe overdrive has /proc/config ?
<dannym>At least those:
<dannym>dannym@novena ~/src/guix-wip-arm-bootstrap/guix/gnu/packages/aux-files/linux-libre$ grep OABI_COMPAT *
<dannym>4.14-arm.conf:# CONFIG_OABI_COMPAT is not set
<dannym>4.19-arm.conf:# CONFIG_OABI_COMPAT is not set
<dannym>5.4-arm.conf:# CONFIG_OABI_COMPAT is not set
<dannym>5.9-arm.conf:# CONFIG_OABI_COMPAT is not set
<dannym>(OABI is from before year 2000)
<dannym>Can't we have gcc-2.95 just use EABI syscalls? It's not a lot of difference anyway--just load r7 with the syscall_number and then svc #0 (instead of just svc 0x900...0 + syscall_number)
<stikonas>OriansJ: on the other hand gcc 4.7.4 does not contain essential bison files (although it does contain flex .l file)
<stikonas>so if we want to try to skip old gcc, then we need flex
<stikonas>anyway, even if you bootstrap gcc and have very few other tools, it won't be that much easier than bootstrapping those tools with tcc and then building gcc
<stikonas>tcc can cope with fairly advanced C
<stikonas>if you want to skip tcc entirely, that might be another thing...
<OriansJ>stikonas: except GCC+glibc would eliminate a lot of possible problems
<stikonas>well, compared to mes libc, definitely
<stikonas>compared to tcc+musl, maybe, but I think less so
<stikonas>anyway, for bootstrapping, more paths is better, not worse
<stikonas>especially if they'll end up producing binaries with the same hash
<gforce_d11977>stikonas: important sentence! (more paths are better)
<civodul>dannym: re /proc/config.gz, you can do something like: guix gc -R $(guix gc --derivers $(readlink -f /run/current-system/kernel/bzImage)) | grep '\.conf'
<dannym>civodul: Thanks! CONFIG_OABI_COMPAT is indeed not set.
<dannym>(just tried that command--it works)
<dannym>(on novena, I get /gnu/store/51lywr26bayiq54hc15lhwak24ad6yhk-5.9-arm.conf )
<fossy><gforce_d11977> what i just want to say: for making the bootstrap, it is a good idea to not only avoid binaries, but also to avoid sourcecode (as much as posible)
<fossy>right, exactly
<fossy>we want to minimze the number of programs required across the entire bootstrap
<fossy>but if say we used your sed replacement, we still require sed later, so effectively all that has been done is added another program
***ChanServ sets mode: +o rekado
<stikonas>and we require sed for string replacement too
<stikonas>(I think as part of flex bootstrap)
<fossy>OriansJ: I guess its kinda a compromise between time and complexity; we can just strip out autotools and make from every single package and just use kaem, but that would take literally decades, at our current rate of work. For now it seems to be easier to just make autotools work, but eventually it would be nice to not use autotools at all
<fossy>complexity of implementing the complex thing vs removing it
<stikonas>fossy: I think OriansJ was proposing kind of automatic capture of commands to run...
<stikonas>that's why it was called "cheating"
<stikonas>testing perl changes now...
<stikonas>ok, it works, just need to put stuff into the right comits...
<stikonas>hmm, maybe I'll just create a new commit...
<fossy>stikonas: ooh, I see
<fossy>I dont mind that
<stikonas>fossy: ok, I pushed a new commit
<fossy>but then that clearly has to be fixed and audited..n
<stikonas>yaeh...
<stikonas>and it will be a long script...
<stikonas>and even if you have gcc with glibc
<stikonas>you still then need to bootstrap other tools...
<stikonas>you still need to write custom makefile for make, then build bash
<stikonas>bootstrap bison...
<stikonas>so the work that we are doing now will not disappear
<stikonas>it will be just postponed
<stikonas>well, maybe can drop a few patches
<stikonas>fossy: for these new perl tarballs from git archives I had to add more code to src_unpack
<stikonas>folders were named differently
<stikonas>other than that it worked
<fossy>stikonas: cool cool
<stikonas>I'll later work on newer perl's...
<stikonas>it can run some older autoconf/automake, but newer versions need slightly newer perl...
<stikonas>and 5.003 was the best I could build with 5.000
<fossy>stikonas: unfortunately I think we may need different versions of autotools :-/
<stikonas>oh, that's for sure
<stikonas>that's why I'm trying to get newer perl too...
<fossy>and hence different perls
<stikonas>oh, I think they can all run on new perl
<stikonas>there is probably minimal version requirement
<stikonas>but that's all
<fossy>oh yeah probably
<fossy>is Perl backwards compatible
<stikonas>I saw some versions complaining about needing at least 5.005
<stikonas>I don't know Perl...
<stikonas>but it probably is
<fossy>Im pretty sure its not semver
<fossy>Let me google
<fossy>so for the most part it is
<fossy>the only incompatibilites possible would be module removals
<stikonas>well, old automake runs without any modules
<stikonas>I think I tried 1.2
<fossy>right, ok
<stikonas>and I think we might need to install m4 libs/modules
<stikonas>there is stuff in m4-1.4.7/m4
<stikonas>ir maybe not, not sure now. Anyway, we can do that when/if we need