IRC channel logs

2022-12-10.log

back to list of logs

<alethkit>stikonas: Shame.
<alethkit>I assume you need some combination of BIOS boot/UEFI, so a port of Cosmopolitan libc might be the way to go
<stikonas>I don't remember exact details but it might be something simple enough like can't write to files
<stikonas>never heard of cosmopolitan libc
<alethkit>It's the actually portable executable
<alethkit>a.k.a the universal x86 binary
<stikonas>ok website says runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + NetBSD + BIOS
<stikonas>I guess they haven't implemented UEFI though
<alethkit>You also get UEFI at the expense of windows
<stikonas>do you?
<alethkit>Yes
<stikonas>UEFI is quite different from windows
<alethkit>Not in terms of the binary format
<stikonas>it has the same calling convention
<alethkit>They're both COFF/PE, IIRC
<stikonas>yes, binary format is same too
<stikonas>but it has no syscalls
<alethkit>It has runtime/boot services which are stable
<stikonas>yes, but boot services are not available in windows
<stikonas>so I don't see how windows support adds uefi support
<alethkit>It doesn't
<alethkit>They're mutually exclusive
<stikonas>I guess you could dynamically determine
<stikonas>whether it's UEFI or windows
<stikonas>and then picks appropriate code path
<stikonas>so you might have one binary running on both
<stikonas>well, our own M2libc is slowly getting UEFI support too
<stikonas>so M2-Planet can be built for both POSIX and UEFI
<stikonas>and it's exactly the same source code (if you don't count libc bits)
<alethkit>Figuring out where jart is might be helpful
<alethkit>But I think she hangs out on Discord?
<alethkit>stikonas: I wonder if sectorlisp's existing I/O can be expanded for UEFI
<alethkit>It probably beats attempting to port an entire libc
<stikonas>I guess it is possible, but it won't be sector lisp then
<stikonas>UEFI I/O is somewhat tricky
<stikonas>there is no way you can fit it in a sector
<alethkit>I mean, given the fact that UEFI defaults to 4K for a sector
<stikonas>does it?
<alethkit>(For GPT boot, that is)
<stikonas>I was not aware that it cares about sector size
<stikonas>and GPT works fine with 512 byte sectors
<stikonas>so for comparison hex0-seed on amd64 linux including elf header is 292 bytes. But UEFI version is 832 bytes
<stikonas>so extra UEFI overhead is about 540 bytes
<alethkit>But how large is the Linux EFISTUB?
<stikonas>probably much larger
<stikonas>you'll struggle to write anything non trivial that is smaller than hex0
<stikonas>even trivial UEFI application that does nothing would be about 400 bytes
<alethkit>Oh, I'm sure about that
<alethkit>I'm trying to figure out how to bootstrap from UEFI
<alethkit>if only to avoid redundancy in driver code
<stikonas>save way we bootstrap on Linux
<stikonas>same way
<stikonas>start with hex0, reach tcc
<stikonas>use tcc to build some POSIX kernel
<stikonas>(this is the step that is different for UEFI)
<stikonas>and then continue till Linux is bootstrapped
<stikonas>you won't find anything simpler
<stikonas>and this is because: 1. Linux needs GCC
<stikonas>2. GCC needs lots of other tools, such as bison, flex, autotools (so perl, bash)
<stikonas>well, I can imaging bash can be replaced with something else, but you'll struggle to replace bison, flex or perl
<stikonas>so you still follow https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst
<alethkit>~If Linux needs GCC, just bootstrap using Hurd~
<stikonas>unlikely to be any better
<stikonas>Fiwix seems a better choice for bootstrap
<stikonas>Hurd is modular kernel, so I suspect complicated build system, etc..
<stikonas>and rickmasters said that tcc is capable of building fiwix
<stikonas>or at least tcc on musl (we haven't tried bootstrapped tcc yet)
<stikonas>so on UEFI I've now reached step "M2-Planet (v1) compiles kaem" in Part 1
<alethkit>When you say complicated, I assume Guix falls into that area?
<stikonas>Guix build system is definitely complicated enough
<stikonas>and it also needs Guile
<stikonas>and by complicated I mean in terms of bootstrapping purposes
<stikonas>i.e. make is very simple in that sense as you can just build it with a few tcc commands
<stikonas>something like autotools is a bit more complicated, you need to build bash and perl for it to work
<stikonas>something like CMake is much more complicated, as you need g++
<muurkha>replacing bison and flex is kind of a pain but not out of the question
<stikonas>I think guix also needs c++ (indirectly)
<alethkit>How heavily does the bootstrapping process rely on special guile extensions?
<stikonas>we don't use guile till very very late
<sam_>i'm surprised by how much stuff works with byacc and reflex at least
<stikonas>and so far the only package that needs guile is autogen
<alethkit>Wouldn't it be easier to replace bison and flex with handwritten parsers? Or are the grammars not LL?
<stikonas>well, handwritten parser is used to bootstrap bison and flex
<stikonas>I'm not sure about gcc itself
<stikonas>but it's not just gcc
<stikonas>you would have to write a lot of parsers
<stikonas>it's also binutils
<stikonas>a few versions of perl, etc..
<stikonas>oh and bash too (though we first build bash before bison and build that parser using yacc from heirloom tools)
<stikonas>well, components in live-bootstrap in principle is something that fossy and I found easiest to build
<stikonas>if you look at what was available at that time
<muurkha>if you wanted to replace bison and flex with handwritten parsers, which wouldn't require grammars to be LL, you could handwrite the parsers for bison and flex's input formats, rather than for the zillion things they're used for
<alethkit>That would make more sense, yes
<stikonas>yes, but there was already bison bootstrap chain that was done
<alethkit>Fair enough
<alethkit>I guess it gives me something to do
<muurkha>yeah, through various versions of bison, no?
<stikonas>so all we had to do is to run those steps (i.e. apply some patches, replace some files and build it)
<stikonas>it's mostly a single version
<muurkha>sorry for stupid questions
<stikonas>we do build some other versions later but that's because bison is not fully compatible between different versions
<muurkha>a handwritten shift-reduce parser is pretty simple to write if it doesn't have to be efficient; the breakthrough of LALR was that it was guaranteed linear time with no backtracking, not that it could parse things people didn't know how to parse before
<stikonas>bootstrap of bison is done using version 3.4.1 which we build 3 times
<stikonas>1. Build bison using a handwritten grammar parser in C. 2. Use bison from previous stage on a simplified bison grammar file. 3. Build bison using original grammar file.
<muurkha>and a handwritten PEG parser generator is probably simpler than that; I wrote an example one in https://github.com/kragen/peg-bootstrap/blob/master/parser.md
<muurkha>oops
<muurkha> https://github.com/kragen/peg-bootstrap/blob/master/peg.md
<stikonas>well, yes, we wrote parsers manually for e.g. M2-Planet
<muurkha>as a literate program which doubles as an example to PEG parsing
<stikonas>which is surprisingly readable
<muurkha>um, as an introduction to PEG parsing
<muurkha>once you have some systematic way to do backtracking, writing a parser for even fairly hairy languages is not that hard
<alethkit>systematic backtracking?
<alethkit>hmmm
<muurkha>yeah, like in a PEG parser, for example
<muurkha>or in Prolog DCGs
<muurkha>it may take exponential time to run (Packrat guarantees linear time for PEGs, but Prolog DCGs cover the entire class of context-free languages) but that doesn't always matter
<muurkha>this doesn't matter that much for bison and flex since we already have acceptable unbootstrapping paths for them, but it's likely to matter for things like Haskell, GNAT, etc.
<alethkit>What happened to GNU Gash?
<stikonas>alethkit: not still being ported to mes
<alethkit>Pardon?
<stikonas>gash needs guile right now
<stikonas>which is a fairly heavy dependency
<stikonas>I think there is a wip work of gash and wip branch of mes where it might run
<stikonas>but nothing is released
<stikonas> https://git.savannah.nongnu.org/cgit/gash.git/log/?h=wip-modular-mes
<stikonas>it might be useful a bit if it runs on mes
<alethkit>ah, right
<stikonas>that way we can build musl a bit earlier and postpone building of heirloom-tools
<stikonas>right now there is licensing problem with heirloom-tools
<stikonas>binaries of heirloom yacc and heirloom lex that live-bootstrap builds are non redistributable
<stikonas>(heirloom tools are licensed under CDDL)
<stikonas>(i.e. same license as ZFS)
<alethkit>ah, the zfs problem
<stikonas>indeed
<stikonas>and meslibc is GPLv3
<alethkit>hmm
<stikonas>if we get gash working then we can reorder and build heirloom tools after musl
<stikonas>and MIT + CDDL is fine
<theruran>muurkha: thanks for the link to your PEG bootstrap! I was planning on using a PEG parser for Ada bootstrap, but it would need to be ported to whatever Scheme we use (Irvise_)
<theruran>hmm, I have this in my browser history: https://github.com/hengestone/parasail_git/blob/master/ada202x_parser/ada202x.y
<muurkha>theruran: sure!
<theruran>yeah, iirc, Ada is an LALR(1) grammar - https://dl.acm.org/doi/10.1145/947825.947832
<muurkha>PEGs are IIRC a superset of LALR(1) grammars
<muurkha>LALR parsers have the advantage over Packrat parsers (more important in 01978 than today) that they only require memory proportional to the syntactic nesting depth, not the input size. also they're usually a lot faster and have better error reporting
<muurkha>but they're a lot more fiddly
<theruran>hmm, from what I've seen of PEG/Packrat in Scheme, they are more elegant in defining the grammar to be parsed. I thought this would be an advantage for bootstrapping readability and (low) effort
<muurkha>usually both LALR parser generators and PEG parser generators take as input something like a slightly ornamented CFG
<stikonas>hmm, would any asm experts know here if there is zero-extended version of movsx rax, DWORD PTR [rax] ?
<theruran>I've seen that most industrial-grade language implementations have custom parsers - yes, probably due to error reporting and other language features that require finer control
<muurkha>stikonas: movsz?
<stikonas>well, I lower ones are movezx
<theruran>well, and speed
<stikonas>but only BYTE and WORD PTR versions are recognized
<muurkha>I guess that demonstrates that I'm not an assembly expert
<muurkha>it sort of makes sense that movzx on the i386 wouldn't have a dword version because there's no more bits to extend the dword version into, but I'm surprised that amd64 didn't add one
<stikonas>well, maybe there isn't one instruction...
<stikonas>maybe I need to load into eax and then zero extend value
<muurkha>it might be easier to zero rax first and then load into eax
<muurkha>rather than trying to zero just the high-order bits without bothering the low-order 32
<stikonas>yes, that's probably easier
<stikonas>I mean if there are two short instructions that do the same job, I wouldn't be surprised if they don't add one long instruction to do it
<muurkha>theruran: possibly also custom parsers run faster
<stikonas>oh it might be that 64-bit instructions that result in 32-bit value are automatically zero extended
<stikonas>and got my test passing on x86_64 now
<stikonas>will do the other arches tomorrow
<stikonas>and then we'll have better support for those fixed size int types
<muurkha>hmm, you mean that if you mov eax, word ptr [rax] you think it will zero the high bits? i had no idea, I thought it would leave them alone like the analogous 8-bit and 16-bit cases do (?))
<muurkha>*not being an asm expert intensifies*
<stikonas> https://stackoverflow.com/questions/11177137/why-do-x86-64-instructions-on-32-bit-registers-zero-the-upper-part-of-the-full-6
<stikonas>yes, I didn't know that either
<stikonas>well, we both learned something today
<muurkha>hopefully, it might be too hot here for me to remmber it
<muurkha>it was 32° most of the day, with a punishing dewpoint of about 20°
<stikonas>oh, here it's qutie cold...
<muurkha>a thunderstorm briefly brought the dewpoint close to a deadly 30° before cooling the air off
<stikonas>bellow freezing point now
<muurkha>that's not ideal either
<stikonas>oriansj: what's the difference between LOAD and LOADU32 on knight?
<muurkha>ACTION bellows, "0°! 0°!"
<stikonas>isn't the register 32-bit on knight
<stikonas>so there is nothing to extend
<stikonas>(I might be wrong though)
<stikonas>anyway, I'm signing off, will read the logs later
<stikonas[m]>I could probably try both LOAD and LOADU32 and see if test passes with both
<stikonas[m]>Or look at vm.c
<stikonas[m]>Looks like they are the same assuming register size is 4 bytes
<alethkit>Hmmm
<alethkit>Looking into it more, futamura projections might be able to be combined with R7RS for a really good bootstrap
<alethkit>as a bonus, we might be able to remove all the other binaries
<alethkit>since we can just have DSLs
<oriansj>stikonas: there would only be a zero extended load of RAX for a 64bit value if the architecture planners thought their might be a 128bit extension to x86
<oriansj>muurkha: no such things as stupid questions here, just details one wishes to know
<oriansj>alethkit: just be sure to do what you find fun ^_^
<alethkit>oriansj: Oh, this is definitely going to be fun
<alethkit>specially since if it actually works, you can leverage it to bootstrap multiple languages (e.g RPython, MiniML)
<oriansj>alethkit: good, always good to expand our bootstrapping tree
<stikonas>and I'm continuing to fix load instructions of various length in M2-Planet
<stikonas>I think now I've got x86, amd64, riscv32, riscv64 and armv7l working, though armv7l defines are somewhat painful :(
<stikonas>and I messed something up in aarch64, so test0104 (kaem) segfaults
<stikonas>and knight tests are also broken, though that might be earlier regression, since I haven't tested on knight for some time
<alethkit>Oh, we actually need python for meson!
<muurkha>doh!
<stikonas>yes, in that sense meson is harder to bootstrap than cmake
<stikonas>strange, it now seems that aarch64 has been broken for a few commits...
<stikonas>I was sure that I was running tests...
<stikonas>strange
<stikonas>maybe I mistyped --override flag in get_machine...