IRC channel logs

2022-10-21.log

back to list of logs

<oriansj>rekado_: worst case is that it is a step in the right direction.
<oriansj>as it'll finally remove the GHC binary from the guix Haskell build chain and reduce it to just some generated C sources; which we can use as a fixed target to hit
<oriansj>(in terms of haskell support and functionality)
<janus>rekado_: which platform are you compiling on? nhc98 doesn't need a lot of patching on redhat 6.1. ghc 4 might have some of the same assumptions about the generated ABI?
<janus>i have some progress on getting ghc 0.29 compiled with nhc98, the main issue is that the interface files (.hi) from nhc have a different syntax. so i am rewriting some of them by hand, which is error prone. will be a problem if i ever get to the linking step :P
<janus>regarding HBC, i saw https://github.com/haskell-implementations/hbc but gave up on it as it is written in LML and i don't know where to get an LML compiler
<rekado_>janus: we have nhc98 in Guix
<rekado_>I’m building for i686-linux
<rekado_>the HBC code on github doesn’t have a license, so I ignored it.
<janus>rekado_: oooh, does the nhc98 in guix run? i got the impression from your article that it didn't
<rekado_>we built it with the generated C “source” files.
<rekado_>in my blog post I tried to do without
<janus>ah right, ok
<janus>even ghc 0.29 is checking for a bunch of haskell compilers in it's build system without actually being compatible with anything but ghc
<janus>i tried using it because i thought it could be closer to the point where it was still somewhat portable
<rekado_>in ghc 4 you can set HC (via WithGhcHc in mk/config.mk.in) to nhc98, but GHC does require GHC and nhc98 just segfaults when building the first file.
<rekado_>it also complains about unknown command line options
<rekado_>I don’t think anyone ever tested this
<janus>the ghc website seems to suggest that people were contributing binaries. but they were probably all built with ghc, yeah
<rekado_>here’s my gdb session for the ghc 4 hsc segfault: https://elephly.net/paste/1666332860.html
<rekado_>I don’t know what I’m doing
<rekado_>I’ve built whatever I could with -g and -gdwarf-2
<janus>but a modern compiler, right? how can you trust it to lay out the code in a way they expect
<rekado_>I used GCC 2.95.3
<rekado_>that’s as unmodern as I can afford
<janus>ooh interesting. i tried using gcc 2.95 for ghc-0.29 and it crashed. which is why i use eGCS now
<janus>or actually, i think it was 2.96 (the hacked redhat version of 2.95)
<janus>gcc 0.29 is from Jan '96 if i can trust these timestamps. gcc 2.95 is from july '99.
<janus>things were changing a lot back then in GCC, so it makes sense to me how gcc 2.95 would already be too new
<rekado_>the release announcement for 0.29 is from July 96
<janus>right. still three years of gcc changes, which is my point
<rekado_>yes, I understand
<rekado_>(I just felt like looking it up)
<rekado_>have you tried any of the other GHC releases?
<janus>i have tried ghc-3.02, and i ported happy to work with nhc98. but i gave it up relatively early when it became apparent that ghc 3 hadn't been ever compiled with anything else
<janus>ghc 0.29 actually has a lot of HBC-specific code, more than for any other alternative compiler. so that is one argument in favour of HBC
<rekado_>ghc 0.29 has an intimidatingly custom build system
<janus>it has something called Jmakefiles which are used to generate normal Makefiles
<janus>and I think this Jmake build system was the one i didn't get to work on gcc 2.95
<janus>it also has some perl hacks to check which haskell files require which one. when they don't want that perl script to pick up an import, they put a newline after the "import" keyword, hah! so hacky
<janus>and the perl script also has some hardcoded lists of modules to treat a certain way
<rekado_>for GHC 4 I’m using perl 5.6 because that’s old enough to support multi-line matching with $*
<janus>heh good to hear that even newer ghc releases are using the weirdest features :P
<rekado_>re LML: it’s probably Lazy ML, which is presented here: https://dl.acm.org/doi/epdf/10.1145/800055.802038
<rekado_>but the paper only contains a few code snippets, no implementation
<rekado_>the release tarball hbc-2004-06-29.src.tar includes a directory “LML”
<janus>yeah. actually Lennart Augustsson which worked on HBC, and also on LML, now works with Simon Peyton Jones at Epic Games
<janus>so unlike the authors of the other alternative compilers, he is still around
<janus>even did a Haskell podcast episode that i listened to
<janus>i wish there was some way to challenge him to get his compiler building again :P
<janus>but i can't find his email anywhere
<janus>here are the logs of my current progress on ghc-0.29: https://paste.sr.ht/~janus/883278312e5cb9bf62d16c31f0c9f31ff896dc4f
<janus>the errors at the very end are due to the .hi files being incompatible, and i have already fixed a dozen, but there are many left
<janus>i could push this somewhere if anybody cares...
<rekado_>do you have flex and bison to build the parser from scratch?
<rekado_>(asking because of the yyin / yyout errors)
<janus>i have bison 1.28 and flex 2.5.4
<janus>but i havn't worried about it because there are so many haskell incompatibilities to fix
<rekado_>do you use GhcWithHscBuiltViaC to build from .hc files?
<janus>i think it isn't trying that, i remember trying to unset that, and it didn't seem to make a difference
<janus>but since it is invoking nhc98, i guess it isn't building ViaC?
<janus>i should note that it nhc98 does actually build some files in my patched tree successfully. it's just that the custom build system they use starts out with a module that has many dependencies
<janus>so the modules that i can compile, are not visible because the log just shows whatever the custom build system tries
<janus>but since they rely on shipped interface files, they are indirectly actually relying on handwritten/regenerated .hi files that i have overwritten before the build starts
<janus>maybe i should make mkdependHS not error, maybe that would make it choose a better build order...
<janus>since i have adjusted imports to get things building with nhc, it is not surprising that this breaks mkdependHS
<stikonas>oriansj: I'm a bit confused by cc_amd64 arithmetic, any idea what is going on? I have the following asm and C (M2) codes https://paste.debian.net/1257796/
<stikonas>oriansj: note C9 in assembly vs CA in C
<stikonas>(but I only need this for the second line)
<stikonas>is that due to signed arithmetic being applied? Even though we have unsigned type
<rekado_>I’m giving up on GHC 4 for now. Moving on to hbc.
<rekado_>gah, hbc makefile says: “It is impossible to make from scratch. You must have a running lmlc, to recompile it (of course).”
<rekado_>lmlc is hbc
<oriansj>stikonas: well M2-Planet behaves badly when you work on 64bit values because it tries to standardize the behavior between 32bit and 64bit systems (aka would that same code produce the exact same results running on a 32bit system)
<oriansj>and I thought that bit was needed to start a UEFI program, so it would probably belong in the libc.M1 anyway
*rekado_ just built gofer
<rekado_>only just noticed that the gofer license is … weird
<rekado_>“Permission to use […] for any personal or educational use without fee is hereby granted, […]”
<rekado_>is permission granted without fee, or is only permission granted for “use without fee”?
<rekado_>in other words: is this a non-commercial license?
<AwesomeAdam54321>rekado_: I think this is a non-commercial license
<rekado_>crud
<stikonas[m]>oriansj: hmm, I was thinking of putting some of early initialisation into bootstrap.c rather than libc-core.M1
<stikonas[m]>Though maybe I can put pritocol guids into .M1
*civodul wonders How Hard it could be to write a stupid Haskell implementation without type checking
<rekado_>civodul: the thought has crossed my mind many times.
<rekado_>we don’t need any of the fancy features. We just trust that the GHC sources are fine.
<civodul>might be interesting to team up with one of the labs/companies and invite them to be as good as OCaml in that regard :-)
<stikonas[m]>Indeed, bootstrap compiler with no checks shouldn't be super hard
<rekado_>lml2hs exist: http://web.archive.org/web/19970305125612/http://www.cs.chalmers.se/pub/haskell/chalmers/hbc/hbc/lml2hs.tar.gz
<rekado_>maybe this could be used to convert the LML sources to HS with Hugs
<rekado_>and then use the haskellified lmlc to build lmlc
<rekado_>and then we’d have what we need for hbc
<rekado_>I patched ghc 4 to print more info as it starts up. The segfault happens on the first module initialization in ghc/rts/RtsStartup.c (initModules).
<rekado_>that’s supposed to call all the init functions for each module in turn, but it already fails on __init_PrelMain.
<rekado_>this seems relevant: https://elephly.net/paste/1666360296.c.html
<rekado_>allocation in the RTS has changed a little bit between GHC 4 and 5.
<rekado_>but I don’t think this is the problem here
<stikonas>oriansj: so I think I'll end up using the following trick to create 64-bit constants in M2: (0x3B7269C9 << 32) + 0x50003F8E + 0x50000000 (with the comment explaining why we have double sum)
<stikonas>restricting to 31-bit constants works and let's me create 128-bit GUIDs
<stikonas>and it seems nicer to offload as much of UEFI initialization to C as possible...
*rekado_ learns how to use gdb
<oriansj>rekado_: here are some gdb notes xz'd and uuencoded https://paste.debian.net/1257896/
<oriansj>(org-mode text file of course)
<oriansj>stikonas: well as long as it is only done during runtime, I guess that is fine.
<stikonas>oriansj: what do you mean during runtime?
<oriansj>stikonas: as is in that block executed during runtime and wouldn't result in different compile results for 32bit hosts doing a build of the binary
<oriansj>^as is in that block^as in that block is^
<stikonas>oh I see
<stikonas>yeah, we want 32-bit M2-Planet to be able to build 64-bit binaries
<oriansj>and have them be bit for bit identicial
<oriansj>^identicial^identical^
<stikonas>yeah, got it
<rekado_>oriansj: thanks for the notes!
<stikonas>I"ll have to double check it, but I think it should be fine. I'm just adding __init and __cleanup functions in addition to main
<stikonas>something like: https://paste.debian.net/1257899/ ( ignore some debug stuff I have there)
<rekado_>stumbled upon more undocumented debug output for the RTS. Tells me where it’s jumping to in its custom stack.
<oriansj>yeah that looks like it should be fine
<stikonas>I'm now thinking how strict should I be while tokenizing load_options into argv[] (i.e. should I assume only single spaces or now)
<stikonas>s/now/not/
<oriansj>and you'll need to also include _exit and decide if that calls clean up as well (which may impact forked processes)
<rekado_>and I noticed that when the RTS is built with -unreg (i.e. with -DUSE_MININTERPRETER) it will always segfault upon *returning* from all init jumps.
<stikonas>yeah, right now I have __exit too but without cleanup
<stikonas>I just have mov_rsp,[rip+DWORD] %__return_address and ret
<rekado_>so I built without -unreg (and with GhcLibWays= instead of GhcLibWays=u) and this uses architecture-dependent assembly code leading to a much later segfault.
<oriansj>(sorry I ment to say :FUNCTION__exit)
<stikonas>yes, that's FUNCTION__exit. C version for now is just goto FUNCTION__exit;
<stikonas>I'll check how M2-Planet and M1 use exit...
<oriansj>well :FUNCTION_exit does cleanup and :FUNCTION__exit do no clean up and just direct syscall to exit
<stikonas>yeah, that makes sense, I'll do cleanup there though maybe after I get command line arguments and fopen working
<stikonas>(and this is still just the bootstrap M2libc...)
<stikonas>took me some time to sort out user stack stack __open_protocol functions but now they are working
<oriansj>well it shouldn't take much extra logic to drop extra whitesapce and would make it more robust against bad user input
<stikonas>yeah, I've already added it to argc counting
<stikonas>(haven't worked on argv yet)
<oriansj>and there is the question if you want to setup envp
<stikonas>well, it's easier to work with cc_* dialect than in assembly or hex
<oriansj>or just set a null
<stikonas>oriansj: I thought about it and I think yes, but not yet, maybe for full libc
<stikonas>nothing uses it yet
<stikonas>but we'll need it for kaem
<oriansj>kaem can work with a NULL
<stikonas>and we need to pack that stuff into load_options
<stikonas>but it needs to pass it to child binaries
<stikonas>oriansj: how do you think we should encode that?
<stikonas>load_options="ENV=env applications.efi arguments..."?
<stikonas>of course that will only work with kaem and not UEFI shell but I guess that's fine
<oriansj>stikonas: I'll need to think about that a bit more
<stikonas>yeah, I haven't thought too much about it yet...
<stikonas>but I think breaking compatibility with UEFI shell at that stage should be fine
<oriansj>we could reserve ::: as a separater for kaem
<stikonas>especially if we are replacing \ with /
<stikonas>we could do that too...
<stikonas>anyway, let's worry about it after I get bootstrap M2libc working
<stikonas>that can be used to build M2.efi
<oriansj>so kaem would do: program arg1 arg2 argN ::: envp1 envp2 envpN
<stikonas>then I can submit it for review
<stikonas>but yes ::: sounds alright