IRC channel logs

2022-08-11.log

back to list of logs

<stikonas>some of work is far less interesting, like changing ; comments into #
<stikonas>(nasm/clang differences)
<stikonas>but I had to use clang since it natively supports UEFI PE32 binaries
<theruran>I am not disputing the need for C compiler and libc to bootstrap *everything written in C*. But to tell me that C is necessary to build an operating system is some bullshit. For one thing, there is Pre-Scheme that was already used to write the Scheme48 VM. For another, there is Ada which is objectively better than C. Another person may argue for Rust, but whatever - C is not the only choice.
<theruran>I find it disturbing the affinity for C here. It is an old language that has proven itself unfit for modern applications. And I wasn't suggesting "replace C with Scheme", I was asking about how to get C out of a branched bootstrap path.
<vagrantc>how bootstrappable are those other languages?
<vagrantc>i mean, that's what it really comes down to
<vagrantc>there's also the question of ... who wants to work on what? :)
<theruran>vagrantc: Ada is not very bootstrappable right now. and I am asking these questions because it makes no sense to me to bring C into the bootstrap path of Ada or Scheme. and you all seem to take it for granted
<theruran>apparently C's "easy readability" did not stop people from causing thousands of CVEs
<vagrantc>different tools for different folks ... demonstrate another bootstrap path without C that works and i'm sure people will welcome it :)
<vagrantc>having multiple independent bootstrap paths is valuable in and of itself
<vagrantc>that said, people have worked a fair amount on a C bootstrap path, and come very far with it, so ... it's kind of a selection bias with the people who've been working on things?
<theruran>oriansj apparently gave me misleading information then, and I was trying to find out why
<vagrantc>i read that as oriansj came to some conclusions based at having looked at options...
<vagrantc>people might still have their disagreements, but at the end of the day, work on what motivates you
<theruran>he said earlier that the Knight assembler is written in C and M1 and hex2, as a response to my questions about lisp.s
<vagrantc>i would imagine it would be welcome to find other implementations
<theruran>which made me even more confused until I checked it myself
<stikonas>I don't think people say that C is the only choice
<stikonas>I was just pointing out that most people are familiar with C and hence that's what is mostly used
<stikonas>nobody would reject stuff just because it's non C
<stikonas>the main rule is that person who gets to devide is the one who is doing the work
<stikonas>if somebody is willing to do scheme programming then yes, why not
<vagrantc>it would be great to be in a position to have to decide between two bootstrapping paths :)
<stikonas>we can't even decide between 2 arches
<stikonas>the whole chain only works for x86
<oriansj>theruran: I hope I didn't give you the impression that C is the only solution.
<oriansj>It is just the only one we have right now.
<oriansj>It is certainly possible to write Kernels in Assembly, Lisp and FORTH and they have certainly be done in the past
<muurkha>also Pascal or Oberon
<oriansj>in fact the builder-hex0 work was entirely done in hex0 and I am planning on using that as well
<muurkha>stikonas: most Scheme implementations lose less than an order of magnitude of performance compared to C, and to some extent the increased difficulty of maintenance is offset by being able to write a lot less code
<muurkha>I don't know that Common Lisp vs. Scheme makes much of a difference
<oriansj>The only bias encouraged here is against the belief in magic. There isn't a person here that wouldn't celebrate progress (regardless of the language used in achieving it)
<stikonas>mes is at least 2 orders of magnite slower
<muurkha>you can get tail calls and call/cc easily if you're willing to take an additional performance hit and allocate your activation records on the heap instead of a stack
<stikonas>maybe even 3
<stikonas>building tcc takes almost 10 minutes with mes here
<muurkha>I think there are lots of times where people have written Lisp interpreters in assembly, often single individuals in less than a year
<stikonas>but fractions of the second to rebuild tcc with tcc
<stikonas>but speed is not always important in bootstrapping
<muurkha>they weren't Common Lisp because by the time we had Common Lisp there were better implementation languages available
<muurkha>like MACLISP or C
<muurkha>stikonas: I'm surprised clang has its own assembler! maybe that's why -m32 didn't work, it wasn't gas
<muurkha>theruran: I'm not convinced that Ada is objectively better than C but if you're up for implementing an Ada compiler in assembly, or in a small subset of Ada that can be compile with assembly, that would be super useful
<muurkha>stikonas: yeah, mes is a very slow Scheme
<stikonas[m]>Well, if clang was using GAS then it wouldn't be able to output uefi applications
<muurkha>really? I thought you could build for UEFI with GCC and gas
<stikonas[m]>I think you need to do additional magic with objcopy
<muurkha>oh, could be
<muurkha>anyway a simple tree-walking Lisp interpreter will generally be about 100× slower than C
<oriansj>heck if I remember correctly even Guix doesn't have a proper bootstrap for Ada yet as no one yet picked up the task of: extend Ada/Ed to support Ada 95, then bootstrap GNAT
<stikonas[m]>Whereas for clang you just need to set cflags and lflags
<stikonas[m]>Yea, I think ada is not yet bootsrappable
<oriansj>so even with GCC and all of Guix available we don't have a proper completed Ada bootstrap yet
<muurkha>but a still relatively simple Lisp compiler can get slowdowns of under 10×
<stikonas[m]>10x slowdown is probably fine
<oriansj>muurkha: I don't think speed is a good reason to not use a tool in bootstrapping when there is no better path available yet.
<stikonas[m]>1000x is slightly annoying but still much better than no bootstrap
<stikonas[m]>Exactly
<oriansj>heck, a year long build time to bootstrap GHC would be a dream compared to the "not possible now" build time
<vagrantc>works is better than could work :)
<oriansj>also nothing attracts contributors more than visible progress towards a tangible goal
<oriansj>and there are certainly a good few people here who love Lisp (myself included but I admit 4 years in assembly and hex2 really hurt my lisp programming skills)
<oriansj>including the brilliant mind behind solving the Guile bootstrapping problem: https://github.com/schierlm/guile-psyntax-bootstrapping
<muurkha>oriansj: you can't get a year-long build to finish successfully
<muurkha>but I agree that the potentially higher speed of a possible but not written program is not a good reason to abstain from using something
<muurkha>"works" is better than "could work", or as Christine Lavin said:
<oriansj>muurkha: if you mean in terms of debugging and making operational, you are probably right but I was thinking more in terms of it works but on a slow machine it might take that long.
<muurkha>"The reality of me cannot compete with the dreams you have of her"
<muurkha>yeah, I meant in terms of debugging and making operational
<muurkha>speed helps a lot with bringing stuff up
<muurkha>the great advantage of Forth over C is that when you suspect you forgot to byteswap the data written to the I/O port or whatever, it takes you five seconds to try the fix and find out if that solved the problem, instead of 30 seconds or ten minutes
<muurkha>to recompile your application and reflash the ROM on the board
<oriansj>or in the words of my friend Rachael when she was 16, dream boys are nice and all but you actually show up to put out.
<muurkha>ha
<oriansj>well FORTH is absolutely killer in its niche, no doubts.
<theruran>I am not that concerned about a slow LISP interpreter early in the bootstrapping path. later in the path compilers can be made
<muurkha>I think it might be, but I can't confirm that from personal experience. but my strong suspicion at this point is that that niche is not "programming language" but rather "interactive computing environment"
<theruran>oriansj: you still did not answer my question
<muurkha>theruran: I must have missed your question because I thought oriansj had answered it
<stikonas[m]>Ask again
<muurkha>what was it?
<theruran><@oriansj> well the knight assembler is written in C, M1 and hex2
<theruran>I don't understand. looking at stage0 and looks like lisp.s only needs M0-compact (and the pieces up to that point)
<stikonas[m]>What is knight assembler? Knight VM or M0 for knight
<stikonas[m]>Knight VM is definitely C
<theruran>so I suppose the question is: what Knight assembler? and why is this a response to a question about lisp.s
<stikonas[m]>But that's VM
<theruran>I see that Knight VM is written in C
<theruran>I suppose there is no M0 for x86_64 yet?
<theruran>only M2, which depends on C
<stikonas[m]>There is
<stikonas[m]>POSIX version
<theruran>OK in the stage0-posix tree
<stikonas[m]>And I'll have UEFI version in a week or so
<theruran>but no lisp.s in x86_64 M0, right?
<stikonas[m]>No
<stikonas[m]>Well, that's because nobody write any software for lisp.s
<theruran>you mean, "right." ?
<stikonas[m]>So on knight it was bootstrapped but not used
<stikonas[m]>I meant wrote
<stikonas[m]>And then unfinished bootstrap paths tend not to be done by people doing ports, e.g POSIX port
<oriansj>theruran: the first thing that needs to be known is Knight is an instruction set architecture. Like VAX, PDP-11 and IBM 360
<stikonas[m]>Because everybody is doing minimal amount of work to get to goal
<oriansj>the knight vm is just a VM like qemu but which supports the knight instruction set
<theruran>stikonas[m]: you answered no, but you meant yes or 'right'
<stikonas[m]>Oh indeed
<stikonas[m]>I meant right
<stikonas[m]>But I got confused by another typo in the following line
<stikonas[m]>write->wrote
<theruran>OK. so the x86_64 defs are there for M0? just need to translate lisp.s from Knight to x86_64
<stikonas[m]>And adjust syscall numbers
<theruran>alright
<stikonas[m]>But you can consider it part of translation
<oriansj>M0 is assembly but assembly is never portable between instruction set architectures and porting work will need to be done when moving an assembly program between instruction set architectures (sometimes it is easier to just rewrite from scratch)
<stikonas[m]>I heavily used x86/arm assembly when porting stage0-posix to riscv
<stikonas[m]>Assembly itself is not portable
<stikonas[m]>But they all have similar concepts
<oriansj>for example knight has div r0, r1, r2 as valid assembly but x86 and AMD64 would need several instructions (and a couple scratch registers) to approximate that
<stikonas[m]>Well, yes, e.g. push rax would need 2 instructions on risc-v but still main building blocks are similar
<oriansj>(as long as one sticks to register architectures and not something very different like stack or Memory-Memory machines)
<stikonas[m]>Yeah, those might be more different
<stikonas[m]>But x86, amd64, aarch64, risc-v just differ in minor details
<stikonas[m]>And since we have x86 port that is low on registers
<oriansj>and a human reading and rewriting can crib rather effectively
<stikonas[m]>All other ports have plenty of free registers
<stikonas[m]>theruran: any thoughts on what you are going to build after lisp.s port?
<oriansj>theruran: to your specific question M0-compact is just an optimization of M0 which is just a single architecture version of M1
<stikonas[m]>What is simplified in compact version?
<oriansj>stikonas: it optimizes the amount of memory needed to build bigger programs written in M0 assembly
<oriansj>the naive M0 tends to ballon with memory requirements when building things like the output of cc_*/M2-Planet
<stikonas[m]>Oh I think on risc-v I might have done something too
<stikonas[m]>Probably calculated length of string before mallocing
<muurkha>djb's "qhasm" proposal seems like it might be portable enough to be useful between instruction set architectures
<muurkha>also of course SNOBOL4 was written in assembly and was portable between ISAs
<muurkha>you could very reasonably compile div r0, r1, r2 to a short instruction sequence on amd64 as long as you reserve one or two registers as scratch registers for the "assembler"
<oriansj>muurkha: 1's complement and symmetric complement and floating point only architectures disagree.
<muurkha>an i386 port that is low on registers (that's what you mean by x86, right?) could use memory for some of the "registers". certainly that's what you'd do on the 6502
<muurkha>not familiar with symmetric complement, but I'm pretty sure SNOBOL4's implementation worked on both 1's-complement and 2's-complement machines
<oriansj>4bits would be 0000 -> zero, 1111 -> NaN, 1* -> negative 0* -> positive, 1110 -> -1; 0001 -> 1
<oriansj>I wonder how string comparison could work on 1's complement without also covering the -0 case; let alone the problem of architectures that don't have a condition register and those that don't have the ability to use registers for conditional jumps
<muurkha>1's complement does suffer the problem that you need separate signed and unsigned comparisons, addition, and subtraction instructions
<muurkha>almost nothing can use registers for conditional jumps, ARM is the only exception I can think of
<oriansj>knight, MIPS, Sparc, Alpha
<muurkha>but when you're programming you handle that by reversing the sense of the condition and jumping over an unconditional jump
<muurkha>well within the powers of a simple macro system
<muurkha>yeah, I don't know the assembly languages of Knight, MIPS, SPARC, or Alpha ;)
<oriansj>at that point you just have multiple different programs written in assembly just sitting together in a single source code base with nothing actually shared.
<theruran>If I can concentrate long enough to write x86_64 lisp.s, then my first thought is to port the next thing that has required C. M1?
<oriansj>theruran: well if you finish this port https://github.com/oriansj/slow-utils you can make MesCC no longer have to depend on any binaries except mes.c
<stikonas[m]>It was M2-Planet
<oriansj>then some work on a scheme compiler would enable theruran to entirely skip the C work
<stikonas[m]>But I guess you don't want to port lisp.s just to use it to build C compiler
<oriansj>as theruran could just compile mescc and slow-utils into a single binary which would be able to self-host
<stikonas[m]>That's an option...
<stikonas[m]>Probably a lot of work but if theruran finds it fun then good
<oriansj>and theruran here is a scheme compiler which could save you some time: http://canonical.org/~kragen/sw/urscheme/
<muurkha>oriansj: the idea is that your "assembler" (or macro package) knows how to assemble things like three-argument div or conditional jump to register for a given platform
<muurkha>that's how SNOBOL4 achieved its incredible degree of portability
<oriansj>muurkha: sounds like SNOBOL4 just did a simple high level language which compiled to native or an O-code sort of machine
<muurkha>well, kind of, but the O-code instructions were implemented as assembler macros
<muurkha>rather than interpreted at runtime
<oriansj>so %define ADD_R0_R1 add rax, rbx sort of thing
<muurkha>yeah
<oriansj>with a human having to figure out the right combination of replacement strings to make it sorta work
<muurkha>yeah. very similar to Ur-Scheme really, but without reordering prefix notation into postfix
<stikonas[m]>So kind of cross platform M0
<stikonas[m]>Can't we already do that with M0?
<stikonas[m]>Just needs suitable definition file
<stikonas[m]>And possibly two passes
<oriansj>hmmm https://www.regressive.org/snobol4/doc/arizona/s4d58.pdf
<muurkha>also https://archive.org/details/macroimplementat0000gris
<oriansj>stikonas: with a boatload more effort, absolutely
<stikonas[m]>Well, yes, it would be harder to write that cross-platform M0
<stikonas[m]>And in one more step we get it via cc_* anyway
<oriansj>well it would have to be M1 because M0 isn't cross-platform and doesn't handle things that are different between ports
<muurkha>worth mentioning that the reason SNOBOL4 was written like this was that C didn't exist yet
<oriansj>and people complained too much
<muurkha>they wrote the SNOBOL4 replacement, Icon (now Unicon), in C
<oriansj>about incompatibility and bugs in versions that they hadn't written
<muurkha>haha
<theruran>what is the difference between M0 and M1?
<unmatched-paren>theruran: iiuc: M1 is a slightly improved version of M0 that adds labels
<unmatched-paren>Ah, wait
<unmatched-paren>i'm conflating hex0/hex1 and M0/M1
<unmatched-paren>ignore me :)
<stikonas>theruran: M1 is slightly improved version of M0 that is cross-platform
<stikonas>so M1 should support all arches (not just the one that is being bootstrapped)
<stikonas>and M1 should be able to print better error messages
<stikonas>e.g. print if we used DEFINE that is not defined
<stikonas>M0 simply exits with non-zero error code
<stikonas>also M1 supports multiple input files
<stikonas>with M0 it is "M0 in.M1 out.hex2"
<stikonas>with M1 you can have "M1 --architecture x86 --little-endian --file header.M1 --file program.M1 --output output.hex2"
<stikonas>also M0 might have more limitations
<stikonas>e.g. it might only support uppercases hexes (liek 0xA but not 0xa), etc...
<stikonas>there are also two hex2's one is written in hex1 the other in C or you can write it in some other high level language
<theruran>and M1 is written in combination C and M0?
<stikonas>theruran: M1 is written in C
<stikonas>it's then compiled using M2-Planet into M1, hex2 and eventually binary