IRC channel logs

2022-03-29.log

back to list of logs

<muurkha>unmatched-paren: it is true that there are loads of good resources on that page but you only need like 0.1% of the knowledge in them to write a bootstrap compiler
<muurkha>it's easy to get overwhelmed thinking you need to understand undefined behavior and static single assignment and graph coloring and C and LALR(k) parsing and automata theory and strength reduction and sparse conditional constant propagation and floating-point arithmetic rounding modes and multiple ABIs to write a compiler
<muurkha>but actually you don't need to understand even one of those things
<oriansj>indeed muurkha
<muurkha>I mean they're beneficial because you can write a *better* compiler with them
<muurkha>but working compilers can be written without understanding any of them
<oriansj>(assuming for values of correctness or quality of resulting generated binaries when one says *better*)
<muurkha>I know because I did: http://canonical.org/~kragen/urscheme
<muurkha>except that I did understand some things about automata theory, strength reduction, and C when I wrote it, I just mostly didn't use them
<muurkha>I think you do need to understand enough of the semantics of your target machine, and how to implement your desired language semantics in terms of them
<oriansj>one can understand cc_* and M2-* with just basic programming literacy
<muurkha>and unless your syntax is at Forth or assembly-language levels of simplicity, you need to know at least one way to write a parser. recursive descent is probably the easiest way, but knowing how to use it to handle common language constructs like operator precedence requires learning about manipulations of context-free grammars
<muurkha>Ur-Scheme doesn't support things like that though ;)
***bauen1_ is now known as bauen1
<unmatched-paren>muurkha: yeah, that page is on the website of the author of the QBE compiler backend (which i'm using)
<unmatched-paren>i suppose QBE handles most of those things...
<unmatched-paren>and the grammar i need to parse is pascal, which seems to lend itself quite well to recursive-descent parsing
<unmatched-paren>although i STILL can't figure out which language to write the thing in!
<unmatched-paren>i don't enjoy C, i like the robustness static typing gives me (so lisps are out), and i vastly prefer functional programming, but none of the languages that satisfy all three (ocaml, haskell, idris, fsharp, scala) are completely bootstrappable!
<AwesomeAdam54321>unmatched-paren: There's a programming language called Idris that fulfils those 3 conditions
<unmatched-paren>Idris1: written in haskell, Idris2: written in idris but bootstrappable with idris1
<AwesomeAdam54321>I didn't realise that
<unmatched-paren>there's typed racket... but it's not a first-class part of racket, and it seems to lack certain functional features like matching against tagged unions
<unmatched-paren>and i'm slightly wary of joining the racket community: https://beautifulracket.com/appendix/why-i-no-longer-contribute-to-racket.html
<nimaje>well, just having more parts of a bootstrapping graph would be useful, so just write in some language you like
<unmatched-paren>i guess so
<unmatched-paren>but i don't think the haskell, scala, or fsharp problems are going to be fixed anytime soon (for all three it looks like a new compiler/interpreter is the only option)
<unmatched-paren>the only one close to being done is ocaml
<unmatched-paren>but the last commit to https://github.com/Ekdohibs/camlboot was in September last year, so i'm not entirely confident it's ever going to be finished unless someone else picks it up
<unmatched-paren>i could use ocaml 4.07, but there are probably a few libraries that don't work with it
<unmatched-paren>... i'll just use ocaml 4.07, it probably won't be much different to working with the latest version :P
*unmatched-paren suddenly wonders whether we could built Idris1 with Hugs...
<unmatched-paren>probably not, but i could try :)
***alMalsamo is now known as lumberjack123
<lumberjack123>unmatched-paren: Why do you prefer static typing over dynamic typing?
<lumberjack123>I never heard of Idris... wikipedia says you transpile it to C or Javascript? Is that the only way?
<unmatched-paren>lumberjack123: i like that static typing gives you an extra layer of robustness, and apparently it also helps optimization; anyway, anything you can do with dynamic types can be accomplished with generics and ADTs
<unmatched-paren>idris also apparently has code generators for the JVM, LLVM, and dotnet
<unmatched-paren>but they're third-party
<unmatched-paren>so far, it looks like i'll need to use a pretty old version of idris1
<unmatched-paren>and idris has a lot of dependencies (as is to be expected of a typical haskell project) which may or may not work under hugs
<unmatched-paren>lumberjack123: actually, idris2 looks like it can only currently compile to Scheme and C
<unmatched-paren>one of the dependencies uses a language extension that doesn't seem to be supported by hugs; i think i've lost already :p
<lumberjack123>unmatched-paren: Can you expound upon this "extra layer of robustness"?
<oriansj>unmatched-paren: well you always could just limit yourself to the subset of haskell that Hugs supports
<unmatched-paren>lumberjack123: imo it gives you an extra barrier against mistakes, since static typing catches incorrect types at compile time (you don't even have to call the function to get an error; the compiler will downright reject your code.) most functional languages have pretty advanced type-inferrence engines, so it's generally not too verbose
<unmatched-paren>in dynamic languages, you could accidentally call a function with the wrong parameters, and you'd never know because in your testing you never got to that particular point in the control flow
<unmatched-paren>of course, this is highly subjective. some people just prefer dynamic -.o.-
<oriansj>or via example should "0x1234" + 5 be "0x12345" or 0x1239 or an error because the user needs to type cast prior to resolve that detail and when will we be checking that.
<unmatched-paren>oriansj: limiting myself to a small subset of haskell means i can't use basically any libraries
<unmatched-paren>happy, for one, and i'd rather not build a parser from scratch
<nimaje>oriansj: "0x1234" + 5 evaluating to 0x1239 is a example of weak typing, not dynamic typing
<oriansj>nimaje: thank you for that refinement
<oriansj>unmatched-paren: fair enough; it is still ok to just use GHC and wait for someone else to solve the GHC bootstrap problem completely (assuming I didn't miss someone already solving that) as last I heard good progress has been made.
<muurkha>unmatched-paren: yes, Pascal was carefully designed to support recursive-descent parsing because that's what Wirth likes
<muurkha>OCaml 4.07 is just fine
<muurkha>it's true that static typing helps optimization, a great deal. without static typing, every time you execute a primitive operation like addition or array indexing, you need to first run a dynamic type check to see if the operands are a suitable type, and possibly to do polymorphic dispatch (floating-point addition vs. integer addition, for example)
<muurkha>'should "0x1234" + 5 be "0x12345" or 0x1239 or an error' is not really about static vs. dynamic typing; it's about implicit type coercion. it's an error in OCaml, which is statically typed, yes, but it's also an error in Python, Scheme, and Smalltalk, which are dynamically typed, and it's allowed in C, which is statically typed
<muurkha>unmatched-paren: the hostility the author evidently received from Felleisen is unfortunate, and it would be nice to find an ambit where people never maltreat each other, but sadly I don't think those exist except at the level of five people or less
<muurkha>in a larger group that lasts years you will always find some degree of pecking order, and those higher in the pecking order will always include some diversity of people, including some who are more aggressive than others
<muurkha>Felleisen's apology is nice: https://felleisen.org/matthias/Thoughts/Apology.html