IRC channel logs

2019-11-02.log

back to list of logs

<oriansj>greetings twinshadow
<twinshadow>oriansj: hey
<oriansj>twinshadow: what fun things are you working on?
<oriansj>greetings theruran
<theruran>o/
<oriansj>theruran: working on something fun I hope
<theruran>mostly lurking :) but yeah, interested in bootstrapping a LISP system
<oriansj>theruran: good; lots of fun pieces to do that
<theruran>reading Lisp as the Maxwell's Equations of Software
<oriansj>you might like this then: https://github.com/oriansj/mes-m2
<oriansj>The First portable bootstrappable Scheme
<oriansj>(We are slowly working to make it fully mes.c compatible)
<theruran>yeah! this is awesome!
<oriansj>and I do mean from quite small bootstrap binaries: https://github.com/oriansj/mescc-tools-seed (357 bytes)
<theruran>I want to implement a KLambda machine to host Shen lisp
<oriansj>ambitious, I like it
<theruran>yeah, in a memory-safe language :)
<oriansj>one would hope real metal
<theruran>we'll see how my research plan develops. I need to meet with my advisor again to discuss what to do next
<oriansj>but generally a virtual machine first: https://github.com/oriansj/knight-vm
<oriansj>Then FPGA
<oriansj>Then custom asic
<theruran>True! Steele and Sussman outlined the VM in their SCHEME-79 and SCHEME-81 architectures. A VM of this Lisp microprocessor should be made first
<theruran>err not the VM, they implemented a real chip. but they provide the state machine for the Lisp processor
<oriansj>no worries, I am familiar with their papers
<oriansj>Though I have found tagged memory harder to bootstrap from zero
<theruran>great, now I know how to ask ;)
<theruran>who*
<oriansj>But once one has a proper C compiler or lisp compiler; the bootstrap gains some lovely benefits for free.
<theruran>oriansj: because of the extra juggling needed at the hex or assembly level?
<oriansj>it is more restrictions and things to keep track of that make the process harder (but not impossible)
<theruran>I see, yeah
<theruran>I like this guy's project: http://akkartik.name/post/mu-2019-2
<theruran>It's a different direction than where I'm headed but fun to learn how he is approaching the bootstrapping process.
<oriansj>nice guy; lots of fun discussions on this channel
<oriansj>essentially when working at the hex or binary level, everything takes alot longer and it harder to change. Think of making a picture with individual pieces of colored sand; simple and small pictures can be done with reasonable effort but as the complexity of the picture and the size of it goes up it can exceed the amount of effort people are psychologically able to perform.
<theruran>but if your architecture is just low-level Lisp expressions then no need for bit-fiddling, right?
<oriansj>theruran: how do you plan on encoding your instructions? your integers? your floats?
<theruran>as in, how to tell the Lisp microprocessor what to do?
<oriansj>bit-fiddling is just part of all real software assemblers and compilers
<theruran>from what I understood of their architecture, the hardware takes care of that level of detail. it even has a garbage collector. the authors say the line between hardware and software can be drawn anywhere
<oriansj>well using ascii encoded strings for numbers in hardware is far from a wise idea
<oriansj>0xFF would consume 5bytes instead of 1
<theruran>so a bootstrapping compiler will still be needed to translate ASCII-encoded Lisp programs into their hex instruction addresses and values?
<oriansj>it would also take hundreds of more clock cycles to perform a simple calculation that could be done in a single cycle in a better format
<oriansj>Do you wish ADD instruction to consume 3 bytes to express something that could fit in 1 byte?
<theruran>I don't know. I don't think there is an ADD instruction
<oriansj>aka do you encode ADD as 41 44 44 ... or just 04 ..
<oriansj>well one could easily do a Turing Tarpit instruction set
<theruran>I imagine the Lisp processor is pretty slow by today's standards, but it may not matter.
<oriansj>theruran: it could also be impressively fast if done correctly
<theruran>would love to get there because I think it would mean much safer computers and easier to extend
<oriansj>well it isn't the hardware that makes it easy to extend, it is the software.
<oriansj>OpenGenera on DEC Alpha proved that
<theruran>I am betting on Lisp and you know all the reasons already. Abstract C Machines should be considered harmful.
<oriansj>but properly made hardware can certainly add safety properties that software just can't provide.
<oriansj>I think of them as a tool; used wisely you can do impressive things but act foolish and the price must be paid
<theruran>without formal verification tools to do the sanity checking for us, they are too dangerous for mere mortals to use
<oriansj>So what? Not every program has to be correct nor secure.
<oriansj>If one needs the properties of C and assembly; there is no way to have automatic formal verification.
<oriansj>Now one can follow a process that results in C and assembly code that have formally proven properties
<oriansj>but compilers can easily violate those assumptions
<oriansj>especially when optimizing undefined behavior
<theruran>in my opinion, we must raise the bar and create correct programs. It should not be relegated to military defense and aerospace applications. Normal users deserve working software.
<theruran>I agree that not every program needs to be verified, but those should be at the upper-most level: basically user scripts.
<oriansj>theruran: good; now how are you going to help make that reality?
<theruran>by doing the formal verification work, either of legacy systems or building a new one without the same problems
<theruran>the tools are much better nowadays. No need to learn some obscure logic programming and proof assistants
<oriansj>now what does progress towards that goal look like
<oriansj>what is missing or needs to be fixed first
<theruran>one simple way to start would be to verify minilisp.c
<oriansj>good
<theruran>another way I am considering is modeling Steele and Sussman's Lisp microprocessor state machine using abstract state machines
<oriansj>I look forward to seeing it
<theruran>thanks oriansj! it may take me many more months. I am still getting used to being a student researcher :)
<theruran>will be sure to share my progress here
<oriansj>theruran: just remember, one person can change the world an inch at a time
<oriansj>keep pushing
<theruran>
<oriansj>we are always here to help those who want to make the world a better place
<oriansj>learn, share, teach and create something worth sharing
<theruran>thank you. glad to have found this community because you are working on stuff that will upend the status quo
<oriansj>theruran: it isn't everyday you get to solve a 61 year old open Computer Science problem and have this much fun with your friends doing it
<theruran>wow :D
<oriansj>The Trusting Trust attack is literally from a 1958 paper on subverting Soviet technical infrastructure and soon it will be solved.
<theruran>I guess I didn't realize it wasn't solved yet
<theruran>well, at a practical level definitely not. but experimentally yes
<oriansj>well the DDC paper provided a roadmap
<oriansj>Guix provided a solution to the GCC and up bit
<oriansj>Mes.c is providing the scheme to GCC bit
<theruran>true. what's the DDC paper?
<oriansj>The paper that started it all: https://dwheeler.com/trusting-trust/
<oriansj>It was a formal proof that allows one for the first time know that they have defeated the trusting trust attack.
<oriansj>It is what inspired the Repoducible builds project
<oriansj>that started the huge task of making gcc builds reproducible
<theruran>Oh, I've been on this page before but I don't think I've read this paper yet. Diverse Double Compiling is more robust that what is outlined in Trusting Trust?
<oriansj>It is a way to prove if binaries contain such an attack
<theruran>gotcha
<oriansj>IE: If the result is bit-for-bit identical with the untrusted executable, then the source code accurately represents the executable.
<oriansj>and by bootstrapping off a single 357 byte binary; we have a path to a trusted root compiler
<oriansj>be sure to watch the talk, it is well worth it
<theruran>will do. thanks
<oriansj>So all that remains is the close the gap between mes-m2 and mes.c
<oriansj>oh and if you want to help; there are scheme programs that still need writing
<theruran>so the project is very close to completing its mission? very exciting!
<theruran>what Scheme programs must be written?
<oriansj>well correct me if I am wrong janneke but we still need scheme versions of: bzip2, coreutils, gawk, grep, gzip, patch, sed,
<oriansj>tar, xz
<oriansj>after which guix's bootstrap seed will be just guile
<oriansj>and as mes.c grows to fix into guile's place; then the transistion will be end to end complete
<theruran>bootstrapping Guile from a hex seed and low-level Lisp intermediate language?
<theruran>do the Scheme versions need to implement all of the same features?
<oriansj>well see there is a problem in guile; it can't be built from human writen source right now
<theruran>oh, so writing it in mes-m2 Scheme
<oriansj>as one needs guile to convert the human written source into something that can be currently built; thus mes.c needs to grow up enough to do that step
<oriansj>mes-m2 is just the name for a fork of mes.c with the goal of being buildable by M2-Planet fast
<oriansj>and now that it can be built via M2-Planet and all tests pass
<oriansj>I have the task of making it become closer and closer to mes.c
<oriansj>everything from hex0->M2-Planet->mes-m2 is done already
<oriansj>it is just writing C code to get mes-m2 up to mes.c; so that is able to self-host MesCC
<theruran>is it C or a subset of C?
<oriansj>MesCC is a C99 compiler
<oriansj>M2-Planet is a highly restricted subset of C
<oriansj>Written in a more restricted subset of C
<oriansj>that I implemented in assembly
<oriansj>So this code: https://github.com/oriansj/M2-Planet/blob/master/cc_core.c supports more C functionality than it uses
<oriansj>and the assembly version is availble in multiple forms: https://github.com/oriansj/mescc-tools-seed/blob/master/x86/cc_x86.M1 https://github.com/oriansj/mescc-tools-seed/blob/master/x86/NASM/cc_x86.S
<theruran>it's really impressive work everyone!
<theruran>what dialect of Scheme is used to write the remaining pieces?
<oriansj>well we are sticking to a common core subset to make the transistion from guile to Mes.c as small as possible
<theruran>where?
<oriansj>for example, I am rewriting mescc-tools in scheme https://github.com/oriansj/slow-utils so that mescc-tools don't have to be in the binary total
<oriansj>The bash replacement is being done here: https://gitlab.com/rutger.van.beusekom/gash.git
<theruran>how similar is Scheme Shell (scsh) to Gash, I wonder?
<oriansj>well Gash is a Bash drop in compatible
<theruran>the code looks good!
<oriansj>good programmers
<theruran>everything here is being built on much sounder foundations. when the bootstrap is complete it will feel incredibly empowering to use such systems
<oriansj>More people should be on tomorrow morning; so I am off to bed and hopefully we can chat more later. ^_^
<theruran>peace!
<janneke>oriansj: mostly correct, you said:
<janneke>well correct me if I am wrong janneke but we still need scheme versions of: bzip2, coreutils, gawk, grep, gzip, patch, sed,tar, xz
<janneke>we already have awk, coreutils, grep, sed, tar
<janneke>we don't need bzip2, patch
<janneke>iwbn if awk and sed were more feature-complete
<janneke>and we have a broken compress, a gzip would be nice
<janneke>when i say: "we don't need", i mean that we can bootstrap C versions of those tools before we need them
<janneke>oriansj, theruran gash and geesh were merged and gash core moved here: https://savannah.gnu.org/projects/gash
<janneke>gash-core-utils has been moved to a new WIP feature branch, now temporarily living here: https://gitlab.com/janneke/gash
<janneke>this will all be cleaned-up and released some time soon
<oriansj>janneke: thank you for correcting me; I am glad to see the eliminations of binaries from the guix bootstrap in such a good state
<oriansj>The next Guix release and announcement is going to be quite impressive
<janneke>oriansj: well yes, this has not been released but we have it prototyped
<janneke>so some rough edges all around
<oriansj>currently doing a big shift in stage0 to make the amount of duplication go down a good bit
<oriansj>which requires me to clean up some old bios level hex pieces
<xentrac>janneke: thank you for the update!
<janneke>xentrac: yw
<janneke>oriansj: samplet is working on the next gash release and will probably do the next announcement of the guix bootstrap
<xentrac>what about xz?
<janneke>his work will probably be mes-0.19 based, whereas i am prototyping with mes-0.21rc
<janneke>xentrac: xz is quite probematic, i succeeded to build it with gcc 4.6 or 4.7
<janneke>it won't build with tcc or gcc-2.95
<janneke>*problematic
<janneke>however, we can avoid using it until we have gcc-4.x
<janneke>it would be a feature if we either got rid of xz entirely, or get some good bootstrap-oriented maintenance on it
<xentrac>I mean do we need a scheme version of it?
<oriansj>need no; would it be nice to have, yes
<dddddd>Hello booters!
<oriansj>how goes the aarch64 work?
<dddddd>fine, msub for % is working. I have some simplification for if that I need to check on the metal and port to the rest of the control structures.
<dddddd>slow but steady progress last weekends
<oriansj>nice
<oriansj>hopefully you are having lots of fun
<oriansj>I probably prepare for mescc-tools-seed Release v1.1 by including mes-m2 into the build process
<janneke>xentrac: a scheme version of gzip has higher priority for me than xz
<janneke>guix uses xz to store patched tarballs
<janneke>we can work around that by applying patches in the build recipe
<janneke>another workaround could be to tell guix to use gzip for patched bootstrap source tarballs, that needs some work
<oriansj>well tar files would be fine as compression isn't required at these sizes
<oriansj>provided of course Mescc can compile gzip
*xentrac isn't required at these sizes
<oriansj>* xentrac is clearly wrong *
<xentrac>well that goes without saying
<oriansj>Nothing worth having came from a man who wasn't willing to be wrong
<oriansj>keep being awesome xentrac
*xentrac blushes
<oriansj>janneke: my current goal: make it possible to just git clone --recursive mescc-tools-seed and kaem --verbose strict and have the future guix seed built
<oriansj>So I am integrating the mes-m2 build into it and will include the initial self-host steps required for mes.c
<oriansj>but as of right now mes-m2 isnt at the self-host stage; I'll just put what I have
<janneke>oriansj: good plan
<janneke>oriansj: the scheme-only bootstrap uses uncompressed, plain tar balls for mes, tcc, make and gzip
<janneke>for now, that's pretty ok'ish
<xentrac>if we had a scheme gzip, could we dispense with all four of those?
<janneke>another options is a scripted build for gzip, and use tar.gz for make already
<oriansj>well we always can do kaem scripts
<xentrac>heh
<janneke>xentrac: yes, we could use regular .tar.gz balls
*janneke goes to prepare dinner
*xentrac wishes he hadn't compressd his dinner with xz
<oriansj>it just depends on which gets done first: someone writes gz in scheme or someone writes a kaem.run script for gzip's build
<xentrac>the second one sounds a lot easier
<xentrac>but the first one sounds a lot cooler
<oriansj>well nothing impressive ever was easy
<xentrac>apparently Dijkstra thought up Dijkstra's algorithm in 20 minutes
<oriansj>So open call for anyone who wants to work on either one of them ^_^
<oriansj>xentrac: but yet there are entire classes of programmers who take months to get it
<xentrac>I think probably the hard part is becoming the kind of person for whom thinking up Dijkstra's algorithm is easy
<xentrac>I remember a Feynman story about drumming
<xentrac>He was doing this bongo ballet thing or something, and they had a composer who had written a score for it
<xentrac>and he and the other guys playing the drums gradually learned to play the score, measure by painstaking measure
<xentrac>by practicing for weeks and weeks they finally got it
<xentrac>then one day he showed the score to an experienced drummer who could sight-read
<xentrac>and the guy said, "oh, it's like this?" and knocked out the rhythm immediately on his drum set
<xentrac>Feynman was very impressed that the drummer could do in seconds what had taken him weeks
<xentrac>For the drummer it was easy; for Feynman it was impressive. But becoming the drummer wasn't easy
<oriansj>indeed
<xentrac>(my favorite Feynman stories are the Feynman stories where Feynman isn't just telling you an anecdote about how smart Feynman is)
<oriansj>have you seen the Feynman talk about AI?
<xentrac>no?
<oriansj>janneke: and the initial x86 port of mes-m2 passed all tests!!!
<oriansj>xentrac: https://www.youtube.com/watch?v=EKWGGDXe5MA
<oriansj>and mescc-tools-seed v1.1 is now up
<oriansj>with mes-m2 builds for both AMD64 and x86 that pass the tests
<xentrac>thansk!
<janneke>oriansj: brilliant work, as always!
<oriansj>janneke: I was thinking of a behavior change in mes.c and mes-m2; if the ENV variable MES exists fallback to the MES_BOOT behavior of mes.c; otherwise use the easier to test behavior of mes-m2
<oriansj>or we could do the reverse with another variable name
<janneke>oriansj: ah nice -- yes, let's use another variable..or wait
<janneke>the mes build system uses `MES' in scripts to set the mes executable
<janneke>oriansj: you have seen more of those scripts than you would like, i guess :-/
<janneke>and it's a bit of a mess
<oriansj>how about MES_CORE
<janneke>i have also been using `GUILE' and 'SCHEME' to set the scheme interpreter variable
<janneke>that needs to be cleaned up i guess
<janneke>yeah, MES_CORE is perfect
<oriansj>if set; use the new behavior for testing
<oriansj>otherwise load the boot.scm per the mes.c way of doing things and exit when done
<oriansj>I'll just throw up a minimal version so that you can play with it
<oriansj>and it looks like I made a mistake with env_lookup
<oriansj>janneke: the primitive version is up and I'll start the process of getting mes-m2 more mes.c friendly
<janneke_>\o/
***janneke_ is now known as janneke