IRC channel logs

2020-01-25.log

back to list of logs

<Hagfish>that's interesting, thanks
<oriansj>Hagfish: always here to help ^_^
<opal>just wanted to let you guys know this project looks cool
<opal>i'll probably lurk for a while til i have fewer things to do
<opal>always easier to be interested in new stuff than it is to finish old stuff lol
<oriansj>opal: very true, especially when the old stuff requires alot of work
<opal>yeah seems very time-consuming assuming the best possible scenario, but things get complicated with unmaintained stuff that may not work on newer setups
<opal>still solving the "turtles all the way" dilemma seems noble
<oriansj>opal: actually we are heading to less work as we go along
<opal>thats good
<oriansj>also somethings you might wish to know: https://github.com/oriansj/talk-notes/blob/master/Current%20bootstrap%20map.pdf and https://github.com/oriansj/talk-notes
<opal>aa more links
<opal>lol
<oriansj>everything you want to know about current bootstrapping efforts but were afraid to ask
<opal>got way too many tabs and bookmarks but yeah i will check those out
<opal>thanks
<oriansj>Morning
<rain1>hey
<oriansj>looks like static code analysis shows I should change a couple primitives to catch calloc failures.
<oriansj>which means I either have to leverage require in more primitives or hand-roll the same logic in each of them (Bigger diff but less dependencies)
<oriansj>which is probably why the C standard includes assert
<oriansj>now it would be easy to add assert logic into M2-Planet's code generation; however cc_* wouldn't be able to compile any asserts and thus M2-Planet wouldn't be able to leverage that new functionality (at least not until v2.0)
<oriansj>any thoughts?
<dddddd>I guess you mean without changes to the cc_* family. Too much trouble to change?
<oriansj>dddddd: indeed, not a small change either.
<stikonas>so is the longer term plan cc_* bootstraps M2-Planet v1 and then it bootstraps M2-Planet v2?
<dddddd>I'm thinking crazy here, so bear with me. As we kind of know the requirements of the programs we're going to compile with cc_*, would make sense to do a big _checked_ allocation upfront and then use that space? Feels rigid, sure... but maybe that requires less changes?
<dddddd>It's not a general purpose assert-like functionality that's is added. To be honest, I don't know what kind of changes you're thinking about and I guess the calloc example coloured my thoughts. Could you elaborate a bit about which parts of cc_* needs change and what kind of changes are required?
<oriansj>stikonas: exactly.
<oriansj>dddddd: I am just thinking of what is the best way to make M2-Planet v1.x as reliable as possible; while making as minimal changes to cc_* as possible (ideally no changes)
<oriansj>right now I am leveraging require in a handful of spots
<oriansj>but some of the places where issues could occur are in the common libraries, such as numerate_number.c and file_print.c
<oriansj>and string.c
<oriansj>and should I make them also require require (ironic) for brevity sake or roll custom error catching in each
<oriansj>This unfortunately isn't an engineering problem, so much as a user facing problem. Do we make leveraging string.c dependent on also having to import require.c which a programmer might not want.
<dddddd>sure, I was asking the changes in cc_* that you're trying to avoid, to understand it's (not small) scope.
<dddddd>*about the
<oriansj>dddddd: essentially in order to support assert, we would have to add tracking of what file and what line each token came from and then use that info to generate the assert string
<oriansj>now that is already in M2-Planet and would be very little work to add but in cc_* it would require changes to the reader in a big way
<dddddd>*its
<dddddd>oh, so for error reporting, instead of just catching.
<oriansj>well assert is 2 properties 1) erroring out if condition is false and 2) displaying where that error occurred
<oriansj>The first is trivial to add, the second less so
<oriansj>if one wanted to just do the first and not the second, we would just need to add assert to the libc.M1 of each architecture.
<oriansj>The second requires a function to rewrite the assert(a != b) into _assert(a != b, "file.c: line 123\n") and having _assert be the function in libc.M1
<dddddd>I see, I was missing the lack of reader support. So the reader needs to save information about the assert location (but I guess that not for every token).
<oriansj>dddddd: well, every token is actually easier than just a handful of them
<dddddd>I guessed so, just trying to nail the requirement.
<oriansj>So, since the reader is the hardest part of cc_* to tweak (very very touchy); it isn't something I would want to change if I can avoid it.
<oriansj>as it is effectively 1/3->1/2 of the development effort of writing cc_*
<dddddd>I see, so M2-Planet compiled with cc_* would lack assert. Then self-compiled M2-Planet is the first one to proper assert.
<oriansj>exactly, which means duplicates of the primitives (with and without assert to catch bugs)
<oriansj>or I don't use assert in the primitives and just leverage require instead
<oriansj>or do one off error catching in each
<dddddd>assert in low level functions, pointing out to themself without a stack trace seems not very useful to the user, isn't it?
<oriansj>hence, the question do we make some of the primitives also require require.c or just roll custom error catching in each
<oriansj>I have no strong feeling about it and honestly, I'm half ready to just flip a coin to decide. Thus does anyone have a strong preference or is a coin flip good enough?
<dddddd>What would the user want to avoid the (indirect) dependency on require.c?
<dddddd>*Why (sorry, rewrited badly)
<dddddd>If there're good reasons, I guess we should err on their side even if that means we write more custom error catching.
<oriansj>well assert would be the most common primitive they would want instead of require;
<oriansj>one does not need require if one has assert
<oriansj>assert is to require as a C macro is to a C function;
<oriansj>(M2-Planet lacking C macros at this time of course)
<oriansj>It is ultimately no good nor bad reasons for either way of solving; hence why I struggle. If there was a definite advantage for either; it would have been implemented already
<oriansj>Ok, coin flip says custom error catching. I'll be updating the primitives shortly
<oriansj>well that is 40 updated checksums to validate
<dddddd>Thanks for sharing. I think this conversation helped me to better understand interactions between cc_*, different versions of M2-Planet (and its lib) and the user.
<oriansj>dddddd: one thing to remember is that there is one huge reason why debug_list is the most important function in cc_*
<oriansj>tokenizing C in assembly sucks.
<oriansj>after that the rest is trivial
<oriansj>hence why debug_list is always the first function written in cc_* and commenting it out is the last thing you do.
<dddddd>debug-head of M1.scm comes to mind, helpful too
<markjenkinsznc>My M1.py got viable enough to get out of my work in progress branch and make it to knightpies master https://github.com/markjenkins/knightpies/blob/75bcd63b0f671cbeb9a3f2c8d6d93ce773fbb50d/M1.py
<markjenkinsznc>I can now $ ./M1.py stage0/High_level_prototypes/defs stage0/stage0/stage0_monitor.s > stage0_monitor.hex2
<markjenkinsznc>and $ ./hex2tobin.py stage0_monitor.hex2 stage0_monitor
<markjenkinsznc>and $ sha256sum stage0_monitor gets the same result as $ grep stage0_monitor stage0/test/SHA256SUMS
<markjenkinsznc>Still to come: I didn't really do much on my M1 string token handling as stage0_monitor.s doesn't use "string", so I need to look at that more carefully. Bugs are probably present.
<markjenkinsznc>Need to support little endianesses and differences in number of bytes for 0x hex atoms.
<markjenkinsznc>Hex2tobin will need to support more symbolic offset modes than those used for knight ISA (@ and $) and also differences in value size and endianess.
<markjenkinsznc>Some refactoring to enable M1tobin.py .
<markjenkinsznc>Test suite support against all supportable stage0 .s files and those found in stage0 git submodules, including interpretting the knight ISA binary M0-macro.hex2 as a point of comparision. (useful when no SHA256sum is present) Then I can do a release and start implementing knight ISA instructions used by stage2/cc_x86.s to make a simple wrapper stage0_cc_x86.py that will try to compile M2-planet.
<oriansj>markjenkinsznc: very very nice
<markjenkinsznc>in case anyone is wondering, writing M2-planet.py (e.g. a rewrite-port of M2-planet to pure python, not interpreting stage0 knight ISA binaries) is not on my roadmap :).
<oriansj>markjenkinsznc: completely fair M2-Planet has grown into a quite large program
<oriansj>150KB compiled
<markjenkinsznc>and I think mescc (c compiler in scheme) is already a compelling c compiler in a high level language
<oriansj>with the ports to AArch64 and RISC-V only to make it even bigger
<oriansj>markjenkinsznc: exactly
<markjenkinsznc>though if anyone wants to do a port M2-planet.py, or if i feel like it some day, I'd suggest making version one target compiling the M2 C subset to only one output architecture to start, and what I'd suggest doing as a first output architecture is targeting a stack machine of some kind instead of a register machine as https://www.craftinginterpreters.com/ makes a good case for. Then target compiling to a register machine after that.
<markjenkinsznc>idea being, stack machines are easier to do code generation for
<dddddd>For a M2-Planet.py I'd try to modularize the code generation by arch, to avoid the chain of repeated conditionals everywhere.
<oriansj>markjenkinsznc: ummm a single architecture version of M2-Planet is cc_* (with only a handful of additions)
<oriansj>also M2-Planet is outputing a C state machine; which is a stack machine
<markjenkinsznc>stack machine also provides a good point of comparison if you start output to a register machine as a second output target, if your parser is broken then both outputs will be broken, but if your x86 code generator is broken but your stack machine code generator is fine then you'll see
<oriansj>quite similar to the AT&T Hobbit processor
<markjenkinsznc>very cool to learn this detail about M2-planet internals
<markjenkinsznc>difference between M2-planet.py with x86 output and what I've called a future stage0_cc_x86.py is the former is pure python in implemtation, the later would interpret knight ISA binary cc_x86.s
<markjenkinsznc>anyway, got to go, nice to learn about this Hobbit thing Jeremiah
<oriansj>always good talking to you Mark
<oriansj>hmm looks like cleanup for test25 and test26 were never turned on; easy to fix
<oriansj>for History buffs, The AT&T hobbit processor is most famous for being the first CPU in BeOS computers and so cheap that Be Inc decided to put 2 into their first machine
<oriansj>It also invented stack look-ahead in hardware and never stood a chance to be performance competitive with any register based design.
<oriansj>What is interesting in the history of Processors; is the best surviving ISAs are those that are unCISCy and unRISCy at the same time
<oriansj>as if good stack operations and C benchmarks; have forced a particular style of thinking about ISAs.
<NieDzejkob>where are the commands supported by the stage0 monitor documented? I looked around some of the repositories yesterday and couldn't find anything
<oriansj>NieDzejkob: stage0_monitor doesn't have any commands
<oriansj>you type hex0 into it and it just writes to 2 files; tape_01 gets exactly what you typed and tape_02 gets the processed output
<oriansj>It solves the bootstrap problem of needing a text editor and an assembler in the same binary
<oriansj>(correction tape_02 is the binary and tape_01 is what you typed)
<oriansj>(why did I just say that backwords twice?))
<oriansj>tape_01 is the binary
<oriansj>tape_02 is your input
<oriansj>(we assume, you'll have to manually input the source in the event of a complete wipe)
<oriansj>Hence why M2-Planet and above has alot of error reporting and catching to make that job easier
<oriansj>Hence why every piece up to M2-Planet only adds up to 8053 lines
<oriansj>with cc_* being about 4,974 of that
<oriansj>and SET.s exists; to allow correcting sources
<oriansj>a 473 line (1072byte) bare bones line editor
<NieDzejkob>ah, yes, the Shitty Expensive Typewriter, I recall looking at how it works, but none of my conclusions
<NieDzejkob>ah, the high level prototype has a printf with help
<oriansj>yep, although I probably should create a high level prototype for the stage0_monitor so that it'll be instantly obvious how it works
<oriansj>and clean up the stage2 High level prototypes
<oriansj>and now that I finally cleared out the last ?alloc warning that static code analysis has for M2-Planet; I think I'll take 10 and be back at it
<oriansj>and patches are up
<dddddd>I just rebased my M2-Planet patch over de6eb338d52d, adapted test100/hello-aarch64.sh to the new situation (removal of test100/hello.sh, changes in makefile and require.c requirement) and added chdir, fchdir and access (based on faccessat syscall because for AArch64 there's no access syscall).
<oriansj>dddddd: sound great
<oriansj>very nicely done as always dddddd
<oriansj>and new High level prototypes are up
<oriansj>now to clean up stage2's high level prototypes
<dddddd>Thanks. What's the program which triggered the addition of the new lib functions? Maybe one of fossy? (the idea being tests a bit, just in case).
<dddddd>*test
<oriansj>dddddd: kaem.c
<oriansj>also mes-m2 will be leveraging them when I get to adding mes.c's logic about searching for boot-0.scm
<fossy>dddddd: the addition of a cd builtin in m2-planet
<oriansj>that is entirely for kaem
<fossy>cd builtin in kaem sorry
<oriansj>and I am going to add a troll into stage2/High_level_prototypes/
<oriansj>and I'll be importing the C sources for cc_* which should be quite useful for anyone wanting write their own.
<dddddd>There's no kaem repo anymore and its home is mescc-tools now, right?
<oriansj>dddddd: correct
<oriansj>as kaem is part of the mescc-tools and can be depended upon for all bootstrapping of all platforms
<oriansj>Thus it needs to become powerful enough to be used for the most expected shellscripts in the early bootstrap process
<dddddd>OK, README at mescc-tools-seed links to the deprecated repo. That confused me for a moment, when my mirror attempt failed.
<oriansj>dddddd: sounds like something we need to fix so that other people don't hit the same mistake
<oriansj>If someone completes that before I finish preparing the cc_x86 c sources; I'll merge it in.
<oriansj>and cc_x86's sources are ready and patches are up
<fossy>oriansj: would you be interested in adding getcwd() support to M2-Planet? that would allow things like `cd -` to work
<fossy>because otherwise I can't get the initial path
<oriansj>fossy: well it does appear to just be another syscall to support; so entirely possible
<fossy>yes
<oriansj>I'll try to get it done tonight
<oriansj>but first I need to fix mescc-tools-seed's README
<fossy>oh yeah, that. sorry i wasn't aware at that point that kaem was in mescc-tools
<oriansj>fossy: no worries, we are all learning here everyday
<oriansj>ok patches are up and now to figure out how to do getcwd
<oriansj>looks like only the char *getcwd(char *buf, size_t size); form is provided by the kernel with the char *getwd(char *buf); form just being a libc wrapper. which is a minor problem as M2-Planet does not support 2 different functions with the same name.
<oriansj>fossy: does getcwd(calloc(4096, sizeof(char)), 4096); sound like a restriction you are willing to deal with?
<fossy>yes
<dddddd>getcwd != getwd so no same name it seems
<oriansj>ok, I'll make getcwd be a straight asm function and have getwd as just a wrapper function
<oriansj>and I'll expose a PATH_MAX constant which you'll be able to leverage fossy
<oriansj>(say 4096 for simplicity)
<fossy>cool, thanks
<oriansj>dddddd: I'll take care of x86, AMD64 and ARMv7l
<oriansj>might as well throw in get_current_dir_name to complete the triplet
<dddddd>noted, oriansj. I'll do my part, of course. I'm just a bit unsure without tests. Maybe an upgrade to test25 with new kaem and more complete kaem.run?
<oriansj>dddddd: understandably; fossy will you please help by making a test for us
<fossy>sure thing, for chdir() family and getcwd(), getwd()?
<oriansj>fossy: correct
<dddddd>access() is untested too, I think.
<oriansj>throw it onto https://paste.debian.net/ when you are done and dddddd will merge the test in
<dddddd>Sounds good, oriansj. New kaem.c plus a script from fossy.
<oriansj>during which time I'll be porting getcwd syscall into stage0's --posix-mode and disassembler
<oriansj>and my quick prototypes are up
***dddddd_ is now known as dddddd
***deesix_ is now known as deesix