IRC channel logs

2021-06-09.log

back to list of logs

***dekenevs is now known as mitzman
<oriansj>gio: You are absolutely right that M1 is a quite primitive Assembly language and it is quite reasonable to not like it for doing more advanced work. I just still haven't found time to make a proper binutils compatible assembler/linker M2-Planet buildable yet
<oriansj>although the macro issue has been addressed thanks to yt; who added a rather standard C preprocessor to M2-Planet
<oriansj>but I do remember that the M2-Planet version you wrote in G gio was a pretty close copy and would be able to bootstrap the current generation of M2-Planet+M2libc;
<oriansj>So from boot to GCC would be possible by simply borrowing solutions from live-bootstrap.
<oriansj>I also remember when akkartik was working on his assembler that became mu https://github.com/akkartik/mu/
<oriansj>It is kinda interesting how 5years ago things we knew were going to be the case didn't happen and the things we thought might be impossible ended up being what we did.
<oriansj>but yeah, the tiny human bootstrapping process is crazy high in compute and memory requirements.
<xentrac>what were the things we knew were going to be the case
<oriansj>xentrac: probably a lisp written in assembly was a big one that I remember well (heck I even wrote the version lisp in assembly)
<oriansj>that doing in C in assembly would be impossible (turned out to be easier than FORTH or Lisp)
<xentrac>oh, yeah
<xentrac>nothing is impossible
<xentrac>well, lots of things can't be written in any language
<xentrac>but i mean anything you can write in one language you can write in another
<xentrac>c is easier to write a lisp in than assembly tho
<xentrac>i guess it's easier to point that out now that we already have experience with it ;)
<oriansj>xentrac: also that writing a C compiler in assembly basically ends up being something so obvious once you know about state machines and looking at basic tokenization.
<oriansj>fossy: Do you prefer I put kaem enhancements in mescc-tools or mescc-tools-extra?
<xentrac>yeah, i guess i should try it but i suspect maybe it's not so obvious to people who aren't you ;)
<oriansj>xentrac: well look at the debug function in cc_x86 and the relationship from if you see string x, replace with string pattern y isn't a hard jump
<oriansj>Then a little bit of type tracking added on and boom a rather rich C subset
<oriansj>fossy: I ask because once you decide, I'll merge an enhancment to kaem and finish off the stage0-posix+mescc-tools-extra integration script
<vancz>wait mes is called mes because "#+SUBTITLE: Maxwell Equations of Software"? that is sooo coool
<oriansj>essentially: https://paste.debian.net/1200473/ will be the standard for all of the pieces and currently kaem converts ${BLOOD_FLAG} into " " and blood-elf doesn't handle that as valid input;
<oriansj>The fix is trivial: https://paste.debian.net/1200474/
<oriansj>since Kaem is your baby fossy, it is your decision: mescc-tools or mescc-tools-extra as the home for kaem.
<oriansj>and intial sha256sum, ungz and untar M2-Planet+M2libc kaem build script for mescc-tools-extra is up;
<oriansj>just need to add the remaining pieces and get fossy
<oriansj>to make a decision on Kaem's home
<oriansj>then I'll wrap it all up with the latest M2-Planet v1.8.0 release in stage0-posix
<oriansj>with the after.kaem hook for live-bootstrap to leverage
***bandali_ is now known as bandali
<xentrac>vancz: yeah, alan kay's phrase for the lisp metacircular interpreter. not sure i totally agree but it's certainly evocative
<vancz>ah
<xentrac>there are a variety of simple, minimal sets of node types that give you a turing-complete interpreter if you already have some kind of flexible linked data structure
<xentrac>and the insight that stunned alan kay was that anywhere you could implement the lisp eval function or something similar, you had access to unlimited computational power
<vancz>aha
<vancz>give me a loop and a branch and shall
<vancz>*I shall move the instruction pointer
<xentrac>heh, pretty much
<xentrac>the interesting thing about the lisp primitive set is that it gives you flexible linked data structures (or rather inherits them from the host environment, which is one of the limitations of specification-by-metacircular-interpreter)
<xentrac>and similarly with recursive functions
<xentrac>in the lisp 1.5 form, it doesn't give you the kind of general-purpose 'object-orientation' that you get from closures in scheme, but it's straightforward to tweak the metacircular interpreter to add that
<gio>Melg8[m]: I don't really remember the details, but that code was even more experimental than the rest. Basically I was just playing with low level stuff. My advice is not to try to make it work, it just won't. If you want to make something at that point, you need to study how Linux loads and implement that part from scratch.
<gio>Also notice that in the end you won't probably want to load Linux from a classic bzImage, because you don't really want to write the code to create a bzImage at boot time. Rather, you'll want to directly link and load the code in memory.
<gio>But that also requires a good understanding of how Linux links and loads, which is not trivial. And that is all to be written.
<gio>I trust it won't much different than iPXE in its concept, but still there is some non-trivial coding to be done.
<fossy>oriansj: i'm going to suggest mescc-tools because it is an integral part of the bootstrap process
<Melg8[m]>@gio, thanks, i observed some strange (at least for me) behavior when i tried to debug what is happening within that memcpy instruction, at execution of which it looks like reboot. So, i tried to check what's inside of addresses after malloc (i've assumed that memcpy somehow corrupting memory, so i started dig in that direction. So far what i observed is: real_mode contains some interesting value on allocation. I don't want you to spend alot
<Melg8[m]>of time on this, but at glance, doest this values after malloc look fine to you? (http://paste.debian.net/1200482/ - you are looking at dump from https://gitlab.com/giomasce/asmc/-/blob/master/http/continue2.c#L176 real_mode ) i mean, sure - malloc should allocate without zeroing out, but where it points, - isn't that some already used memory part? (like virtual fs or something)? And another thing, i've observed, i tried to call in loop
<Melg8[m]>mallocs, to check what they contain, and found that
<Melg8[m]>@gio that addresses are distributed weirdly: http://paste.debian.net/1200483/. So my question is - can there be some major flaw in malloc work? or you've tested that it's fine, and i digging in wrong direction? (i've tried to enable USE_CHECKED_MALLOC) - but it doesnt seem to add/change anything)
<Melg8[m]>Btw after copy and reallocation this part looks fine (at least with 55 aa at the end) http://paste.debian.net/1200485/
<gio>Melg8[m]: It's not strange that malloc gives you a bunch of memory that was already used and then discarded with free. I have a rather naive malloc implementation, that just stores freed blocks in a few lists (differentiated by block size) and then hands them back out when they are requested.
<gio>So the pattern given by malloc essentially depends on the patterns by which free is called, in reverse.
<gio>I am quite confident that my implementation of malloc is correct, and has never given me problems; though of course bugs can always surface. But your observations do not look strange to me.
<gio>Also USE_CHECKED_MALLOC only applies to G's malloc, so it's normal it doesn't change anything for you.
<gio>Again, I think my code doesn't work just because it doesn't work, there is nothing to fix. There is correct code to be written. You won't reach anything by fiddling with continue2.c.
<Melg8[m]>@gio now i'm just learning about your system and about how kernel loads, and continu2.c in only what i've got) at least i can tweak it)
<xentrac>happy st. columba day
<gio>Melg8[m]: This is a useful resource: https://0xax.gitbooks.io/linux-insides/content/Booting/
<gio>And then you need to read the code.
<gio>Also, reading how iPXE loading works in asmc might be useful for understanding the kind of things you need to do, though Linux is a different beast anyway.
<Melg8[m]>@gio thanks! btw how you debug this? did you used some qemu related debugging? or just dump out data with prints, or some other way? (isolated parts build?)
<gio>Melg8[m]: Hard question! My usual debugging technique is to try really hard to directly write correct code. Other than that, mostly a lot of printing. I used low level technique in the beginning, but at this level I believe they're already too much hassle.
<gio>In line of principle you can attach a gdb to qemu, but then you have to make sense of stuff, and it's easily more work than printing stuff.
<Melg8[m]>@gio how hard it would be to port your g language to other arch? what parts it would require to change?
<gio>Melg8[m]: Well, the compiler is written in x86 assembly, so you basically have to write it from scratches. You can use the x86 as a base, of course, but you cannot directly save code.
<gio>Also, notice that G code is not portable, has
<gio>*as it assumes word size and endianness.
<gio>So you would have to update the G code as well.
<Melg8[m]>i see
<gio>I have sometimes thought a little bit about how to make it a little more arch-independent, but for the moment that's the status.
<Melg8[m]>elaborate) please) even if i will not do it right away, at least for the future would be nice to know your ideas
<gio>The big problem, if I recall correctly, is that you have to manually write offsets in the G code, because the language itself doesn't know about structs.
<gio>So you could add some primitive to the language to multiply a number by the word size at compile time.
<gio>This might solve a good amount of problems, but I'd have to think better to see if there are others.
<gio>Also, support for 64 bits operation might be included in the language itself, instead of added later as it is now. But you don't really want to complicate the language, that's the most important thing.
***mitzman is now known as kitzman
<Melg8[m]>@gio didn't you use assembler implementation of g to compile another version of g written in g?
<gio>No, there isn't any G written in G.
<Melg8[m]>would it solve this? or that would be just more compilcations?
<Melg8[m]>so like g in assembler - is platform specific, but g in g is not
<gio>Right now G as a language is platform dependant, so it is not somthing you fix by just changing the compiler.
<gio>In line of principle adding another layer might help solve the problem. But I wouldn't make a layer of G again, it wouldn't make sense. Maybe create a new language G1, which is some kind of intermediate between G and C. This might make sense, though I'd have to think it better.
<gio>Also, this doesn't remove the problem that G-on-assembly is still platform dependant.
<Melg8[m]>siraben: here you go https://gitlab.com/giomasce/asmc/-/blob/master/G_LANGUAGE.md
<oriansj>Melg8[m]: there is an extremely good reason why cc_* supports structs. It makes the writing of portable C code easy.
<oriansj>that transition point from assembly to a higher level language is always 100% non-portable in terms of the individual implementations (hence cc_x86, cc_armv7l, cc_aarch64, etc) but if one puts in a bit of effort that higher level language is "effectively portable (tm)"
<oriansj>So although there are a great deal of restrictions on M2-Planet's code (cc_* compatibility and cross-platform portability without macros)
<oriansj>Once one has a cc_* for their architecture and builds M2-Planet, everything else after it becomes standard. One needs only update M2libc for the target architecture and add basic support in M2-Planet+M1+hex2 (or in bonko insane architectures blood-elf)
<oriansj>C for all of its many flaws and imperfections, still is a pretty great portable assembly language; when you remember to think of it in those terms.
<oriansj>also when you want a clean memory block, never do malloc; only calloc ensures you will get a clean memory block. malloc is for cases where you plan on immediately overwriting the block of memory you are given with something you'll properly delimit or properly segment.
<oriansj>fossy: thank you for your decision. I'll simplify mescc-tools-extra accordingly
<oriansj>ok and with that we have a new mescc-tools Release 1.2.0 vagrantc you will want to update Debian with the latest release
<oriansj>Let us see if I can finish this all before I have to go to work
<siraben>Melg8: ooh interesting, looks Forth-like
<Melg8[m]>i saw llvm ir in that) but dk what is easier - to implement small C subset compiler - or LLVM IR compiler - in assembly
<oriansj>Melg8[m]: cc_x86 is only 3257 lines of assembly and comments, which can be done in less than 24 hours. what would the LLVM IR compiler take?
<oriansj>and the stage0-posix x86 and AMD64 updates are up but the AArch64 will have to wait till after work and I'll make it a stage0-posix release
***Server sets mode: +cnt
***dongcarl7 is now known as dongcarl
***dekenevs is now known as kitzman
***smartin1 is now known as smartin
***ericonr- is now known as ericonr
***edef_ is now known as edef
<stikonas>oriansj: I need https://github.com/oriansj/mescc-tools-extra/pull/2 for live-bootstrap
<stikonas>fossy: so I mostly got live-bootstrap working with stage0-posix
<stikonas>for now I'm keeping same mes, to avoid changing too many things in one go
<stikonas>but I'll remove all fletcher stuff...