IRC channel logs

2020-12-22.log

back to list of logs

<fossy>OriansJ: i was just utilising catm as you had suggested to copy files around in early stages of the bootstrap
<fossy>however i have run into a problem: it does not copy over permissions, so executable bit is lost
<fossy>is this something i should just implement seperatly or is there a way to get catm to do it
<OriansJ>fossy: we might just have to get some unix shell commands into a form that M2-Planet can build
<OriansJ>deesix: not all labels for gotos have following statements
<OriansJ>mihi: interesting choice for buffers on read; I would have thought read the whole file into a buffer and skip syscalls for all files opened for read and the buffering on writes just until newline or fflush.
<OriansJ>mihi: the reason for the failing test on AMD64 that you posted; on 32bits was written but 64bits was read
<OriansJ>doing something like this: https://paste.debian.net/1177902/ would fix that bug but introduce another bug if you did -1
<deesix>OriansJ, I think a label without a statement is not a thing: "Statements may carry label prefixes". The grammar for labeled_statement looks like the ANSI C (identifier : statement) one. What am I missing?
<OriansJ>mihi: thank you for fixing the x86 fclose syscall and the test kaem envp behavior. merged
<OriansJ>deesix: void bar() { goto foo; global = global + 1; foo:} is valid C code
<deesix>OriansJ, I already tested something like that and got "error: label at end of compound statement". Let me check that one.
<OriansJ>minor correction int main() { goto foo; global = global + 1; foo:;} with GCC 10
<deesix>Oh, that final ; is a statement, I think.
<deesix>expression-statement without the optional part.
<xentrac>yes, ; is a statement
<xentrac>that's why it's valid
<OriansJ>I think this is why I am really bad at doing a proper C compiler.
<xentrac>the usual case where this aspect of the C syntax is annoying is when you want to put a label on a declaration
<xentrac>foo: int x = 3;
<xentrac>it's not you, OriansJ, it's C
<OriansJ>I really cheated hard in M2-Planet for gotos and labels
<OriansJ>just converted label: to :label and made goto a naked jump (good luck if you assigned variables)
<OriansJ>but a 20 hour hack of a C compiler that seems to be my eternal project
<OriansJ>where all functions arguments are void* and all returns are void*
<OriansJ>The types only mean size, signed and nothing else. And actually get the size wrong on all targets. long ->32bits on 32bit platforms; int ->64bits on 64bit platforms
<OriansJ>mihi: looking close enough to your buffering code. It looks like it could/should be shared by all of the architectures.
<OriansJ>time I guess to finally start treating libc seriously I guess in M2-Planet
<OriansJ>but it is late and I have work bright and early. Good night.
<fossy>OriansJ: ok, I will work on that at some point
<mihi>OriansJ, that buffering approach is not my idea, the same concept is used by Rust's bufio and Java's buffered streams. Only simplification I made is that I always flush the buffer on seek instead of checking if the seek offset is inside the already read buffer.
<mihi>Making it auto-flush on newline is certainly doable, and also we could change the stdout/stderr buffering from "no buffering" to "buffer until newline" if we make sure that the stream is either flushed or every printed message ends with a newline.
<mihi>I also agree that it will be possible to share some code across architectures, but I have no (easy) way of testing ARM architecture and since it does not work on AMD64, the only arch left is i386...
<mihi>so my approach would be to first implement/test for all architectures and refactor common code once that is done and proven to work.
<mihi>About read(2) ing the whole file in a single syscall I see a few disadvantages:
<mihi>1) While it may be possible to get the size beforehand using lseek or stat, this will not work for all kinds of file descriptors
<mihi>2) reading the whole file will block until the hard disk has read all the sectors, even if the program stops after having read half of the file. Also modern OSes have read-ahead caches, which only help if the program does some procesing already while the file is read.
<mihi>you may work around this by using mmap(2), but that would make it more complex, won't work for all file types, and would require every supported kernel to support mmap.
<mihi>and last but not least 3) you would need some size limit as I don't think it is a good idea to completely load gigabyte large files on the first read call.
<mihi>last a disclaimer: I never had a look how actual libc implementations implement their read buffering, but I would assume it to be a lot more complex :)
<OriansJ>mihi: well right now the largest file M2-Planet is compiling is 524,034bytes; so setting a limit at 20MB seems reasonsable and set a flag if fully read. Then we can fall back to the flush and lseek behavior if it is too large.
<mihi>OriansJ, so If I understandy you correctly, you would prefer: (1) on fopen, try to use lseek to get the file size and if we are successful, use the file size clamped to 20MB as buffer size. If unsuccessful or the file is still empty, use a fixed buffer size as now. Store the file size. (2) When reading, try to fill the whole buffer. I don't think we need an extra flag for it, when I implement detection of seeks
<mihi>inside the buffer (which will need the file size in case of SEEK_END seeks). (3) When switching from read to write or vice versa (don't think it happens in M2-Planet right now), I would still flush the whole buffer. (4) writes will invalidate the file size. (5) separate file.c into architecture-dependent and architecture-specific parts
<mihi>(6) auto-flush on newlines?
<mihi>meanwhile I checked libbionic (embedded libc) and their io buffering implementation is very similar to ours. One exception is that it bypasses the buffer when reading large chunks and the buffer is empty, but we are only reading one byte at a time anyway :)
<mihi>s/ours/mine/
<mihi>but anyway, I still think the biggest obstacle is to get it working on 64-bit too :)
<mihi>on 64-bit M2-Planet, to be exact
<siraben>OriansJ: what's the largest file M2-Planet is compiling?
<pder>siraben: I now have precisely building with M2-Planet. I am just working on modifying precisely.hs to output M2-Planet compatible code. Do you have any thoughts on sample code to test the compiler?
<pder>btw precisely.c is 759222 bytes
<siraben>pder: sure, I'll do some tests and see what precisely.hs accepts
<siraben>of course, you should attempt self-compilation
<pder>Thank you, yes bin/precisely < precisely.hs works
<pder>I am just wondering what a minimal program looks like
<pder>Maybe we want a wrapper script that compiles the hs to C and compiles with M2-Planet to an executable?
<siraben>pder: try, `main = putStrLn "hello, world!"`
<siraben>I think at that point it's layout sensitive
<siraben>so lots of simple haskell programs will work, provided the functions are defined
<siraben>pder: and when you compile precisely with itself, does the produced output also work?
<pder>siraben, I still need some more modifications to precisely.hs to generate M2-Planet code, so I can compile it repeatedly
<pder>but I am close
<pder>so the minimal hello world gives me missing: putStrLn
<siraben>Ok, so that's because it's not defined, hm.
<siraben>I'll paste something that should
<siraben>oh at least that some error message!
<siraben>what happens when you do some ill-typed thing like `main = 3 + "hello"`?
<pder>"missing +"
<pder>Do we need some sort of header that defines all of these primitives?
<siraben>it seems like it, I will make a simple program now and see
<siraben>pder: this is a good instance where interactive development with GHC helps
<pder>so GHC loads Prelude on startup, so I assume we need something equivalent
<siraben>Ok none of the wrap files are working with precisely
<siraben>hm
<siraben>pder: http://ix.io/2J99
<siraben>./precisely < sample.hs > out.c && gcc -O3 -o out out.c && ./out
<pder>Thanks. How did you arrive at that?
<siraben>pder: I started with precisely.hs and commented out huge swaths of it and replaced main with putStr "hello\n"
<siraben>then cleaned up and so on, iterated using precisely itself actually
<siraben>it told me what was missing and when types mismatched
<siraben>the next step would be to fix a wrap so that one can use GHC
<siraben>then a reasonable prelude
<pder>Ah ok, do you think we should have something equivalent to Prelude that is autoloaded?
<siraben>Hm, if blynn-compiler had modules/was able to include files it would be easier
<siraben>yes definitely
<siraben>The user should only need to write `main = putStrLn "hello, world!"` instead of that preamble
<pder>Anything missing we can extend beyond precisely
<siraben>As in extra stages? I think precisely is the last stage.
<siraben>Didn't we have crossly as well?
<siraben>Also are you sure commit 734f695e9539a23e4336e4e17006a3bd96171636 is alright?
<siraben>I wonder why blynn made div and mod like that in the first place
<pder>I wondered that too. I tried to keep anything I was not certain of in separate commits so they are easy to see and possibly revert
<pder>Yeah, I meant extra stages beyond precisely if we needed extra features like modules
<siraben>Ah crossly lets you compile to wasm, nice
<siraben>So now we have a bootstrap to a wasm version of blynn-compiler
<pder>I changed div and mod because M2-Planet did not give the same results as gcc with that code but I didnt look closely at why
<siraben>ah, right.
<siraben>so we might have to propose a "modularly" stage, I'll open an issue upstream and see what happens
<siraben> https://github.com/blynn/compiler/issues/2
<pder>In commit 53af09a76492ac6 I tried to do the least invasive thing which should make the 64bit types behave like 32 bit types. Does this seem ok?
<siraben>Oh also disabled GCC optimizations? Did it change the behavior?
<pder>No change in behavior- just for debugging purposes. When I push the final branch I will undo it
<siraben>OK, great
<siraben>I think it should be good, we'll need to test it to be sure
<pder> https://github.com/oriansj/blynn-compiler/pull/11
<pder>OriansJ: the final stage of blynn-compiler now builds with M2-Planet
<pder>siraben: Your sample hello world works building with M2-Planet
<siraben>pder: thank you for your work!
<siraben>pder: it would be good to fix the CI as well, I'll take a look in the morning and suggest changes
<siraben>pder: segfault: https://github.com/oriansj/blynn-compiler/pull/11/checks?check_run_id=1595992801
<pder>siraben: looks like the tests are using an old version of M2-Planet. from the logs it is using e5befc4feed411f55303c
<fossy>pder: amazing work!!!!
<fossy>and siraben
<fossy>siraben: what's the next step? a r5s5 interpreter?