IRC channel logs

2022-03-08.log

back to list of logs

<muurkha>usually filesystems have to worry a lot about disk sequencing, consistency on power outages, consistency of the buffer cache, fragmentation vs. block allocation efficiency, that kind of thing. those problems mostly go away if you can get away with just writing the entire consistent filesystem image to disk when someone says `sync`
<oriansj>muurkha: we also can dictate no power outages; do no buffering, ignore fragmentation and just stick to 4K blocks
<muurkha>yeah
<oriansj>performance doesn't matter nor do we have to be efficient in any regard than simplicity
<muurkha>well, if the resulting system isn't usable, people won't use it to look at the source
<muurkha>so you're back to trusting mainstream Linux or whatever to not lie to you about what the source says
<muurkha>but that's still a lot better than where we are now
<oriansj>well nothing will be there to stop us from building a better version in the M2-Planet C subset and just switch to that too
<oriansj>also I am planning on using BootOS (possibly with some tweaks) and manually making a GFK file system by hand and a couple boot sector programs for manually writing the pieces onto the disk
<oriansj>but first things first, I need to solve this kernel problem and shrink the TCB more and into a form that I can then snuff out
<muurkha>Green Fluorescent Knowledge?
<muurkha>Grand Forks? Ghostface Killah?
<oriansj>muurkha: good forking knight filesystem
<muurkha>heh
<oriansj>figured I might as well make a filesystem made to be hand edited and trivial in assembly to support
<muurkha>the UCSD p-System filesystem didn't support fragmented files at all
<muurkha>if you wanted to write a file that wouldn't fit into any of the free spaces on the disk, you had to run a defragmenter program to defragment the disk
<muurkha>for a usable bootstrapped system I've been thinking of directly accessing NAND Flash without a FTL
<oriansj>page based filesystem seems relatively simple to do and then I can just ignore being fragmented
<muurkha>not sure why they made that choice but possibly one reason was so that the whole filesystem catalog would fit in one sector (this was floppy-disk era, so the disk might be up to a megabyte)
<muurkha>another reason might be that poor performance due to fragmentation was a much worse problem on a floppy than on a hard disk or an SSD
<muurkha>random seek times were close to a second on floppies
***alMalsamo is now known as lumberjack123
<oriansj>that I certainly believe and they had to care about being fast and efficient
<oriansj>I get to waste GB of disk and RAM to make my work easier
<muurkha>the p-System was ridiculously inefficient as I recall
<muurkha>which is why we all ended up using CP/M and AppleDOS and stuff instead
<muurkha>but it did have to run in 64 KiB RAM, or even less
<oriansj>inefficient those standards is crazy efficient by modern standards
<muurkha>well, I mean, it could do many fewer arithmetic operations or memory accesses per second than JS code JIT-compiled with V8
<muurkha>or per clock cycle, or even per instruction executed
<oriansj>well one wouldn't expect a pascal interpreter running in 64KB to be as efficient as a 1+GB JavaScript interpreter.
<muurkha>exactly
<oriansj>much like how -0s probably will have less performance than -03
<muurkha>you'd be surprised how often that isn't true
<muurkha>the bigger reason is that it really was an interpreter, and V8 isn't, it's a JIT compiler
<oriansj>as often as the hot path stops fitting in the L1 cache due to code expansion
<oriansj>but if your hot path is big enough that -03 can cause it to not fit in 64KB, then you have other issues besides your compiler flags
<muurkha>as you optimize code your hot path tends to get bigger and bigger
<muurkha>because you're trying to eliminate the tiny hotspots. decrease inequality, you might say. redistribute execution time
<muurkha>sometimes you can't, but often you can
<muurkha>but also -Os often produces surprisingly fast code on its own
<muurkha>quite aside from cache effects
<oriansj>muurkha: well fast code these days is mostly just reduce the number of syscalls and don't touch disk much
<muurkha>that depends a lot on what you're doing. certainly there are programs for which that is true
<oriansj>even M2-Planet generated code got a 10x performance increase by just improving the M2libc caching behavior
<muurkha>Amdahl's law applies
<muurkha>if your code is spending 90% of its time making syscalls, you can make it up to 10× faster by reducing the number of syscalls
<muurkha>but you can never make it 11× faster
<muurkha>if you can reduce the amount of work done in syscalls by 100× then your code is now spending 8% of its time in syscalls
<oriansj>well we went from 1-2 syscall(s) per byte to 1 syscall per file
<oriansj>as fgetc was doing just read(fd, &stack, 1); pop and return
<oriansj>and fputc was just doing push; write(fd &stack, 1); and return
<muurkha>right, if you're spending 99% of your time in syscalls, it's straightforward to get a 10× performance increase
<oriansj>now we just open, lseek (to get size), brk (to allocate enough memory), lseek (to reset), read and close
<muurkha>but not all code is spending 99% of its time in syscalls
<oriansj>completely agree muurkha
<muurkha>some code is spending 99% of its time on something else, like running the p-System interpreter (probably closer to 90%)
<muurkha>and in the modern world you can often also make code go faster by not spending 80% of your cores on being idle
<muurkha>even if that means doing more somewhat ore computation rather than less
<muurkha>*more
<oriansj>well unfortunately prolog never got to a generally useful state and performance became a human problem
<muurkha>and of course interleaving disk I/O with CPU work is sort of similar in that it also requires you to have multiple different threads that can proceed as soon as they get data from the disk
<muurkha>I think Prolog is about as generally useful as other programming languages? it's better suited for some things and worse for others, but the difference is just not big enough to make the mental paradigm shift compelling
<muurkha>maybe similar to APL/J/K?
<muurkha>nowadays SSD I/O has a latency on the order of 10-100 μs, which is enough smaller than hard-disk 8000 μs or floppy-disk 1000000 μs that the optimal tradeoffs change
<muurkha>in-RAM caching that was a performance win with spinning rust can be a performance loss with SSD, for example
<oriansj>well I gave up on the APL/J/K route simply because they never managed to make a C compiler with it
<muurkha>and the bandwidth is high enough that the memory-to-memory copy implied by the read() and write() interface starts to become a significant performance loss, as it did long ago for spewing raster data to the display
<muurkha>yeah, I feel like there must be *something* there with APL/J/K but it obviously wasn't the kind of silver bullet that going from assembly to C, or punchcards to screen editors, was
<muurkha>like, it doesn't cut development effort much for anything besides very small systems, and it might actually increase it
<muurkha>also true of Forth, dynamic languages like Python and JS and arguably Lisps
<oriansj>well going from assembly to C is a huge jump in productivity; even if it doesn't prevent you from doing more fun constructs
<muurkha>yup
<muurkha>I thought going from C to Python was a similarly huge jump in productivity, but then I measured it, and I had a dismaying epiphany
<oriansj>going from C to lisp doesn't bring any advantage until you get comfortable with macros
<muurkha>I don't think that's true; Lisp is similar to Python in a lot of ways
<muurkha>because of dynamic typing Lisp and Python code is implicitly parametrically polymorphic, and it's easy to do ad-hoc polymorphism too
<oriansj>but that might just be the perspective shift that learning macros forces upon you; that in retrospect will also make your C programming more efficient too
<muurkha>and data structure literals make it easy to do embedded DSLs in either Lisp or Python without macros
<muurkha>also, GC
<muurkha>so I think you do get some improved productivity by going from C to Lisp, even without macros or fexprs
<muurkha>especially for small programs
<muurkha>it's just a lot smaller than I used to think
<muurkha>(the productivity boost)
<oriansj>well garbage collection certainly is very great for userspace code productivity and unfortunately the deciding factor is library support these days.
<muurkha>again, though, the boost over C from all the goodness in Python is less than an order of magnitude, even in programs that take only hours to write
<muurkha>for me
<muurkha>and for larger programs I suspect it may actually be negative
<oriansj>I tend to pick the language I use after I look at what I want to do.
<muurkha>in theory Prolog should be ideal for writing compilers
<muurkha>but oddly enough its use for that is almost nil nowadays. Haskell and ML are a bit higher
<oriansj>well the biggest factor in the use of any language is the number of people working to make it better.
<muurkha>if that were true then COBOL or FORTRAN would have been the language of choice since their inception
<muurkha>or possibly Intel assembly
<muurkha>or IBM 360 assembly
<muurkha>JS probably has the biggest library nowadays: npm
<muurkha>GNU Prolog generates substantially faster code and uses less memory than V8 for most purposes, but many more compilers are written in JS
<oriansj>muurkha: I have serious doubts on the idea that JavaScript has more developers working to make the language better than other languages
<muurkha>maybe, but it certainly has a bigger library than all but a few, probably more than all
<oriansj>I'll grant you it probably has the most people working on scratching their own programming itch but its bootstrappng state is seriously f$*ked
<muurkha>oh, agreed
<muurkha>it probably has more people working on improving optimizing compilers for it too. I wrote a raytracer in JS a few years ago
<muurkha>I mentioned it a bit in https://news.ycombinator.com/item?id=30378986
<muurkha>when I wrote it in 02017 it took 6.8 seconds to render a particular scene in Node.js (V8)
<muurkha>user meetups323 reports that now it takes 0.25 seconds
<muurkha>that kind of thing trades off heavily against having to optimize your own code
<oriansj>eventually one gets to the point where things are just fast enough to not care
<muurkha>no, that doesn't happen
<muurkha>I mean, you can decide to not care, sure
<muurkha>but having your code run faster always opens up new possibilities, whether you take them or not
<muurkha>fuzzing, Monte Carlo simulation, gradient descent, etc.
<oriansj>fair. But then again people optimizing for performance are not likely going to go for interpreted languages or even dynamicly typed languages.
<oriansj>type information is too useful for optimization and JIT is no substitute for good old compilation
<muurkha>LuaJIT and Java HotSpot are competitive with all but the best traditional AOT compilers for statically typed languages
<muurkha>so I don't agree that JIT is no substitute for good old compilation, if average performance is your objective
<muurkha>average speed
<muurkha>good old compilation can definitely beat JIT for worst-case performance and memory usage though
<muurkha>dynamic typing and interpreters definitely have a performance cost, but sometimes it's worth it
<lumberjack123>Sorry dumb question what is "literate programming"?
<lumberjack123> https://en.wikipedia.org/wiki/Literate_programming
<lumberjack123>nvm ^_^
<muurkha>np :)
<fossy>stikonas[m]: can you send your perl-5.6.2_0.tar.gz, qemu or chroot. don't even know where to start looking
<stikonas[m]>fossy: OK, will do so
<stikonas>fossy: https://stikonas.eu/files/perl-5.6.2_0.tar.gz
<oriansj>lumberjack123: no such thing as a dumb question. Only questions that lacks sufficient shared context and then clarifying is helpful to bridge that gap.
<oriansj>for example today I am learning about setting up a multiboot elf binary and starting to use qemu and that grub-file --is-x86-multiboot can provide a sanity check
<oriansj>and dear lord, putting a handful of bytes in the right place is annoying using linker scripts
<oriansj>but I guess I am spoiled rotten with hex2 doing exactly as I tell it in the order I tell it to do it
***alMalsamo is now known as lumberjack123