IRC channel logs

2024-12-12.log

back to list of logs

<Pellescours>I’m able to understand a bit more the problem, with rumpdisk, doing a cp of a big folder (with a lot of small files) (~900M
<Pellescours>maybe less), it hang
<youpi>that maps to the usual hang I've been seeing, happening while unpacking .debs
<youpi>having more memory, or telling ext2fs to sync more often, reduces the issue
<youpi>possibly it's just when memory becomes scarce that it misbehaves
<Pellescours>In my reproducible test, I see some rumpdisk threads having the exception with code KERN_INVALID_ADDRESS. I just reproduce the case in a normal built debian VM (32bit also) and I don’t see the exception but I see the freeze too
<Pellescours>Ah by running htop at the same time I see the cache memory (yellow) increasing and when it’s close to the top, it freeze
<solid_black>hi
<solid_black>I wonder, what is it that htop reports as cache memory
<solid_black>something from vm_statistics?
<solid_black>must be vm_cache_statistics->cache_count rather
<solid_black>which is vm_object_external_pages, which is something like the number of physical pages that represent external VM object's pages
<solid_black>so yes, if that fills up the whole bar, that's when pageout would start to happen
<solid_black>Pellescours: can you see exactly what the faulting thread is doing when it gets the exception?
<solid_black>let's just fix the pageout bugs, sounds like they've been plaguing us for too long
<solid_black>or perhaps you can tell me how to reproduce this?
<solid_black>is it just causing pageout in any way w/ rumpdisk?
<solid_black>hi ZhaoM!
<ZhaoM>solid_black: hi
<solid_black>I saw RTC got merged?
<solid_black>congrats
<solid_black>what are you going to hack on next?
<ZhaoM>Thanks
<ZhaoM>Now I'm browsing the TODO list on the website
<ZhaoM>even though there are still some minor things about RTC
<ZhaoM>like adding wiki and a proper RTC_UIE_ON implementation
<ZhaoM>And I found there are so many entries on the open issue page
<ZhaoM>Maybe it's worth to check if all of them are up-to-date
<solid_black>my impression was many of them aren't
<ZhaoM>Then I may spend some time on it before I find something I'm keen to hack on :)
<gfleury>Hi
<solid_black>hi gfleury
<gfleury>hi solid_black
<gfleury>ZhaoM: https://darnassus.sceen.net/~hurd-web/contributing/ start with this under TODO List
<solid_black>oh, getting rid of executable stacks would be very cool indeed
<gfleury>That the most pressing but Ask before because some are fixed
<solid_black>if you take up SA_NOCLDWAIT, please talk to me about how that interacts with my plans for new signals
<gfleury>solid_black: you want to reimplement signals
<solid_black>kind of, yes
<solid_black>the good news is the huge monster of signal handling code path in glibc stays the same
<solid_black>but there'd be new RPCs and new glibc APIs
<solid_black>it's related to the GDB vs msgport issue (see the email I sent on April 2nd two years ago), and to killing suid processes
<gfleury>Nice
<solid_black>I have way too many plans and way too little motivation :)
<solid_black>right now I'm motivated to investigate the pageout issues youpi and Pellescours have been facing
<solid_black>so let's do that please
<solid_black>I should also really get back to my little Alpine-based distro
<solid_black>'cause it's that time of year again
<solid_black>in completely unrelated news, my layout rework is finally starting to land in gtk
<gfleury>plz fix pageout issue to make the system stable. I think you are in good position to do it
<ZhaoM>gfleury: OK
<gfleury>congrats for your gtk. Sometimes i Wonder how you do too much work complex like that
<solid_black>htop doesn't even list any processes here
<gfleury>same here
<solid_black>rustix expects c::sendmsg() to return isize, but it returns i32
<solid_black>it certainly is declared as returning ssize)t in glibc
<youpi>it's being solved
<youpi>already fixed in the rust-libc repo
<solid_black>ah
<youpi>the dep just needs to be updated now
<solid_black>ACTION looks
<solid_black>now I'm getting "the following trait bounds were not satisfied: `Poller: AsRawFd` which is required by `Arc<Poller>: AsRawFd`" in calloop
<solid_black>which is likely about it using poll vs epoll
<youpi>which version of rustix are you building?
<youpi>0.38.37-1 builds fine in debian
<solid_black>I 'cargp update'd rustix
<solid_black>so rustix now builds
<solid_black>the new error is in calloop
<youpi>ok
<youpi>that possibly needs patching vs poll/epoll indeed
<youpi>the crazy tendency of the rust ecosystem to just put #cfg everywhere instead of just detecting the availability of interface hits again
<solid_black>yes, it has #[cfg(unix)] currently
<youpi>I know it's meant for being able to cross-compile, but that's not a good reason. You can detect the availability of most interfaces while cross-compiling too
<solid_black>I wonder whether it's possible to just add something like 'where Poller: AsRawFd', instead of more cfgs
<solid_black>from a quick check, that doesn't seem to work, the compiler sees that the condition doesn't depend on any of the impl's own generics
<Pellescours>solid_black: I was not able to determine it, but probably it's possible
<Pellescours>the exception is maybe because I have no swap on my test environment
<Pellescours>but basically get a directory of small files (around 1G of small files), I took locales that I dupplicated to have a big enough folder and just copy it to another folder
<Pellescours>if you have a gdb with a breakpoint set at exception() you should catch it
<Pellescours>basically: mkdir test; cp -r /usr/share/locales test/locales-1; cp -r /usr/share/locales test/locales-2; cp -r /usr/shares/locales test/locales-3; cp -r /usr/share/locales test/locales-4; cp -r /usr/share/locales test/locales-5
<Pellescours>and then after this preparatory work you just cp test test2 and see the freeze
<Pellescours>ZhaoM: while being in clocks, maybe implement the CLOCK_MONOTONIC in gnumach ? I don't know how hard it can be to implement and what benefit it brings (maybe not that worth for now)
<ZhaoM>Pellescour: Just scanned through some documentation about monotonic clock. It seems interesting. I will try it.
<solid_black>I haven't looked into the design for monotonic clock in particular, but in general: please don't always rush to bring unix concepts into gnumach
<solid_black>if it makes sense in the kernel, sure, let's do it
<solid_black>but "unix has it" is not good enough of a reason to implement a feature into mach
<ZhaoM>Got it
<solid_black>for the monotonic clock in particular, I don't immediately see the opposite thing, why a microkernel would need to keep anything but a monotonic clock
<solid_black>is our current /dev/time CLOCK_REALTIME?
<solid_black>it is, yes
<solid_black>can we get rid of it?
<youpi>that would break existing applications
<youpi>better extend it
<solid_black>is mach_msg timeout current measured against that realtime clock for instance?
<solid_black>s/current/currently/
<youpi>it's relative
<youpi>so it doesn't matter very much
<solid_black>doesn't it? what if the wall-clock is changed due to e.g. changing a timezone, or an NTP sync occuring
<youpi>I mean the specification
<youpi>then the implementation might be buggy, I don't know
<solid_black>the specification is relative, sure
<solid_black>but if, as implemented, it uses the existing "realtime" Mach clock, then it must be affected by time jumps
<youpi>mach uses ticks for timeouts
<solid_black>hm, indeed, so it does look relative
<solid_black>supporting pthread clockwait / condattr_setclock basically requires gsync timeout to be measurable against arbitrary clocks
<ZhaoM>go to bed. Good night guys.
<solid_black>night
<Pellescours>I assumed that implementations of clock must be done in the kernel as, the existing implementation is already there
<solid_black>in light of what I wrote about gsync / clockwait above, it probably has to indeed, but I need to think more
<solid_black>but if it wasn't for that and if we didn't already have a realtime clock in the kernel, I could imagine /dev/time being implemented by a /hurd/time server for example
<solid_black>and doing an NTP time sync (or manually setting date) would do an RPC to that
<solid_black>and it would host the various adjtimeex logic for adjusting wall clock gradually to aovid discontinous changes
<Pellescours>so basically it woumd be better to have clock monotonic in the kernel and clock realtime userland ?
<Pellescours>would*
<damo22>something we need to fix about clocks is that minimum granularity for kernel mach clock timer is 100Hz
<damo22>so you basically never get a clock value between comb filter
<damo22>so if you measure the time and measure it again too fast, you get exactly the same value
<Pellescours>that's due to hpet timer ?
<damo22>no i think its because the implementation of clock timers measures the stored value that is updated once per clock interrupt
<Pellescours>i know that linux use a complicated timers, hard to get it right because it's based on cpu frequency
<Pellescours>ah I see
<damo22>i could be wrong but last i checked that was the case
<damo22>hpet should be okay, if we query it every time the time is requested
<damo22>youre talking about ntp and high level stuff but we really need to fix the low level implementation
<Pellescours>yeah, but first having rump working correctly (disk and net)
<Pellescours>i need to retry the netbsd sources upgrade. But I need to switch back to linux in kernel drivers otherwise it's impossible to have something stable with heavy write
<jab>hey hurd people, so I'm tempted to tell phoronix about the Hurd's latest Qoth: qoth Q3 2024, which I guess is not merged into the web git repo. Weird I thought it was.
<youpi>iirc q3 was missing a few updates before committing ?
<jab>gotcha. I thought it was just a few typos type thing. I'll tweak it and re-send it. Also darnassus website is lagging behind the web repo.
<jab>eg: https://git.savannah.gnu.org/cgit/hurd/web.git/commit/?id=078b6861aaeaeafd88914de292331c1164b7e676
<jab> https://darnassus.sceen.net/~hurd-web/open_issues/
<jab>I'm not seeing Damien's audio system plan that I added (he wrote most of it).
<azert>sneek: later tell jab I understand the wish to share good news with phoronix, but beware of the demoralization shills over there! Don’t fall for their tricks, don’t reply to the inevitable out of context wrong criticisms expressed with weird standing of superiority and paternalism
<sneek>Got it.
<Pellescours>What does generate gnumachUser.c un rumpkernel? It’s mig?
<Pellescours>Ah found, and yes
<Pellescours>I have an error compiling rumpdisk
<Pellescours>gnumachUser.c:1791:37: error: comparison of integer expressions of different signedness: 'unsigned int' and 'int' [-Werror=sign-compare]
<Pellescours>(gcc 14.2.0)
<Pellescours>Ah I see that it was fixed in mig latest
<gfleury>sneek later tell youpi: i just sent on glibc ML move of __pthread_sigstate, __pthread__sigstate and pthread_sigmask into libc.
<sneek>Got it.
<Pellescours>probably the debian package needs to be updated to include this fix
<Pellescours>I did it, I finally updated the netbsd sources and the associated debian patches. It was actually not hard once I learned to use quilt correctly tu update the patches.
<Pellescours>But now I have an error at compilation, the opt_ahcisata.h is not here anymore