IRC channel logs

2025-03-26.log

back to list of logs

<Pellescours>what is the problem? I don’t remember
<damo22>Pellescours: i dont recall exactly but packets are getting lost somewhere
<damo22>you can build my hurd rumpnet version and try
<damo22>Pellescours: https://git.zammit.org/hurd-sv.git/log/?h=rumpnet
<Pellescours>the hurd console is now fixed thank youuu
<Pellescours>youpi: I got a strange behavior (default pager read was called while I don’t have a swap partition and the ram was almost empty). I investigate a bit on vm code, and I found this spot https://salsa.debian.org/hurd-team/gnumach/-/blob/master/vm/vm_object.c?ref_type=heads#L2161
<Pellescours>the doc says "for an internal object", does it means that the object.internal must be TRUE?
<youpi>internal objects have object.internal set to TRUE, yes
<Pellescours>(I tried to put an assert(object.internal == TRUE) but the assertion fails when rumpdisk start)
<youpi>about default pager called without swap, note that I fixed a couple things latelay, make sure to look at master
<youpi>internal objects do exist, and can have a memory object
<youpi>even without swap
<Pellescours>I’m running on master, I’m on a cross-compiled 64 bit hurd
<youpi>the problem comes only when the kernel tries to swap them out
<youpi>that's where I put some fixes
<youpi>and since then I have way less issues on 64b buildds
<Pellescours>yeah I understand that internal object can exists, I was just wondering why it tried to read data from the default pager (while I have no swap)
<youpi>it's odd that it'd be trying to read if it never swapped out
<youpi>it'd be useful to look at the backtrace to check what led to doing that
<Pellescours>I don’t have backtrace, just all write that stops and a message in the console
<Pellescours>"(default pager): data_request read error, lost data"
<youpi>what message?
<Pellescours>When I checked in the source code what could have called the seqnos_memory_object_data_request I reached this place https://salsa.debian.org/hurd-team/gnumach/-/blob/master/vm/vm_fault.c?ref_type=heads#L652
<Pellescours>(maybe I’m mistaken and it’s not this place that called the method)
<youpi>you can put an assert there
<Pellescours>but if it’s this place, I’m wondering how it was possible to reach this codepath. And that’s why I got the internal field question
<youpi>checking whether the pager is the default pager (unexpected) or another pager, e.g. ext2fs's libpager (expected)
<youpi>put an assert, and the backtrace will tell you how it got there :)
<youpi>indeed, you could also just assert on object->internal, rather than checking the pager pointer
<Pellescours>I tried first by putting an assertion (object->internal != TRUE) on the vm_page_fault (before the memory_object_data_request), and it did not failed but I got the default pager message again
<Pellescours>then I tried put an assert_backtrace(1 == 0) in the fail path of the pager data_request (just after the message) and the stacktrace only contains the seqnos_memory_object_data_request call
<Pellescours>I’m not able to understand what calls my default pager
<Pellescours>damo22: do you have a way to use rump one one card while still using dde on the other?
<Pellescours>youpi: I put some logs to the mach-defpager to see how the method is called 7 times at the moment of the "crash", 6 times it takes the `if (no_block)` and the `if (external)` (the part with the comment saying "it ask for unswapping data" in default read) and the last time and the last one is due to the `if(offset >= ds.dpager.limit)` (this one cause the error message)
<youpi>Pellescours: ah, so it did try to write to swap actually
<youpi>and it's only upon trying to re-read that it realizes that it didn't write
<youpi>that's indeed how the protocol currently works, unfortunately
<Pellescours>so putting more ram should workaround it I suppose
<youpi>yes
<youpi>or not starting mach-defpager at all
<youpi>so mach knows it shouldn't try to swap out
<youpi>disabling all tmpfs mounts will allow that
<Pellescours>Ah ok, but isn’t there a way for the defpager to say it reject the write?
<youpi>currently, no
<Pellescours>I see some return (PAGER_ABSENT), maybe the pager should be able to determine that it does not have a backing store and return it?
<youpi>also, the kernel doesn't know how much room mach-defpager has on the swap
<Pellescours>so the interface between the defpager and the kernel should be improved if we want to be able to do that, I see.
<Pellescours>I will try to disable not start the defpager but the swap request is really weird. I doubled my RAM (16G to the VM) only 400M is reported as used, and there was a swap request
<youpi>possibly it's just because some low-memory segment is getting full
<Pellescours>I ensured that the defpager is not started, I do not have a tmpfs and still the issue
<youpi>still the issue?
<youpi>if the defpager was really not started, you wouldn't get the error message from it :)
<Pellescours>right
<Pellescours>I renamed the mach-defpager to something else and now it works
<Pellescours>and the "real" problem appear, it’s during the compilation of nss (mozilla certificates), it runs the gyp program, and now I have an OSError (Input/output error) from /lib/python3.12/multiprocessing/synchronize.py (_multiprocessing.Semlock call)
<Pellescours>does this python script work on debian hurd64 `import multiprocessing; multiprocessing.Pool(1)`? I want to know if it’s my VM bootstraping that needs fixing or if it’s a bug on 64bit version of hurd
<youpi>it terminates fine here