IRC channel logs

<Arsen>on the topic of grub, I was surprised there was no effort to port multiboot2 to amd64 :(

<damo22>rrq, does gnumach print anything?

<damo22>rrq: this is how i laid out my hurd disk:

<damo22>Device Boot Start End Sectors Size Id Type

<damo22>/dev/wd0s1 * 2048 1953791 1951744 953M ef EFI (FAT-12/16/32)

<damo22>/dev/wd0s2 1953792 22925312 20971521 10G 83 Linux

<damo22>/dev/wd0s3 22927360 127784959 104857600 50G 83 Linux

<damo22>/dev/wd0s4 127784960 232642559 104857600 50G 83 Linux

<damo22>s2 is i386, s3 is for code, and s4 is for amd64, s1 is not needed but i can test efi

<rrq>no output; grub's multiboot drops a note that "no console is available for the OS" ... I'll need to get a gdb-server session going

<rrq>the disk has only 2 partitions, EFI and ext2; the multiboot loading is done by the grup-efi module referring into (hd0,2) which is the ext2 filesystem

<rrq>.. the same multiboot loading that works with bios boot (though that's on a different image)

<nikolar>that's because (usually) efi doesn't come with a vga style console

<nikolar>you need to use the framebuffer in those cases

<rrq>right. with "you" you mena "gnumach" ?

<nikolar>sure

<damo22>rrq: how did you specify the efi firmware in qemu? i am having trouble loading it

<damo22>i tried -bios /path/to/edk2-x86_64-code.fd

<rrq>"-drive if=pflash,readonly=on,format=raw,file=OVMF_CODE_4M.fd -drive if=pflash,readonly=off,format=raw,file=OVMF_VARS_4M.fd"

<rrq>th fd files copied from /usr/share/OVMF/

<rrq>esp the VARS should be local since it gets modified

<rrq>(I'm on Devuan (debian))

<damo22>ok

<rrq>.. I use my efirun.sh in https://git.rrq.selfhost.au/git/rrq/prep-hurd-image.git

<rrq>it also has grub.cfg at top level in the project

<damo22>my grubefi is hanging probing disks

<damo22>error unable to find boot device, error variable prefix is not set

<rrq>you dropped th -bios argument I guess? and the VARS file is writable?

<damo22>ok it eventually probed

<damo22>there is something missing because it crashes

<damo22>but i dont get the console log

<rrq>is there a way to figure out where qemu jumps to the "kernel" (gnumach)?

<rrq>in gdb I mean

<damo22>i dont know

<damo22>you could load grub in gdb?

<rrq>using the gdb-server mode of qemu

<damo22>yeah

<damo22>instead of loading gnumach in the gdb client

<damo22>load grub

<rrq>tracing the machine isntructions ... was not too hard when I had the source

<damo22>then it will have all the symbols from grub

<rrq>yeah.. I'm still debian so getting to symbols is a long armlengt away

<damo22>do you think its not reaching the gnumach entrypoint?

<rrq>contrary: I want to start tracing there

<damo22>you could find a very early symbol in gnumach and put a break point on it

<damo22>and load gnumach

<damo22>i had trouble though because of segmentation

<rrq>mmm right; though it's indirectly through qemu's emulation isn't it

<damo22>ie, it would not let me insert the breakpoint

<damo22>you can "watch spl_init"

<damo22>then when that variable becomes available it will break when it changes

<damo22>thats about the best ive got so far

<damo22>heres another qemu trick:

<damo22>-chardev socket,id=net0,host=127.0.0.1,port=9999,ipv4=on,server=on,telnet=on -monitor chardev:net0 \

<damo22>so when you launch qemu, you also have to launch a telnet session to localhost 9999

<damo22>but the benefit is you get the qemu monitor in a separate terminal

<rrq>... mmm I could load my own stub instead of gnumach and induce a segmentation violation... should give a hint of the memory address concerned

<rrq>ah yes, I'm using separate unix sockets for console and monitor, and vnc for graphics (which I don't need to connect)

<rrq>... and then I need my onw gnumach compile to work thorugh that one. Guess I won't have a weekend :)

<damo22>rrq: you can look at the forgejo script in master, it cross builds gnumach

<rrq>ta

<damo22>youpi: if a setuid program is owned by root, and its purpose is to perform some privileged op and then drop privileges to a lower user, is there a way to make the program indeed switch to that user so kill works as that user?

<damo22>if so, isnt that a bug in the setuid program?

<damo22>if it doesnt do that

<damo22>youpi: this is the backtrace from x86_64 smp with -smp 6, it drops to kdb shell with a trap and i caught it in gdb:

<damo22> https://pastebin.com/raw/T354F5zc

<damo22>is there something wrong with hpc clock in smp?

<damo22>syscall64 -> mach_msg_trap -> ipc_kobject_server -> _Xhost_get_uptime64 ...

<damo22>#warning This needs fixing

<damo22>clock: Note that the hpc support is not smp safe yet

<damo22>hmm what needs to be done there?

<damo22>i think we only want to update last_hpc_read on cpu0

<damo22>otherwise it updates too frequently

<damo22>that didnt fix it though

<mightysands>Hey, so I just took a look at that git repo. https://git.savannah.gnu.org/git/hurd/incubator.git and...it's empty

<mightysands>I cloned it, and there's nothing but a README file with two lines...

<mightysands>I'm guessing it's not empty, because it downloaded about 70Mb of stuff, so is this just an instance of it being hidden behind various git commands ?

<mightysands>.git/refs/tags is empty, so not sure what my options are

<mightysands>Ah, nvm. The info I was looking for was in the .git/packed-refs file :)

<azeem>damo22: the setuid thing is an old issue with the Hurd, not the setuid program IMO, see https://lists.gnu.org/archive/html/bug-hurd/2006-08/msg00063.html - or do you mean something else?

<azeem>also, https://lists.gnu.org/archive/html/bug-hurd/2026-02/msg00022.html

<damo22>i dont see how its an issue with hurd, if the setuid program doesnt change owner

<damo22>if X runs as root forever, it runs as root

<damo22>its not a bug with hurd

<azeem>maybe I misunderstood your problem then

<damo22>but if X only needs root to start something and then drop to user, it could be killed by user

<azeem>I think this is what happens

<azeem>it *should* be *killable* by user, but right now it isn't

<azeem>for a simple example, see the attachments in https://lists.gnu.org/archive/html/bug-hurd/2026-02/msg00022.html

<damo22>i dont think it should be, if it still needs root for doing privileged things

<azeem>you think it should not drop privs?

<damo22>i dont think it can

<damo22>because the gpu probably needs root to access it

<azeem>I don't have an opinion on the specific X case

<azeem>I think on Linux this might be solved by adding the user running X to some group that has access to /dev/drm or something?

<damo22>in a case where root is only needed initially, it should always drop privs

<damo22>and this should be killable by user

<azeem>yes, but it isn't on the Hurd, agreed

<rrq>well, X can run as non-root; in newest scheme it uses "device mediation" so that root opens them and then passes the opeoend file descriptor

<rrq>the opening is via elogind or seatd

<damo22>in your perl case, it doesnt look like perl is doing the right thing

<azeem>well, it maybe doesn't do the right thing for the Hurd, but I think what is does is legal

<damo22>i dont think its a hurd bug

<azeem>look, if there's a process running under non-uid X and user X can't send signals to it, I call that a bug

<azeem>non-root uid*

<damo22>does it really run under uid X

<azeem>unfortunately the work-around I found (running POSIX::setuid()) breaks the postgresql-common testsuite on Linux at another place, I need to investigate this further

<azeem>damo22: well, ps says so

<damo22>hmm maybe its a bug with ps info

<damo22>and perl

<azeem>there's a C reproducer as well in that message

<damo22>what is the correct sequence of syscalls to change the uid of a process?

<azeem>I don't think POSIX defines that

<azeem>but I think the sequence that Perl/that C program do is legit, and it works on at least Linux and FreeBSD

<damo22>isnt it just setuid() ?

<damo22>These wrapper functions (including the one

<damo22> for setuid()) employ a signal-based technique to ensure that when one thread changes credentials, all of the other

<damo22> threads in the process also change their credentials. For details, see nptl(7)

<damo22>maybe in your test C program you can check the return value is 0

<azeem>ok, they always return 0, both on Hurd and Linux

<damo22>ok yeah, something is wrong

<damo22> root 936 933 p1 0:00.00 sudo ./atest

<damo22> demo 937 936 p1 0:00.00 ./atest

<damo22>$ kill -9 937

<damo22>-bash: kill: (937) - Operation not permitted

<damo22>$

<azeem>aye

<youpi>damo22: the setuid bug is not in the setuid program, it just tells proc what it needs to know. It's proc that doesn't let the user kill the setuid-ed process

<damo22>ok

<damo22>im more interested in the clock bug because its blocking smp64

<damo22>how to make it smp safe?

<youpi>damo22: don't confuse real uid and effective uid

<youpi>setuid only sets the effective uid, not the real

<youpi>see man getuid vs man geteuid

<azeem>ps shows the effective uid right?

<youpi>probably

<damo22>+ /*

<damo22>+ * Only update high precision read on cpu0 once per clock interrupt

<damo22>+ */

<damo22>+ last_hpc_read = hpclock_read_counter();

<damo22> }

<damo22>- last_hpc_read = hpclock_read_counter();

<damo22>this should help but it doesnt make any difference

<youpi>that doesn't mean it's not a good thing

<damo22>right, you left a #warning when NCPUS > 1 to make hpc smp safe, but i dont know what is needed

<azeem>10:08 < youpi> setuid only sets the effective uid, not the real

<youpi>damo22: possibly nothing but checking that it's alright

<azeem> https://man7.org/linux/man-pages/man2/setuid.2.html says "if the calling process is privileged, the real UID and saved set-user-ID are also set."

<youpi>I mean the setuid bit

<azeem>ah!

<damo22>any thread can call host_get_uptime64 on cpuN when cpu0 is in the middle of a clock interrupt

<youpi>that kind of thing, yes

<damo22>i dont know how i managed to catch the cpu running and executing HPET32

<damo22>unless that read_mapped_uptime () loop gets stuck for a long time

<youpi>(actually it's auth which knows the uids, and the joe-user process has to convince proc that it's allowed to kill the setuid-ed process)

<azeem>what about the less-dramatic case where the executable does not have the setuid bit, but is run by root and then changes iud to joe-user and then they cannot kill it either? Is that the same issue or a subtly different one?

<youpi>iirc currently proc's check is about the effective uid, so that case would work

<damo22>i removed the entire hpc setup and it didnt change the failing tests

<damo22>so something else is broken

<damo22>but only on x86_64

<azeem>youpi: ok, but the Debian Postgres wrapper changes both when it starts the server, so then the postgres user can no longer send signals to it

<azeem>I'll see whether I can find something that works

<azeem>hrm, so https://github.com/Perl/perl5/blob/blead/dist/IO/lib/IO/Socket.pm#L294 seems to be meant "if we try to run getpeername on the socket, but don't get anything defined back, do this"?

<youpi>both real and effective uids are set to the postgres, and the postgres user can't kill it? I don't see how that can be possible

<azeem>yeah IIRC

<azeem>$ sudo ~/test-uid-HUP.pl

<azeem>switching to uid 101 and gid 101

<azeem>euid: 101, uid: 101, gid: 101 0

<youpi>and 101 can't kill it?

<azeem>postgres@debian:~$ id

<azeem>uid=101(postgres) gid=105(postgres) Gruppen=105(postgres),104(ssl-cert)

<azeem>postgres@debian:~$ ps -ef |grep ^postgres.*HUP | grep -v grep

<azeem>postgres 27038 27037 p4 0:00.01 /bin/perl /home/demo/test-uid-HUP.pl

<azeem>postgres@debian:~$ LANG=C kill -s HUP 27038

<azeem>-bash: kill: (27038) - Operation not permitted

<azeem>yeah

<youpi>it does work here

<azeem>weird

<youpi>the gid doesn't match, though

<azeem>uh

<azeem>well I think the setgid doesn't matter here, but if I change it to 105, I still get Operation not permitted

<youpi>even with a gid mismatch I do get to kill it

<youpi>$ sudo ./test

<youpi>r 1001 e 1001 s 1001

<youpi>$ id

<youpi>uid=1001(youpi)

<youpi>$ kill -s HUP 8611

<azeem>how did you do setuid?

<damo22>are you on hurd? :P

<youpi>setuid(1001);

<youpi>damo22: sure

<azeem>right, the perl code runs setregid(105, -1); setreuid(101, -1); setresuid(-1, 101, -1) on Linux

<youpi>ah, so it doesn't change the effective uid

<youpi>which is what kill() currently checks

<youpi>ah no with the last setresuid it does

<youpi>let me check the result of these

<azeem>yeah, I run geteuid() and it is set as well

<youpi>ok, it leaves the suid as 0

<youpi>and in that case I indeed cannot kill it

<youpi>even if both real and effective are set to 1001

<youpi>I don't see why proc would refuse just because of suid, but that's it for now, apparently

<youpi>azeem: btw, are you in Münich atm?

<azeem>youpi: no, Occitanie

<azeem>I'll be back in Munich from March

<youpi>ok :) I'll be in münich in the coming days

<azeem>damn :)

<azeem>if you're interested, I can setup a Debian meetup

<azeem>btw. Jeff Bailey was in Munich the other day as well, but we didn't manage to meet either

<azeem>"the other day" being last September or so

<yelninei>does mmap work differently when something is PROT_WRITE and PROT_WRITE | PROT_READ ? The first one crashed when memcpy'ing something to that region while this seems fine on linux

<azeem> https://github.com/Perl/perl5/blob/blead/README.hurd is "Last Updated: Fri, 29 Oct 1999 22:50:30 +0200"

<nexussfan>Wow

<youpi>yelninei: in principle that can, yes

<youpi>yelninei: while the hardware can't prevent the read when the write is allowed, if the page is not resident the system can catch it

<yelninei>youpi: However I would assume that there is no problem with only PROT_WRITE and writing something there (with memcpy)

<yelninei>But there is also a logical problem in the program as the file gets opened with O_RDWR but mmaped only with PROT_WRITE

<youpi>one would have to see the actual crash detail (i.e. assembly), to be sure what happens

<youpi>memcpy implementations can be very tricky

<youpi>in principle it would only write, but possibly it gets smart with trying to prefetch output data etc.

<yelninei>i guess just also adding PROT_READ does not hurt anybody but this took me a while to figure out

<azeem>13:43 < azeem> This perl code (basically IO::Socket::UNIX->new(Type=>SOCK_STREAM(), Peer=>$path); following by send()) works fine on Linux but fails with "Cannot determine peer address" on the Hurd: https://paste.debian.net/hidden/c64a1389 is that a known issue?

<azeem>I've opened a Perl bug now, I think this is something they should work around: https://github.com/Perl/perl5/issues/24195

IRC channel logs

2026-02-14.log