IRC channel logs

2026-02-14.log

back to list of logs

<Arsen>on the topic of grub, I was surprised there was no effort to port multiboot2 to amd64 :(
<damo22>rrq, does gnumach print anything?
<damo22>rrq: this is how i laid out my hurd disk:
<damo22>Device Boot Start End Sectors Size Id Type
<damo22>/dev/wd0s1 * 2048 1953791 1951744 953M ef EFI (FAT-12/16/32)
<damo22>/dev/wd0s2 1953792 22925312 20971521 10G 83 Linux
<damo22>/dev/wd0s3 22927360 127784959 104857600 50G 83 Linux
<damo22>/dev/wd0s4 127784960 232642559 104857600 50G 83 Linux
<damo22>s2 is i386, s3 is for code, and s4 is for amd64, s1 is not needed but i can test efi
<rrq>no output; grub's multiboot drops a note that "no console is available for the OS" ... I'll need to get a gdb-server session going
<rrq>the disk has only 2 partitions, EFI and ext2; the multiboot loading is done by the grup-efi module referring into (hd0,2) which is the ext2 filesystem
<rrq>.. the same multiboot loading that works with bios boot (though that's on a different image)
<nikolar>that's because (usually) efi doesn't come with a vga style console
<nikolar>you need to use the framebuffer in those cases
<rrq>right. with "you" you mena "gnumach" ?
<nikolar>sure
<damo22>rrq: how did you specify the efi firmware in qemu? i am having trouble loading it
<damo22>i tried -bios /path/to/edk2-x86_64-code.fd
<rrq>"-drive if=pflash,readonly=on,format=raw,file=OVMF_CODE_4M.fd -drive if=pflash,readonly=off,format=raw,file=OVMF_VARS_4M.fd"
<rrq>th fd files copied from /usr/share/OVMF/
<rrq>esp the VARS should be local since it gets modified
<rrq>(I'm on Devuan (debian))
<damo22>ok
<rrq>.. I use my efirun.sh in https://git.rrq.selfhost.au/git/rrq/prep-hurd-image.git
<rrq>it also has grub.cfg at top level in the project
<damo22>my grubefi is hanging probing disks
<damo22>error unable to find boot device, error variable prefix is not set
<rrq>you dropped th -bios argument I guess? and the VARS file is writable?
<damo22>ok it eventually probed
<damo22>there is something missing because it crashes
<damo22>but i dont get the console log
<rrq>is there a way to figure out where qemu jumps to the "kernel" (gnumach)?
<rrq>in gdb I mean
<damo22>i dont know
<damo22>you could load grub in gdb?
<rrq>using the gdb-server mode of qemu
<damo22>yeah
<damo22>instead of loading gnumach in the gdb client
<damo22>load grub
<rrq>tracing the machine isntructions ... was not too hard when I had the source
<damo22>then it will have all the symbols from grub
<rrq>yeah.. I'm still debian so getting to symbols is a long armlengt away
<damo22>do you think its not reaching the gnumach entrypoint?
<rrq>contrary: I want to start tracing there
<damo22>you could find a very early symbol in gnumach and put a break point on it
<damo22>and load gnumach
<damo22>i had trouble though because of segmentation
<rrq>mmm right; though it's indirectly through qemu's emulation isn't it
<damo22>ie, it would not let me insert the breakpoint
<damo22>you can "watch spl_init"
<damo22>then when that variable becomes available it will break when it changes
<damo22>thats about the best ive got so far
<damo22>heres another qemu trick:
<damo22>-chardev socket,id=net0,host=127.0.0.1,port=9999,ipv4=on,server=on,telnet=on -monitor chardev:net0 \
<damo22>so when you launch qemu, you also have to launch a telnet session to localhost 9999
<damo22>but the benefit is you get the qemu monitor in a separate terminal
<rrq>... mmm I could load my own stub instead of gnumach and induce a segmentation violation... should give a hint of the memory address concerned
<rrq>ah yes, I'm using separate unix sockets for console and monitor, and vnc for graphics (which I don't need to connect)
<rrq>... and then I need my onw gnumach compile to work thorugh that one. Guess I won't have a weekend :)
<damo22>rrq: you can look at the forgejo script in master, it cross builds gnumach
<rrq>ta
<damo22>youpi: if a setuid program is owned by root, and its purpose is to perform some privileged op and then drop privileges to a lower user, is there a way to make the program indeed switch to that user so kill works as that user?
<damo22>if so, isnt that a bug in the setuid program?
<damo22>if it doesnt do that
<damo22>youpi: this is the backtrace from x86_64 smp with -smp 6, it drops to kdb shell with a trap and i caught it in gdb:
<damo22> https://pastebin.com/raw/T354F5zc
<damo22>is there something wrong with hpc clock in smp?
<damo22>syscall64 -> mach_msg_trap -> ipc_kobject_server -> _Xhost_get_uptime64 ...
<damo22>#warning This needs fixing
<damo22>clock: Note that the hpc support is not smp safe yet
<damo22>hmm what needs to be done there?
<damo22>i think we only want to update last_hpc_read on cpu0
<damo22>otherwise it updates too frequently
<damo22>that didnt fix it though
<mightysands>Hey, so I just took a look at that git repo. https://git.savannah.gnu.org/git/hurd/incubator.git and...it's empty
<mightysands>I cloned it, and there's nothing but a README file with two lines...
<mightysands>I'm guessing it's not empty, because it downloaded about 70Mb of stuff, so is this just an instance of it being hidden behind various git commands ?
<mightysands>.git/refs/tags is empty, so not sure what my options are
<mightysands>Ah, nvm. The info I was looking for was in the .git/packed-refs file :)
<azeem>damo22: the setuid thing is an old issue with the Hurd, not the setuid program IMO, see https://lists.gnu.org/archive/html/bug-hurd/2006-08/msg00063.html - or do you mean something else?
<azeem>also, https://lists.gnu.org/archive/html/bug-hurd/2026-02/msg00022.html
<damo22>i dont see how its an issue with hurd, if the setuid program doesnt change owner
<damo22>if X runs as root forever, it runs as root
<damo22>its not a bug with hurd
<azeem>maybe I misunderstood your problem then
<damo22>but if X only needs root to start something and then drop to user, it could be killed by user
<azeem>I think this is what happens
<azeem>it *should* be *killable* by user, but right now it isn't
<azeem>for a simple example, see the attachments in https://lists.gnu.org/archive/html/bug-hurd/2026-02/msg00022.html
<damo22>i dont think it should be, if it still needs root for doing privileged things
<azeem>you think it should not drop privs?
<damo22>i dont think it can
<damo22>because the gpu probably needs root to access it
<azeem>I don't have an opinion on the specific X case
<azeem>I think on Linux this might be solved by adding the user running X to some group that has access to /dev/drm or something?
<damo22>in a case where root is only needed initially, it should always drop privs
<damo22>and this should be killable by user
<azeem>yes, but it isn't on the Hurd, agreed
<rrq>well, X can run as non-root; in newest scheme it uses "device mediation" so that root opens them and then passes the opeoend file descriptor
<rrq>the opening is via elogind or seatd
<damo22>in your perl case, it doesnt look like perl is doing the right thing
<azeem>well, it maybe doesn't do the right thing for the Hurd, but I think what is does is legal
<damo22>i dont think its a hurd bug
<azeem>look, if there's a process running under non-uid X and user X can't send signals to it, I call that a bug
<azeem>non-root uid*
<damo22>does it really run under uid X
<azeem>unfortunately the work-around I found (running POSIX::setuid()) breaks the postgresql-common testsuite on Linux at another place, I need to investigate this further
<azeem>damo22: well, ps says so
<damo22>hmm maybe its a bug with ps info
<damo22>and perl
<azeem>there's a C reproducer as well in that message
<damo22>what is the correct sequence of syscalls to change the uid of a process?
<azeem>I don't think POSIX defines that
<azeem>but I think the sequence that Perl/that C program do is legit, and it works on at least Linux and FreeBSD
<damo22>isnt it just setuid() ?
<damo22>These wrapper functions (including the one
<damo22> for setuid()) employ a signal-based technique to ensure that when one thread changes credentials, all of the other
<damo22> threads in the process also change their credentials. For details, see nptl(7)
<damo22>maybe in your test C program you can check the return value is 0
<azeem>ok, they always return 0, both on Hurd and Linux
<damo22>ok yeah, something is wrong
<damo22> root 936 933 p1 0:00.00 sudo ./atest
<damo22> demo 937 936 p1 0:00.00 ./atest
<damo22>$ kill -9 937
<damo22>-bash: kill: (937) - Operation not permitted
<damo22>$
<azeem>aye
<youpi>damo22: the setuid bug is not in the setuid program, it just tells proc what it needs to know. It's proc that doesn't let the user kill the setuid-ed process
<damo22>ok
<damo22>im more interested in the clock bug because its blocking smp64
<damo22>how to make it smp safe?
<youpi>damo22: don't confuse real uid and effective uid
<youpi>setuid only sets the effective uid, not the real
<youpi>see man getuid vs man geteuid
<azeem>ps shows the effective uid right?
<youpi>probably
<damo22>+ /*
<damo22>+ * Only update high precision read on cpu0 once per clock interrupt
<damo22>+ */
<damo22>+ last_hpc_read = hpclock_read_counter();
<damo22> }
<damo22>- last_hpc_read = hpclock_read_counter();
<damo22>this should help but it doesnt make any difference
<youpi>that doesn't mean it's not a good thing
<damo22>right, you left a #warning when NCPUS > 1 to make hpc smp safe, but i dont know what is needed
<azeem>10:08 < youpi> setuid only sets the effective uid, not the real
<youpi>damo22: possibly nothing but checking that it's alright
<azeem> https://man7.org/linux/man-pages/man2/setuid.2.html says "if the calling process is privileged, the real UID and saved set-user-ID are also set."
<youpi>I mean the setuid bit
<azeem>ah!
<damo22>any thread can call host_get_uptime64 on cpuN when cpu0 is in the middle of a clock interrupt
<youpi>that kind of thing, yes
<damo22>i dont know how i managed to catch the cpu running and executing HPET32
<damo22>unless that read_mapped_uptime () loop gets stuck for a long time
<youpi>(actually it's auth which knows the uids, and the joe-user process has to convince proc that it's allowed to kill the setuid-ed process)
<azeem>what about the less-dramatic case where the executable does not have the setuid bit, but is run by root and then changes iud to joe-user and then they cannot kill it either? Is that the same issue or a subtly different one?
<youpi>iirc currently proc's check is about the effective uid, so that case would work
<damo22>i removed the entire hpc setup and it didnt change the failing tests
<damo22>so something else is broken
<damo22>but only on x86_64
<azeem>youpi: ok, but the Debian Postgres wrapper changes both when it starts the server, so then the postgres user can no longer send signals to it
<azeem>I'll see whether I can find something that works
<azeem>hrm, so https://github.com/Perl/perl5/blob/blead/dist/IO/lib/IO/Socket.pm#L294 seems to be meant "if we try to run getpeername on the socket, but don't get anything defined back, do this"?
<youpi>both real and effective uids are set to the postgres, and the postgres user can't kill it? I don't see how that can be possible
<azeem>yeah IIRC
<azeem>$ sudo ~/test-uid-HUP.pl
<azeem>switching to uid 101 and gid 101
<azeem>euid: 101, uid: 101, gid: 101 0
<youpi>and 101 can't kill it?
<azeem>postgres@debian:~$ id
<azeem>uid=101(postgres) gid=105(postgres) Gruppen=105(postgres),104(ssl-cert)
<azeem>postgres@debian:~$ ps -ef |grep ^postgres.*HUP | grep -v grep
<azeem>postgres 27038 27037 p4 0:00.01 /bin/perl /home/demo/test-uid-HUP.pl
<azeem>postgres@debian:~$ LANG=C kill -s HUP 27038
<azeem>-bash: kill: (27038) - Operation not permitted
<azeem>yeah
<youpi>it does work here
<azeem>weird
<youpi>the gid doesn't match, though
<azeem>uh
<azeem>well I think the setgid doesn't matter here, but if I change it to 105, I still get Operation not permitted
<youpi>even with a gid mismatch I do get to kill it
<youpi>$ sudo ./test
<youpi>r 1001 e 1001 s 1001
<youpi>$ id
<youpi>uid=1001(youpi)
<youpi>$ kill -s HUP 8611
<azeem>how did you do setuid?
<damo22>are you on hurd? :P
<youpi>setuid(1001);
<youpi>damo22: sure
<azeem>right, the perl code runs setregid(105, -1); setreuid(101, -1); setresuid(-1, 101, -1) on Linux
<youpi>ah, so it doesn't change the effective uid
<youpi>which is what kill() currently checks
<youpi>ah no with the last setresuid it does
<youpi>let me check the result of these
<azeem>yeah, I run geteuid() and it is set as well
<youpi>ok, it leaves the suid as 0
<youpi>and in that case I indeed cannot kill it
<youpi>even if both real and effective are set to 1001
<youpi>I don't see why proc would refuse just because of suid, but that's it for now, apparently
<youpi>azeem: btw, are you in Münich atm?
<azeem>youpi: no, Occitanie
<azeem>I'll be back in Munich from March
<youpi>ok :) I'll be in münich in the coming days
<azeem>damn :)
<azeem>if you're interested, I can setup a Debian meetup
<azeem>btw. Jeff Bailey was in Munich the other day as well, but we didn't manage to meet either
<azeem>"the other day" being last September or so
<yelninei>does mmap work differently when something is PROT_WRITE and PROT_WRITE | PROT_READ ? The first one crashed when memcpy'ing something to that region while this seems fine on linux
<azeem> https://github.com/Perl/perl5/blob/blead/README.hurd is "Last Updated: Fri, 29 Oct 1999 22:50:30 +0200"
<nexussfan>Wow
<youpi>yelninei: in principle that can, yes
<youpi>yelninei: while the hardware can't prevent the read when the write is allowed, if the page is not resident the system can catch it
<yelninei>youpi: However I would assume that there is no problem with only PROT_WRITE and writing something there (with memcpy)
<yelninei>But there is also a logical problem in the program as the file gets opened with O_RDWR but mmaped only with PROT_WRITE
<youpi>one would have to see the actual crash detail (i.e. assembly), to be sure what happens
<youpi>memcpy implementations can be very tricky
<youpi>in principle it would only write, but possibly it gets smart with trying to prefetch output data etc.
<yelninei>i guess just also adding PROT_READ does not hurt anybody but this took me a while to figure out
<azeem>13:43 < azeem> This perl code (basically IO::Socket::UNIX->new(Type=>SOCK_STREAM(), Peer=>$path); following by send()) works fine on Linux but fails with "Cannot determine peer address" on the Hurd: https://paste.debian.net/hidden/c64a1389 is that a known issue?
<azeem>I've opened a Perl bug now, I think this is something they should work around: https://github.com/Perl/perl5/issues/24195