IRC channel logs

2021-07-10.log

back to list of logs

<damo22>hmmm when i dump_processes() in startup at the end of launch_core_servers() i get:
<damo22>pid2 /hurd/startup
<damo22>pid4 /hurd/proc
<damo22>pid8 exec
<damo22>pid9 /hurd/auth
<damo22>it seems pci-arbiter(5) and rumpdisk(6) are missing
<damo22>although a normal boot shows:
<damo22>pid2 /hurd/startup
<damo22>pid4 /hurd/proc
<damo22>pid7 /hurd/auth
<damo22>and sometimes exec
<damo22>im quite frustrated with this seemingly small problem
<damo22>ive traced it back through various calls and all i can see is that proc_getprocargs is returning nonzero when you cat /proc/6/stat
<damo22>and get_vector() is dying early and returning an empty buffer, its like the args are not populated
<damo22>you cant call proc_mark_important() on a process that isnt a child of startup unless you are root, therefore startup cannot proc_child() rumpdisk
<damo22>s/therefore/and
<damo22>startup cannot call proc_child() on rumpdisk because it doesnt know which task to use
<damo22>youpi: how do you make a process authenticated with auth so p->id is owned by root?
<damo22>im hitting EPERM when i try to set proc_mark_important
<damo22>in libmachdev
<damo22>i think one of the problems with the start up is that the bootstrap processes dont have a proper uid=0 credential
***cadmium.libera.chat sets mode: +o ChanServ
<youpi>damo22: AIUI from S_proc_mark_important, the idea is to first make it child of startup, and then use proc_mark_important
<damo22>i did it by reauthenticating proc
<damo22>procserver port
<damo22>now the process appears as root owned
<damo22>but ext2fs' pid is being set with "rumpdisk" as exe
<damo22>ess_tasks list is not populated yet with essential tasks inside launch_core_servers
<damo22>do pids get renumbered at any time?
<youpi>"i did it": making it child of startup ? That's unrelated to reauthentication
<youpi>I mean this in S_proc_mark_important: p->p_parent != startup_proc
<damo22>"it" being, passed the test in the condition, by passing the check_uid ==0
<youpi>I mean the comment in S_proc_mark_important seems to imply that for children of startup, they would not have uid == 0 at this point, and it'd rather be the parent == startup_proc test that would succeed
<damo22>ok
<damo22>well i dont know how to do that
<youpi>I guess it's simply the proc_child call?
<damo22>ok, but how do i get a list of the tasks
<damo22>ess_tasks is empty
<youpi>at which point?
<youpi>check in the bootstrap.html page at which point you are
<damo22>after the fsys_init() call in launch_core_serves
<youpi>to know what was called, what was not
<damo22>is fsys_init asynchronous? like are things happening in different threads or does it block?
<youpi>see the beginning of diskfs_S_fsys_initdiskfs
<youpi>even if fsys_init is synchronous, the diskfs call replies early
<youpi>so it ends up being asynchronous
<damo22>so how do i ensure i wait for the fsys_init chain to complete before checking ess_tasks
<youpi>you watch for the startup_essential_task, see the code there that looks for the calls
<damo22> startup_essential_task (startup, mach_task_self (), MACH_PORT_NULL, diskfs_server_name, host);
<damo22>what do i do with that?
<damo22>do i have to call fsys_init_reply (reply, replytype, 0); in libmachdev's trivfs_S_fsys_init() so it can send things to startup?
<youpi>well ,you have the name so you can check what it is, and do the proc_child, mark important, etc.
<youpi>that's what it does for exec for instance
<damo22>ah you mean inside S_startup_essential_task() i can special case rumpdisk and pci-arbiter?
<youpi>it's done for exec so I guess it won't cause trouble to do the same
<damo22>it seems like a hack to do that because we may have more tasks to add to bootstrap
<damo22>but i can start with that
<youpi>you could do the converse: exclude doing it for the names you know about
<damo22>ok
<damo22>problem: if there is no pci-arbiter, how can you wait for it
<damo22>like if you launch the system with different bootstrap processes, you need to know which ones to expect
<youpi>damo22: what do you want to wait it for?
<damo22>if (authinit && execinit && procinit) { launch_system
<damo22>it also sends the reply for startup
<damo22>startup_essential_task_reply
<youpi>diskfs_S_fsys_init waits for its own fsys_init call before calling startup_essential_task
<youpi>or something like this
<youpi>so that don't have to wait for all of them, just one of them, that happens to only call startup_essential_task once bootstrap before it is over
<youpi>I guess you could just add fsinit to the list
<damo22>i think S_startup_essential_task is crazy, it allows for the three messages to arrive in any order, and then sets flags until they are all complete
<youpi>well the initialization is crazy: bootstrap processes may have their own dependencies, so the order in which they can eventually be initialized may depend on other stuff
<youpi>so it's better for startup to not impose any strict order
<youpi>and just let them initialize in the order they prefer
<youpi>that relaxes constraints for the rest of the boostrap process
<damo22>ok
<damo22>but the problem i face now, is that i dont know which bootstrap fstasks to wait for
<damo22>i can hardcode it to pci-arbiter and rumpdisk
<damo22>but then it wont work if you run without the arbiter for example
<youpi>as I said you don't need to explicitly wait for a list
<youpi>you can assume that fs is the ealiest in the chain you have to wait for
<damo22>ok
<damo22>so if ext2fs is completed, i can assume everything that launched it is complete
<youpi>yes
<damo22>!! nice
<damo22>lrwxrwxrwx 0 4294967295 root 11 Jan 1 1970 /proc/5/exe -> pci-arbiter
<damo22>lrwxrwxrwx 0 4294967295 root 8 Jan 1 1970 /proc/6/exe -> rumpdisk
<damo22>lrwxrwxrwx 0 4294967295 root 2 Jan 1 1970 /proc/7/exe -> fs
<damo22>that is correct but /proc/6/stat is still an error
<damo22>getting closer
<damo22>-startup-+-@\032P\001
<damo22>in pstree
<damo22>that is supposed to be rumpdisk
<damo22>maybe i should post what i have that fixes the "exe"
<damo22>getprocargs is failing still
<damo22>youpi: are you sure that proc_set_arg_locations() is being called for machdev_argv in glibc if i call _hurd_init(...)
<youpi>trivfs_S_fsys_init has procserver != NULL, right? it then calls _hurd_init, which calls _hurd_new_proc_init, which calls __proc_set_arg_locations
<youpi>mach_print() would allow you to be sure what happens
<damo22>ok, with my changes, the machine shuts down at least after the rpc times out for pci-arbiter shutdown notification
<damo22>the problem with the notification is that the disk access is terminated when rump shuts down, and then the arbiter cannot be contacted?
<youpi>why can't it be contacted? what actually happens?
<damo22>or something is happening after the rump shuts down
<damo22>i will get shutdown log
<damo22>rump ends with:
<damo22>(ipc/mig) server died
<damo22>startup: notifying pci-arbiter of reboot...
<damo22>hang
<damo22>then the rpc times out and it reboots
<youpi>you can use the mach debugguer to check the state of pci-arbiter
<damo22>41 tasks remain
<damo22>including the arbiter
<damo22>but rumpdisk is gone
<damo22>interesting, it does not kill all tasks
<damo22>but there is no more disk access
<damo22>maybe its because the proc/6/stat is still broken
<youpi>I'd say first fist stat, in case that happens to fix other things
<damo22>i am trying to yes
<damo22>i might send in this patch that fixes the proc/?/exe at least
<damo22>do we need to fiddle with rumpdisk arg locations due to this?
<damo22> /* The kernel task has a bootstrap port set. Perhaps it is its proc
<damo22> server port from another Hurd. If so, propagate the kernel
<damo22> argument locations from that Hurd rather than diddling with the
<damo22> kernel task ourselves. */
<damo22>startup/startup.c
<damo22>it calls proc_get_arg_locations (kbs, &kargv, &kenvp); and then sets them in another process with proc_set_arg_locations (proc, kargv, kenvp);
<youpi>you are not running a sub-hurd
<damo22>ok