IRC channel logs
2020-07-26.log
back to list of logs
***Server sets mode: +nt
<damo22>i need to call arrange_shutdown_notification() very late in the piece, otherwise it cant find _SERVERS_STARTUP ... it needs to be roughly at the same time as _diskfs_init_completed() but i cant find the right hook <damo22>i think it needs to be after proc and auth have ports so i can call proc_mark_important <youpi>ah, right, you need that as well <youpi>diskfs_S_fsys_init just needs to be made to forward the call in the diskfs_exec_server_task != MACH_PORT_NULL case as well <damo22> /* Give our parent (the real bootstrap filesystem) an fsys_init RPC of its own, as init would have sent it. */ <--- do you mean this <youpi>yes, but in the other part of the if as well <damo22>i think it needs to be before the HURD_PORT_USE call ? <damo22>i tried after and i got invalid port <damo22>nope then it crashes on assert(proc) in my arrange notification <damo22>do i call fsys_init ( .. , execprocess ,... <damo22>Hurd server bootstrap: ext2fs[part:2:device:/dev/wd0] exec startup proc auth. <damo22>rumpdisk: ../../libmachdev/startup.c:38: arrange_shutdown_notification: Assertion 'proc' failed. <damo22>ext2fs: ../../libdiskfs/boot-start.c:545: diskfs_S_fsys_init: Unexpected error: (ipc/mig) server died. <damo22>why would proc be not ready to call getproc() in trivfs_S_fsys_init() ? <damo22>youpi: i wanted to investigate GPU for opencl so i bought a Radeon RX5700 (navi) but looks like the AMD compute stack is a complete mess <AlmuHS>youpi: I'm testing the tty locked error in my T60. I've just compiled the gnumach's upstream, without my SMP patches, and the problem also appears in It <AlmuHS>so, maybe the problem is not in my SMP patch, and it's a upstream error <youpi>damo22: you need to call the parent fsys_init after the exec_init call, since otherwise exec won't have called startup_essential_task yet, and thus all of auth, proc, exec will still be stuck in their respective startup_essential_task <AlmuHS>I can try to upgrade, to check it the error remains <AlmuHS>but, before It, I have to fsck ;) <AlmuHS>are there any progress in ext3 or later supporting? <damo22>youpi: i thought i did that in my last patches <damo22>HURD_PORT_USE ( ... , exec_init (...) ) <damo22>does that actually call exec_init and block on it? <damo22>or is there a possible race condition that fsys_init can be called before exec_init is completed? <damo22>does that cause expr to be evaluated twice? <damo22>doesnt exec_init() return a kern_return_t error code? so why is it HURD_PORT_USE (... , err) <damo22>The code that is written could be equivalent to this, is it correct? <damo22> err = exec_init (port, authhandle, execprocess, MACH_MSG_TYPE_COPY_SEND); <damo22> HURD_PORT_USE (&_diskfs_exec_portcell, err); <AlmuHS>youpi: after upgrading, the tty freezed problem continues <AlmuHS>but a good notice: now the NIC works with upstream's gnumach <damo22>i sent in 2x patches to the ml but its still not quite working <damo22>AlmuHS: i think we all appreciate the bug reports you make but keep in mind that without hardware drivers, IMHO hurd is not quite ready to be run on real hardware yet, ie try to expect it would not work by default. Eg im not sure why you try to have a working X11 system right now when we still dont even have disk properly working... just my 2 cents <AlmuHS>before loading the tty, It shows this error <AlmuHS>damo22: this error is not about X11, it's the default tty <damo22>i booted hurd on X200 and also experienced a frozen tty <AlmuHS>in Debian's GNU Mach, I can use the tty without problems <damo22>if you disable the hurd-console i think it works <AlmuHS>in the Debian's GNU Mach, I can use the hurd-console without problems (by now) <AlmuHS>I go to reboot and try the Debian's GNU Mach <damo22>if youre running a debian system, you need the debian patches on gnumach <AlmuHS>but I need a quickly way to add the debian patches to upstream gnumach <AlmuHS>because my patch is developed over upstream <damo22>you should be testing smp in qemu its far easier <AlmuHS>yes, but I have to test in real hardware too <AlmuHS>to be sure that all works properly <damo22>i dont even bother testing rump on real hardware yet <damo22>we can do many many iterations of debug and boot on qemu <damo22>one thing you can do is buy a usb<->sata cable and use a second hand 2.5" hard drive that you can install hurd on, and boot both in qemu and swap into real hardware to boot the same disk <damo22>that will save you reinstalling hurd a million times <youpi>damo22: fsys_init *calls* exec_init. And the exec_init call will not return until S_exec_init returns <youpi>typeof() doesn't evaluate its expression, it's just replaced by the type <youpi>the HURD_PORT_USE macro returns the value of the expr <youpi>AlmuHS: concerning the "tty freeze", what is it actually? is only the tty frozen etc. ? without at least some investigation on your side I can't say much more than "works for me" <youpi>at least try to disable the hurd console, to check whether that's where the concern is <youpi>so you say that depending you're using the debian or upstream kernel, you get a frozen tty on the hurd console or not? <youpi>did you merge the latest upstream master into your tree? <youpi>there's nothing left in the debian gnumach that'd be needed for anything to work <youpi>i.e. upstream has everything <youpi>yes, but that's not needed to have a working system <damo22>i closed my hurd vm, i will check tomorrow regarding exec_init <damo22>i didnt know fsys_init calls exec_init <youpi>« diskfs' diskfs_S_fsys_init gets called, it thus knows that proc and auth are ready, and can call exec_init. It initializes the default proc and auth ports to be given to processes. » <youpi>that's the very code you were patching over <damo22>that is exec_init and then fsys_init is called after <youpi>well, I was talking about libdiskfs' fs_init <youpi>which currently calls exec_init(), and you made it to call the parent's fsys_init() after that <youpi>I just meant that yes, once exec_init() is called, exec can continue its boot down to startup_essential_task and startup will then let auth/proc/exec work <damo22>right, but does that explain why getproc() == 0 *youpi hadn't processed his mails yet, now answering <damo22>sorry, i dont mean to bug you i bug-hurd <damo22>theres no rush to answer, i cant do much tonight my brain is dead <AlmuHS>youpi: the tty doesn't reply to keyboard <AlmuHS>but it's possible to access the system via ssh <AlmuHS>now executing fsck. After this I will try to disable hurd-console <AlmuHS>youpi: ok, the problem is with hurd-console. Mach console works properly <AlmuHS>my smp patch works properly. Using Mach console, I can use the tty with It <AlmuHS>and a good news: now I have network connection using upstream's gnumach <AlmuHS>btw, in hurd-console, are the any usage for the mouse? <youpi>iirc it is just shown, and clicking doesn't do anything useful <youpi>concerning the hurd console hang, it'd be useful to investigate in the hurd console code: is it actually getting keypresse from the kernel or not, etc. <youpi>there's no reason for such a simple thing to break <AlmuHS>but currently I don't know what search in the code <youpi>truth is: I don't know either, nobody knows, but you have the hardware which has the problem, so you'll have to find out <youpi>a few things I know, however, is that it's the console-client part you're interested in <AlmuHS>yes, but where starts to search? <youpi>and it so happens that there's a pc-kbd.c file in there <youpi>there is a console/ and a console-client/ directory <youpi>shouldn't it be obvious that it could just be there? <youpi>it'll tell you that it's in the hurd package <AlmuHS>hurd-console shows a timeout error <AlmuHS>telling that could not receive return value from daemon process <AlmuHS>what daemon are hurd-console waiting? <youpi>see the documentation that marcus wrote about the console <youpi>the console is split into a daemon that does the actual processing, and a client that shows the output and gets the input <youpi>I damn don't remember by heart of course <youpi>but it very very very probably is on the hurd wiki <youpi>that very very veryre y re eryre yveryv ery looks so yes <AlmuHS>meanwhile, now I'm sure that my smp patch works properly. There are not any panic or similar <youpi>but we're not actually running anything on other cpus than cpu0 :) <AlmuHS>because my patch is only enumeration <AlmuHS>to test my code, you only have to disable the #if NCPUS > 1 directive, in the smp_init() call <AlmuHS>in next steps i will add some code to enable the processors, but step by step <AlmuHS>oops, the upstream gnumach also hangs during rebooting ***Server sets mode: +nt
<AlmuHS>youpi: i disabled the #if NCPUS > 1 in my code over real hardware, and now hangs in exec server during booting <youpi>remember that there's apparently a race on exec startup <youpi>did you try at least like a dozen times? <youpi>apparently sometimes it's happening quite a lot <AlmuHS>but with the rest of kernel is less common <youpi>never understimate the power of "works only by luck" <AlmuHS>ok, in the 7th attempt it works, XD <AlmuHS>my real machine is a Thinkpad T60. A legendary model of IBM <AlmuHS>in Qemu my patch works with less problems ;) <AlmuHS>right, in Qemu most things works with so few problems, of course, XD <AlmuHS>youpi: ok, then you can check my code with more safety, XD <AlmuHS>reading the hurd console tutorial, i've discovered that the console error is in the client, in /bin/console <Pellescours>AlmuHS: I tries to compile with your patches with NCPU =2 and I got a compilation errror, CPU_NUMBER not defined. did I do something wrong ? <Pellescours>I got ../i386/i386/cswitch.S:42: Error: invalid character '(' in mnemonic <Pellescours>and in i386/i386/cpu_number.h the macro CPU_NUMBER is defined only if NCPUS==1 <Pellescours>But if I compile with NCPUS=1 the patch don't break gnumach <AlmuHS>Pellescours: yes, it's expected. There are many missing functions in NCPUS > 1 <AlmuHS>in next patches I will try to solve this <AlmuHS>to try my patch you have to compile with mach_ncpus = 1, and disable the #if NCPUS > 1 macro in smp_init() call <AlmuHS>the next patch will implements cpu_number() and CPU_NUMBER() <AlmuHS>if the patch works, you might see a table with the CPUs and IOAPIC discovered, during the booting <Pellescours>I saw it, but it was too fast and I'm sot able to see kernel log (/var/log/dmesg is not updated and /dev/klog is empty) <Pellescours>using the trick of qemu -s -S ... and gdb I was able to stop it and I can see the logs :) <AlmuHS>don't forget set more than one cpu in Qemu script ;) <youpi>and use console:com0 on the gnumach command line <AlmuHS>I never did that tests, thanks Pellescours!! <AlmuHS>in real hardware, I only tested with 2 cpus. There are not Core2Quad for laptop ;) <AlmuHS>my patch assume xAPIC. For x2APIC can be necessary some configurations <AlmuHS>the Core iX processors using x2APIC <Pellescours>I will try wit -cpu host to see (it's an intel core i5) <Pellescours>AlmuHS: is there a way to see the ACPI cpu data (number found ...) at runtime or not yet ? <AlmuHS>I have a Thinkpad T410, with an i5 1st gen, but I don't sure if I can get network connection in it, to download the sources and try there <AlmuHS>I'll try there in a weeks. But the current tests over Qemu and 32-bit CoreDuo are enough by now <AlmuHS>about tty freeze, i've found the code block in which the error are showd <AlmuHS>but I don't found the daemon_fork definition <AlmuHS>but, anyways, It seems a variant of classical POSIX's fork() <AlmuHS>the question is... why fails the fork? <AlmuHS>but the daemon doesn't send the signal <youpi>possibly it crashes after starting <youpi>and you can try to run it by hand without the --daemonize option, to see what happens <youpi>you just need to pass the same parameters as configured in /etc/default/hurd-console <youpi>anyway, for the starting up details, see /etc/init.d/hurd-console <youpi>usually I have /bin/console --daemonize -d current_vcs -c /dev/vcs -d vga -d pc_kbd --keymap fr pc_kbd -d pc_mouse --protocol=ps/2 pc_mouse <youpi>possibly just drop --daemonize from that <AlmuHS>like this? # console -d vga -d pc_mouse --repeat=mouse -d pc_kbd --repeat=kbd -d generic_speaker -c /dev/vcs <AlmuHS>I've removed --daemonize option in /etc/init.d/hurd-console. I go to try It <AlmuHS>the exec freeze is notably more common in my smp microkernel. why? <jrtc27>you have more code, so some operations take slightly longer and make the race condition more likely to be triggered? <AlmuHS>about tty, without daemonize it seems don't start <jrtc27>yes, it won't, because you need to daemonize in an init script <jrtc27>"and you can try to run it ***by hand*** without the --daemonize option, to see what happens" <youpi>AlmuHS: "don't start", as in? <jrtc27>youpi: AlmuHS just patched the init script to not daemonize <youpi>did you try to run from ssh to get the output? <jrtc27>so of course it's going to just hang <jrtc27>and hopefully sysvinit times out or something and just kills it, idk <youpi>I only mentioned the init script to get how it's usually started <AlmuHS>i can't access via ssh, the ssd server didn't start too <youpi>yet another reason to start by *HAND* <AlmuHS>now I have to solve this from chroot. :-/ <youpi>that's not really a problem, you can put whatever you need to hack in it <AlmuHS>the problem is that I've locked the tty when I removed the --daemonize option <AlmuHS>and ssh starts after console, so ssh doesn't works <AlmuHS>so I have to edit the file from chroot or a live to recover the console <youpi>just put back the --daemonize option in there <AlmuHS>yes, but I can access the machine, it's the problem <youpi>don't you have network access? <youpi>put back the --daemonize option <AlmuHS>but I need to access via live or similar, because i have not tty or ssh <AlmuHS>i need to access to edit the file <AlmuHS>do you understand the problem now? <youpi>you can reboot in emergency mode to get access to the disk <youpi>or with the installer CD and mount the partition by hand <youpi>you just need to put back the --daemonize option <AlmuHS>not really, simply the rescue mode offers this option <AlmuHS>youpi: i tested with this line: console -d vga -d pc_mouse --repeat=mouse -d pc_kbd --repeat=kbd -d generic_speaker -c /dev/vcs <AlmuHS>i go to disable hurd-console, and try again from Mach console <AlmuHS>youpi: ok, starting hurd-console manually from mach console, it works <AlmuHS>i go to reboot to be sure I've started the correct kernel <AlmuHS>It seems the problem is in the mouse driver <AlmuHS>when I add the mouse option, hurd-console keeps blank at start <AlmuHS>youpi: finally solved. Disabling mouse driver, hurd-console works properly in upstream's gnumach <AlmuHS>youpi: excuse me the insistence. Pellescours has tested my SMP patch in many scenarios (disabling ACPI, in a x86_64 host...) and It works properly <jrtc27>what do you hope to gain from it? <jrtc27>do you want something to happen as a result, or is it just informative? <AlmuHS>because some days ago i was afraid about possible errors in my SMP patch. And now, after the tests, I can confirm that the patch works well <AlmuHS>now, the patch review can be centered in the code, not in if the patch works <AlmuHS>btw, jrtc27 , has you read the patch? <jrtc27>no, I stopped after seeing the lack of care wrt code formatting <jrtc27>and for me at least I find that makes it much harder to concentrate on the code itself <jrtc27>both in that it's mentally more taxing when things aren't uniform, and that I spend my time noticing all the ways in which it's wrong <AlmuHS>i can recheck again. But I don't find a fixed reference for the formatting. In many files are different styles <jrtc27>the predominant style in GNU Mach is the mach style <jrtc27>the GNU style exists in some files, but IMO that's a mistake <jrtc27>your code is neither of the two, but an inconsistent mix of them plus some mistakes on top of that <AlmuHS>even in GNU files, in the same file there are many different styles <AlmuHS>so I don't know what is the correct <jrtc27>right, but, you're not even using that <jrtc27>at least pick something and do it correctly <jrtc27>but I'd argue that mach should follow the mach style <jrtc27>because that's what most of the code does <AlmuHS>if each file has a style, what style must I to use? <jrtc27>you should always match the surrounding style