IRC channel logs

2020-07-26.log

back to list of logs

***Server sets mode: +nt
<damo22>where is this deployed? http://git.savannah.gnu.org/cgit/hurd/web.git
<jrtc27> https://www.gnu.org/software/hurd/
<jrtc27>gets mirrored to http://darnassus.sceen.net/cgit/hurd-web.git/
<damo22>ty
<damo22>i need to call arrange_shutdown_notification() very late in the piece, otherwise it cant find _SERVERS_STARTUP ... it needs to be roughly at the same time as _diskfs_init_completed() but i cant find the right hook
<youpi>damo22: fsys_init, really
<youpi>err
<youpi>fsys_startup
<damo22>trivfs_S_fsys_init
<damo22>startup is too early
<damo22>i think it needs to be after proc and auth have ports so i can call proc_mark_important
<youpi>ah, right, you need that as well
<youpi>diskfs_S_fsys_init just needs to be made to forward the call in the diskfs_exec_server_task != MACH_PORT_NULL case as well
<damo22> /* Give our parent (the real bootstrap filesystem) an fsys_init RPC of its own, as init would have sent it. */ <--- do you mean this
<youpi>yes, but in the other part of the if as well
<damo22>ok
<damo22>i think it needs to be before the HURD_PORT_USE call ?
<damo22>i tried after and i got invalid port
<damo22>nope then it crashes on assert(proc) in my arrange notification
<damo22>do i call fsys_init ( .. , execprocess ,...
<damo22>Hurd server bootstrap: ext2fs[part:2:device:/dev/wd0] exec startup proc auth.
<damo22>rumpdisk: ../../libmachdev/startup.c:38: arrange_shutdown_notification: Assertion 'proc' failed.
<damo22>ext2fs: ../../libdiskfs/boot-start.c:545: diskfs_S_fsys_init: Unexpected error: (ipc/mig) server died.
<damo22>why would proc be not ready to call getproc() in trivfs_S_fsys_init() ?
<damo22>youpi: i wanted to investigate GPU for opencl so i bought a Radeon RX5700 (navi) but looks like the AMD compute stack is a complete mess
<AlmuHS>youpi: I'm testing the tty locked error in my T60. I've just compiled the gnumach's upstream, without my SMP patches, and the problem also appears in It
<AlmuHS>so, maybe the problem is not in my SMP patch, and it's a upstream error
<youpi>damo22: you need to call the parent fsys_init after the exec_init call, since otherwise exec won't have called startup_essential_task yet, and thus all of auth, proc, exec will still be stuck in their respective startup_essential_task
<youpi>AlmuHS: ah
<AlmuHS>I can try to upgrade, to check it the error remains
<AlmuHS>**check if
<AlmuHS>but, before It, I have to fsck ;)
<AlmuHS>are there any progress in ext3 or later supporting?
<damo22>youpi: i thought i did that in my last patches
<damo22>HURD_PORT_USE ( ... , exec_init (...) )
<damo22>/* fsys_init to follow */
<damo22>does that actually call exec_init and block on it?
<damo22>or is there a possible race condition that fsys_init can be called before exec_init is completed?
<damo22>__typeof(expr) result = expr;
<damo22>does that cause expr to be evaluated twice?
<damo22>doesnt exec_init() return a kern_return_t error code? so why is it HURD_PORT_USE (... , err)
<damo22>The code that is written could be equivalent to this, is it correct?
<damo22> err = exec_init (port, authhandle, execprocess, MACH_MSG_TYPE_COPY_SEND);
<damo22> HURD_PORT_USE (&_diskfs_exec_portcell, err);
<AlmuHS>youpi: after upgrading, the tty freezed problem continues
<AlmuHS>but a good notice: now the NIC works with upstream's gnumach
<AlmuHS>a good news
<damo22>goodnight
<damo22>i sent in 2x patches to the ml but its still not quite working
<AlmuHS>goodnight damo22
<damo22>:)
<damo22>AlmuHS: i think we all appreciate the bug reports you make but keep in mind that without hardware drivers, IMHO hurd is not quite ready to be run on real hardware yet, ie try to expect it would not work by default. Eg im not sure why you try to have a working X11 system right now when we still dont even have disk properly working... just my 2 cents
<AlmuHS> https://pasteboard.co/Jjr7hlU.jpg
<AlmuHS>before loading the tty, It shows this error
<AlmuHS>damo22: this error is not about X11, it's the default tty
<damo22>i booted hurd on X200 and also experienced a frozen tty
<damo22>that was a while ago
<AlmuHS>in Debian's GNU Mach, I can use the tty without problems
<damo22>if you disable the hurd-console i think it works
<AlmuHS>in the Debian's GNU Mach, I can use the hurd-console without problems (by now)
<AlmuHS>I go to reboot and try the Debian's GNU Mach
<damo22>if youre running a debian system, you need the debian patches on gnumach
<damo22>so just use the debian kernel!
<AlmuHS>but I need a quickly way to add the debian patches to upstream gnumach
<AlmuHS>because my patch is developed over upstream
<damo22>you should be testing smp in qemu its far easier
<AlmuHS>yes, but I have to test in real hardware too
<AlmuHS>to be sure that all works properly
<damo22>i dont even bother testing rump on real hardware yet
<damo22>its not ready
<damo22>we can do many many iterations of debug and boot on qemu
<damo22>one thing you can do is buy a usb<->sata cable and use a second hand 2.5" hard drive that you can install hurd on, and boot both in qemu and swap into real hardware to boot the same disk
<damo22>that will save you reinstalling hurd a million times
<youpi>damo22: fsys_init *calls* exec_init. And the exec_init call will not return until S_exec_init returns
<youpi>typeof() doesn't evaluate its expression, it's just replaced by the type
<youpi>the HURD_PORT_USE macro returns the value of the expr
<youpi>AlmuHS: concerning the "tty freeze", what is it actually? is only the tty frozen etc. ? without at least some investigation on your side I can't say much more than "works for me"
<youpi>at least try to disable the hurd console, to check whether that's where the concern is
<youpi>ah, so you did
<youpi>so you say that depending you're using the debian or upstream kernel, you get a frozen tty on the hurd console or not?
<youpi>did you merge the latest upstream master into your tree?
<youpi>there's nothing left in the debian gnumach that'd be needed for anything to work
<damo22>except ramdisk?
<youpi>i.e. upstream has everything
<youpi>yes, but that's not needed to have a working system
<damo22>i closed my hurd vm, i will check tomorrow regarding exec_init
<damo22>i didnt know fsys_init calls exec_init
<youpi>« diskfs' diskfs_S_fsys_init gets called, it thus knows that proc and auth are ready, and can call exec_init. It initializes the default proc and auth ports to be given to processes. »
<youpi>on the bootstrap.html page
<youpi>that's the very code you were patching over
<damo22> http://git.zammit.org/hurd-sv.git/tree/libdiskfs/boot-start.c?id=0d1f942e54fae5fa53c9089e48c0670436266eb1#n533
<damo22>that is exec_init and then fsys_init is called after
<youpi>well, I was talking about libdiskfs' fs_init
<youpi>*fsys_init
<youpi>which currently calls exec_init(), and you made it to call the parent's fsys_init() after that
<youpi>I just meant that yes, once exec_init() is called, exec can continue its boot down to startup_essential_task and startup will then let auth/proc/exec work
<damo22>right, but does that explain why getproc() == 0
<damo22>in my tree
*youpi hadn't processed his mails yet, now answering
<damo22>sorry, i dont mean to bug you i bug-hurd
<damo22>theres no rush to answer, i cant do much tonight my brain is dead
<damo22>thanks, i'll sleep on it
<AlmuHS>youpi: the tty doesn't reply to keyboard
<AlmuHS>but it's possible to access the system via ssh
<AlmuHS>now executing fsck. After this I will try to disable hurd-console
<AlmuHS>youpi: ok, the problem is with hurd-console. Mach console works properly
<AlmuHS>my smp patch works properly. Using Mach console, I can use the tty with It
<AlmuHS>and a good news: now I have network connection using upstream's gnumach
<AlmuHS>btw, in hurd-console, are the any usage for the mouse?
<youpi>iirc it is just shown, and clicking doesn't do anything useful
<youpi>concerning the hurd console hang, it'd be useful to investigate in the hurd console code: is it actually getting keypresse from the kernel or not, etc.
<youpi>there's no reason for such a simple thing to break
<AlmuHS>but currently I don't know what search in the code
<AlmuHS>I need to check any log
<youpi>truth is: I don't know either, nobody knows, but you have the hardware which has the problem, so you'll have to find out
<youpi>a few things I know, however, is that it's the console-client part you're interested in
<AlmuHS>yes, but where starts to search?
<youpi>and it so happens that there's a pc-kbd.c file in there
<youpi>by just looking
<youpi>really
<youpi>look
<youpi>there is a console/ and a console-client/ directory
<youpi>shouldn't it be obvious that it could just be there?
<AlmuHS>where is hurd-console sources?
<youpi>see dpkg -S
<AlmuHS>ok
<youpi>it'll tell you that it's in the hurd package
<youpi>and thus the hurd source
<AlmuHS>hurd-console shows a timeout error
<AlmuHS>telling that could not receive return value from daemon process
<AlmuHS>what daemon are hurd-console waiting?
<youpi>see the documentation that marcus wrote about the console
<youpi>the console is split into a daemon that does the actual processing, and a client that shows the output and gets the input
<AlmuHS>where is this docs?
<youpi>I damn don't remember by heart of course
<youpi>but it very very very probably is on the hurd wiki
<AlmuHS> https://www.gnu.org/software/hurd/hurd/console.html
<AlmuHS>this?
<youpi>that very very veryre y re eryre yveryv ery looks so yes
<AlmuHS>meanwhile, now I'm sure that my smp patch works properly. There are not any panic or similar
<AlmuHS>can someone check my code?
<youpi>but we're not actually running anything on other cpus than cpu0 :)
<AlmuHS>because my patch is only enumeration
<AlmuHS>to test my code, you only have to disable the #if NCPUS > 1 directive, in the smp_init() call
<AlmuHS>in next steps i will add some code to enable the processors, but step by step
<AlmuHS>oops, the upstream gnumach also hangs during rebooting
***Server sets mode: +nt
<AlmuHS>youpi: i disabled the #if NCPUS > 1 in my code over real hardware, and now hangs in exec server during booting
<AlmuHS> https://pasteboard.co/JjscwnL.jpg
<youpi>remember that there's apparently a race on exec startup
<youpi>did you try at least like a dozen times?
<youpi>apparently sometimes it's happening quite a lot
<AlmuHS>4 or 5
<youpi>even that is not that much
<AlmuHS>but with the rest of kernel is less common
<youpi>possibly *only* by luck
<youpi>yes, that is very possible
<youpi>never understimate the power of "works only by luck"
<AlmuHS>ok, in the 7th attempt it works, XD
<AlmuHS>my real machine is a Thinkpad T60. A legendary model of IBM
<AlmuHS>in Qemu my patch works with less problems ;)
<AlmuHS>right, in Qemu most things works with so few problems, of course, XD
<AlmuHS>youpi: ok, then you can check my code with more safety, XD
<AlmuHS>reading the hurd console tutorial, i've discovered that the console error is in the client, in /bin/console
<AlmuHS> https://web.archive.org/web/20111025035056/http://uwhug.org.uk/index.pl?Hurd_Console_Tutorial
<Pellescours>AlmuHS: I tries to compile with your patches with NCPU =2 and I got a compilation errror, CPU_NUMBER not defined. did I do something wrong ?
<youpi>did you run autoreconf?
<Pellescours>yep
<Pellescours>I got ../i386/i386/cswitch.S:42: Error: invalid character '(' in mnemonic
<Pellescours>this correspond to CPU_NUMBER(%eax)
<Pellescours>and in i386/i386/cpu_number.h the macro CPU_NUMBER is defined only if NCPUS==1
<Pellescours>But if I compile with NCPUS=1 the patch don't break gnumach
<AlmuHS>Pellescours: yes, it's expected. There are many missing functions in NCPUS > 1
<AlmuHS>in next patches I will try to solve this
<AlmuHS>to try my patch you have to compile with mach_ncpus = 1, and disable the #if NCPUS > 1 macro in smp_init() call
<Pellescours>okay, thanks, I try this so
<AlmuHS>the next patch will implements cpu_number() and CPU_NUMBER()
<AlmuHS>but first I was to commit this
<AlmuHS>**I want
<Pellescours>I understand
<AlmuHS>if the patch works, you might see a table with the CPUs and IOAPIC discovered, during the booting
<Pellescours>I saw it, but it was too fast and I'm sot able to see kernel log (/var/log/dmesg is not updated and /dev/klog is empty)
<Pellescours>using the trick of qemu -s -S ... and gdb I was able to stop it and I can see the logs :)
<AlmuHS>don't forget set more than one cpu in Qemu script ;)
<youpi>ues a serial console
<youpi>qemu -serial stdio
<youpi>and use console:com0 on the gnumach command line
<Pellescours>AlmuHS: in qemu this also working with -no-acpi :D
<AlmuHS>LOL
<Pellescours>It can't find acpi but it don't crash
<AlmuHS>I never did that tests, thanks Pellescours!!
<AlmuHS>the -no-acpi test
<Pellescours>It works with 2 cpus, it found the cpus
<AlmuHS>I tested even with 4 cpus
<AlmuHS>in real hardware, I only tested with 2 cpus. There are not Core2Quad for laptop ;)
<AlmuHS>my patch assume xAPIC. For x2APIC can be necessary some configurations
<Pellescours>I just tried with -smp 64 and it works
<AlmuHS>the Core iX processors using x2APIC
<AlmuHS>ok, thanks again
<Pellescours>I will try wit -cpu host to see (it's an intel core i5)
<Pellescours>It seems to works too
<Pellescours>AlmuHS: is there a way to see the ACPI cpu data (number found ...) at runtime or not yet ?
<AlmuHS>I have a Thinkpad T410, with an i5 1st gen, but I don't sure if I can get network connection in it, to download the sources and try there
<AlmuHS>I'll try there in a weeks. But the current tests over Qemu and 32-bit CoreDuo are enough by now
<AlmuHS>about tty freeze, i've found the code block in which the error are showd
<AlmuHS>It seems that the fork fails http://git.savannah.gnu.org/cgit/hurd/hurd.git/tree/console-client/console.c#n696
<AlmuHS>but I don't found the daemon_fork definition
<jrtc27>libdaemon
<jrtc27>(not a hurd package)
<jrtc27>(uh, s/package/library)
<AlmuHS>but, anyways, It seems a variant of classical POSIX's fork()
<AlmuHS>the question is... why fails the fork?
<AlmuHS>ok, the fork doesn't fails
<AlmuHS>but the daemon doesn't send the signal
<youpi>possibly it crashes after starting
<youpi>and you can try to run it by hand without the --daemonize option, to see what happens
<youpi>you just need to pass the same parameters as configured in /etc/default/hurd-console
<AlmuHS>hurd-console ... --daemonize ?
<youpi>ah, there are a few other
<youpi>without** --daemonize
<AlmuHS>yes
<youpi>anyway, for the starting up details, see /etc/init.d/hurd-console
<youpi>usually I have /bin/console --daemonize -d current_vcs -c /dev/vcs -d vga -d pc_kbd --keymap fr pc_kbd -d pc_mouse --protocol=ps/2 pc_mouse
<youpi>possibly just drop --daemonize from that
<AlmuHS>ok, wait a minutes
<AlmuHS>like this? # console -d vga -d pc_mouse --repeat=mouse -d pc_kbd --repeat=kbd -d generic_speaker -c /dev/vcs
<AlmuHS>I've removed --daemonize option in /etc/init.d/hurd-console. I go to try It
<jrtc27>that's not going to work surely
<AlmuHS>the exec freeze is notably more common in my smp microkernel. why?
<jrtc27>you have more code, so some operations take slightly longer and make the race condition more likely to be triggered?
<AlmuHS>could be
<jrtc27>races are very very fragile
<AlmuHS>about tty, without daemonize it seems don't start
<jrtc27>yes, it won't, because you need to daemonize in an init script
<jrtc27>"and you can try to run it ***by hand*** without the --daemonize option, to see what happens"
<AlmuHS>it's true
<youpi>AlmuHS: "don't start", as in?
<youpi>what does it tell?
<AlmuHS>blank screen
<jrtc27>youpi: AlmuHS just patched the init script to not daemonize
<youpi>did you try to run from ssh to get the output?
<jrtc27>so of course it's going to just hang
<youpi>errgl sure
<jrtc27>and hopefully sysvinit times out or something and just kills it, idk
<youpi>run by *hand*
<youpi>not from the init script
<jrtc27>yeah
<AlmuHS>XD
<youpi>I only mentioned the init script to get how it's usually started
<AlmuHS>i can't access via ssh, the ssd server didn't start too
<AlmuHS>*ssh
<youpi>yet another reason to start by *HAND*
<AlmuHS>yes
<AlmuHS>now I have to solve this from chroot. :-/
<youpi>from chroot?
<AlmuHS>or from live
<AlmuHS>it's a physical machine
<youpi>that's not really a problem, you can put whatever you need to hack in it
<youpi>the build tree, gdb, etc.
<AlmuHS>the problem is that I've locked the tty when I removed the --daemonize option
<AlmuHS>and ssh starts after console, so ssh doesn't works
<AlmuHS>so I have to edit the file from chroot or a live to recover the console
<youpi>just put back the --daemonize option in there
<youpi>so it starts normally
<youpi>or disable the daemon
<AlmuHS>yes, but I can access the machine, it's the problem
<youpi>don't you have network access?
<AlmuHS>yet, but ssh doesn't starts
<youpi>again
<youpi>again
<youpi>again
<AlmuHS>?
<youpi>put back the --daemonize option
<youpi>so it'll start
<youpi>and thus ssh will start
<youpi>and thus you'll have access
<AlmuHS>but I need to access via live or similar, because i have not tty or ssh
<AlmuHS>i need to access to edit the file
<AlmuHS>do you understand the problem now?
<youpi>you can reboot in emergency mode to get access to the disk
<youpi>or with the installer CD and mount the partition by hand
<AlmuHS>I'm using the rescue mode
<AlmuHS>of DVD-1
<youpi>ok, but no need to chroot
<youpi>you just need to put back the --daemonize option
<AlmuHS>not really, simply the rescue mode offers this option
<AlmuHS>ok, file edited
<AlmuHS>tty recovered
<AlmuHS>youpi: i tested with this line: console -d vga -d pc_mouse --repeat=mouse -d pc_kbd --repeat=kbd -d generic_speaker -c /dev/vcs
<AlmuHS>but nope
<AlmuHS>i executed It by hang
<AlmuHS>*hand
<AlmuHS>but the tty is blank again
<AlmuHS>and I've lost the ssh access
<AlmuHS>i go to disable hurd-console, and try again from Mach console
<AlmuHS>youpi: ok, starting hurd-console manually from mach console, it works
<AlmuHS>i go to reboot to be sure I've started the correct kernel
<AlmuHS>ok, it's correct
<AlmuHS>It seems the problem is in the mouse driver
<AlmuHS>when I add the mouse option, hurd-console keeps blank at start
<AlmuHS>youpi: finally solved. Disabling mouse driver, hurd-console works properly in upstream's gnumach
<AlmuHS>youpi: excuse me the insistence. Pellescours has tested my SMP patch in many scenarios (disabling ACPI, in a x86_64 host...) and It works properly
<jrtc27>what's your point?
<AlmuHS>my point?
<AlmuHS>what do you refers?
<jrtc27>the message you just sent
<jrtc27>why did you send it?
<jrtc27>what do you hope to gain from it?
<jrtc27>do you want something to happen as a result, or is it just informative?
<AlmuHS>because some days ago i was afraid about possible errors in my SMP patch. And now, after the tests, I can confirm that the patch works well
<AlmuHS>It's informative
<AlmuHS>now, the patch review can be centered in the code, not in if the patch works
<AlmuHS>**focused, not centered
<AlmuHS>btw, jrtc27 , has you read the patch?
<jrtc27>no, I stopped after seeing the lack of care wrt code formatting
<AlmuHS>i fixed the formatting
<jrtc27>no you didn't though
<jrtc27>it's still very sloppy
<jrtc27>and for me at least I find that makes it much harder to concentrate on the code itself
<jrtc27>both in that it's mentally more taxing when things aren't uniform, and that I spend my time noticing all the ways in which it's wrong
<AlmuHS>i can recheck again. But I don't find a fixed reference for the formatting. In many files are different styles
<jrtc27>the predominant style in GNU Mach is the mach style
<jrtc27>the GNU style exists in some files, but IMO that's a mistake
<jrtc27>your code is neither of the two, but an inconsistent mix of them plus some mistakes on top of that
<AlmuHS>even in GNU files, in the same file there are many different styles
<AlmuHS>so I don't know what is the correct
<jrtc27>right, but, you're not even using that
<jrtc27>at least pick something and do it correctly
<jrtc27>but I'd argue that mach should follow the mach style
<jrtc27>even though it's GNU Mach
<jrtc27>because that's what most of the code does
<AlmuHS>if each file has a style, what style must I to use?
<jrtc27>you should always match the surrounding style