IRC channel logs
2024-02-11.log
back to list of logs
<damo22>youpi: when it hangs and i cant enter kdb, all cpus are in HLT state with 0x246 eflags <youpi>damo22: is the timer interrupt not working any more? <youpi>does 0x246 include the interrupt flag? <damo22>i think 0x200 is interrupt flag? <damo22>icrh is being set with apic ids for ipis <damo22>and yes timer interrupt is working <damo22>pit timer on cpu0 and lapic timers on each AP <damo22>although, how does cpu0 receive timer interrupts? <damo22>does it route through the lapic? <damo22>i verified all the timer interrupts are being called <youpi>ok so interrupts basically work, the question is then why the keyboard interrupt doesn't work to trigger kdb <damo22>let me check if i compiled with --enable-kdb <damo22> pin 1 0x0000000000010031 dest=0 vec=49 active-hi edge masked fixed physical <damo22>must be that kbdopen is not called <damo22>how can kd represent the screen if it refers to keyboard events? <damo22>Processor set runq: count(0) low(4) <damo22>Processor #0 runq: count(0) low(32) <damo22>Processor #1 runq: count(0) low(32) <damo22>Processor #2 runq: count(0) low(32) <damo22>Processor #3 runq: count(0) low(32) <damo22>Processor #4 runq: count(0) low(32) <damo22>Processor #5 runq: count(0) low(32) <youpi>so it's not a scheduler bug, but a missing wakeup somewhere <damo22>eg in thread_run and friends, it reads the cpu number before disabling interrupts, what if it gets interrupted in between reading the cpu number and turning off interrupts to handle the code <damo22>i guess it will return here eventually with the right stack? <youpi>it looks odd to be calling current_processor() outside splsched() indeed <damo22>yeah can it be possibly getting the wrong value for cpu number? <damo22>or caching it, and then returning on a different cpu <youpi>I don't remember if the gnumach kernel is preemptible <youpi>userland is preemptible of course <damo22>does that mean interrupts cannot interrupt gnumach process? <youpi>they just can't preempt them <youpi>preempt = run another thread <youpi>i.e. when you're in kernel mode, you're sure to stay running, until you call something that might block <damo22>how do you hand off to another thread? <youpi>well, the blocking primitive will block <youpi>i.e. tell the scheduler to run something else <damo22>but i mean if a timer interrupt happens before splsched() will it guarantee to return with the same cpu and kernel stack? <youpi>threads don't magically change cpu :) <damo22>what sort of missing wakeup would i be looking for <youpi>no idea, but you probably rather want to look for what is actually supposed to be running <damo22>the cpus all end up in machine_idle <youpi>sure, if there is no thread to run, cpus will be idle <youpi>question is: what is your system actually doing? <youpi>what did you expect it to be doine? <youpi>are you having a shell, something? <youpi>very possibly you just have an userland interlock <damo22>its in the bootup process running INIT <youpi>and thus the kernel is not at fault at all <youpi>and it's just userland being faulty <youpi>so check what userland is doing, which point it is at <youpi>you can also check the state of ext2fs <youpi>possibly it's stuck for whatever reason <youpi>but you can see in the backtrace what it's doing <youpi>it's like agatha christie novels <youpi>making hypothesis is premature until you actually have an idea where you're aiming at <damo22>is there any way to make all the init tasks bound on cpu0 but let some tasks like gcc run on APs? <damo22>or does everything inherit from init task <youpi>everything inherits from the init task <youpi>but you can probably change the binding later when you want <damo22>make everything run on cpu0 and select some things to run on APs only <damo22>ok when i do that, it runs, reallllly slow <damo22>i changed it so APs are in a separate pset, but i think the scheduler is putting threads into the alternate pset as well but they are never run <damo22>youpi: i think ive solved it for now <damo22>i put APs into a separate processor set and they are disabled by default <damo22>so smp boots with all APs but only executes on BSP <damo22>i think we can enable them using processor_set RPCs <damo22>ive mailed in a few small patches that enable this <damo22>we might be able to spawn a shell that runs only on APs <damo22>idling in a processor set that is unused <janneke>could it be that the latest hurd release (v0.9.git20231217) needs an unreleased gnumach? <janneke>69620634858b2992e1a362e33c95d9a8ee57bce7 <janneke>x86_64: Support 8 byte inlined port rights to avoid message resizing. <janneke>start-translator-long.c:42:3: error: unknown type name ‘mach_port_name_inlined_t’ <janneke> 42 | mach_port_name_inlined_t control_port; <youpi>which "release" of gnumcah do you have? <youpi>possibly I just forgot to push latest tags <youpi>it seems there was no 2023 tag indeed <janneke>i'm using v1.8+git20230410, which is my latest afaics <youpi>see the mig changes that Flavio introduced lately <youpi>which indeed makes incompatible API changes <youpi>anyway, I have pushed the latest 2023 tags <janneke>thanks, and good to know latest hurd needs something closer to gnumach master