<youpi>llvm really is a beast to build <youpi>to determine what is happening <youpi>by putting prints in the various cases <youpi>I would have put like two dozens <AlmuHS>in thread_setrun(), there are a XXX <AlmuHS> * XXX should replace queue with a boolean in this case. <youpi>AlmuHS: in thread_select, what is interesting is the myprocessor->runq.count > 0 cas <youpi>like I said: thread_setrun puts it on cpu0's run, thread_select finds it there <youpi>cpu0's runq is myprocessor->runq ***Server sets mode: +nt
<AlmuHS>now I have a hard backache, and I want to sleep <AlmuHS>you can open an issue in my repo, or send me an email with the notes <youpi>gnu_srs: ergl, the mahler box crashed exactly while I was upgrading the gnumach kernel, and it can't reboot <youpi>could you reboot it with a debian installer CD image? <mabox>The guy did suggested at Cristosan.com <damo22>youpi: why does disable_irq and enable_irq in linux/dev/arch/i386/kernel/irq.c both call "cli" <gnu_srs2>youpi: I think mahler is OK again. Can you confirm? <damo22>hey AlmuHS i am implementing IOAPIC <damo22>im doing it naively for now, just by assuming all interrupts are active low level triggered <damo22>seems to be most compatible with PCI cards <AlmuHS>we need also to program the Local APIC timer <damo22>i think we need to parse the MADT/APIC table for legacy interrupt overrides <AlmuHS>but MADT shows very few information about lapic, I remember <AlmuHS>check ACPI specification for this <AlmuHS>I only read the APIC ID of IOAPIC, I remember <damo22>but there might be some other fields that say which interrupts are which polarity <AlmuHS>in acpi_rsdp.c, I have an ioapic array <AlmuHS>has you find the MADT section in ACPI specification <damo22>im writing the boilerplate code for ioapic <AlmuHS>and It's not clear as Intel guides <AlmuHS>APIC info is in chapter 10, in Intel guides <damo22>because destination field for redirection says 4 bits for apicid and top 4 bits for processor if its a logical destination <damo22>but we are going to use fixed destinations so no processor markings <AlmuHS>the ioapic structures in apic.c is the default in Mach 4 header, I prefered don't touch them <AlmuHS>you can add new struct to apic.h <AlmuHS>I kept old Mach struct because I don't understand very well how IOAPIC structures works <damo22>i can send you a paste of what i have <youpi>damo22: restore_flags is a popf, which contains the interrupt flag <damo22>im still replacing pic with ioapic stuff <damo22>because interrupt handling is a nightmare for that <AlmuHS>ok, It might be patched later, but It's good decision by now <AlmuHS>pretty code. My code is a few dirty ;) <AlmuHS>youpi: what printf says tonight I had to add ? I lost my mobile IRC historial <youpi>(01:16:03) Samuel Thibault: AlmuHS: in thread_select, what is interesting is the myprocessor->runq.count > 0 cas <youpi>(01:16:05) Samuel Thibault: e <youpi>(01:16:29) Samuel Thibault: like I said: thread_setrun puts it on cpu0's run, thread_select finds it there <youpi>(01:16:38) Samuel Thibault: cpu0's runq is myprocessor->runq <youpi>(01:16:41) Samuel Thibault: not pset->runq <AlmuHS>ok. Now I go to lunch. After this, I will try this <damo22>youpi: why does the PIC mask get changed at all? <damo22>is it rescheduling interrupts by masking them out at will? <damo22> movl $(SPL0),EXT(curr_ipl) /* set ipl */ <damo22> movl EXT(pic_mask)+SPL0*4,%eax /* get PIC mask */ <damo22> SETMASK() /* program PICs with new mask */ <damo22>im confused why it adds SPL0*4 wont that always evaluate to adding zero? <damo22>my problem is i dont understand what spl 0-7 are for <damo22>they seem to be interrupt prorities <damo22>i think this is legacy crap that can go away? <youpi>damo22: changing the PCI mask is a way to mask interrupts rather than using cli. various spl levels have various PIC masks <youpi>SPL0*4 will always be zero, 0 <youpi>putting it away may not be that simple <youpi>using SPL0*4 is just a way to remind that pic_mask is an array indexed by spl levels <damo22>in the ioapic you can mask interrupts as well <damo22>but i checked all the spl levels, the old pic masks are not changing except 2 IRQs <damo22>doesnt that mean the pic masks are only changing for irq1 and irq13? <damo22>so i think i will eliminate all this garbage <youpi>please don't call that "garbage" :) <youpi>mit people have worked hard to get this working on old hardware which had way more limitations than nowadays' <youpi>note however that spl7 is hardcoded to an all-mask <youpi>look at kdstart() for instance <youpi>these spl levels are not "just for nothing", really <youpi>in kdstart it uses splsoftclock() to avoid the clock ticking <youpi>without completely masking everything <damo22>but with an apic timer won't all this be simpler? <youpi>I just mean that "throwing away" can't be done lightly <youpi>possibly these spl levels can be squashed into just one "just disable interrupts" level because nowadays it's not that urging to respond to an interrupt <youpi>theoretically code running at a given spl level could expect to be able to wait for an interrupt of a high spl level <youpi>I don't think that happens in practice <youpi>perhaps you could first try to add all interrupts at SPL1 <youpi>to check that it still works <damo22>so SPL0 means the interrupt is ignored? <youpi>spl0 means all interrupts are unmasked <youpi>and spl7 means all interrupts are masked <damo22>so a system could run with all interrupts working at spl0 <damo22>and at spl7 nothing interrupts the cpu(s) <damo22>i cant think anymore its been a big week <AlmuHS>excuse me. Today I have a unstable internet connection <youpi>AlmuHS: it would be useful to not only print the thread address, but also its task name: thread->task->name <youpi>to check what actually gets running <youpi>here we see alternation between two threads <youpi>it'd be useful to know from what task <youpi>also, to get more logs, you can pass -serial stdio to qemu, and console=com0 to gnumach <AlmuHS>oh, ok. I was searching a thread name <AlmuHS>It shows an error: qemu-system-i386: console=com0: drive with bus=0, unit=0 (index=0) exists <youpi>console=com0 is for gnumach, not qemu <youpi>/etc/default/grub's GRUB_CMDLINE_GNUMACH variable <youpi>like I said, the content of the parameter to be p ased to gnumach is console=com0 <youpi>so you have to put console=com0 in the variable, not only part of it <AlmuHS>GRUB_CMDLINE_GNUMACH="console=com0" <youpi>I fail to understand how you could hope that anything could work <youpi>I said you have to pass console=com0 to gnumach <youpi>so you have to pass only that <AlmuHS>yes, but I didn't understand that It was a parameter <AlmuHS>I thank that It was a configuration <youpi>for grub it's not a parameter, it's part of the gnumach command line <youpi>thus my saying "pass console=com0 to gnumach" <youpi>it's just text passed to the kernel <youpi>it's printed on stdio of qemu <youpi>I forgot to mention: of course after modifying the grub configuration you need to run update-grub <AlmuHS>wait, I go to pastebin the output <AlmuHS>oh, fuck, &> don't redirect the output to a file <AlmuHS>youpi: I paste a partial output (my terminal lost the start) <youpi>as I said, getting the names would be useful <youpi>redirecting to a file should be possible with proper shell redirectgion <AlmuHS>sorry, the file is so long to pastebin <AlmuHS>youpi: here you can see the pastebin <youpi>AlmuHS: you could truncate the file before pastebining it <youpi>AlmuHS_: it's a char*, you need to use %s <youpi>but also print the thread address <youpi>I said "not only print the thread address, but also its task name" <youpi>really, make sure to read what we write <AlmuHS_>added. But I can't push yet, my WiFi is down <youpi>really, you need to understand about the consequences of what you are doing: if you keep %d in the format, for sure it will keep printing a number <AlmuHS>I've just recovered WiFi. I go to push <AlmuHS>thread f5e3a938 with name ext2fs selected in cpu 0 <youpi>details matter, please pastebin the log <AlmuHS>I need to improve pset prints, but I'll pastebin anyway <AlmuHS>I forgot add slot_num when I print "choose pset". I go to improve It <youpi>also put a print in the else part of pset->runq.count == 0 <AlmuHS>are there any way to identify the pset (as a name) <youpi>there is only one pset for now, containing all processors <youpi>also, print th->sched_pri and th->state <youpi>in thread_setrun, in the part with XXX Don't do this remotely to master <youpi>try to remove the (processor != master_processor) && part <youpi>it shouldn't be a problem nowadays <AlmuHS>remove the else block or only the check> <youpi>(like I said. I don't see how else it could be understood if one really reads what I have written) <AlmuHS>I understood that you wanted I disable the block which start with this condition <youpi>I don't see how you could understand that <youpi>my "part" word only designated "(processor != master_processor) &&" <youpi>i.e. only that needed to be removed <youpi>if I wanted the whole block to be removed, I'd have said so <youpi>see the patch I have just commited to master actually <youpi>which makes me realize: aston() is not actually where you'd send an IPI. cause_ast_check is <youpi>AlmuHS: that commit is in the hurd repo <youpi>we are talking about the gnumach repo <youpi>that's the latest commit and it talks about what I mentioned, so yes, sure <AlmuHS>my master branch is a remote to upstream <AlmuHS>compiling. (excuse spanish expression) <youpi>like I mentioned above, also print th->sched_pri and th->state <youpi>the fact that I mentioned something else after that doesn't mean that it is not useful any more <youpi>each time you get one, i.e. after the calls to choose_thread(), current_thread(), choose_pset_thread(), or dequeue_head() <youpi>also, you could use control-alt-d to get the kernel debugger (you build with --enable-kdb, right?), and type "show all threads" there to get a list of threads <AlmuHS>I didn't compile with --enable-kdb (I think), but I can compile with this <AlmuHS>I have a compilation error. This call is not defined: cpu_interrupt_to_db <AlmuHS>ddb/db_mp.c:155: cpu_interrupt_to_db(i); <AlmuHS>this error only appears when I try to compile with --enable-kbd <AlmuHS>youpi: It seems I can't use kbd in my smp kernel. There are another not-implemented function, I think <youpi>ah, that will be needed for proper debugging on the different processors indeed. For now you can just add to ./i386/i386/db_interface.c a function that does nothing <youpi>that's when you press control-alt-d ? <youpi>you can use gdb to know where that is <AlmuHS>I thank that It was a bad trace, but I commented the risky trace and the dump appears again <youpi>AlmuHS: perhaps thread->task is not always set, so use thread->task ? thread->task->name : "no name" <youpi>you can use gdb to get the precise line number <youpi>l * ( thread_select + 0x205 ) <youpi>(the first line of the backtrace tells these) <AlmuHS>0xc1030155 is in thread_select (../kern/sched_prim.c:593). <AlmuHS>588 simple_unlock(&pset->runq.lock); <AlmuHS>593 if (thread->policy == POLICY_TIMESHARE) { <AlmuHS>595 myprocessor->quantum = pset->set_quantum; <youpi>so most probably it's the thread variable which has become bogus <youpi>what I don't understand is: you only added prints? <youpi>perhaps try to come back in the commits <youpi>to determine what exactly triggered the issue <youpi>possibly breaking down the commit in separate lines <youpi>(again, putting prints in the else part of if (pset->runq.count == 0) in thread_select would allow to make sure where that "thread" value comes from) <AlmuHS>but now I'm doing checkout in old commits, to find the problem <AlmuHS>youpi: the problem seems happens after resync with upstream <AlmuHS>nope, It's in the previous to this <AlmuHS> printf("current thread is %d with name %s , priority %d and state %d\n", thread, thread->task ? thread->task->name : "no name", thread->state); ***Glider_IRC_ is now known as Glider_IRC
<AlmuHS>nope, this print is not the problem ***Out`Of`Control is now known as Viper
<gnu_srs2>AlmuHS: Nice log. Maybe you still have problems with the format of the print statements, negative values just seems wrong? <AlmuHS>probably, this values is not set in these moments <AlmuHS>in pset case, the pset has not an ID <AlmuHS>but really, I can print these with %x