IRC channel logs

2019-11-01.log

back to list of logs

<youpi>llvm really is a beast to build
<AlmuHS>I added some prints
<AlmuHS>I'm exhausted
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/3b2799af8d4f7fdf8f51e8db7b655d736850312d
<youpi>I'd say put way more prints
<youpi>to determine what is happening
<youpi>by putting prints in the various cases
<AlmuHS>even more?
<youpi>I would have put like two dozens
<AlmuHS>uff...
<AlmuHS>in thread_setrun(), there are a XXX
<AlmuHS> * XXX should replace queue with a boolean in this case.
<AlmuHS>line 1364
<AlmuHS>I added more traces
<AlmuHS>I'm so tired
<AlmuHS> https://pasteboard.co/IEAEd9N.png
<AlmuHS>I go to sleep
<AlmuHS>sleeping
<youpi>AlmuHS: in thread_select, what is interesting is the myprocessor->runq.count > 0 cas
<youpi>e
<youpi>like I said: thread_setrun puts it on cpu0's run, thread_select finds it there
<youpi>cpu0's runq is myprocessor->runq
<youpi>not pset->runq
***Server sets mode: +nt
<AlmuHS>ok, tomorrow I'll check It
<AlmuHS>now I have a hard backache, and I want to sleep
<AlmuHS>you can open an issue in my repo, or send me an email with the notes
<AlmuHS>and tomorrow I will continue
<youpi>gnu_srs: ergl, the mahler box crashed exactly while I was upgrading the gnumach kernel, and it can't reboot
<youpi>could you reboot it with a debian installer CD image?
<mabox>The guy did suggested at Cristosan.com
<damo22>youpi: why does disable_irq and enable_irq in linux/dev/arch/i386/kernel/irq.c both call "cli"
<damo22>shouldn't one of them be sti?
<gnu_srs2>youpi: I think mahler is OK again. Can you confirm?
<damo22>hey AlmuHS i am implementing IOAPIC
<AlmuHS>oh! :)
<damo22>im doing it naively for now, just by assuming all interrupts are active low level triggered
<damo22>seems to be most compatible with PCI cards
<AlmuHS>I didn't implement any trigger
<AlmuHS>only IDT
<damo22>im doing it
<AlmuHS>oh, ok
<AlmuHS>we need also to program the Local APIC timer
<damo22>yeah
<damo22>i think we need to parse the MADT/APIC table for legacy interrupt overrides
<AlmuHS>It's possible
<AlmuHS>but MADT shows very few information about lapic, I remember
<AlmuHS>check ACPI specification for this
<damo22>not lapic, ioapic
<AlmuHS>oh, really
<AlmuHS>I only read the APIC ID of IOAPIC, I remember
<damo22>that is all i need really
<damo22>but there might be some other fields that say which interrupts are which polarity
<AlmuHS>in acpi_rsdp.c, I have an ioapic array
<damo22>yeah, i am using it
<damo22>i borrowed the address from it
<damo22>so i can talk directly to hw
<AlmuHS>has you find the MADT section in ACPI specification
<AlmuHS>?
<damo22>not yet
<AlmuHS>wait, maybe I had a screenshot
<damo22>im writing the boilerplate code for ioapic
<AlmuHS>5.2.1 chapter
<damo22>void ioapic_configure(void)
<AlmuHS>perfect
<AlmuHS>acpi specification is here: http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf
<AlmuHS>MADT is chapter 5.2.1
<AlmuHS>5.2.12, exacly
<AlmuHS>*exactly
<damo22>i hate that doc
<AlmuHS>yes, I too
<damo22>too much crap in ACPI
<AlmuHS>yes
<AlmuHS>and It's not clear as Intel guides
<damo22>apic_id is max 4 bits?
<damo22>so you can only have 16 ioapics
<AlmuHS>yes
<damo22>good
<AlmuHS>in Intel guides say this
<AlmuHS>APIC info is in chapter 10, in Intel guides
<AlmuHS>other heavy guide: https://software.intel.com/sites/default/files/managed/a4/60/325384-sdm-vol-3abcd.pdf
<damo22>because destination field for redirection says 4 bits for apicid and top 4 bits for processor if its a logical destination
<AlmuHS>where?
<damo22>but we are going to use fixed destinations so no processor markings
<AlmuHS> https://software.intel.com/sites/default/files/managed/a4/60/325384-sdm-vol-3abcd.pdf#G15.83096
<AlmuHS>the ioapic structures in apic.c is the default in Mach 4 header, I prefered don't touch them
<AlmuHS>apic.h
<AlmuHS>oh, but this is from Min_SMP https://github.com/AlmuHS/GNUMach_SMP/blob/smp/imps/apic.h#L43-L47
<AlmuHS>some of my code is from Min_SMP, an experimental toolbox to try SMP https://github.com/AlmuHS/Min_SMP/blob/master/include/ioapic.h
<damo22>i had to implement more structs
<damo22>for the redirection entries
<AlmuHS>ok, you can do It
<damo22>i mean routing entries
<AlmuHS>the default Mach IOAPIC struct is this: https://github.com/AlmuHS/GNUMach_SMP/blob/smp/imps/apic.h#L36-L40
<AlmuHS>If you need, no problem
<AlmuHS>you can add new struct to apic.h
<AlmuHS>I kept old Mach struct because I don't understand very well how IOAPIC structures works
<damo22>i can send you a paste of what i have
<damo22>it doesnt compile yet
<damo22>im assuming 1 ioapic
<AlmuHS>ok
<youpi>damo22: restore_flags is a popf, which contains the interrupt flag
<youpi>gnu_srs2: yes, thanks!
<damo22> https://paste.fedoraproject.org/paste/wCXBo29fdzJ7DZeuIs7X7A/raw im still working on it but its looking something like this
<AlmuHS>damo22: can you share you repo?
<damo22>ok i pushed to a feature branch
<damo22> https://github.com/zamaudio/GNUMach_SMP/tree/feat-ioapic
<damo22>its still broken
<damo22>im still replacing pic with ioapic stuff
<AlmuHS>ok
<damo22>also linux devices are removed
<damo22>because interrupt handling is a nightmare for that
<AlmuHS>ok, It might be patched later, but It's good decision by now
<AlmuHS>pretty code. My code is a few dirty ;)
<AlmuHS>youpi: what printf says tonight I had to add ? I lost my mobile IRC historial
<youpi>(01:16:03) Samuel Thibault: AlmuHS: in thread_select, what is interesting is the myprocessor->runq.count > 0 cas
<youpi>(01:16:05) Samuel Thibault: e
<youpi>(01:16:29) Samuel Thibault: like I said: thread_setrun puts it on cpu0's run, thread_select finds it there
<youpi>(01:16:38) Samuel Thibault: cpu0's runq is myprocessor->runq
<youpi>(01:16:41) Samuel Thibault: not pset->runq
<AlmuHS>ok. Now I go to lunch. After this, I will try this
<damo22>youpi: why does the PIC mask get changed at all?
<damo22>is it rescheduling interrupts by masking them out at will?
<damo22> movl $(SPL0),EXT(curr_ipl) /* set ipl */
<damo22> movl EXT(pic_mask)+SPL0*4,%eax /* get PIC mask */
<damo22> SETMASK() /* program PICs with new mask */
<damo22>im confused why it adds SPL0*4 wont that always evaluate to adding zero?
<damo22>my problem is i dont understand what spl 0-7 are for
<damo22>they seem to be interrupt prorities
<damo22>priorities
<damo22>but they are mostly zero
<damo22>i think this is legacy crap that can go away?
<youpi>damo22: changing the PCI mask is a way to mask interrupts rather than using cli. various spl levels have various PIC masks
<youpi>SPL0*4 will always be zero, 0
<youpi>that's expected for spl0
<youpi>putting it away may not be that simple
<youpi>s/PCI/PIC
<youpi>s/, 0/, yes
<youpi>using SPL0*4 is just a way to remind that pic_mask is an array indexed by spl levels
<damo22>in the ioapic you can mask interrupts as well
<damo22>but i checked all the spl levels, the old pic masks are not changing except 2 IRQs
<damo22>intpri[] is mostly zeroes
<damo22>doesnt that mean the pic masks are only changing for irq1 and irq13?
<youpi>yes
<youpi>(from what I remember)
<damo22>so i think i will eliminate all this garbage
<youpi>please don't call that "garbage" :)
<youpi>mit people have worked hard to get this working on old hardware which had way more limitations than nowadays'
<youpi>note however that spl7 is hardcoded to an all-mask
<damo22>it hardly runs in a vm now
<youpi>for other reasons
<youpi>look at kdstart() for instance
<youpi>these spl levels are not "just for nothing", really
<youpi>in kdstart it uses splsoftclock() to avoid the clock ticking
<youpi>without completely masking everything
<damo22>but with an apic timer won't all this be simpler?
<youpi>possibly
<youpi>I just mean that "throwing away" can't be done lightly
<damo22>sure
<youpi>possibly these spl levels can be squashed into just one "just disable interrupts" level because nowadays it's not that urging to respond to an interrupt
<youpi>theoretically code running at a given spl level could expect to be able to wait for an interrupt of a high spl level
<youpi>I don't think that happens in practice
<youpi>perhaps you could first try to add all interrupts at SPL1
<youpi>to check that it still works
<damo22>so SPL0 means the interrupt is ignored?
<youpi>spl0 means all interrupts are unmasked
<youpi>and spl7 means all interrupts are masked
<damo22>so a system could run with all interrupts working at spl0
<damo22>and at spl7 nothing interrupts the cpu(s)
<damo22>i cant think anymore its been a big week
<damo22>will resume tomorrow
<AlmuHS>ok, I've added more traces
<AlmuHS> https://pasteboard.co/IEGTXEN.png
<AlmuHS>youpi:
<AlmuHS>excuse me. Today I have a unstable internet connection
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/353f49ce643e8aef5078a3f64f44f9a09c978852
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/063935e3c4ad78333e6baa241d4aeead914204f5
<youpi>AlmuHS: it would be useful to not only print the thread address, but also its task name: thread->task->name
<youpi>to check what actually gets running
<youpi>here we see alternation between two threads
<youpi>it'd be useful to know from what task
<youpi>also, to get more logs, you can pass -serial stdio to qemu, and console=com0 to gnumach
<AlmuHS>oh, ok. I was searching a thread name
<AlmuHS> http://dpaste.com/1HDR1J5
<AlmuHS>It shows an error: qemu-system-i386: console=com0: drive with bus=0, unit=0 (index=0) exists
<AlmuHS>youpi:
<youpi>console=com0 is for gnumach, not qemu
<AlmuHS>how can I do It?
<youpi>configure it in grub
<youpi>/etc/default/grub's GRUB_CMDLINE_GNUMACH variable
<AlmuHS>oh, ok
<AlmuHS>GRUB_CMDLINE_GNUMACH=com0 ?
<youpi>you need the console= part
<AlmuHS>in the same file?
<youpi>on the same line
<youpi>like I said, the content of the parameter to be p ased to gnumach is console=com0
<youpi>so you have to put console=com0 in the variable, not only part of it
<AlmuHS>console=com0
<AlmuHS>GRUB_CMDLINE_GNUMACH=console
<AlmuHS>?
<AlmuHS>oh, I undertand
<AlmuHS>GRUB_CMDLINE_GNUMACH="console=com0"
<AlmuHS>?
<youpi>yes
<AlmuHS>ok
<youpi>I fail to understand how you could hope that anything could work
<AlmuHS>?
<youpi>anything _else_ I meant
<youpi>I said you have to pass console=com0 to gnumach
<youpi>so you have to pass only that
<youpi>no more no less
<AlmuHS>yes, but I didn't understand that It was a parameter
<AlmuHS>I thank that It was a configuration
<youpi>for grub it's not a parameter, it's part of the gnumach command line
<AlmuHS>yes
<youpi>thus my saying "pass console=com0 to gnumach"
<youpi>it's not a parameter
<youpi>it's just text passed to the kernel
<AlmuHS>yes, I know
<AlmuHS>how can I open the console?
<youpi>it's printed on stdio of qemu
<AlmuHS>really
<youpi>yes
<youpi>I forgot to mention: of course after modifying the grub configuration you need to run update-grub
<AlmuHS>yes, I know this
<AlmuHS>wait, I go to pastebin the output
<AlmuHS>oh, fuck, &> don't redirect the output to a file
<AlmuHS>all like this: https://pastebin.com/YVP3ZH2i
<AlmuHS>youpi: I paste a partial output (my terminal lost the start)
<youpi>as I said, getting the names would be useful
<youpi>redirecting to a file should be possible with proper shell redirectgion
<AlmuHS>wait, I'm trying to wgetpaste
<AlmuHS> https://drive.google.com/file/d/1hpB-4L7jLpXlzC_UiZnlbBGAkefpUWiU/view?usp=sharing
<AlmuHS>sorry, the file is so long to pastebin
<AlmuHS>now https://pastebin.com/niaR6YU4
<AlmuHS>youpi: here you can see the pastebin
<AlmuHS>I added the thread names, but shows coded https://github.com/AlmuHS/GNUMach_SMP/commit/f65c31b8b3dac2ab0d9cacc13b295b2d1e6a5caa
<AlmuHS_>my wifi down
<youpi>AlmuHS: you could truncate the file before pastebining it
<AlmuHS_>yes, I did It
<youpi>AlmuHS_: it's a char*, you need to use %s
<AlmuHS_>oh, ok
<youpi>but also print the thread address
<youpi>I said "not only print the thread address, but also its task name"
<youpi>i.e. both
<youpi>really, make sure to read what we write
<youpi>details are important
<AlmuHS_>added. But I can't push yet, my WiFi is down
<AlmuHS_>I'm speaking from mobile
<youpi>really, you need to understand about the consequences of what you are doing: if you keep %d in the format, for sure it will keep printing a number
<AlmuHS>I've just recovered WiFi. I go to push
<AlmuHS> https://github.com/AlmuHS/GNUMach_SMP/commit/29a46989c0ec0adad54921293df1ed6c9e2177fd
<AlmuHS>ok, now I see
<AlmuHS>the thread is ext2fs
<AlmuHS>thread f5e3a938 with name ext2fs selected in cpu 0
<AlmuHS>youpi: the thread is ext2fs
<youpi>details matter, please pastebin the log
<AlmuHS>ok, wait 2 minutes
<AlmuHS>I need to improve pset prints, but I'll pastebin anyway
<AlmuHS>youpi: https://pastebin.com/CabNs4An
<AlmuHS>I forgot add slot_num when I print "choose pset". I go to improve It
<youpi>also put a print in the else part of pset->runq.count == 0
<AlmuHS>ok
<AlmuHS>are there any way to identify the pset (as a name)
<AlmuHS>?
<youpi>no, but there is no need to
<youpi>there is only one pset for now, containing all processors
<AlmuHS>yes, It's true
<youpi>also, print th->sched_pri and th->state
<youpi>in thread_setrun, in the part with XXX Don't do this remotely to master
<youpi>try to remove the (processor != master_processor) && part
<youpi>it shouldn't be a problem nowadays
<AlmuHS>ok, I go to find It
<AlmuHS>remove the else block or only the check>
<AlmuHS>?
<youpi>only that part of the check
<youpi>(like I said. I don't see how else it could be understood if one really reads what I have written)
<AlmuHS>I understood that you wanted I disable the block which start with this condition
<AlmuHS>by this reason I asked
<youpi>I don't see how you could understand that
<youpi>my "part" word only designated "(processor != master_processor) &&"
<youpi>i.e. only that needed to be removed
<youpi>if I wanted the whole block to be removed, I'd have said so
<AlmuHS>ok, XD
<youpi>see the patch I have just commited to master actually
<AlmuHS>this? http://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?id=84e19ba0671b6d2a1740f14bc033ea9bcdc188e1
<youpi>which makes me realize: aston() is not actually where you'd send an IPI. cause_ast_check is
<youpi>AlmuHS: that commit is in the hurd repo
<youpi>we are talking about the gnumach repo
<AlmuHS>oops
<AlmuHS> http://git.savannah.gnu.org/cgit/hurd/gnumach.git/commit/?id=c69f7f3b5cdc6cf9367507478578dc5c875c2b74
<youpi>that's the latest commit and it talks about what I mentioned, so yes, sure
<AlmuHS>I can resync from here
<AlmuHS>my master branch is a remote to upstream
<AlmuHS>done
<AlmuHS_>wifi down again
<AlmuHS>por fin!
<AlmuHS>I go to push
<AlmuHS>compiling. (excuse spanish expression)
<youpi>like I mentioned above, also print th->sched_pri and th->state
<youpi>the fact that I mentioned something else after that doesn't mean that it is not useful any more
<AlmuHS>where?
<youpi>each time you get one, i.e. after the calls to choose_thread(), current_thread(), choose_pset_thread(), or dequeue_head()
<youpi>inside thread_select
<AlmuHS>ok
<youpi>also, you could use control-alt-d to get the kernel debugger (you build with --enable-kdb, right?), and type "show all threads" there to get a list of threads
<AlmuHS>I didn't compile with --enable-kdb (I think), but I can compile with this
<AlmuHS>editing
<AlmuHS>compiling
<AlmuHS>I have a compilation error. This call is not defined: cpu_interrupt_to_db
<AlmuHS>ddb/db_mp.c:155: cpu_interrupt_to_db(i);
<AlmuHS>It's a linker error, sorry
<AlmuHS>this error only appears when I try to compile with --enable-kbd
<AlmuHS>youpi: It seems I can't use kbd in my smp kernel. There are another not-implemented function, I think
<youpi>ah, that will be needed for proper debugging on the different processors indeed. For now you can just add to ./i386/i386/db_interface.c a function that does nothing
<AlmuHS>ok
<AlmuHS>compiling
<AlmuHS>solved
<AlmuHS>kernel trap :(
<AlmuHS> http://dpaste.com/2Q9M0DP
<AlmuHS>youpi:
<youpi>that's when you press control-alt-d ?
<youpi>you can use gdb to know where that is
<youpi>gdb gnumach
<youpi>l * 0xc1021b76
<AlmuHS>ok
<AlmuHS>It's without kbd
<AlmuHS>I thank that It was a bad trace, but I commented the risky trace and the dump appears again
<youpi>AlmuHS: perhaps thread->task is not always set, so use thread->task ? thread->task->name : "no name"
<AlmuHS>ok, I'll try It
<AlmuHS>dump continues: http://dpaste.com/02A5X8P
<youpi>you can use gdb to get the precise line number
<youpi>gdb gnumach
<AlmuHS>yes, I'm working in It
<youpi>l * ( thread_select + 0x205 )
<youpi>(the first line of the backtrace tells these)
<AlmuHS>why this?
<AlmuHS>oh, ok
<AlmuHS>0xc1030155 is in thread_select (../kern/sched_prim.c:593).
<AlmuHS>588 simple_unlock(&pset->runq.lock);
<AlmuHS>589 }
<AlmuHS>590 }
<AlmuHS>591
<AlmuHS>592 #if MACH_FIXPRI
<AlmuHS>593 if (thread->policy == POLICY_TIMESHARE) {
<AlmuHS>594 #endif /* MACH_FIXPRI */
<AlmuHS>595 myprocessor->quantum = pset->set_quantum;
<AlmuHS>596 #if MACH_FIXPRI
<AlmuHS>597 }
<AlmuHS>your line returns this
<youpi>so most probably it's the thread variable which has become bogus
<youpi>what I don't understand is: you only added prints?
<AlmuHS>yes
<youpi>perhaps try to come back in the commits
<AlmuHS>ok
<youpi>to determine what exactly triggered the issue
<youpi>possibly breaking down the commit in separate lines
<AlmuHS>oh, I go to do this
<youpi>(again, putting prints in the else part of if (pset->runq.count == 0) in thread_select would allow to make sure where that "thread" value comes from)
<AlmuHS>ok
<AlmuHS>but now I'm doing checkout in old commits, to find the problem
<AlmuHS>to find the origin of the dump
<AlmuHS>youpi: the problem seems happens after resync with upstream
<AlmuHS>here https://github.com/AlmuHS/GNUMach_SMP/commit/3960163dd83452fd572b9834f99819322fbb1fa9
<AlmuHS>I'm rechecking
<AlmuHS>nope, It's in the previous to this
<AlmuHS>youpi: the dump starts in this commit https://github.com/AlmuHS/GNUMach_SMP/commit/269bb89e1db1866c1bf70f1c5b0591f23c5e5322
<AlmuHS>the problem is in any print
<AlmuHS> printf("current thread is %d with name %s , priority %d and state %d\n", thread, thread->task ? thread->task->name : "no name", thread->state);
<AlmuHS>It print seems origin the dump
***Glider_IRC_ is now known as Glider_IRC
<AlmuHS>nope, this print is not the problem
<AlmuHS>youpi: I solved the dump. It was a lapse https://github.com/AlmuHS/GNUMach_SMP/commit/b31613161ab3b3b64179bcccf33fd0c7b37c0691
<AlmuHS> https://pastebin.com/EhS1whr3
***Out`Of`Control is now known as Viper
<gnu_srs2>AlmuHS: Nice log. Maybe you still have problems with the format of the print statements, negative values just seems wrong?
<gnu_srs2>e.g. pset -1054665344 unlocked
<AlmuHS>probably, this values is not set in these moments
<AlmuHS>in pset case, the pset has not an ID
<AlmuHS>but really, I can print these with %x
<AlmuHS>new log: http://paste.debian.net/1112993/