IRC channel logs
2024-02-05.log
back to list of logs
<damo22>so it makes sense that the INIT SIPI sequence causes the AP to load code from the reset vector, unless you change the RTC cmos setting to jump through an alt address <damo22>that would explain why my code seems to reboot <damo22>because i basically kexec'd coreboot on the AP <damo22>but when i change the CMOS setting, the ESR fails with error 4 or c <damo22>i also updated the warm reset vector at 0x467 <damo22>linux$ git grep TRAMPOLINE_PHYS_HIGH <damo22>arch/x86/include/asm/apic.h:#define TRAMPOLINE_PHYS_HIGH 0x469 <damo22>arch/x86/kernel/smpboot.c: *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) = start_eip >> 4; <damo22>linux$ git grep TRAMPOLINE_PHYS_LOW <damo22>arch/x86/include/asm/apic.h:#define TRAMPOLINE_PHYS_LOW 0x467 <damo22>arch/x86/kernel/smpboot.c: *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = start_eip & 0xf; <damo22>i took out the printfs in the critical path and got: <damo22>hmmm startup sent illegal vector <damo22>it seems that INIT DEASSERT is not supported with LEVEL triggered on my board.... it says so in the BKDG. that is also the old way before SIPI was supported <damo22>woot ext2fs: part:2:device:sd0: No such device or address on AMD <damo22>i think that happened with that recent patch for fpu <damo22>Pellescours: that log above is the culmination of a weeks work, debugging all that on hardware :) <youpi>(I wouldn't be surprised if some bugs are left in the fpu-on-smp case) <damo22>in master we are doing bad things with APIC <damo22>i will clean this up and mail in fixes <damo22>hmm i should also test smp on qemu <damo22> ../kern/slab.c:966: kmem_cache_alloc_from_slab: Assertion `bufctl != NULL' failed.panic {cpu1} <damo22>i get that error from qemu with smp 2 <AlmuHS>damo22: the ICR struct includes some RO fields, because it must include all register fields. But the IPI function only ask the write fields <damo22>AlmuHS: yes, but you cannot write a 0 to some read only field <damo22>actually the code is uninitialised <damo22>so it could put anything in there <damo22>the best way is to read the current value and write it with the same value when you write it <damo22>also, we forgot to use ".r" on the register write <damo22>so it clobbered most of the block <damo22>it refers to the first 32 bits of the icr only <AlmuHS>i remember that i added this as a simple padding <damo22>yes but the padding is mapped to hardware registers <damo22>so if you write the whole block, you set all the registers i think <AlmuHS>you could check this reading ICR in GDB <AlmuHS>reading the ICR status after IPI <damo22>it could explain why the IPIs were lagging or doing something strange in smp <AlmuHS>in my T410, when i test SMP last year, the machine reboots after SIPI <AlmuHS>but t410 is 64-bit, so it could be using x2APIC by default <AlmuHS>yeah, but i tried this machine so long months ago <AlmuHS>if you modify icr struct, be careful with longs and padding <AlmuHS>it must be exactly than specs, including reserved spaces <AlmuHS>it's not necessary including an array for padding, this topic is solved in the struct with the manual padding (the fields without name) <damo22>i did not modify the struct, i just added a name to a missing field and set it <AlmuHS_>in start_other_cpus(), why do you disable lapic? <AlmuHS_>you need lapic to send IPI. it's not? <damo22>not quite, the INIT and SIPI can still work with IOAPIC interrupts disabled <damo22>LAPIC_ENABLE flag of lapic is kind of equivalent to masking all the ioapic individually <damo22>so i made a separate function just to toggle that bit <AlmuHS_>ok, the name function is not clear about that <damo22>its good to turn it off because it prevents ioapic interrupts from getting in the way of smp startup <damo22>so then i turn it on after the APs are up <youpi>damo22: it would be very useful to put that in comments <youpi>because I have seen this kind of function calls getting moved here and there according to the specific issue each and everyone was having, without taking into account all situations <youpi>so we really need comments to know *why* some call is done here and not there <AlmuHS_>btw, could you send me the patch in which you modified the scheduler to remove the cpu pause? The patch will prevent boot stucks <youpi>otherwise you'll probably see some random guy some years later who will move them again and break things <AlmuHS_>this issue is not solved yet in upstream, so i continue applying this patch to get boot successfully <damo22>AlmuHS_: but perhaps try master + my patch series from the ML <damo22>you might not need this extra patch anymore <AlmuHS_>yes, i am pulling master after your repository <gnu_srs>youpi: dhcpcd use a lot of entries and structs from linux/netlink.h and linux/rtnetlink.h. They are not defined in the header files in /usr/include. <gnu_srs>They are available and used in libdde and pfinet though. Can I use the definitions in these files? In which additional library are they available, or not? <AlmuHS_>but i want to check this patch to understand the problem <youpi>gnu_srs: we don't expose a netlink interface, so the definitions will most probably not be useful <gnu_srs>Used very much are: RTM_NEWADDR and RTM_DELADDR <youpi>that can be replaced with ioctls <damo22>AlmuHS_: i have a branch with master plus my patches, its zam/fix-ioapic