IRC channel logs

2022-10-22.log

back to list of logs

<damo22>youpi: so my problem now is that when the 251 interrupt happens on the AP for TLB shootdown it seems %esp gets set to 0:
<damo22>CPU#1
<damo22>EAX=00000000 EBX=00000000 ECX=00000002 EDX=f4846f80
<damo22>ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
<damo22>EIP=ca17be42 EFL=00010206 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
<damo22>and the instruction pointer gets messed up
<damo22>but i think i fixed locore.S to handle multiple int stacks
<damo22>maybe i just need to wait a little bit before BSP launches first actual task
<damo22>youpi: i think im getting an interrupt during an interrupt
<damo22>and i get a general protection fault
<damo22>youpi: when i run the kernel with no userspace, it runs to the end but the AP gets eip=0
<damo22>how do i ensure there is not a null thread to execute
<damo22>or that it should never switch to a null thread
<damo22>youpi: if there is contention on a memory address by multiple cpus, do you need to use some special lock xchg instruction to read it?
<damo22>db{0}> trace/tu
<damo22>apic_get_current_cpu(0,0,bffffecf,bf,82)
<damo22>all_intrs(0,0,0,0,0)+0xf
<damo22>>>>>> user space <<<<<
<damo22>0x0()
<damo22>db{0}>
<damo22>Pellescours: im convinced the handoff of the first AP thread is getting some null value
<damo22>and crashes the cpu
<damo22>kernel: Page fault (14), code=0
<damo22>Stopped at stack_handoff+0x23: movl 0(%ecx),%eax
<damo22>stack_handoff(f4825e78,f4843930,0,c100ada8,286)+0x23
<damo22>Bad frame pointer: 0xf4839fc8
<Pellescours>a problem in the task itself or in the i386 specific code ?
<damo22>most likely i386
<damo22>maybe its an interrupt happening during switching
<damo22>and corrupts the stacks
<Pellescours>solution: put cli eveeywhere, haha
<damo22>im not sure if the lapic timer is masked by cli
<youpi>damo22: the AP need to switch to their own idle thread
<youpi>damo22: the contention on memory is not the problem. coherency is
<damo22>youpi: i can see the idle_threads are being created for all cpus, but is it bad to thread_resume them before the cpu is up?
<youpi>I don't know what "is up" means here
<youpi>but I don't see how "bad" it could be, they're there to be used
<damo22>meaning, if the cpu is not set to machine_slot[cpu].running = TRUE yet
<damo22>and cannot yet read memory correctly
<damo22>it seems the start_other_cpus is called very late
<damo22>inside a kernel thread on the BSP
<youpi>depends what the rest of the kernel does with that running flag
<youpi>the AP need to read memor ycorrectly before switching to their idle thread, sure
<youpi>but I don't really understand what you're at
<youpi>AP just need to initialize themselves, and when they're ready, switch to their idle thread
<damo22>bsp has to send init ipi and sipi to wake up the APs
<damo22>then they start running themselves
<damo22>they eventually all call cpu_launch_first_thread()
<youpi>ah ok it's choose_thread which will happen to chose the idle thread of that cpu
<youpi>(unless there's some thread to run)
<damo22>i ran it with -smp 1 and it reached rumpdisk, but got general protection fault at all_intrs+0xf apic_get_current_cpu... i think its because it received a timer interrupt during a timer interrupt
<damo22>do i need to mask the lapic timer instead of calling cli
<damo22>i think the timer interrupts are happening too fast
<damo22>the handling of interrupts cant keep up
<Guest67>Damo22: what does Darwin and Linux do in the timer interrupts to avoid that? Can you just program the timer to be much slower?