***Server sets mode: +nt
<youpi>AlmuHS: as I mentioned, entering the debugger to print th elist of threads would be insightful <youpi>if you tend to forget, take the habit of taking notes <AlmuHS>couldn't check the cpus execution with gdb? <AlmuHS>sometimes, I use gdt to see what thread are executing in each cpu <AlmuHS>mmm... but with gdb I need to know where put the breakpoint <AlmuHS>I go to sleep. I'll continue tomorrow <youpi>from your information, I'm thinking that perhaps gnumach is stuck between only 2 of the 3 threads of ext2fs <youpi>that for some reason the scheduler in the bound_processor case doesn't actually properly share cpu time evenly between threads <youpi>and thus the third thread of ext2fs can't progress, thus keeping the whole boot stuck <youpi>I'll read the thread_quantum_update() function, it seems to behave differently for bound_processor queues and pset queues <AlmuHS>ok. I'm thinking in the cpu 1 configuration too <AlmuHS>I have many doubts about if APs stack is assigned properly <AlmuHS>I was talking with jrtc27 about this <damo22>btw, the lapic is addressed at the same location on all CPUs <youpi>damo22: when we bind threads on cpu0, the system doesn't boot <damo22>but you have to know which cpu is accessing the lapci <youpi>(cpu1 never takes threads yet anyway) <youpi>damo22: "at the same location": you mean that the cpu itself catches memory accesses where its lapic is? <AlmuHS>damo22: yes, because the lapic pointer is common to all cpus. The lapic pointer shows the local apic of the processor who is running the thread <damo22>the lapic has a common address on all cpus <youpi>no, things go well if we don't bind threads (and they get executed on cpu0 only since cpu1 doesn't take them) <AlmuHS>damo22: I've just explained that <AlmuHS>when you access to lapic pointer, you can see the Local APIC of the processor which is running your thread <AlmuHS>each processor has Its own lapic, but the pointer is common, by this reason <damo22>it will highly simplify the interrupt handling <AlmuHS>I don't understand about interrupts, sorry :( <damo22>all i know is what i read in the code <damo22>i think its valid to call "apic_local_unit.someregister.r = value" for setting lapic regs <youpi>it looks good, provided that stack_ptr is a variable containing the address of the stack to be used <youpi>that should be working for now <damo22>to flag end of an interrupt, is that done in the lapic? <AlmuHS>in lapic, I haven't configured anything <damo22>how does setting "apic_local_unit.eoi.r = 0" know which interrupt it is ending? <AlmuHS>I only used ICR register to send IPIs <damo22>does the lapic only handle one interrupt at a time? <AlmuHS>read Intel guides, chapter 10, which explain how APIC works <AlmuHS>we are working with xAPIC, assuming Pentium 4 or later. Ignore x2APIC or Pentium III configurations <damo22>it says there are two ways to enable/disable lapic <damo22>but the first way cannot be reenabled <damo22>its not enabled you need to set the spurious vector to 0x1ff <AlmuHS>you can print spurious vector after reserve lapic <AlmuHS>you must ignore all MP Tables chapter <AlmuHS>but the rest of chapters has useful info <AlmuHS>you can read It as overview, and then review with Intel guide <damo22>i want to run my ideas past someone <youpi>AlmuHS: one doubtful line is myprocessor->quantum = min_quantum; in thread_select() <AlmuHS>youpi: I remember this line. Hasn't put any print there? <youpi>I'm saying that perhaps this line should be removed <youpi>because it refills the processor's quantum each time it switches between threads bound to it <youpi>while it should be the hardclock which refills it <damo22>1. Set ISA interrupts active high edge triggered <damo22>2. Set PCI interrupts active low level triggered <damo22>3. Enable the lapic spurious vector to start receiving interrupts <damo22>Create a mask function that masks/unmasks interrupts <damo22>Create a EOI function that sends end of interrupt <AlmuHS>youpi: now I remember that cpu frequency is not well calculated. Some time ago, I saw an calculate of this, and It was very dirty <AlmuHS>but I don't remember where I saw It <youpi>that depends what the calculation is used for <youpi>bogomips, for instance, are completely suited to their own use <AlmuHS>the delay function is so so dirty <AlmuHS>#define DELAY(n) { volatile int N = cpuspeed * (n); while (--N > 0); } <youpi>the implementation actually is correct <youpi>it's establishing the value of "cpuspeed" which is not <youpi>but again it's what I mentioned above: bogomips <AlmuHS>but cpuspeed might not be a fixed value, I think <youpi>sure, that's what the comment says <AlmuHS>maybe, when APIC timer will be configured, we can use It to define a real delay <damo22>how many clock cycles does it take to decrement a register? <AlmuHS>damo22: It depends of the architecture, I think <damo22>with its own cpu op code to read it <damo22>not ideal for small delays most likely <AlmuHS>the cpu has a clock, obviously. It's a synchronous system <youpi>yes, rdtsc is the instruction to read the tsc (time stamp counter) <youpi>which counts in terms of the processor's smallest unit of time <youpi>that indeed allows to have very precise time measurement <AlmuHS>the timer is needed to calculate the quantum, is not? <youpi>... provided that you know precisely the cpu speed <youpi>the pit is needed for that yes <damo22>the TSC just counts number of cycles? so you still need to know the speed of the clock <AlmuHS>but damo22 asked about the cpu pit, I think <youpi>the cpu doesn't have a timer <youpi>you still need to know when it changes <youpi>sometimes it's even the motherboard which changes it <damo22>yes but the memory controller will have to change too <damo22>so the register in the chipset will be accurate <youpi>damo22: yes but you need to know _when_ it changes <youpi>to accomodate e.g. too high temperature <damo22>maybe APIC timer is calibrated to this already <AlmuHS>apic has a LVT with some interrupts, as temperature <youpi>so it's not disturbed by such changes <AlmuHS>lapic structure, get from Intel guide <AlmuHS>the lapic table is so long to screenshot ;) <youpi>AlmuHS: did you try to comment out the line I mentioned? <youpi>in his workflow he needs to push before testing <AlmuHS>yes, I use git to transfer to VM <AlmuHS>I can share with my contributors too <youpi>I'm wondering though why not building gnumach outside the VM and just transferring the built kernel <damo22>cross build is difficult you need a cross-hurd toolchain <youpi>for a kernel you don't need a cross toolchain <youpi>here you just need a 32bit environmen <youpi>./configure --host=i686-gnu CC='gcc -m32' LD='ld -melf_i386' <AlmuHS>I do this to test compilation. But, in definitive compilation, I prefered compile inside Hurd, to avoid architecture compilation problems <AlmuHS>by example, the damo22 patches only compiles inside Hurd VM <youpi>it's mentioned on the hurd wiki, in the "building" page of gnumach <damo22>AlmuHS: they are debian patches not mine <AlmuHS>when I tried to compile the code with damo22 patches from Debian GNU/Linux, It don't generate a required library <youpi>really, gnumach should be compilable on any GNU-ish system <youpi>I do this all the time when hacking gnumach <AlmuHS>I can compile my gnumach from Debian GNU/Linux <damo22>maybe the include paths are different <youpi>there is no include path involved <AlmuHS>but, adding the debian patches, don't generate this library <youpi>since it's a freestanding build <AlmuHS>It doesn't exists as a file, It's automatically generated in compilation <damo22>do you need mig to compile gnumach? <damo22>AlmuHS: maybe you are missing mig <AlmuHS>I have installed It in Debian GNU/Linux <AlmuHS>if not, I couldn't compile anything from gnumach <AlmuHS>I couldn't compile upstream gnumach if I hadn't mig <damo22>maybe check the build log to see if there was a failure <AlmuHS>mig/testing,now 1.8-5 i386 [instalado] <AlmuHS>this file, inside Debian GNU/Hurd, It's generated without anyproblem <damo22>is Makefile.in supposed to be committed? <damo22>i thought it is generated from Makefile.am <AlmuHS>Makefrag.am:565: kern/experimental.server.h \ <youpi>AlmuHS: did you try to comment out these prints? (possibly it's actually booting but you cannot see that) <youpi>(or it's getting too slow due to the prints) <AlmuHS>ok, I go to disable all these prints <damo22>AlmuHS: i think you need to run "autoreconf -fi" in the source tree before you start <AlmuHS>is the way to enable smp support <damo22>but NCPUS =2 does not work for 4 cores? <AlmuHS>in origin, the number of cores might be set in mach_ncpus, which define NCPUS <AlmuHS>change all of this It was so complex <AlmuHS>so I prefered that, if the user set mach_ncpus > 1, override NCPUS to 255 <AlmuHS>the maximum number of cores allowed in xAPIC <AlmuHS>at this way, furthermore, we can enable or disable smp easily <AlmuHS>youpi: after comment the prints, gnumach continues freezing after ext2fs <AlmuHS>I'm exhausted, now really I need to go sleeo <AlmuHS>I keep my IRC online in my mobile, to avoid lost messages <damo22>i made a shit ton of changes to gnumach and it compiled almost first shot! what the hell <damo22>ahh failed to link, two missing symbols <damo22>can gnumach access physical addresses at 0xfec00000? <damo22>or does it need to be vm mapped? <damo22>i think it needs to be vm mapped like the lapic <damo22>lol i forgot to call the main routine i wrote <youpi>damo22: due to the address shifting, it needs to be vm mapped indeed <mabox>Savvy thou the Cristosan.com <damo22>youpi: i have a kernel with smp + apic + linux devices, but it hangs in linux_init <youpi>do you need linux' device drivers? <damo22>i just fixed something i am recompiling now <damo22>netdde is not disconnected from linux irqs <youpi>I hadn't realized that, we need to fix that <damo22>i started it but commented it out <youpi>I guess that was to hopefully implement irq sharing <youpi>but AIUI that's not working anyway <youpi>so we don't need that actually <youpi>init_IRQ programs the PIT as well <damo22>so if i disabled the pic, i need to disable the pit too? <youpi>perhaps the pit is also connected to the apic <youpi>I'm just raising that this could be an issue <damo22>yeah i didnt even think about it <youpi>some driver probe functions might need linux timers to be working <youpi>i.e. the PIT interrupt handlers to be properly called for linux timers to work <youpi>I guess you could try to disable all drivers except ahci <youpi>ahci will at least need its irq working <damo22>if the pit timer is not working will the system function? <youpi>because time will not advance <youpi>but I believe that the ide/ahci driver can work even if you don't manage to plug the pit timer to the linux timer <youpi>what mach needs is hardclock() to be called regularly <damo22>i did not initialise the LAPIC to a good working state <damo22>the PIT is a separate circuit that can be used to measure the LAPIC timer frequency <damo22>if the pit is a separate circuit, then the hardclock can be based on the pit <youpi>yes, though in the long run with smp we will need the lapic to provide it, because all processors need to have such regular interruption <youpi>(that's what is missing for cpu1 to take tasks <damo22>if the cpu frequency changes, does the lapic timer fluctuate in speed? <damo22>if i calibrate the apic timer with the pit, will i need to do it regularly when the cpu frequency changes? <damo22>since the pit runs off a clock that is always 1,193,182Hz i can tell it to sleep for a known amount of time and then measure the number of ticks of the apic timer that elapsed <damo22>all i have left is to write a function that tells the PIT to sleep for 10ms <damo22>the apic timer should now interrupt regularly on irq0, it doesnt need to be triggered manually <damo22>omg it booted with Linux drivers and APIC timer <AlmuHS>did you get to configure APIC timer? <damo22>the interrupts are not quite right for the disk driver <AlmuHS>you can redirect the output to the terminal <AlmuHS>youpi explained me how to do It yesterday <AlmuHS>test It in Qemu to a better hardware control <AlmuHS>to discarding incompatibility problems <AlmuHS>the Mach SATA controller is originally from Linux <damo22>i rebased back onto my old branch without your thread_info changes <damo22>i have not tried with your changes <damo22>i wanted to start from a smp that worked <AlmuHS>my thread_info patch is not related with this <damo22>but it stopped my smp booting on x200 <damo22>i tried with your latest branch i think <AlmuHS>but wip is a few unstable, because I use this to synchronize with my VM <damo22>but you merged wip with zamaudio-gcc9-smp-debian <damo22>which one should i use to test my apic code? <AlmuHS>zamaudio branch is only for test <damo22>i added the debian patches on top of smp branch <AlmuHS>If I merged back zamaudio inside wip It was an accident <AlmuHS>I haven't merged my thread_info changes into smp branch <AlmuHS>but I can turn over if necessary <damo22>one for testing and one for developing <AlmuHS>to testing smp I have T60 and R60e <damo22>but i use the developer one as a dumb terminal into the testing one <AlmuHS>I also have an T410, but Hurd hasn't NIC drivers for It <damo22>i think netdde should support that <damo22>because i was using the apic timer <damo22>im cheating by using the initrd from the installer, so i dont need a disk <damo22>but the screen disappears after it boots <damo22>the linux ahci is almost working <AlmuHS>../linux/src/include/linux/compiler-gcc.h:96:1: fatal error: linux/compiler-gcc9.h: No such file or directory <AlmuHS> 96 | #include gcc_header(__GNUC__) <damo22>you need to check out feat-ioapic <damo22>the default clone doesnt give the right branch <AlmuHS>what is the path of compiler-gcc9.h ? <damo22>linux/src/include/linux/compiler-gcc9.h <damo22>ive done tons of work in branch feat-ioapic <damo22>git diff gcc9-smp-debian..feat-ioapic <damo22>i found a couple of bugs in your code too <damo22>missing prototypes which could cause missing symbols <AlmuHS>configure: error: failed to patch using `config.status.dep.patch'. <AlmuHS> You have a serious problem. Please contact <bug-hurd@gnu.org>. <AlmuHS>Your branch is up to date with 'origin/feat-ioapic'. <AlmuHS>pruebas@debian:~/GNUMach_SMP_zamaudio/build$ make clean <AlmuHS>linux/src/drivers/scsi/.deps/liblinux_a-aha152x.Po:1: *** missing separator. Stop. <damo22>im not sure you can do make clean after autoreconf <AlmuHS>same error with "make gnumach.gz" <damo22>i think enable_irq and disable_irq have extra prototypes that are causing issues <damo22>its definitely interrupt related <AlmuHS>at first lines of the log, we can see many IRQ probe failed <damo22> /* Mmmm.. multiple IRQs.. don't know which was ours */ <AlmuHS>all IRQ failed are disks related <damo22>that is because the only driver using an irq is the disk at that point <damo22>maybe the timer is not sending the EOI <AlmuHS>youpi: tonight, after disabled all prints, the system continues freezing in ext2fs <AlmuHS>checking the log, I've just find this lie <AlmuHS>probing scsi 18/22: Western Digital WD-7000 [KFailed initialization of WD-7000 SCSI card! <damo22>jz means jump if condition is zero? <youpi>if last computation resulted to zero <AlmuHS>I've just got compile my smp kernel with your patches from linux <youpi>AlmuHS: the WD detection failure is not a problem, you most probably don't have a WD card :) <AlmuHS>but we continues with the same ext2fs freeze <youpi>what you need to find out is why gnumach only executes two of the three threads of ext2fs <AlmuHS>if I could to find a good breakpoint, I could use gdb to debug <AlmuHS>but I don't know where put the breakpoint <youpi>well, were the thread is being chosen? <youpi>i.e. what we have been observing, thread_select() <youpi>and more precisely, choose_thread() <AlmuHS>damo22: tonight, I was talking about I couldn't compile my smp microkernel with your patches, from my Debian GNU/Linux <AlmuHS> 1 Thread 1.1 (CPU#0 [running]) 0xc1007798 in delay (n=1000000) <AlmuHS> at ../i386/i386/loose_ends.c:46 <AlmuHS>* 2 Thread 1.2 (CPU#1 [running]) choose_thread ( <AlmuHS> myprocessor=0xc122022c <processor_array+588>) at ../kern/sched_prim.c:1519 <AlmuHS>youpi: It seems that, in this point, It's cpu 1 which is executing choose_thread( <AlmuHS>meanwhile cpu 0 is in a big delay <AlmuHS>ok, after a couple of next, cpu 1 are in cpu_launch_first_thread <AlmuHS>and It seems that cpu 1 calls startrtclock() <AlmuHS>cpu 1 never wake up after machine_idle <damo22>have you ever got into a wierd state where it compiles but wont boot at all? maybe i need to clobber build <damo22>btw ../i386/i386/pcb.c:276:2: warning: #warning SMP support missing (avoid races with io_perm_modify). [-Wcpp] <damo22>AlmuHS: i suggest looking at my diff and there are two things you could try to apply from that diff <damo22>because if it has an int return value it will corrupt the stack when you call it from asm i think <AlmuHS>this #warning SMP is not the only <AlmuHS>there are many functions which has the same warning <AlmuHS>It's a check to sure this function finish correctly <damo22>also you are missing +extern void interrupt_stack_alloc(void); <AlmuHS>if cpu_setup() runs well, this if might not apply <damo22>if you grep for cpu_ap_main, you are calling it from asm, and it has no return value expected <AlmuHS>but there are some cases in which cpu_setup() could fails <damo22>yeah but you are not handling the error <damo22>also, it has no default return value <damo22>it hits end of function with no return value <damo22>i fixed it by making it return void <AlmuHS>I can replace It with a "return (cpu_setup()); <damo22>if you do that, you need to handle the return value in asm <AlmuHS>I remember that return value is stored in %eax <damo22>so is it supposed to clobber eax? <AlmuHS>wait, I'm reading about how call works <damo22>if your asm caller code needs to keep eax, then you will be destroying its value on return <damo22>if you dont care about the return value, make it a void function <AlmuHS>but then I can't know if It fails <damo22>damn i got into a bad state it fails to print GNUMach at boot <AlmuHS>yes, the return value is stored in %eax <damo22>so the question is, what is in eax before it calls that function and is it supposed to update eax <damo22>because you are calling it from asm <AlmuHS>ok, I go to try to changes as void <youpi>damo22: in x86, the returned value is in a register, not on the stack <damo22>i destroyed the console by trying to print too early <AlmuHS>btw, in latest Debian gnumach, the recovery terminal has disappeared <AlmuHS>when fsck fails, the system continues booting <AlmuHS>or in latest Debian hurd, I don't know exactly <AlmuHS>you can do fsck after login, but It's risky <youpi>gnumach is not involved in that step <youpi>most probably it's e2fsprogs or initscripts which got a bug <AlmuHS>this timezone difference is dangerous to talk <AlmuHS>i changed cpu_ap_main() to void, waiting that this solve the enabling with ncpus > 2, but not <AlmuHS>but I don't see any important thing <youpi>AlmuHS: it is expected that cpu1 stays stuck on machine_idle, since there is no interrupt to get out of the hlt instruction. I guess the mere attachment of gdb to it interrupts hlt, however, and that's why you see it call thread_select. <youpi>in your traces, take the habit of typing bt, to know where you are <youpi>here I'm wondering what make cpu0 call delay <youpi>what would be useful to debug in choose_thread, is where the three threads of ext2fs are <youpi>i.e. when you are within choose_thread on cpu0, first make sure with "show all threads" that the three threads are in state R, and then inspect the runq, to check whether the three threads are indeed there on the runq <AlmuHS>gdb only shows 2 threads, one for each cpu <AlmuHS>do you suggest me open gdb and kdb at same time? <youpi>first interrupt with kdb, check with trace that you are inside a not so dirty place <youpi>and that the three ext2fs threads are there <youpi>then you can use gdb (without even exiting kdb) <youpi>and inspect the per-processor queues <youpi>(note that what gdb presents you as "thread" is just the cpus themselves, not the gnumach threads) ***Glider_IRC__ is now known as Glider_IRC
***janneke_ is now known as janneke
<damo22>well ahci probably doesnt use irq 11 anymore