IRC channel logs
2023-02-14.log
back to list of logs
<damo22>is this the correct state for idle threads? <damo22>maybe the scheduler thinks the cpus are all running and doesnt select anything for them to run <gnu_srs1>youpi: Can you resend the mail about go/unix.TIOCGETA I cannot find it. <youpi>damo22: the idle thread is always running, yes <youpi>gnu_srs1: about golang build issues, note that the ‘unix.TIOCGETA’ issue is already solved, it's just that debian got a newer package that dropped the cherry-picked patch <damo22>i put printf in the idle_thread_continue loop, only one of them is running on one cpu <damo22>but the rest of the cores are idle <youpi>do they get clock interrupts ? <youpi>idle should be getting interrupted by the clock at least <damo22>1 acpi (f59a8dd0): (f5998bd8) ..SO..(thread_bootstrap_return) i need to figure out what happens after this <youpi>did you set debug_all_traps_with_kdb? <damo22>boolean_t debug_all_traps_with_kdb = FALSE; <youpi>better enable it to see such issue <damo22>start acpi: kernel: Invalid opcode (6), code=0 <damo22>that makes more sense, its somehow hitting an invalid opcode and crashing the task <damo22>all the changes we have for smp are upstream <almuhs>have you checked the bound processor topic? <damo22>(20:06:16) damo22: start acpi: kernel: Invalid opcode (6), code=0 <damo22>(20:06:16) damo22: Stopped at 0x80eb880: ??? <damo22>i turned on debug_all_traps_with_kdb and now it throws a debug trap <damo22>i havent looked at bound_processor yet <damo22>it seems to be switching to a bogus thread? <almuhs>when youpi1 and me was checking, some years ago, ext2fs has a thread that never was assigned to a cpu <damo22>almuhs: it hits an invalid opcode and crashes the task <almuhs>then you probably have to debug it using gdb <damo22>i dont know how to debug more than kernel task with gdb <almuhs>you will have to set a remote debugging session with gdb, and put a breakpoint in this <youpi1>it won't be live, but at least you know what's at that address <youpi1>(possibly the binary is just not getting properly mapped into memory) <damo22> 80eb880: 66 0f 6e c0 movd %eax,%xmm0 <almuhs>then you have to find the function who execute this instruction <youpi1>damo22: possibkly it's the fpu state which is not properly managed among cpus <youpi1>notably, enabling mmx etc. instructions <youpi1>that probably has to be done on APs as well <almuhs>maybe setting a simpler cpu in qemu we can to check it this is the problem <youpi1>which is suposed to be called on each cpu <youpi1>almuhs: without fpu, userland will just not work <youpi1>it has always assumed to be there <almuhs>yes, i refered to an cpu which has not mmx instructions ? <damo22>almuhs: we would have to recompile userland with no mmx <almuhs>wait me 10 minutes. My class has finished and i have to left the classroom <youpi1>not only no mmx, userland will still try to use floats <damo22>ext2fs: part:2:device:sd0: No such device or address <youpi1>now you need to use rumpdisk etc. <almuhs>i can try it in a real machine, but i need prepare some things <almuhs>what processor is executing each process <damo22>it hangs randomly while bootstrapping the first tasks <almuhs>i remember a similar problem some months ago <almuhs>it was related with stack reserve, if i remember well <youpi1>it'd probably be good to make an extensive review of the code to check that things are locked <damo22>with -smp 2 it got to Hurd server bootstrap: ext2fs[part:2:device:wd0] exec startup proc auth. <damo22>and hanging there with 200% load <almuhs>my error of some months ago was when i was using the harddisk <almuhs>by example, using apt, some times i had kernel panic <damo22>Stopped at pmap_put_mapwindow+0xd2: jmp pmap_put_mapwindow+0xbe <damo22>pmap_put_mapwindow(c10a84cc,0,1000,f8458760,0)+0xd2 <damo22>pmap_zero_page(6f4c4000,f5efb6a8,f599bd3c,f6119a50,0)+0x6d <damo22>vm_page_zero_fill(f8458760,f6119a50,0,c105e44e,f5997ec8)+0x19 <damo22>vm_fault_page(f6119a50,0,3,0,0,f599bddc,f599bde0,f599bde4,0,0,f599bddc,f599bdd0) <youpi1>damo22: did you take the addition of simple_lock there? <damo22>@@ -258,6 +259,7 @@ cpu_setup(int cpu) <damo22> machine_slot[cpu].cpu_subtype = CPU_SUBTYPE_AT386; <damo22> machine_slot[cpu].cpu_type = machine_slot[0].cpu_type; <damo22> cpu_launch_first_thread(THREAD_NULL); <damo22>* d6ff5ba7 (HEAD -> master, zammit/master, origin/master, origin/HEAD) linux: Fix non-SMP build <damo22>i havent committed the init_fpu() change yet <almuhs>i will try to compile upstream with smp <damo22>i dont think the (pmap == kernel_pmap) change is very good, it sends many many cpu updates, is there a way to reduce them? <youpi1>if the kernel mapping changes, all cpus have to be aware of it <youpi1>aka: don't blame the change, blame the code that triggers the case <almuhs>maybe it's necessary some optimization <damo22>but if it sets the cpus_active and cpus_using on the pmap, cant it already know <youpi1>just to make sure: do you know about TLB? <damo22>i think its updating even though nothing is changing <youpi1>then look for the callers of signal_cpus <youpi1>it's supposed to be called only when something changed <almuhs>damo22: what are the new configure flags for smp? <damo22>in pmap_put_mapwindow, should there be a PMAP_READ_LOCK()/UNLOCK() around the PMAP_UPDATE_TLBS() ? <youpi1>probably better replace the separate slock with the mere PMAP_WRITE_LOCK/UNLOCK <almuhs>youpi: i have many syntax error when i try to "make gnumach.gz" from Debian GNU/Linux <gnu_srs1>youpi1: Can you just resend the mail to me to bug-hurd about unix.TIOCGETA I cannot find it. <youpi1>what I pasted above is all I can remember <youpi1>almuhs: "many syntax error" is not enough of a bug report <damo22>almuhs: autoreconf -fi && mkdir build && cd build && ../configure --enable-apic --enable-kdb --enable-ncpus=8 && make gnumach.gz <youpi1>probably you need to upgrade your mig <almuhs>yes. I forgot to add testing repositories in my new machine <almuhs>i have testing repositories, i don't know why apt only offers the version from stable <youpi1>you can force with apt install mig/testing <damo22>i'll send in the patch for init_fpu tomorrow, bedtime <almuhs>waiting to finish Debian GNU/Hurd installation <almuhs>i'm testing in a real machine (thinkpad r60e) without rumpdisk. It's really slow <almuhs>i can enable rumpdisk in a real machine: the latest Debian GNU/Hurd installation image crashes when try to boot from dvd <almuhs>i go to upgrade my installation, and repeat the test <Pellescours>I think before having smp + rumpdisk, we need to ensure that interrupts on apic works correctly <Pellescours>when I last tested the rump+apic, irq were lost and I got messages similar to what almuhs just had <almuhs>now i go to try again in real hardware <fowler>On Debian I have deb-src lines, but for some packages like netdde I'm unable to do an apt-get source. I can install the binary version 0.0.20200330-11 no issues, but the source doesn't seem to be in the package repos for some odd reason. Can anyone reproduce this? <youpi1>that's a shortcoming of debian-ports indeed <youpi1>you can however run "debcheckout netdde" to get the git repo <fowler>Thanks for that. Will do, thanks :) <fowler>W: Unable to locate package netdde <youpi1>ah, perhaps debcheckout uses deb-src too <youpi1>anyway it's just at the same place other hurd debiain packages <youpi1>https;//salsa.debian.org/hurd-team/netdde.git <fowler>The reason I'm poking around at this is because I've done something similar to netdde before. Prof Andrew Tanenbaum hired me 10 years ago to port a linux kernel driver to Minix3. He wanted a gigabit card working on the system for real hardware workstations and chose the Broadcom 5752. The linux kernel driver was ~ 13,000 LOC, but I got it working and managed to get really decent speeds out of it. <fowler>It was a really fun project and I've been playing with Hurd recently so I thought I'd take a look at the current driver ports <youpi1>netdde is not really "current", we'll rather head to rump <youpi1>we haven't worked on the network part of it since we have netdde which is fine enough for various needs, but long-run we'll rather use rump for network too <fowler>The concept of the rump kernel is definitely appealing, but taking linux drivers "off the shelf" and reusing them looks like it will support a lot more hardware. <youpi1>except that "reusing" is terribly not easy <fowler>I do agree, afterall, I do have some experience in matter :D <fowler>The "terribly no easy" part may be enjoyable to me. I did enjoy the challenge the last time <youpi1>and then you have the maintenance part <youpi1>I've seen such projects die one after the other <fowler>Oh yes, very true of course. Such is the life of a volunteer developer and a volunteer dev team. People kind of expect the project to be taken on by the dev team after they've done their "fun part" which is in many ways not fair or practical <fowler>For now I'm just poking around at things to find something challenging and fun and possibly practical. It may be that I end up doing nothing and it may be that I end up doing something challenging, fun AND practical.