IRC channel logs

2024-04-05.log

back to list of logs

<almuhs>i suspect that rumpdisk has race conditions
<almuhs>in smp with damo22 patch, sometimes freeze during the boot, in ext2fs step
<almuhs>or even before, during disk detection
<Pellescours>No it’s not race condition, It triggers in non smp mode. It’s relates to the pager
<Pellescours>But, there may be race condition also
<youpi>smp probably makes the issue more probable
<almuhs>now i'm compiling rumpkernel with default debian sources from here https://people.debian.org/~sthibault/tmp/unreleased/rumpkernel_0~20211031+repack-4.dsc . When I boot the VM with my modified rumpdisk, the disk is not detected. So I can check if the problem is mine or is in the default code
<almuhs>i'm compiling the default code to be sure that the problem is not from this code
<Pellescours>If you run stress program (stressing either disk or memory) with rumpdisk make the VM freeze completely. I started to investigate with solid_black but I don’t know what’s the solution here
<almuhs>meanwhile i compile the code, i found some TODO and "function not implemented"
<almuhs>i found an error in the compilation process https://pastebin.com/eEfGAXuF
<youpi>it should be there in i386/include/mach/i386/machine_types.defs:type rpc_phys_addr_array_t = array[] of rpc_phys_addr_t;
<almuhs>in gnumach source code?
<almuhs>there is in the def, but not in the headers
<almuhs>pruebas@debian-hurd:~/rumpkernel-0~20211031+repack$ grep -rn rpc_phys_addr_array_t /usr/include/i386-gnu/mach/
<almuhs>/usr/include/i386-gnu/mach/gnumach.defs:209: out pages : rpc_phys_addr_array_t);
<almuhs>in gnumach's upstream source code, the type is defined in i386/include/mach/i386/machine_types.defs:105:type rpc_phys_addr_array_t = array[] of rpc_phys_addr_t;
<almuhs>and in i386/include/mach/i386/vm_types.h:97:typedef rpc_phys_addr_t *rpc_phys_addr_array_t;
<youpi>yes, and you need both installed
<almuhs>copying the files manually?
<youpi>depends how you did it for gnumach.defs, you probably want to do the same
<almuhs>currently I have the latest debian version of gnumach
<almuhs>**gnumach.defs
<youpi>yes, but you need the newer versions of the other files that gnumach.Defs needs
<almuhs>but, if i copy the newer versions, maybe there are compatibily issues
<youpi>possibly
<youpi>you can try to copy/paste only the lines you need
<almuhs>ok, after copy the definitions of rpc_phys_addr_t , I got compile rumpkernel and compile hurd with it. Now I am sure that the latest rumpkernel debian's sources works correctly
<almuhs>other day i will try again with my new patches sources to add the prints
<almuhs>notice that I had to "make clean" before "./configure --enable-static (...) && make rumpdisk" to recompile rumpdisk.static
<youpi>sneek: tell almuhs later that one can cd rumpdisk ; make rumpdisk.static to build a static version
<sneek>almuhs, youpi says: later that one can cd rumpdisk ; make rumpdisk.static to build a static version
<youpi>eh?
<youpi>sneek: later tell almuhs one can cd rumpdisk ; make rumpdisk.static to build a static version
<sneek>Okay.
<solid_black>hi
<solid_black>Pellescours: but we did complete the investigation, didn't we
<solid_black>we don't have a good story about paging w/ rumpdisk, at least not one that I'm aware of
<solid_black>pageout, I mean
<solid_black>unless we can guarantee that rumpdisk can write pages being paged out without itself needing dynamic memory, I guess
<Pellescours>solid_black: yes
<solid_black>youpi: ping?
<youpi>pong
<solid_black>hi
<solid_black>I've been meaning to ask, what are *your* plans concerning aarch64-gnu?
<solid_black>are you going to review things? (the glibc port, the gnumach port)
<solid_black>should I just dump the gnumach port as one large patch?
<solid_black>do you have time for this?
<youpi>for the asm part I can't really review, I don't know much arm
<youpi>for the gnumach part I'm fine with committing it, provided you plan to support it :)
<youpi>for the glibc part, I can commit the parts that are very similar to the linux versions
<youpi>for more involved parts of glibc it's difficult for me to review
<solid_black>the situation is, I'm kind of out my breath with this
<solid_black>and I want more people to look at it, build it, play with it, etc
<solid_black>and looks like the only way that is going to happen is if it's upstream
<youpi>we can commit the gnumach part for a start, since it's already working quite well
<youpi>we're missing the gcc part, though, it needs to be pinged
<solid_black>the glibc part is actually not that complicated, and you don't need much asm knowlege
<solid_black>also Maxim K. of Linaro / glibc told me they (Linaro? glibc?) were going to review it
<youpi>ok, so we're "just" missing the gcc part
<youpi>oh good
<solid_black>but he evidently knew very little about the Hurd/Mach internals (i.e. didn't know what MIG is)
<youpi>sure, that's fine
<solid_black>apparently Linaro is intrigued
<youpi>that part I can probably review
<youpi>it's rather the setcontext/swapcontext and such asm tricky stuff that I'm not at ease with, but he would be
<solid_black>setcontext I don't think I have implemented :D
<solid_black>longjmp, I did
<youpi>heh :D
<youpi>but you get the idea
<solid_black>by copying the Linux version and adding the Hurdish onstack bits
<solid_black>but it needs more testing for sure
<youpi>sure but we can commit something that works already quite well, before going through more testing
<solid_black>for the gnumach port, can you review a GitHub branch, or would you rather I post it as a patch to bug-hurd?
<youpi>it'd be simpler to get a series on the list
<solid_black>'series' is complicated, I don't really know how to split this
<youpi>a bit patch is fine
<solid_black>well some things like the device tree parser could be split out I guess
<youpi>as long as it's just additions
<youpi>(or trivial e.g. #if changes)
<solid_black>what about Debian, do you have any plans for adding the arch / building packages / etc?
<youpi>I actually already had submitted the request for arch addition
<solid_black>I saw that yeah
<youpi>the reply was that it needs to be commited upstream at least a bit :)
<solid_black>but it went nowhere
<youpi>it's not "nowhere"
<youpi>it's just that debian doesn't want to maintain patches
<youpi>so it insists on work being upstreamed, before downstreaming the support
<solid_black>why do all these projects insist on other bits being upstreamed
<solid_black>you have to start *somewhere* with upsteaming
<youpi>yes, and that's the toolchain
<solid_black>good thing at least binutils didn't want us to have a fully functionaning system first
<youpi>sure
<youpi>since they can test some support without needing a system, actually
<solid_black>Linaro wants to know how they could test it on their CI
<solid_black>I told Maxim that eventually, there'd be Debian, and they could run it in qemu and run the testsuite
<solid_black>it sounded like this wasn't a requirement for upstreaming into glibc, but imagine it was
<solid_black>then it'd be a deadlock between debian and glibc
<youpi>glibc is fine with not running the testsuite itself
<youpi>they just want to be able to cross-build and run the cross-build test
<youpi>which thus only depends on the toolchain
<youpi>really, it's just toolchain -> glibc -> distrib
<solid_black>when reviewing, are you going to bootstrap/build all of this on your end?
<youpi>not for just the review before commit
<youpi>but as soon as debian adds the arch, I'll try to bootstrap the arch, and that'll bring the buildability question indeed :)
<solid_black>ok
<solid_black>so I should 1. ping Thomas again 2. post v3 of glibc port 3. try to post Mach port patches
<youpi>2 and 3 can be in parallel
<solid_black>I only have a single brain core :)
<solid_black>no SMP here
<solid_black>it's really that the way forward is to 1. build/bootstrap more things in userland 2. for people other than myself to review / build / play with aarch64 gnumach, and suggest/do improvements
<solid_black>to make hardware support more serious
<solid_black>oh, I forgot the important / problematic part
<youpi>sure, but that'safter the initial work is committed
<solid_black>bootstrap / multiboot
<youpi>so gcc then glibc and gnumach
<solid_black>the way it's bolted on currently in the aarch64 branch probably breaks i386
<youpi>(though I'm fine with committing gnumach before gcc)
<solid_black>somebody just needs to look at it and make it work :|
<youpi>ah, that's a problem :)
<solid_black>also, unrelated, reminder that v2 of the gnumach arbitrary write exploit is still unfixed
<solid_black>we need to do something about that
<youpi>I don't have details about it
<solid_black>that's because I haven't shared them, sure :)
<solid_black>the good thing is, we can land public API in Mach (similar to the patch I posted back in January), and that should be enough for glibc/hurd
<youpi>indeed, we can already commit that if it's ready and doesn't break x86
<solid_black>it is ready (missing aarch64_debug_state, but that's only going to be needed for GDB), and doesn't break anything
<solid_black>and no incompatible changes are expected
<solid_black>well, also any potential SVE / SME state, but we won't support that for now
<youpi>solid_black: you mean the january series is unchanged?
<solid_black>no, there have been minor changes
<solid_black>it's very similar, but still ncompatible
<solid_black>e.g. I've renamed EXC_AARCH64_FP_ID to EXC_AARCH64_IDF
<solid_black>because I mostly started by copying what Apple provided (plus some consulting of the Arm manual)
<solid_black>and now I've done *a lot* more reading of the ARM ARM, and got a lot more experience with aarch64 in practice
<solid_black>so I decided it makes more sense to call the various bits what ARM docs call them, rather than what would make more sense to someone who's not familiar with ARM
<solid_black>here's something that should be possible, and that I will maybe do some time when we have a more complete system: basic Linux syscall emulation, in userland, off-task
<solid_black>so a different task would set itself as exception handler, and get syscalls (EXC_AARCH64_SVC) and faults, and decode and translate them to Hurd-native
<youpi>you mean porting qemu-system-user ?
<solid_black>no
<youpi>why not?
<solid_black>I mean, you could do that too, that's unrelated
<youpi>that's in the end less work, too
<solid_black>maybe
<solid_black>what I'm talking about wouldn't do any recompilation, it would run the binary as-is, just translating syscalls
<youpi>that's what you would get with qemu+kvm
<solid_black>kvm is a huge other beast
<youpi>if you run the binary as-is, you need to catch system calls etc.
<youpi>doing so without using svm/vmx is probably quite a beast
<solid_black>yes, that's exactly what I'm talking about
<youpi>when svm/vmx is exactly meant for that
<solid_black>no it's not, that;s my point
<youpi>it's not?
<solid_black>Mach traps very very intentionally made negative
<youpi>so you catch illegal instruction faults?
<solid_black>not by my, by original Mach designers
<solid_black>erhg, I can't type
<solid_black>s/very very/were very/
<solid_black>s/by my/by me/
<solid_black>I have an exception type, EXC_AARCH64_SVC, that gets raised when you try to perform an SVC instruction that doesn't look like a valid Mach trap
<solid_black>the linuxinator task would handle that by emulating the syscall
<youpi>anyway, my point is: porting qemu-user makes sense and is useful on the long run. Doing something similar is probably very fun, but not clear it'll have the same impact long-term-wise
<youpi>(we could also do the converse, btw: port qemu-user to support the gnumach calls)
<solid_black>sure, I'm not saying this is better than a potential qemu port, or will be useful, or anything
<solid_black>just that it should be possible, and fun, and that I might do it
<solid_black>supporting Mach calls under qemu-user is probably a lot more complicated
<solid_black>whether on Linux or on Mach
<solid_black>does Debian enable PAC & BTI for aarch64?
<youpi>I don't know
<solid_black> https://wiki.debian.org/ToolChain/PACBTI
<solid_black>that sounds like it does since recently
<almuhs>my ahcisata_pci.c patched works
<sneek>Welcome back almuhs, you have 1 message!
<sneek>almuhs, youpi says: one can cd rumpdisk ; make rumpdisk.static to build a static version
<almuhs> https://pasteboard.co/GnAkz6XRu8lu.png
<almuhs> https://pasteboard.co/yyuZIxi51Hvf.png
<almuhs> https://pasteboard.co/5HSPIFrwXsPa.png
<almuhs>now i will copy pci_map.c patch
<solid_black>wrote way too many words in the commit message of the VM entry deadlock commit patch, as requeste
<solid_black>d
<solid_black>now you'll have to read through it :D
<youpi>your future self will thank you so much 10yrs from now when you'll dig back into this question ;)
<Pellescours>:D
<almuhs>pci_map.c patch works too
<almuhs>now i will have a few more info when i try rumpdisk in real machines. I only have to copy the deb files and the rumpdisk.static and copy all in the harddisk to test
<almuhs>**copy and install
<solid_black>...and of course I made many typos in said commit message
<youpi>smp is supposed to be working with rumpdisk, right?
<youpi>it's getting stuck at:
<youpi> Enter evaluation : _SB.PCI0.SC0._ADR (Integer)
<youpi> Exit
<solid_black>it works for me, but I have 0 idea how to debug any rumpdisk issues
<almuhs>in my case, that error appears when i use an IDE disk with noide
<almuhs>or when i forget change "wd0 noide" to "hd0" in GRUB when I have a IDE disk in /dev/hd0
<youpi>I might be doing that indeed
<solid_black>uhh, what?
<solid_black>you're *supposed* to put in noide to use rumpdisk
<solid_black>noide doesn't mean "I hate IDE", it means to disable Mach's Linux drivers, to give rumpdisk a chance to drive the IDE
<almuhs>pixide (the IDE driver in rumpdisk) is buggy
<youpi>ok, after disabling the ide CD, it does boot
<almuhs>i don't know, anyone told me here some time ago
<youpi>(almost)
<almuhs>some weeks ago, someone sent a patch which solve CD with size 0
<youpi>does pfinet work in smp ?
<almuhs>so in upstream this problem is solved, i think. But in debian's gnumach, it's necessary to put in a cdrom in the drive to boot
<almuhs>dhcp works in smp
<almuhs>and i got network
<solid_black>does it?
<youpi>my boot gets stuck after Starting system message bus: dbus.
<almuhs>yes
<solid_black>do you mean full smp, or damo22's mode where everything runs on a single cpu?
<youpi>usually next step is Starting internet superserver: inetd.
<almuhs>yes, sometimes freeze during the boot
<youpi>solid_black: I'm using the current master, iirc that runs on a single cpu
<solid_black>yes
<almuhs>with old damo22 patch which modifies the scheduler, it needs "only" 2 or 3 attempts to get boot
<solid_black>so there's almost no reason for that to work differently than non-SMP
<solid_black>(other than enabling the SMP codepaths in Mach)
<solid_black>there's no additional concurrency introduced etc
<almuhs>without this patch and enabling all cpus during the boot, it was necesary more than 20 attempts to get boot
<almuhs>i don't know if latest patches to fix some race conditions improve it a bit
<youpi>ok, I don't think the SC0 issue was about ide: it is still hanging without it
<youpi>I just got lucky right after disabling the ide cd
<solid_black>with damo22's patch reverted and my patches that I posted, my system booted to dhcp, and then hung
<solid_black>with all cpus being idle
<almuhs>try again
<almuhs>try a couple attempts
<solid_black>and I didn't think to ask kdb about where they're blocked
<solid_black>also I don't yet fully understand how kdb works, so aarch64 doesn't really support it yet
<almuhs>i simply turn on the VM, fix filesystem and try again
<solid_black>yeah, I'm so not going to do that
<solid_black>I should make snapshots instead, so fs as stored doesn't get corrupted
<youpi>note: the SC0 issue seems related to interrupts
<youpi> Enter evaluation : _SB.PCI0.SC0._ADR (Integer)
<youpi> Exit irq handler [9]: new delivery port f64cf3d0 entry f5639ec0
<youpi>which I'm really not surprised of
<youpi>the interrupt delivery path most probably didn't get completely cleaned
<youpi>I disabled the hurd console, it went further, almuhs: do you use the hurd console?
<youpi>do apic-enabled kernels work fine (without smp), for a start, actually?
<almuhs>yes, i use hurd-console
<almuhs>because i configure keymap in spanish
<almuhs>to test smp with minimal race conditions, i keep this patch https://git.zammit.org/gnumach-sv.git/commit/?h=fixes&id=0fe92b6b52726bcd2976863d344117dad8d19694
<almuhs>and disable the other patch which set all process to cpu0
<almuhs>it's a temporary solution
<almuhs>sometimes i need a few attempts to get boot
<almuhs>with upstream code in smp, with the patch which assign all to cpu0, i usually get boot without problem
<youpi>better avoid these patch which only burry the issue, to get the issue all the time and fix it :)
<youpi>assigning all to cpu0 is not yet the default in upstream?
<solid_black>it is
<almuhs>yes
<youpi>I don't understand why you say "with the patch which assign all to cpu0" then
<almuhs>upstream smp
<youpi>what do you mean by upstream smp ?
<almuhs>compiling upstream with NCPUS > 1
<solid_black>commit aadb433981b086bfb4e082757fed1154582d5497 fwiw
<almuhs>gnumach's upstream source code, compiled with --enable-ncpus > 1
<almuhs>i use this script to compile gnumach, btw https://github.com/AlmuHS/gnumach_dev_scripts/blob/main/compile_scratch.sh
<youpi>(btw it'd have been good to submit a patch that disables the linux group and enables apic, instead of letting me have to figure it out)
<almuhs>i put this flags under your instructions
<almuhs>so many time ago
<youpi>that doesn't mean that we want to have to do that longtermwise
<almuhs>btw, this script also works changing the SRC_PATH to the directory where i git clone the gnumach's savannah directory
<youpi>ok, disabled inetutils-inetd too, and now it boots
<youpi>as well as lightdm
<almuhs>:)
<youpi>ah it seems it's lightdm which poses problem
<almuhs>lightdm never worked for me
<almuhs>since many years ago
<youpi>I'm not saying it doesn't work, I'm saying it hangs the boot
<almuhs>oh, ok
<youpi>while in up it doesn't pose any problem
<youpi>(it doesn't work either, but doesn't bother th eboot)
<solid_black>speaking of the Hurd console
<youpi># update-rc.d enable ssh
<youpi>hangs...
<solid_black>from what I've read, it doesn't sound like there's an easy way to get a text-mode console in the aarch64 world (or anywhere but x86)
<youpi>solid_black: but there's a serial port, isn't there?
<youpi>so gnumach at least has a console
<almuhs>youpi: but, which is your kernel configuration? upstream's gnumach?
<youpi>and then you "just" need a frame buffer client for the console :)
<youpi>almuhs: yes
<solid_black>so so far, it sounds like Mach console is going to always be going to a serial port, and the Hurd console would be rendering characters to any graphics displays
<almuhs>ok
<solid_black>but this also means that you won't see kernel boot-up logs scroll by
<almuhs>i go to make git pull and compile again upstream
<youpi>solid_black: not really a problem for non-dev users :)
<youpi>ok, ssh does pose problem too in smp boot
<youpi>still, I'd really say to fix the rumpdisk booting issue first, since it's a very deterministic one
<almuhs>it's very important to get rumpdisk detect real hdd
<Pellescours>for the IDE rumpdisk issue, note that it’s not related to apic nor SMP. If you use IDE with PIC and 1 cpu, the problem will occur
<almuhs>upstream's gnumach compiled with my script, and in a VM with AHCI and rumpdisk, work successfully
<almuhs>this latest test has been done with a real HDD connected to my VM. I go to put this HDD in a Thinkpad T440p, to check the results of my rumpdisk prints (probably it will not detect the hdd)
<youpi>Pellescours: I never got this issue, in which condition are you reproducing it?
<youpi>almuhs: but with damo22's workaround, right?
<almuhs>not, with upstream compiled with my script: https://pasteboard.co/OIrBGIKe8EN4.jpg
<youpi>ok, but then try to reproduce it with qemu, and there you'll be in a situation to fix it
<almuhs>in qemu works sucessfully
<almuhs>with the same hdd
<solid_black>oh, you mean _real hardware_
<youpi>"same hdd", with qemu?
<almuhs>yes
<youpi>I don't understand
<youpi>how can it be the same hdd
<almuhs>i used a real harddisk
<youpi>with qemu?
<almuhs>i connected this harddisk to qemu, installed Debian GNU/Hurd with all the necesary
<youpi>ok, so I have to be explicitly clear: how do you plug it?
<youpi>that's my point
<almuhs>and after this, i disconnected harddisk and put in in a Thhinkpad T440p
<almuhs>with a USB-SATA adapter
<youpi>no, to qemu I mean
<youpi>gnumach doesn't care how it's plugged to the computer
<almuhs>file=/dev/sdb
<youpi>ok
<youpi>through an ahci controller?
<youpi>with media=disk?
<almuhs> https://github.com/AlmuHS/gnumach_dev_scripts/blob/main/qemu-hurd.sh
<youpi>all details do matter
<almuhs>change FILE=/dev/sdb
<almuhs>and chmod 777 /dev/sdb
<almuhs>to get qemu has permission to access
<solid_black>so that it's fully accessible even to mast sandboxed code :)
<almuhs>yes, i only need it to install the system from qemu
<almuhs>after this, i got to run the debian-hurd-2023 installer with noide , and as this way, install the system in the disk from qemu
<almuhs>once installed, i boot the system adding noide to GRUB config
<almuhs>and once booted, i manage this as a common VM
<almuhs>configure debian repositories, apt update, apt upgrade, apt full-upgrade
<Pellescours>youpi: if you take the debian VM, you boot it with rumpdisk and your disk in IDE mode, you’ll have the "Enter evaluation : _SB.PCI0.SC0._ADR (Integer)…" errors
<youpi>I'm not using ide mode
<almuhs>then i compiled other gnumach (upstream, smp with my patch, upstream-smp) and upload this gnumach to the VM. Make update-grub to add all to the list
<youpi>but if you guys see it as a way to reproduce the issue, then use it
<youpi>and you'll be able to fix it
<almuhs>youpi: don't forget to add -M q35 in qemu
<almuhs>qemu-system-i386 -M q35 -m $MEMORY $OPTIONS -smp $NCPUS
<almuhs>latest line in my script
<youpi>ah, if you hide options, I can't see them
<almuhs>if this option is not set, gnumach detects the disk as IDE, even if you are using ahci flags
<youpi>that ratther makes it hang earlier
<youpi>[ 1.0200050] vendor 8086 product 100e (ethernet network, revision 0x03) at pci0 dev 3 function 0 not configured
<youpi>[ 1.0200050] ahcisata0 at pci0 dev 4 function 0: vendor 8086 product 2922 (rev. 0x02)
<youpi>stuck there
<youpi>kvm -cpu host -smp 2 -m 4G -chardev stdio,signal=off,id=stdio -serial chardev:stdio -machine q35 -device ahci,id=ahci1 -drive id=boot,if=none,format=raw,media=disk,file=/root/boot,cache=writeback -drive id=root,if=none,format=raw,media=disk,file=/home/hurd,cache=writeback -device ide-hd,drive=boot,bus=ahci1.0 -device ide-hd,drive=root,bus=ahci1.1 -vga std -net tap,script=/root/ifup-start-hurd -net nic,model=e1000 -net nic,model=e1000 -gdb
<youpi>tcp:127.0.0.1:12345
<youpi>[ 1.0200050] WARNING: DELAY ESCAPED
<almuhs>other thing: i had to use a raw image
<youpi>ah :)
<youpi>so it *really* looks like an irq routing issue
<youpi>it's a raw image
<youpi>but again, again, again
<youpi>the more you use circumventions to hide a bug
<youpi>the more difficult it will be to fix the bug
<youpi>rather just fix the bug in a situation where it happens all the time
<youpi>instead of pushing it away, only for it to bite you later in a situation which will be *way* more difficult to debug with whatnot processes doing all kinds of stuf altogether
<almuhs>"just". I don't know how rumpdisk works
<youpi>I don't either
<youpi>so it's no reason :)
<almuhs>btw, i use qemu, not kvm directly
<youpi>it's the converse :)
<youpi>kvm is a lightweight frontend to just run qemu -kvm
<almuhs>yes, i know
<almuhs>but i see many options in your line
<youpi>not that much more than yours
<youpi>(and most of them completely unrelatde)
<youpi>ok, now it did pass the driver hang, by luck
<youpi>still stuck at starting sshd
<almuhs>my network options are only set to be able to disable dhcp, and know a correct address to set
<almuhs>because some time ago, i found that the boot usually freeze in dhcp step. Now it seems solved, but i keep this option
<almuhs>now i'm using dhcp without problem
<almuhs>but in the first smp patches sent by damo22, pfinet freezed at dhcp
<youpi>it very quickly hangs whatever I do
<almuhs>are you tried to install the system from debian installer instead use the img file?
<youpi>again
<youpi>there's no point trying to find situations where it works
<youpi>to make progress, one has to fix the situations where it doesn't work
<youpi>otherwise we'll keep staying in a niche situation where you have to know that you have to align all stars^Woptions to get things work only by sheer luck
<youpi>for a start, was it checked whether interrupts get routed to the BSP only or also on the AP?
<youpi>and if the latter case, if things happen correctly?
<almuhs>which are working in i/o is damo22
<almuhs>*who
<almuhs>in smp, i only worked in cpu startup and configuration, and IPI sending. But the APIC configuration and most of I/O is from damo22, because it's a weakness for me
<youpi>not a weakness
<youpi>just not knowing about it *yet*
<youpi>it's just about reading stuff, putting prints to see what happens, and working it out
<youpi>seeing what you have already achieved, you can do that, it just takes time
<almuhs>i work better in a more deterministic environment
<almuhs>and the i/o is not deterministic for me
<youpi>I wonder why you started working on smp, which is the most non-deterministic world :)
<youpi>i/o is completely deterministic, on the other hand
<youpi>there's just one hard drive
<almuhs>i started in smp as a chance, XD. I notice contradictory that a thread-based system only was using a unique cpu
<almuhs>btw, i need to learn how to debug a hurd server meanwhile it's running
<youpi>we have been using unique-cpu systems for decades before multicore went usual ;)
<almuhs>the patch which i sent some years ago to fill last-processor field in the stat file doesn't works: last-processor keeps to 0 even when debugging the processor table from gnumach it shows other different value
<almuhs>debugging gnumach from gdb in my smp environment, i noticed that last_processor many times is different to 0. But, i don't know if proc or procfs, continue reading 0 in last_processor
<almuhs>in other words: when last_processor in gnumach is different than 0, in hurd appears as 0
<almuhs>in the /proc/PID/stat
<almuhs>so i need to debug proc and procfs to follow the sequence of this data, to find where is the fail
<youpi>simplest is to use mach_print()
<almuhs>but it can be useful connect gdb or any similar to the servers
<youpi>yes but not all kinds of servers
<youpi>not proc for instance
<youpi>for procfs you'd have to be careful since then you cannot access /proc any more
<almuhs>i need to check the status of last_processor and related struct
<almuhs>the files which have been modified in these patches https://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?qt=grep&q=last_processor
<almuhs>mmm... maybe i have to compile this servers with a gnumach-smp ?
<almuhs>like we have just done with rumpdisk
<youpi>no
<almuhs>then it's a bug
<youpi>the headers don't change whether smp is enabled or not
<Pellescours>I just realized that the ton of "Enter evaluation…" and "Exit evaluation…" messages are debug messages from rumpkernel. That’s surprised me because it did not appear at the begining (when I bringed piixide to rumdisk 2 years ago). But there is still the lost interrupt
<Pellescours>piixide0:0:0: lost interrupt
<Pellescours>type: ata tc_bcount: 512 tc_skip: 0
<Pellescours>and iirc the piixide driver always had this issue, but iirc the issue appeared only when I was shutting down the VM, not at the boot
<almuhs>i had seen this same issues in real hardware
<almuhs>when I configure my old Thinkpad in compatibility mode
<Pellescours>I’m asking if this issue of lost interrup is in gnumach/hurd or in the rumpkernel piixide driver, because ahcisata behave correctly
<almuhs>here in a T60 https://pasteboard.co/han2dvzvExQE.jpg
<almuhs> https://pasteboard.co/VPq9dJBGiYnu.jpg
<almuhs>sometimes, in some models, like R61i, when I configure SATA in AHCI mode in the BIOS, the HDD continues being detected as IDE
<almuhs>and in T60 i think that there are the same issue
<Pellescours>Maybe updating the netbsd source to latest will fix piixide
<almuhs>added to this, in real hardware, i think that the problem is that rumpdisk is not detecting the interface correctly. That the detection problem is not from the disk, instead it doesn't detect the interface
<Pellescours>if you try to boot a nedbsd on this hardware, you’ll may have some answer. If the disk is not detected, then the driver need to be patched. Otherwise it’s our integration that is buggy
<almuhs>it's a good idea
<almuhs>Pellescours: what version of netbsd i have to try? the latest?
<almuhs>latest is 10.0
<Pellescours>yes the netbsd version we imported for rumpdisk is from 2021
<almuhs>10.0 is 2024 march
<Pellescours>try this one first
<almuhs>try 10.0?
<Pellescours>yes
<almuhs>ok
<gnucode>I'
<gnucode>I've got a qoth for q1 almost done. Anyone want to volunteer to proof read?
<gnucode>sneek: later tell solid_black that I am about to submit a qoth for Q1 of 2024. I would like to mention for Alpine Hurd distribution. Do you have a git repo somewhere?
<sneek>Okay.
<almuhs>hi. After try NetBSD 10.0 i386 in a Thinkpad T440p, it detects the HDD correctly
<almuhs>it shows as wd0
<almuhs>Pellescours: https://pasteboard.co/cwohZvRRDseW.jpg
<gnucode>that's interesting.
<almuhs>and the T440p is so modern to don't support "compatibility mode"
<gnucode>I still haven't gotten the Hurd to run on the T410. Not that I've tried very hard.
<almuhs>with the gnumach's IDE driver, setting the SATA interface in compatibility mode, you can install Hurd in the T410 without problem. The problem is with rumpdisk