IRC channel logs

<almuhs>i suspect that rumpdisk has race conditions

<almuhs>in smp with damo22 patch, sometimes freeze during the boot, in ext2fs step

<almuhs>or even before, during disk detection

<Pellescours>No it’s not race condition, It triggers in non smp mode. It’s relates to the pager

<Pellescours>But, there may be race condition also

<youpi>smp probably makes the issue more probable

<almuhs>now i'm compiling rumpkernel with default debian sources from here https://people.debian.org/~sthibault/tmp/unreleased/rumpkernel_0~20211031+repack-4.dsc . When I boot the VM with my modified rumpdisk, the disk is not detected. So I can check if the problem is mine or is in the default code

<almuhs>i'm compiling the default code to be sure that the problem is not from this code

<Pellescours>If you run stress program (stressing either disk or memory) with rumpdisk make the VM freeze completely. I started to investigate with solid_black but I don’t know what’s the solution here

<almuhs>meanwhile i compile the code, i found some TODO and "function not implemented"

<almuhs>i found an error in the compilation process https://pastebin.com/eEfGAXuF

<youpi>it should be there in i386/include/mach/i386/machine_types.defs:type rpc_phys_addr_array_t = array[] of rpc_phys_addr_t;

<almuhs>in gnumach source code?

<almuhs>there is in the def, but not in the headers

<almuhs>pruebas@debian-hurd:~/rumpkernel-0~20211031+repack$ grep -rn rpc_phys_addr_array_t /usr/include/i386-gnu/mach/

<almuhs>/usr/include/i386-gnu/mach/gnumach.defs:209: out pages : rpc_phys_addr_array_t);

<almuhs>in gnumach's upstream source code, the type is defined in i386/include/mach/i386/machine_types.defs:105:type rpc_phys_addr_array_t = array[] of rpc_phys_addr_t;

<almuhs>and in i386/include/mach/i386/vm_types.h:97:typedef rpc_phys_addr_t *rpc_phys_addr_array_t;

<youpi>yes, and you need both installed

<almuhs>copying the files manually?

<youpi>depends how you did it for gnumach.defs, you probably want to do the same

<almuhs>currently I have the latest debian version of gnumach

<almuhs>**gnumach.defs

<youpi>yes, but you need the newer versions of the other files that gnumach.Defs needs

<almuhs>but, if i copy the newer versions, maybe there are compatibily issues

<youpi>possibly

<youpi>you can try to copy/paste only the lines you need

<almuhs>ok, after copy the definitions of rpc_phys_addr_t , I got compile rumpkernel and compile hurd with it. Now I am sure that the latest rumpkernel debian's sources works correctly

<almuhs>other day i will try again with my new patches sources to add the prints

<almuhs>notice that I had to "make clean" before "./configure --enable-static (...) && make rumpdisk" to recompile rumpdisk.static

<youpi>sneek: tell almuhs later that one can cd rumpdisk ; make rumpdisk.static to build a static version

<sneek>almuhs, youpi says: later that one can cd rumpdisk ; make rumpdisk.static to build a static version

<youpi>eh?

<youpi>sneek: later tell almuhs one can cd rumpdisk ; make rumpdisk.static to build a static version

<sneek>Okay.

<solid_black>hi

<solid_black>Pellescours: but we did complete the investigation, didn't we

<solid_black>we don't have a good story about paging w/ rumpdisk, at least not one that I'm aware of

<solid_black>pageout, I mean

<solid_black>unless we can guarantee that rumpdisk can write pages being paged out without itself needing dynamic memory, I guess

<Pellescours>solid_black: yes

<solid_black>youpi: ping?

<youpi>pong

<solid_black>hi

<solid_black>I've been meaning to ask, what are *your* plans concerning aarch64-gnu?

<solid_black>are you going to review things? (the glibc port, the gnumach port)

<solid_black>should I just dump the gnumach port as one large patch?

<solid_black>do you have time for this?

<youpi>for the asm part I can't really review, I don't know much arm

<youpi>for the gnumach part I'm fine with committing it, provided you plan to support it :)

<youpi>for the glibc part, I can commit the parts that are very similar to the linux versions

<youpi>for more involved parts of glibc it's difficult for me to review

<solid_black>the situation is, I'm kind of out my breath with this

<solid_black>and I want more people to look at it, build it, play with it, etc

<solid_black>and looks like the only way that is going to happen is if it's upstream

<youpi>we can commit the gnumach part for a start, since it's already working quite well

<youpi>we're missing the gcc part, though, it needs to be pinged

<solid_black>the glibc part is actually not that complicated, and you don't need much asm knowlege

<solid_black>also Maxim K. of Linaro / glibc told me they (Linaro? glibc?) were going to review it

<youpi>ok, so we're "just" missing the gcc part

<youpi>oh good

<solid_black>but he evidently knew very little about the Hurd/Mach internals (i.e. didn't know what MIG is)

<youpi>sure, that's fine

<solid_black>apparently Linaro is intrigued

<youpi>that part I can probably review

<youpi>it's rather the setcontext/swapcontext and such asm tricky stuff that I'm not at ease with, but he would be

<solid_black>setcontext I don't think I have implemented :D

<solid_black>longjmp, I did

<youpi>heh :D

<youpi>but you get the idea

<solid_black>by copying the Linux version and adding the Hurdish onstack bits

<solid_black>but it needs more testing for sure

<youpi>sure but we can commit something that works already quite well, before going through more testing

<solid_black>for the gnumach port, can you review a GitHub branch, or would you rather I post it as a patch to bug-hurd?

<youpi>it'd be simpler to get a series on the list

<solid_black>'series' is complicated, I don't really know how to split this

<youpi>a bit patch is fine

<solid_black>well some things like the device tree parser could be split out I guess

<youpi>as long as it's just additions

<youpi>(or trivial e.g. #if changes)

<solid_black>what about Debian, do you have any plans for adding the arch / building packages / etc?

<youpi>I actually already had submitted the request for arch addition

<solid_black>I saw that yeah

<youpi>the reply was that it needs to be commited upstream at least a bit :)

<solid_black>but it went nowhere

<youpi>it's not "nowhere"

<youpi>it's just that debian doesn't want to maintain patches

<youpi>so it insists on work being upstreamed, before downstreaming the support

<solid_black>why do all these projects insist on other bits being upstreamed

<solid_black>you have to start *somewhere* with upsteaming

<youpi>yes, and that's the toolchain

<solid_black>good thing at least binutils didn't want us to have a fully functionaning system first

<youpi>sure

<youpi>since they can test some support without needing a system, actually

<solid_black>Linaro wants to know how they could test it on their CI

<solid_black>I told Maxim that eventually, there'd be Debian, and they could run it in qemu and run the testsuite

<solid_black>it sounded like this wasn't a requirement for upstreaming into glibc, but imagine it was

<solid_black>then it'd be a deadlock between debian and glibc

<youpi>glibc is fine with not running the testsuite itself

<youpi>they just want to be able to cross-build and run the cross-build test

<youpi>which thus only depends on the toolchain

<youpi>really, it's just toolchain -> glibc -> distrib

<solid_black>when reviewing, are you going to bootstrap/build all of this on your end?

<youpi>not for just the review before commit

<youpi>but as soon as debian adds the arch, I'll try to bootstrap the arch, and that'll bring the buildability question indeed :)

<solid_black>ok

<solid_black>so I should 1. ping Thomas again 2. post v3 of glibc port 3. try to post Mach port patches

<youpi>2 and 3 can be in parallel

<solid_black>I only have a single brain core :)

<solid_black>no SMP here

<solid_black>it's really that the way forward is to 1. build/bootstrap more things in userland 2. for people other than myself to review / build / play with aarch64 gnumach, and suggest/do improvements

<solid_black>to make hardware support more serious

<solid_black>oh, I forgot the important / problematic part

<youpi>sure, but that'safter the initial work is committed

<solid_black>bootstrap / multiboot

<youpi>so gcc then glibc and gnumach

<solid_black>the way it's bolted on currently in the aarch64 branch probably breaks i386

<youpi>(though I'm fine with committing gnumach before gcc)

<solid_black>somebody just needs to look at it and make it work :|

<youpi>ah, that's a problem :)

<solid_black>also, unrelated, reminder that v2 of the gnumach arbitrary write exploit is still unfixed

<solid_black>we need to do something about that

<youpi>I don't have details about it

<solid_black>that's because I haven't shared them, sure :)

<solid_black>the good thing is, we can land public API in Mach (similar to the patch I posted back in January), and that should be enough for glibc/hurd

<youpi>indeed, we can already commit that if it's ready and doesn't break x86

<solid_black>it is ready (missing aarch64_debug_state, but that's only going to be needed for GDB), and doesn't break anything

<solid_black>and no incompatible changes are expected

<solid_black>well, also any potential SVE / SME state, but we won't support that for now

<youpi>solid_black: you mean the january series is unchanged?

<solid_black>no, there have been minor changes

<solid_black>it's very similar, but still ncompatible

<solid_black>e.g. I've renamed EXC_AARCH64_FP_ID to EXC_AARCH64_IDF

<solid_black>because I mostly started by copying what Apple provided (plus some consulting of the Arm manual)

<solid_black>and now I've done *a lot* more reading of the ARM ARM, and got a lot more experience with aarch64 in practice

<solid_black>so I decided it makes more sense to call the various bits what ARM docs call them, rather than what would make more sense to someone who's not familiar with ARM

<solid_black>here's something that should be possible, and that I will maybe do some time when we have a more complete system: basic Linux syscall emulation, in userland, off-task

<solid_black>so a different task would set itself as exception handler, and get syscalls (EXC_AARCH64_SVC) and faults, and decode and translate them to Hurd-native

<youpi>you mean porting qemu-system-user ?

<solid_black>no

<youpi>why not?

<solid_black>I mean, you could do that too, that's unrelated

<youpi>that's in the end less work, too

<solid_black>maybe

<solid_black>what I'm talking about wouldn't do any recompilation, it would run the binary as-is, just translating syscalls

<youpi>that's what you would get with qemu+kvm

<solid_black>kvm is a huge other beast

<youpi>if you run the binary as-is, you need to catch system calls etc.

<youpi>doing so without using svm/vmx is probably quite a beast

<solid_black>yes, that's exactly what I'm talking about

<youpi>when svm/vmx is exactly meant for that

<solid_black>no it's not, that;s my point

<youpi>it's not?

<solid_black>Mach traps very very intentionally made negative

<youpi>so you catch illegal instruction faults?

<solid_black>not by my, by original Mach designers

<solid_black>erhg, I can't type

<solid_black>s/very very/were very/

<solid_black>s/by my/by me/

<solid_black>I have an exception type, EXC_AARCH64_SVC, that gets raised when you try to perform an SVC instruction that doesn't look like a valid Mach trap

<solid_black>the linuxinator task would handle that by emulating the syscall

<youpi>anyway, my point is: porting qemu-user makes sense and is useful on the long run. Doing something similar is probably very fun, but not clear it'll have the same impact long-term-wise

<youpi>(we could also do the converse, btw: port qemu-user to support the gnumach calls)

<solid_black>sure, I'm not saying this is better than a potential qemu port, or will be useful, or anything

<solid_black>just that it should be possible, and fun, and that I might do it

<solid_black>supporting Mach calls under qemu-user is probably a lot more complicated

<solid_black>whether on Linux or on Mach

<solid_black>does Debian enable PAC & BTI for aarch64?

<youpi>I don't know

<solid_black> https://wiki.debian.org/ToolChain/PACBTI

<solid_black>that sounds like it does since recently

<almuhs>my ahcisata_pci.c patched works

<sneek>Welcome back almuhs, you have 1 message!

<sneek>almuhs, youpi says: one can cd rumpdisk ; make rumpdisk.static to build a static version

<almuhs> https://pasteboard.co/GnAkz6XRu8lu.png

<almuhs> https://pasteboard.co/yyuZIxi51Hvf.png

<almuhs> https://pasteboard.co/5HSPIFrwXsPa.png

<almuhs>now i will copy pci_map.c patch

<solid_black>wrote way too many words in the commit message of the VM entry deadlock commit patch, as requeste

<solid_black>d

<solid_black>now you'll have to read through it :D

<youpi>your future self will thank you so much 10yrs from now when you'll dig back into this question ;)

<Pellescours>:D

<almuhs>pci_map.c patch works too

<almuhs>now i will have a few more info when i try rumpdisk in real machines. I only have to copy the deb files and the rumpdisk.static and copy all in the harddisk to test

<almuhs>**copy and install

<solid_black>...and of course I made many typos in said commit message

<youpi>smp is supposed to be working with rumpdisk, right?

<youpi>it's getting stuck at:

<youpi> Enter evaluation : _SB.PCI0.SC0._ADR (Integer)

<youpi> Exit

<solid_black>it works for me, but I have 0 idea how to debug any rumpdisk issues

<almuhs>in my case, that error appears when i use an IDE disk with noide

<almuhs>or when i forget change "wd0 noide" to "hd0" in GRUB when I have a IDE disk in /dev/hd0

<youpi>I might be doing that indeed

<solid_black>uhh, what?

<solid_black>you're *supposed* to put in noide to use rumpdisk

<solid_black>noide doesn't mean "I hate IDE", it means to disable Mach's Linux drivers, to give rumpdisk a chance to drive the IDE

<almuhs>pixide (the IDE driver in rumpdisk) is buggy

<youpi>ok, after disabling the ide CD, it does boot

<almuhs>i don't know, anyone told me here some time ago

<youpi>(almost)

<almuhs>some weeks ago, someone sent a patch which solve CD with size 0

<youpi>does pfinet work in smp ?

<almuhs>so in upstream this problem is solved, i think. But in debian's gnumach, it's necessary to put in a cdrom in the drive to boot

<almuhs>dhcp works in smp

<almuhs>and i got network

<solid_black>does it?

<youpi>my boot gets stuck after Starting system message bus: dbus.

<almuhs>yes

<solid_black>do you mean full smp, or damo22's mode where everything runs on a single cpu?

<youpi>usually next step is Starting internet superserver: inetd.

<almuhs>yes, sometimes freeze during the boot

<youpi>solid_black: I'm using the current master, iirc that runs on a single cpu

<solid_black>yes

<almuhs>with old damo22 patch which modifies the scheduler, it needs "only" 2 or 3 attempts to get boot

<solid_black>so there's almost no reason for that to work differently than non-SMP

<solid_black>(other than enabling the SMP codepaths in Mach)

<solid_black>there's no additional concurrency introduced etc

<almuhs>without this patch and enabling all cpus during the boot, it was necesary more than 20 attempts to get boot

<almuhs>i don't know if latest patches to fix some race conditions improve it a bit

<youpi>ok, I don't think the SC0 issue was about ide: it is still hanging without it

<youpi>I just got lucky right after disabling the ide cd

<solid_black>with damo22's patch reverted and my patches that I posted, my system booted to dhcp, and then hung

<solid_black>with all cpus being idle

<almuhs>try again

<almuhs>try a couple attempts

<solid_black>and I didn't think to ask kdb about where they're blocked

<solid_black>also I don't yet fully understand how kdb works, so aarch64 doesn't really support it yet

<almuhs>i simply turn on the VM, fix filesystem and try again

<solid_black>yeah, I'm so not going to do that

<solid_black>I should make snapshots instead, so fs as stored doesn't get corrupted

<youpi>note: the SC0 issue seems related to interrupts

<youpi> Enter evaluation : _SB.PCI0.SC0._ADR (Integer)

<youpi> Exit irq handler [9]: new delivery port f64cf3d0 entry f5639ec0

<youpi>which I'm really not surprised of

<youpi>the interrupt delivery path most probably didn't get completely cleaned

<youpi>I disabled the hurd console, it went further, almuhs: do you use the hurd console?

<youpi>do apic-enabled kernels work fine (without smp), for a start, actually?

<almuhs>yes, i use hurd-console

<almuhs>because i configure keymap in spanish

<almuhs>to test smp with minimal race conditions, i keep this patch https://git.zammit.org/gnumach-sv.git/commit/?h=fixes&id=0fe92b6b52726bcd2976863d344117dad8d19694

<almuhs>and disable the other patch which set all process to cpu0

<almuhs>it's a temporary solution

<almuhs>sometimes i need a few attempts to get boot

<almuhs>with upstream code in smp, with the patch which assign all to cpu0, i usually get boot without problem

<youpi>better avoid these patch which only burry the issue, to get the issue all the time and fix it :)

<youpi>assigning all to cpu0 is not yet the default in upstream?

<solid_black>it is

<almuhs>yes

<youpi>I don't understand why you say "with the patch which assign all to cpu0" then

<almuhs>upstream smp

<youpi>what do you mean by upstream smp ?

<almuhs>compiling upstream with NCPUS > 1

<solid_black>commit aadb433981b086bfb4e082757fed1154582d5497 fwiw

<almuhs>gnumach's upstream source code, compiled with --enable-ncpus > 1

<almuhs>i use this script to compile gnumach, btw https://github.com/AlmuHS/gnumach_dev_scripts/blob/main/compile_scratch.sh

<youpi>(btw it'd have been good to submit a patch that disables the linux group and enables apic, instead of letting me have to figure it out)

<almuhs>i put this flags under your instructions

<almuhs>so many time ago

<youpi>that doesn't mean that we want to have to do that longtermwise

<almuhs>btw, this script also works changing the SRC_PATH to the directory where i git clone the gnumach's savannah directory

<youpi>ok, disabled inetutils-inetd too, and now it boots

<youpi>as well as lightdm

<almuhs>:)

<youpi>ah it seems it's lightdm which poses problem

<almuhs>lightdm never worked for me

<almuhs>since many years ago

<youpi>I'm not saying it doesn't work, I'm saying it hangs the boot

<almuhs>oh, ok

<youpi>while in up it doesn't pose any problem

<youpi>(it doesn't work either, but doesn't bother th eboot)

<solid_black>speaking of the Hurd console

<youpi># update-rc.d enable ssh

<youpi>hangs...

<solid_black>from what I've read, it doesn't sound like there's an easy way to get a text-mode console in the aarch64 world (or anywhere but x86)

<youpi>solid_black: but there's a serial port, isn't there?

<youpi>so gnumach at least has a console

<almuhs>youpi: but, which is your kernel configuration? upstream's gnumach?

<youpi>and then you "just" need a frame buffer client for the console :)

<youpi>almuhs: yes

<solid_black>so so far, it sounds like Mach console is going to always be going to a serial port, and the Hurd console would be rendering characters to any graphics displays

<almuhs>ok

<solid_black>but this also means that you won't see kernel boot-up logs scroll by

<almuhs>i go to make git pull and compile again upstream

<youpi>solid_black: not really a problem for non-dev users :)

<youpi>ok, ssh does pose problem too in smp boot

<youpi>still, I'd really say to fix the rumpdisk booting issue first, since it's a very deterministic one

<almuhs>it's very important to get rumpdisk detect real hdd

<Pellescours>for the IDE rumpdisk issue, note that it’s not related to apic nor SMP. If you use IDE with PIC and 1 cpu, the problem will occur

<almuhs>upstream's gnumach compiled with my script, and in a VM with AHCI and rumpdisk, work successfully

<almuhs>this latest test has been done with a real HDD connected to my VM. I go to put this HDD in a Thinkpad T440p, to check the results of my rumpdisk prints (probably it will not detect the hdd)

<youpi>Pellescours: I never got this issue, in which condition are you reproducing it?

<youpi>almuhs: but with damo22's workaround, right?

<almuhs>not, with upstream compiled with my script: https://pasteboard.co/OIrBGIKe8EN4.jpg

<youpi>ok, but then try to reproduce it with qemu, and there you'll be in a situation to fix it

<almuhs>in qemu works sucessfully

<almuhs>with the same hdd

<solid_black>oh, you mean _real hardware_

<youpi>"same hdd", with qemu?

<almuhs>yes

<youpi>I don't understand

<youpi>how can it be the same hdd

<almuhs>i used a real harddisk

<youpi>with qemu?

<almuhs>i connected this harddisk to qemu, installed Debian GNU/Hurd with all the necesary

<youpi>ok, so I have to be explicitly clear: how do you plug it?

<youpi>that's my point

<almuhs>and after this, i disconnected harddisk and put in in a Thhinkpad T440p

<almuhs>with a USB-SATA adapter

<youpi>no, to qemu I mean

<youpi>gnumach doesn't care how it's plugged to the computer

<almuhs>file=/dev/sdb

<youpi>ok

<youpi>through an ahci controller?

<youpi>with media=disk?

<almuhs> https://github.com/AlmuHS/gnumach_dev_scripts/blob/main/qemu-hurd.sh

<youpi>all details do matter

<almuhs>change FILE=/dev/sdb

<almuhs>and chmod 777 /dev/sdb

<almuhs>to get qemu has permission to access

<solid_black>so that it's fully accessible even to mast sandboxed code :)

<almuhs>yes, i only need it to install the system from qemu

<almuhs>after this, i got to run the debian-hurd-2023 installer with noide , and as this way, install the system in the disk from qemu

<almuhs>once installed, i boot the system adding noide to GRUB config

<almuhs>and once booted, i manage this as a common VM

<almuhs>configure debian repositories, apt update, apt upgrade, apt full-upgrade

<Pellescours>youpi: if you take the debian VM, you boot it with rumpdisk and your disk in IDE mode, you’ll have the "Enter evaluation : _SB.PCI0.SC0._ADR (Integer)…" errors

<youpi>I'm not using ide mode

<almuhs>then i compiled other gnumach (upstream, smp with my patch, upstream-smp) and upload this gnumach to the VM. Make update-grub to add all to the list

<youpi>but if you guys see it as a way to reproduce the issue, then use it

<youpi>and you'll be able to fix it

<almuhs>youpi: don't forget to add -M q35 in qemu

<almuhs>qemu-system-i386 -M q35 -m $MEMORY $OPTIONS -smp $NCPUS

<almuhs>latest line in my script

<youpi>ah, if you hide options, I can't see them

<almuhs>if this option is not set, gnumach detects the disk as IDE, even if you are using ahci flags

<youpi>that ratther makes it hang earlier

<youpi>[ 1.0200050] vendor 8086 product 100e (ethernet network, revision 0x03) at pci0 dev 3 function 0 not configured

<youpi>[ 1.0200050] ahcisata0 at pci0 dev 4 function 0: vendor 8086 product 2922 (rev. 0x02)

<youpi>stuck there

<youpi>kvm -cpu host -smp 2 -m 4G -chardev stdio,signal=off,id=stdio -serial chardev:stdio -machine q35 -device ahci,id=ahci1 -drive id=boot,if=none,format=raw,media=disk,file=/root/boot,cache=writeback -drive id=root,if=none,format=raw,media=disk,file=/home/hurd,cache=writeback -device ide-hd,drive=boot,bus=ahci1.0 -device ide-hd,drive=root,bus=ahci1.1 -vga std -net tap,script=/root/ifup-start-hurd -net nic,model=e1000 -net nic,model=e1000 -gdb

<youpi>tcp:127.0.0.1:12345

<youpi>[ 1.0200050] WARNING: DELAY ESCAPED

<almuhs>other thing: i had to use a raw image

<youpi>ah :)

<youpi>so it *really* looks like an irq routing issue

<youpi>it's a raw image

<youpi>but again, again, again

<youpi>the more you use circumventions to hide a bug

<youpi>the more difficult it will be to fix the bug

<youpi>rather just fix the bug in a situation where it happens all the time

<youpi>instead of pushing it away, only for it to bite you later in a situation which will be *way* more difficult to debug with whatnot processes doing all kinds of stuf altogether

<almuhs>"just". I don't know how rumpdisk works

<youpi>I don't either

<youpi>so it's no reason :)

<almuhs>btw, i use qemu, not kvm directly

<youpi>it's the converse :)

<youpi>kvm is a lightweight frontend to just run qemu -kvm

<almuhs>yes, i know

<almuhs>but i see many options in your line

<youpi>not that much more than yours

<youpi>(and most of them completely unrelatde)

<youpi>ok, now it did pass the driver hang, by luck

<youpi>still stuck at starting sshd

<almuhs>my network options are only set to be able to disable dhcp, and know a correct address to set

<almuhs>because some time ago, i found that the boot usually freeze in dhcp step. Now it seems solved, but i keep this option

<almuhs>now i'm using dhcp without problem

<almuhs>but in the first smp patches sent by damo22, pfinet freezed at dhcp

<youpi>it very quickly hangs whatever I do

<almuhs>are you tried to install the system from debian installer instead use the img file?

<youpi>again

<youpi>there's no point trying to find situations where it works

<youpi>to make progress, one has to fix the situations where it doesn't work

<youpi>otherwise we'll keep staying in a niche situation where you have to know that you have to align all stars^Woptions to get things work only by sheer luck

<youpi>for a start, was it checked whether interrupts get routed to the BSP only or also on the AP?

<youpi>and if the latter case, if things happen correctly?

<almuhs>which are working in i/o is damo22

<almuhs>*who

<almuhs>in smp, i only worked in cpu startup and configuration, and IPI sending. But the APIC configuration and most of I/O is from damo22, because it's a weakness for me

<youpi>not a weakness

<youpi>just not knowing about it *yet*

<youpi>it's just about reading stuff, putting prints to see what happens, and working it out

<youpi>seeing what you have already achieved, you can do that, it just takes time

<almuhs>i work better in a more deterministic environment

<almuhs>and the i/o is not deterministic for me

<youpi>I wonder why you started working on smp, which is the most non-deterministic world :)

<youpi>i/o is completely deterministic, on the other hand

<youpi>there's just one hard drive

<almuhs>i started in smp as a chance, XD. I notice contradictory that a thread-based system only was using a unique cpu

<almuhs>btw, i need to learn how to debug a hurd server meanwhile it's running

<youpi>we have been using unique-cpu systems for decades before multicore went usual ;)

<almuhs>the patch which i sent some years ago to fill last-processor field in the stat file doesn't works: last-processor keeps to 0 even when debugging the processor table from gnumach it shows other different value

<almuhs>debugging gnumach from gdb in my smp environment, i noticed that last_processor many times is different to 0. But, i don't know if proc or procfs, continue reading 0 in last_processor

<almuhs>in other words: when last_processor in gnumach is different than 0, in hurd appears as 0

<almuhs>in the /proc/PID/stat

<almuhs>so i need to debug proc and procfs to follow the sequence of this data, to find where is the fail

<youpi>simplest is to use mach_print()

<almuhs>but it can be useful connect gdb or any similar to the servers

<youpi>yes but not all kinds of servers

<youpi>not proc for instance

<youpi>for procfs you'd have to be careful since then you cannot access /proc any more

<almuhs>i need to check the status of last_processor and related struct

<almuhs>the files which have been modified in these patches https://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?qt=grep&q=last_processor

<almuhs>mmm... maybe i have to compile this servers with a gnumach-smp ?

<almuhs>like we have just done with rumpdisk

<youpi>no

<almuhs>then it's a bug

<youpi>the headers don't change whether smp is enabled or not

<Pellescours>I just realized that the ton of "Enter evaluation…" and "Exit evaluation…" messages are debug messages from rumpkernel. That’s surprised me because it did not appear at the begining (when I bringed piixide to rumdisk 2 years ago). But there is still the lost interrupt

<Pellescours>piixide0:0:0: lost interrupt

<Pellescours>type: ata tc_bcount: 512 tc_skip: 0

<Pellescours>and iirc the piixide driver always had this issue, but iirc the issue appeared only when I was shutting down the VM, not at the boot

<almuhs>i had seen this same issues in real hardware

<almuhs>when I configure my old Thinkpad in compatibility mode

<Pellescours>I’m asking if this issue of lost interrup is in gnumach/hurd or in the rumpkernel piixide driver, because ahcisata behave correctly

<almuhs>here in a T60 https://pasteboard.co/han2dvzvExQE.jpg

<almuhs> https://pasteboard.co/VPq9dJBGiYnu.jpg

<almuhs>sometimes, in some models, like R61i, when I configure SATA in AHCI mode in the BIOS, the HDD continues being detected as IDE

<almuhs>and in T60 i think that there are the same issue

<Pellescours>Maybe updating the netbsd source to latest will fix piixide

<almuhs>added to this, in real hardware, i think that the problem is that rumpdisk is not detecting the interface correctly. That the detection problem is not from the disk, instead it doesn't detect the interface

<Pellescours>if you try to boot a nedbsd on this hardware, you’ll may have some answer. If the disk is not detected, then the driver need to be patched. Otherwise it’s our integration that is buggy

<almuhs>it's a good idea

<almuhs>Pellescours: what version of netbsd i have to try? the latest?

<almuhs>latest is 10.0

<Pellescours>yes the netbsd version we imported for rumpdisk is from 2021

<almuhs>10.0 is 2024 march

<Pellescours>try this one first

<almuhs>try 10.0?

<Pellescours>yes

<almuhs>ok

<gnucode>I'

<gnucode>I've got a qoth for q1 almost done. Anyone want to volunteer to proof read?

<gnucode>sneek: later tell solid_black that I am about to submit a qoth for Q1 of 2024. I would like to mention for Alpine Hurd distribution. Do you have a git repo somewhere?

<sneek>Okay.

<almuhs>hi. After try NetBSD 10.0 i386 in a Thinkpad T440p, it detects the HDD correctly

<almuhs>it shows as wd0

<almuhs>Pellescours: https://pasteboard.co/cwohZvRRDseW.jpg

<gnucode>that's interesting.

<almuhs>and the T440p is so modern to don't support "compatibility mode"

<gnucode>I still haven't gotten the Hurd to run on the T410. Not that I've tried very hard.

<almuhs>with the gnumach's IDE driver, setting the SATA interface in compatibility mode, you can install Hurd in the T410 without problem. The problem is with rumpdisk

IRC channel logs

2024-04-05.log