IRC channel logs
2022-02-07.log
back to list of logs
<youpi>look in config.log for what actually happens <Pellescours>not really helpfull, but I forced the flag to true to see and… <Pellescours>/usr/bin/ld : /root/hurd/build/pci-arbiter/../../pci-arbiter/device_map.c:44 : référence indéfinie vers « pci_device_map_legacy » <Pellescours>library version is not the latest one compared to what hurd expect <youpi>(00:23:59) Pellescours: It’s just saying: no <youpi>the config.log created by ./configure in the hurd tree? <youpi>so it's just the pkg-config test that fails <youpi>check pkg-config --exists pciaccess <youpi>and check that you indeed have pciaccess.pc <Pellescours>I don’t have pkg-config installed, I remember now. I already had this issue with another VM <Pellescours>youpi: for what I see, it seems that the process is not dead, I can still see my logs comming from pci_device_open() func <youpi>ah probably it's just the bug about bootstrap not getting properly named <youpi>so possibly what got bogus is libpciaccess accessing pci-arbiter through device_open("pci") <Pellescours>yeah, and because pci-arbiter is not properly named another pci-arbiter starts <youpi>no, the start doesn't care about naming <youpi>it tries to device_open("pci") <youpi>see libpciaccess' initialization function <Pellescours>but I did a settrans -g /servers/bus/pci to prevent the new one to boot and the old one is not providing pcifs tree <Pellescours>so it’s fine if 2 processes pci-arbiter are running? <youpi>it should be attaching itself to /servers/bus/pci <youpi>but that's a different story than having device_open("pci") working <youpi>which is what rumpdisk needs <youpi>it's fine if the second is using libpciaccess to access the first <youpi>it's not if it's using libpciaccess to poke with x86 ports <Pellescours>So, I only have the pci-arbiter that I launch at boot to start (settrans -g command), and I can see the logs for the pci_device_open. <Pellescours>I was able to start rump (I can see the logs) but rump was not able to see my ahcisata disk <Pellescours>my logs shows that at some point, it fails to do device_open when pci_device_open is called <youpi>possibly e.g. mapping BARs doesn't work or such <youpi>that can be a debugging starting point indeed <Pellescours>it succeed to do it during the startup but when I start rump the calls fails <youpi>with more prints you can probably determine what exactly fails <youpi>and also print on the other side, in the arbiter, in the device_open RPC <youpi>(for a start, making sure that it's that that gets called) <youpi>in a word: check the whole path, to work out where exactly things get wrong <Pellescours>during boot pci_device_open is called a lot of times, most succeed but some failed with error "no such device", then when I start rump a fist call succeed and the the 2 others fails with the same error "no such device" <youpi>note that pci_device_open is a forwarder <youpi>when calling device_open("time"), that goes through it <youpi>those are expected, make sure to print the name to rule them out from your debugging <youpi>or do you mean after the if which is after that? <youpi>(that code is bizarre, why setting err, it could just use strncmp in the if itself) <youpi>(there are two in that function) <youpi>please really be always as much explicit as possible <youpi>otherwise it's just gueswork <Pellescours>the first one. I added a mach_print(name) to know which failing calls are for the pci <youpi>the first one is precisely *not* for pci questions <youpi>but for device_open("time") and whatnot <youpi>which happen to go through pci-arbiter just because it interposes the device master port <Pellescours>the 2 calls that fails after rump start are for wd0 and disk:wd0 which is normal because rump did not found the sata disk <Pellescours>So the open of the pci device suceed, I’m adding logs to libpciaccess to understand <Pellescours>I don’t know if it’s usefull, but when I do lspci with the "normal" boot, I get a result. and when I do lspci with my pci-arbiter started as boot (and no other arbiter) I get an error <youpi>IIRC lspci uses /servers/bus/pci <Pellescours>I don’t know if lspci reads the /servers/bus/pci tree or if it does… raced <youpi>which is about pci-arbiter attaching itself to the filesystem, that wouldn't be related to rumpdisk <Pellescours>I build libpciaccess but I don’t see any .o nor .so :/ <damo22>ouch, looks like a lot of issues here in the backlog of irc <damo22>i need to go out, i will be back to take a look in ~4 hours <damo22>hi, did you get any further with your debugging? <damo22>i havent tried upgrading my hurd system yet to latest packages <damo22>from reading the above, it sounds like a libpciaccess problem, or invocation of libpciaccess in pci-arbiter? <damo22>the q35 machine by default has a AHCI controller, i hacked my qemu to not include the ahci controller by default and then added one manually so i could refer to it in the bus= specification <damo22>are you sure you are attaching your disk to the correct controller? <damo22>ok i will load my image and check what version of libpciaccess i am using <damo22>so should i attempt an upgrade of my system? <youpi>you'll probably get the break, yes <damo22>ext2fs: part:2:device:wd0: No such device or address <damo22>I noticed in libpciaccess you are calling device_close() on pci_port: <damo22>does that still allow the enumeration to occur later? <youpi>but in principle the obtained root port should be working fine <youpi>be the device open that we used to obtain it open or not <damo22>i need to check if this commit is present in Version: 0.16-3+hurd.1 of libpciaccess0 <damo22>* 740d2f2 (origin/master, origin/HEAD) hurd: Restore initialization order <youpi>otherwise nedde wouldn't work at all <youpi>that was the ponit of that commit :) <damo22>hmm, thats interesting then, i can only see 3 lines changed in pci_system_hurd_create() and they seem harmless <damo22>but the rest of the changes, the mapping, i have no idea about that <youpi>possibly that broke the use in rump, no idea <youpi>does rump's libpciaccess usage prints warning if pci functions fail? <youpi>never leave an error silent :) <damo22>its probably a regression in pci-userspace <damo22>i can add more debug prints in the debug mode <damo22>so did the api change for libpciaccess region mapping? <youpi>normally they shouldn't have, but see Joan's changes <youpi>perhaps try to grab version 0.16-1+hurd.8 from snapshot.debian.net <youpi>that one integrated the map patch <youpi>from what I can see, there is no source difference between 0.16-1+hurd.8 and 0.16-3+hurd.1 <youpi>and that patch is the only difference between 0.16-1+hurd.7 and 0.16-1+hurd.8 <youpi>it'd be really good if people were testing there changes against various scenarii <damo22>im a bit confused, pci_device_hurd_map_range() uses _SERVERS_BUS_PCI to read the pci device tree, but when its a bootstrap filesystem, that wont exist right? <damo22>do we ever use the hurd access method during bootstrap? <youpi>that's indeed very probably the problem <youpi>Joan introduced that so that delegation can work through /servers/bus/pci <youpi>but indeed at bootstrap that way can't work <youpi>perhaps pci_device_hurd_map_range should try pci_device_x86_map_range first, and revert to /servers/bus/pci if that fails <youpi>similarly to pci_system_hurd_create that tries the pci device first <youpi>note however the comment that Joan had on the list <youpi>he said that my "fix" patch broke his usage <youpi>because that ordering doesn't match his case either <youpi>really, it's just a matter of taking into account the different situations <youpi>please people avoid stay focused on your situation only, and make sure that all people's situation work :) <youpi>it shouldn't happen that I'd be the one doing it, since it'd mean having to magically find out the time to do it <damo22>how can we automate testing this stuff <damo22>if there was a couple of unit tests, we could ensure nothing breaks before sending in patches <youpi>debootstrapping a filesystem, running some commands to set up what should be done <youpi>and then you can run it in qemu <youpi>yes, unit tests can also help <youpi>but nothing replaces actual end results <youpi>the problem with unit tests is getting the situatioin properly <youpi>here, the bootstrap situation is really not easy to oibtain <youpi>which you cannot reasonably do on a running system <youpi>and it's the eventual situation that matters anyway, so even if unit testing can be helpful to pinpoint, the actual eventual situation is what really needs to be checked <damo22>i will send an email to the list to start a discussion about this problem <damo22>maybe Joan can clarify his use case and we can fix it <youpi>his use case is simply going through /servers/bus/pci <youpi>so that setting permissions on these files is enough to give somebody access to a pci card