IRC channel logs

2019-11-17.log

back to list of logs

***Server sets mode: +nt
<damo22>it almost detects the disk on real hw, but i think the timeout i added to work around VM is causing issues on real hw
<damo22>i will try removing the timeout i added and see if it works on real hw
<youpi>damo22: which kind of timeout? usually vms are faster than real hw :)
<damo22>inside the rump ahci driver there is "tsleep" waiting for ahci command to complete
<damo22>on vm it never completes
<damo22>so i made it 100ms
<damo22>then it works
<youpi>maybe you can check inside the qemu code what gets trigger and what doesn't work there
<youpi>which could give the reason why
<damo22>im not sure what im looking for there
<damo22>but first i want to try on real hw with reverted change
<damo22>if it fixes the problem on real hw i will look into qemu bug
<damo22>its very confusing because i have statically compiled rump libs into libmachrumpdev, so every time i change it i have to remember to recompile that
<damo22>and repack the initrd
<damo22>otherwise im looking at old code
<damo22>running old code*
<youpi>that's a common mistake, running the old code
<youpi>I usually increment a printf in the different pieces, to make sure what I'm running
<damo22>storeio got killed
<damo22>just when it was about to mount
<damo22>you cant set storeio as active translator?
<damo22>it seems to die
<damo22>anyway, there is still a problem with my storeio
<damo22>rumpdisk seems to work fine
<damo22>i repacked an updated libstore.so.0.3 into the initrd with my changes which work on VM
<youpi>setting it as active translator should work
<damo22>maybe my libstore.so.0.3 has missing libs
<damo22>in the initrd
<damo22>/lib/ld.so.1 --list /lib/libstore.so.0.3 has no missing symbols
<damo22>since rumpdisk runs the rump_init at device_init time, i can see the log of it identifying the disk every time it starts even running it as a non-translator
<damo22>it reliably identifies the disk with the reverted timeout
<damo22>but it doesnt mount at all for some reason
<damo22>it just hangs
<damo22>in a vm i can use a second controller so i have access to my code
<damo22>but on real hw i cant do that
<damo22>with the same static rumpdisk binary, it has different behaviour on vm and hw
<damo22>at least now i can gdb it on vm
<damo22> http://paste.debian.net/plain/1116590
<damo22>it seems like there are two threads waiting on a sleep, and whatever is managing the threads got stuck
<damo22>because when i give it 2 virtual cpus in rump, it doesnt get stuck anymore, but fails to identify
<damo22>"atath" and "ahcicmd"
<damo22>are both tsleeping
<damo22>seemingly with the same priority
<damo22>ive seen code that bumps the priority for some tsleeps, so i will give that a go
<damo22>perhaps (PRIBIO + 1) | PNORELOCK
<damo22>i have no idea what im doing but it looks right
<damo22>ive got it working in the VM again, now i need to test if this works on hw
<damo22>hmm i can cat the block device but cant mount it
<damo22>on hw
<damo22>i can cat the block device to /dev/null
<damo22>then i get a deallocation of a bogus port and real time signal 0
<damo22>actually dd
<damo22>maybe i should remove some of my mach_prints
<damo22>its going biserk
<damo22>youpi: master-user_level_drivers branch crashes on netdde
<damo22>do i need to use the debian one with that last patch cherry picked?
<damo22>mach_print is really annoying because you need to be able to expand values in it
<damo22>it should accept variable length arguments
<damo22>i am considering writing mach_printf
<damo22>heh i did it in one line
<damo22>#define mach_printf(fmt, ...) printf(fmt, ##__VA_ARGS__)
<user_oreloznog>Hello Hurd ! o/
<damo22>im really confused, isnt pci-arbiter supposed to take control of all the ioports?
<damo22>or at least the pci-cfg ones
<youpi>damo22: you need the debian patch over it to put the interface into the experimental mig ids
<youpi>damo22: pci-arbiter doesn't need to access all ports, only the configuration ports
<youpi>for now it *has* to leave ports ready for userland pci drivers to use them
<youpi>at some ponit we might make pci-arbiter a proxy for the io perm permission, but we are not there yet
<damo22>is it safe to upgrade my packages?
<damo22>i guess i should back up my /
<damo22>i think pci cfg2 is not a good one to protect because some hw uses it
<damo22>c000- cfff
<damo22>youpi: how can i see why rumpdisk/ex2fs is hanging on mounting a partition, but it seems to read fine as a raw block device
<damo22>i am very limited by initrd
<damo22>i tried squashing a 1.6G rootfs in there down to 120MB with a dev environment but it ran out of memory at ramdisk uncompression
<damo22>it now mounts fine on VM
<damo22>im getting a lot of noisy mach_prints deallocating a bogus port 131 and 141
<damo22>but it mounts
<youpi>these deallocation warnings are very concerning
<youpi>it means some code is going really wrong, and that really needs to b e fixed
<youpi>otherwise it'll release completely unrelated ports
<youpi>leading to closing files that shouldn't be etc.
<damo22>i dont understand ports enough to know what im doing
<youpi>ports are not very far from fd
<youpi>to track what is deallocating a bogus port, see mach_port_deallocate
<damo22>mach_device{ } is a struct with a port at the beginning and next is the device emul struct
<youpi>set mach_port_deallocate_debug to 1 to trigger the debugger in that case
<youpi>and there you can get a backtrace
<youpi>yes, that's expected
<youpi>and for userland a port is a number that references to that structure
<youpi>just like file descriptors refer to an open file
<damo22>it seems that when i overload that struct with a new one i have to put the port first
<damo22>so it matches up
<youpi>yes
<damo22>and i can add more elements
<damo22>at the bottom
<damo22>but when i create a device port do i just do it once per instance of the device?
<damo22>or every time it opens
<youpi>see the existing drivers
<damo22>ok
<youpi>xen/block.c's device_open returns the port again
<damo22>libmachdev seems broken
<damo22>is it even used?
<youpi>€ ldd /hurd/netdde | grep mach
<youpi> libmachdev.so.0.3 => /lib/i386-gnu/libmachdev.so.0.3 (0x01007000)
<damo22>oh
<youpi>there is just one opener in that case (pfinet), so I wouldn't be surprised that it doesn't correctly handle multiple device_open
<youpi>see libmachdev's device_open for network devices
<youpi>it does search_nd()
<youpi>to return the same if it's not the same open
<damo22>thanks
<youpi>and block's device_open has a TODO comment about returning the same
<damo22>i saw that
<damo22>i'll take a look at the network one
<damo22>ah nice
<damo22>if the device has the same name, i can consider it the same device
<damo22>and only open it once
<damo22>i'll need to investigate the strange ports another time, but i have it opening only once per block dev
<damo22>it speeds it up enormously on further mounts
<youpi>"another time" the problem might not appear
<youpi>which is a problem because that won't necessarily mean the problem got away
<youpi>and instead you'd have strange behaviors without knowing where that comes from
***Glider_IRC__ is now known as Glider_IRC
***Emulatorman___ is now known as Emulatorman
<rs410ga>is this the appropriate place to ask for some GRUB support?