***Server sets mode: +nt
<damo22>youpi: how do i do thread_wakeup and assert_wait() in userspace? ***Glider_IRC_ is now known as Glider_IRC
<damo22>i tried using a pair of semaphores for synchronising open/close <damo22>but i think they are running in the same thread <damo22>its definitely a synchronisation issue, i get a log like this: <damo22>it should not be calling rump_sys_open the second time because i set a bool flag to stop it, but it looks like it hasnt been set by the time it calls device_open the second time <damo22>i cant see if there are multiple threads with device_open <damo22>dammit 100GB is too big for ide driver <damo22>arghh the offset LBA 316M is too big to start the partition for ide <damo22>once i rearrange my partitions so i can boot off ide and have an AHCI controller as well i'll be set <damo22>youpi: what is the maximum offset i can have for an ide disk? <damo22>106811391 is my last sector for / i think its too large <youpi>damo22: lba28 support should get you 2^28 * 512 bytes, so 128GB <youpi>I'd be surprised if the driver doesn't support lba28 <damo22>i think i corrupted my symlinks when i chowned them <youpi>damo22: thread_wakeup and assert_wait can be implemented with condition variables <youpi>chown shouldn't be corrupting a symlink <youpi>and not even a translator entry <damo22>when i rsynced my home dir and chowned -R everything back to demo user on linux it broke all symlinks when booted into hurd <youpi>I don't think rsync knows about translators (i.e. xattr nowadays), but a symlink wouldn't use a translator <damo22>well i deleted all symlinks and restored them somehow and everything is working on a 50GB / <youpi>damo22: for thread_wakeup and assert_wait, you'd need a pthread_mutex_t, a pthread_cond_t, and an int variable. assert_wait would take the mutex, set the variable to 0, release the mutex ; thread_wakeup would lock the mutex, set the variable to 1, signal the condition, unlock the mutex ; thread_block would lock the mutex, and while the variable is 0, call cond_wait, then unlock the mutex <damo22>thanks, not sure if i need this though, i can't tell if device_open is running again in a new thread or not <youpi>you can add pthread_self() to printfs to know what do what <damo22>rump is hanging in qemu on wd0 at atabus0 drive 0 <damo22>i'll have to add the timeout again <damo22>k_handle...irq handler 10: release dead delivery 1 unacked irqs <damo22>do you have a timeout somewhere? <damo22>disks might take a while to handle the irq <youpi>usually drivers have a timeout when it expects an irq to signal the end of the operation, yes <youpi>but it's usually quite long, like a second <damo22>but youre doing thread_set_timeout(hz) <damo22>so it will think the irq is stuck and clear it? <youpi>damo22: no, in intr_thread I only look at userland processes which died <youpi>since we are not yet using a send-once port that could immediately notify of such a death <damo22>if ((!e->dest || e->dest->ip_references == 1) && e->unacked_interrupts) <damo22>how does it know its an aborted process <youpi>which is the reference gnumach acquired when it created the port <youpi>i.e. the process reference doesn't exist any more <damo22>it couldnt be a just opened one? <youpi>no, because the process reference is already there when we create the port <youpi>the gnumach ref is a second one, not the first one <damo22>is there a chance i am not setting e->dest <youpi>no it's allocated when the port is created <damo22>so then something died in my rump driver? <youpi>oh, I completely misunderstood what you said above <youpi>when I said drivers usually have a timeout, I mean inside the rump disk driver <youpi>there is no timeout inside the user-intr support for irqs <youpi>the message above definitely means that userland somehow dropped the port <damo22>k_handle...irq handler 10: release dead delivery 1 unacked irqs k_done <youpi>either by dying, or by closing the port <damo22>in the handler, it closed the port somehow <youpi>only one thread dying in a process wouldn't close the port <youpi>since it's the process which owns port <damo22>i dont know how its possible i got this message in the handler <damo22> /* For now netdde calls device_intr_enable once after registration. Assume <damo22> * it does so for now. When we move to IRQ acknowledgment convention we will <damo22>the first time the interrupt handler is called it is considered unacked? <youpi>userland will have to ack each interrupt <damo22>do i need to change my pci-userspace code for the new gnumach? <youpi>currently netdde also explicitly enable the interrupt after registering it <youpi>I was thinking about dropping that, but perhaps we want to keep it <youpi>no, the current gnumach only keeps its existing behavior <youpi>I just added the comment to explicit what netdde is currently doing <damo22>im calling device_intr_enable() twice <youpi>no, just one before running the server loop, and after each interrupt, which is what netdde does <damo22>i dont understand where this could be failing <youpi>is it perhaps just talking about a previous instance of your translator? <youpi>Mmm, no, you wouldn't have had irq fired if it wasn't cleaned up yet <damo22>im running this with no translator, just opening the block device <youpi>perhaps other code is deallocating a random port <youpi>and at some point by bad luck it's the delivery port <youpi>I remember you posting warnings about deallocating bogus ports <youpi>perhaps you could print the port numuber of the delivery port and print along every port deallocation in the code <youpi>by translator I just meant your rump process <damo22>wd0: drive supports 16-sector PIO transfers, LBA48 addressing <damo22>wd0: 100 MB, 203 cyl, 16 head, 63 sec, 512 bytes/sect x 204800 sectors <damo22>wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) <damo22>stupid translators were still running <damo22>i need to force it to go away twice <damo22>ok now im getting a different problem <damo22>irq handler 10: release a dead delivery port <damo22>k_handle... irq handler 10: release dead delivery 1 unacked irqs <damo22>its reproducible in that order every time <youpi>damo22: just to make sure what is really happening, I have pushed more debugging to the master-user_level_drivers gnumach branch <youpi>in case the kernel messages you are seeing are not actually about the process you are running, but a previous one <damo22>settrans -fga /dev/rump /bin/sh -c 'exec >> /root/rump.log 2>&1 && /usr/bin/env RUMP_VERBOSE=1 RUMP_NCPU=1 /hurd/rumpdisk' <damo22>wd0(ahcisata0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) (using DMA) <damo22>/hurd/rumpdisk: must be started as a translator <damo22>but somehow when i run it with exec it gets killed <damo22>oh crap !!!! my libstore.so is out of date <damo22>ok why is block.opened_count reset to 0 the next time it opens? <damo22>the static struct is not holding state <damo22>it resets every time device_open gets called <damo22>if it spawns a new thread when it opens the device how do you synchronise the state? <youpi>damo22: by "watch", I mean the "watch" command <youpi>which catches changes in the variable <youpi>so you get to know *what* is changing it <youpi>about synchronization, the question is too vague for me to provide any useful answer beyond "use mutex, cond, etc." <damo22>in block.c i have static struct block_data block; <damo22>why would the state be resetting to zero for that global when device_open is called again <damo22>its dying and the translator is resetting it? <damo22>gdb isnt reattaching so its not a different pid <damo22> at ../../libports/manage-one-thread.c:122 <damo22>122 while (err != MACH_RCV_TIMED_OUT); <damo22>$8 = {port = {class = 0x0, refcounts = {references = {hard = 0, weak = 0}, value = 0}, mscount = 0, cancel_threshold = 0, flags = 0, port_right = 0, current_rpcs = 0x0, bucket = 0x0, hentry = 0x0, ports_htable_entry = 0x0}, device = { emul_ops = 0x1218a60 <rump_block_emulation_ops>, emul_data = 0x121f9e0 <rumpblock>}, mode = 3, rump_fd = 3, media_size = 104857600, block_size = 512, opened_count = 6, opening = true, closing = fal <damo22>seems to be losing the port info <damo22>do i create a new port for it every open? <damo22>how do i tell mach that the device succeeded to open? <damo22> *devicePoly = MACH_MSG_TYPE_MOVE_SEND; <damo22>root@zamhurd:~# mount /dev/wd0s1 /mnt -t ext2fs <gnu_srs>damo22: Does that mean your rump disk driver works? <damo22>pretty much, it needs a lot of cleaning up <damo22>this is running in a VM at the moment <damo22>when i figure out what the bogus ports are from i will try on real hw <damo22>the problem is, i made it work by reverting block.c to an older version but now i have a lot of cleanup to do <damo22>i still dont know how my latest block.c is broken ***Glider_IRC__ is now known as Glider_IRC