IRC channel logs

<damo22>the problem with rumpnet is the immediate reading of a single packet, i need to turn that off and let it queue up multiple packets

<damo22>im getting 1.7MB/s now

<damo22>and errors with timeouts

<damo22>woot 4.4MB/s

<damo22>but slow with smp

<damo22>EMACH_SEND_TIMED_OUT

<damo22>getting lots of these errors on rcvd packets

<damo22>youpi: where is the mach_msg_server for network rcv packets?

<damo22>heh, i got it to work with 1.5MB/s on smp 8

<azert>damo22: is rumpnet multithreaded?

<damo22>yes there is a rcv thread

<damo22>and a main thread

<damo22>plus whateve rump does

<azert>Would switching to multiple rcv threads solve the timeout problems?

<damo22>not sure

<damo22>it seems to hang if i make the mach_msg timeout nonzero

<azert>Indeed it depends where is the bottleneck

<damo22>like if i send a lot of packets

<damo22>it hard locks up the smp system, even ddb does not work

<damo22>kdb*

<damo22>send packets to the system*

<damo22>actually i was scping from host and pulling packets back to the host

<azert>Interrupts are handled on the rcv thread, right?

<damo22>not really

<damo22>im not sure actually

<damo22>all the receive thread does is deliver the messages coming from the nic

<damo22>via mach_msg

<damo22>i just got 267MB copied at 1.8MB/s and then it locked up

<azert>Does it lock up because the message queue is full?

<damo22>i didnt get a backtrace

<damo22>just frozen

<azert>points to the gnumach being stuck

<damo22>yes

<azert>and gnumach sends a message for each interrupt, right ?

<damo22>i think so

<azert>I hope you can solve this without having to mess with scheduling

<damo22>i set the timeout to 1 second for each mach_msg() it delivers per packet

<damo22>im not getting timeouts anymore but i get the hangs

<azert>I would keep the timeout to zero

<damo22>why does the timeout of zero actually cause timeouts

<azert>The kernel shouldn’t block for any reason

<azert>Then I’d try to solve the timeouts

<azert>those are performances related

<damo22>i get EMACH_SEND_TIMED_OUT

<damo22>when the timeout is set to 0

<azert>maybe moving to a multithreaded rcv helps

<azert>It’s better a timeout then a hang

<damo22>yes it is, but i dont understand why

<damo22>shouldnt a non-zero timeout cause a timeout, not an infinite timeout

<azert> https://web.mit.edu/darwin/src/modules/xnu/osfmk/man/mach_msg.html say If the destination port's queue is full, several things can happen. If the message is sent to a send-once right (msgh_remote_port carries a send-once right), then the kernel ignores the queue limit and delivers the message. Otherwise the caller blocks until there is room in the

<azert>queue, unless the MACH_SEND_TIMEOUT option is used. If a port has several blocked senders, then any of them may queue the next message when space in the queue becomes available, with the proviso that a blocked sender will not be indefinitely starved. These options modify MACH_SEND_MSG. If MACH_SEND_MSG is not also specified, they are ignored.

<azert>If the kernel blocks, you get the hang

<damo22>i dont understand MACH_SEND_TIMEOUT option

<damo22>what options should i use to ensure the kernel never blocks

<azert>I understand from the docs that the timeout should be zero

<damo22>MACH_SEND_MSG|MACH_SEND_TIMEOUT

<damo22>with a timeout of 0

<azert>and I think you need to be as quick as possible to handle the interrupts on the rumpnet side

<azert>yep

<youpi>damo22: why are you setting MACH_SEND_TIMEOUT?

<youpi>along with timeout=0, that requests immediate send

<damo22>i dont know, i copied the code from the other net driver

<azert>youpi: the kernel blocks

<youpi>then it's the reception part that needs fixing

<youpi>but don't be surprised to get EMACH_SEND_TIMED_OUT if you set MACH_SEND_TIMEOUT, it's meant for it

<damo22>yes, i am trying to understand the mach_msg() call

<youpi>which other net driver?

<damo22>i think the problem is there

<damo22>its been a long time, i dont recall

<damo22>i think i got it from netdde but im not sure

<youpi>netdde doesn't set it

<youpi>eth-multiplexer happens to set it, I don't know why

<youpi>that being said, losing packets is fine, that tells tcp to back-off

<damo22>so do i just use MACH_SEND_MSG with a timeout of 0?

<youpi>and ignore the errors which are merely due to too fast bandwidth for the software stack, yes

<damo22>ok cool

<damo22>rumpnet is takingn 125% cpu

<damo22>on a smp 6 system

<damo22>while copying at 1.8MB/s

<youpi>may I remind that smp is *difficult*

<youpi>don't bother trying to optimize for smp first

<youpi>to get actual parallelism on smp, you need a multi-channel network card, distributed irqs, distributed message queuing and whatnot

<youpi>going smp or multithreaded is never a simple solution to performance

<damo22>ok

<damo22>dang it froze again on smp

<damo22>around 293MB

<damo22>on UP it copies at 4.8MB/s

<youpi>is the cpu at 100% ?

<damo22> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

<damo22> 552 root 17 -3 394544 264224 0 R 33.4 12.8 0:53.31 rumpnet.s+

<damo22> 7 root 6 -14 343076 211352 0 R 27.2 10.2 0:49.84 rumpdisk

<damo22> 803 demo 20 0 171776 4624 0 R 17.9 0.2 0:32.12 sshd-sess+

<damo22> 549 root 5 -15 187348 2468 0 S 6.3 0.1 0:09.90 pfinet

<damo22> 14 root -1 -21 170352 948 0 S 6.0 0.0 0:07.46 pflocal

<damo22>yes

<damo22>its using 109% on the host

<azert>youpi: maybe deliver_intr in gnumach also needs a timeout of zero?

<youpi>no

<youpi>you do *not* want to lose interrupt delivery message

<youpi>otherwise the interrupt count will get bogus

<azert>what happens when the queue is full?

<youpi>it'll block

<youpi>that's fine

<azert>how is that fine? gnumach will block, right?

<damo22>2GB copied at 5.0MB/s with UP so far

<youpi>the interrupt is masked until userland unblocks it

<damo22>no hangs with UP

<azert>maybe gnumach could write down some state, bailout and retry later instead

<azert>ah ok understood

<azert>damo22: can you profile rumpnet?

<damo22>i dont know if we compile rump with profiling enabled

<damo22>it should be possible

<azert>would be cool

<youpi>it was working at some ponit for ext2fs etc.

<youpi>that's indeed useful to track down overhead

<damo22>why would wire_task_self() fix the vm_pages_phys from returning 0

<youpi>because it forces *all* pages to be always allocated

<damo22>is that why it uses so much ram?

<youpi>yes

<youpi>just like rumpdisk

<youpi>rumpnet can probably just wire down the buffer for which it wants to dma

<damo22>right

<damo22>why cant pci-userspace do that in dmalloc

<damo22>whenever it asks for dma it could wire that down

<youpi>because nobody implemented it?

<youpi>in rumpdisk we have to wire down the wohle process anyway

<damo22>ah because it provides the disk for swap memory

<damo22>2.5GB file matches sha256sums on each end so the copy worked

<damo22>is there any reason we cant wire down all calls to vm_allocate_contiguous in gnumach?

<damo22>is there a case where you want a chunk of contiguous memory but dont mind if its swapped out?

<youpi>they are already made non-pageable

<damo22>are you sure? i dont see that

<youpi>vm_map_pageable

<damo22> kr = vm_map_pageable(map, vaddr, vaddr + size, VM_PROT_READ | VM_PROT_WRITE, TRUE, TRUE);

<damo22>so only VM_PROT_NONE memory is pageable?

<damo22>why do we unwire the requested memory just below that

<youpi>see d6030cdfc49e9aa10819a5438a5ae313a4538f42

<damo22>ok so why are you not unwiring the memory pages between vm_page_atop(size) and npages

<damo22>since they were extra pages we didnt need

<youpi>because we didn't wire them?

<damo22>ah ok

<damo22>so why are the desired pages wired in the first place if you just unwire them after?

<youpi>I don't know, I didn't write that code

<damo22>so do we have a contradiction in the memory management? non-pageable memory must be unwired, but contiguous requests for physical memory need to be non-pageable and wired down?

<youpi>"non-pageable memory must be unwired" ??

<youpi>d6030cdfc49e9aa10819a5438a5ae313a4538f42 is not saying that

<damo22>the commit adds a call to vm_page_unwire() for the pages just allocated, that memory is non-pageable (wired)

<damo22>so im confused why we need to unwire the pages

<damo22>ah we are "releasing one wiring" of the pages

<damo22>vm is hard for me

<youpi>vm is hard

<youpi>for everybody on any OS :)

<azert>If i understand correctly, I’d revert that commit and make instead sure that wired memory don’t get passed around by device_read

<youpi>why?

<youpi>that'd bring yet more data copy

<youpi>while it should be feasible to get wiring right

<azert>Is it feasible? Can you make wiring and copy-on-write work together?

<youpi>why not?

<azert>In theory, it can be done

<azert>But to maximize performances, using shared memory would still be a win

<damo22>if a task wires a memory region, gnumach keeps a ref count of how many tasks wired the page?

<damo22>does gnumach call vm_page_wire() against its own map? it shouldnt right?

<damo22>unless the pages are for gnumach to use

IRC channel logs

2025-06-16.log