IRC channel logs

2024-12-15.log

back to list of logs

<damo22>youpi: it seems i can send a packet but i dont get any back
<damo22>is there any way to confirm the packet actually hit the wire?
<youpi>depends how you start your vm
<damo22>-net nic
<youpi>so you're using the libslirp in-qemu tcp/ip stack, so you can't see the frames
<youpi>you can use -net tap instead of -net user
<youpi>that'll create a tap0 on the host that you can tcpdump on
<damo22>lovely
<youpi>it's more complex to configure, but for looking at outgoing frames that'll be fine
<Arsen>fwiw you can tcpdump -net user too: https://wiki.qemu.org/Documentation/Networking#Network_Monitoring
<Arsen>tap is vastly superior ofc but this is OK too - I used it for netstack debugging
<damo22>01:07:18.179404 ARP, Request who-has 10.0.2.3 tell 10.0.2.15, length 28
<damo22>01:07:18.179439 ARP, Reply 10.0.2.3 is-at 52:55:0a:00:02:03 (oui Unknown), length 50
<damo22>hmm looks like im getting a reply from the wire but it doesnt call device_read
<damo22>sending packet: [ OK ]
<damo22>qemu-system-i386: Slirp: Failed to send packet, ret: -1
<damo22>i think slirp failed to send the reply packet
<youpi>slirp just calls a qemu function for that
<damo22>it seems the wire has the arp request sent by device_write() and the arp request from the host, but theres no device_read
<damo22>arp reply*
<youpi>how did you plug device_read?
<damo22>using bpf
<damo22>let me push my latest code
<youpi>actually pfinet doesn't seem to be useing device_read, but device_set_filter
<youpi>and expecting NET_RCV_MSG_ID messages
<damo22>ohh
<youpi>see ethernet_demuxer
<damo22>can we change it?
<youpi>well, device_read by itself doesn't really make sense
<youpi>since we're not reading a given piece of a disk
<youpi>so it does make sense that we're getting an unexpected message, not an RPC
<damo22>ok
<youpi>and see libmachdevdde/net.c's netif_rx_handle
<damo22>l4dde26_register_rx_callback(netif_rx_handle);
<damo22>how do i make it call the receive when there is a packet
<youpi>that's to be seen with rump
<youpi>I guess there's a bpf function to make it call a function when a packet is seen by the filter
<youpi>or perhaps it's a matter of reading from /dev/bpf or such
<damo22>you just read /dev/bpf
<damo22>so do i need a new thread that keeps reading it?
<youpi>then so be it, make a loop that keeps reading it and sending a rcvmsg message
<damo22>ok
<damo22>youpi: if a packet comes in on the bpf fd, i already know its for the right device but i dont have a way to look it up
<damo22>to find the port
<damo22>ah i might need to query bpf for the interface
<Pellescours>isn't it the goal of pfinet to do that ?
<Pellescours>you are talking at ethernet level, so you don't worry about ip ports
<damo22>no, i mean to identify which NIC
<damo22>since the network stack can share interfaces
<Pellescours>ah
<damo22>the port meaning the mach msg port
<damo22>i think i know what to do now
<Pellescours>yeah ok
<Pellescours>damo22: with the mac address ?
<Pellescours>keeping a correspondance table port <-> mac
<damo22>youpi: which mach_port_t do i give to deliver_msg?
<damo22>is it the one that *devp is set to?
<damo22>when i open the device?
<damo22>ah its the reply_port of the write
<damo22>how does that work for packets that dont come from a write??
<Pellescours>how does netdde deal whit it ?
<damo22>it seems like netdde keeps track of every written packet and stores the reply_port of each one, then every received packet is matched to the list of tracked packets to find which reply port to send to?
<damo22>why cant pfinet just have a device_open port and expect to get packets on that port?
<damo22>im not sure how udp works then, since packets can just arrive unannounced
<Pellescours>or even for tcp where you are a server anf expecting for clients to connect
<Pellescours>ah but it's pfinet that connect to device, so the connection between the 2 services is already there, and so you know the ports
<damo22>what ports?
<damo22>it doesnt seem to use the device port for recieving packets
<damo22>it uses the reply_port on the device_write
<damo22>i guess the device port might time out and it will need to reconnect? so instead they made it get a new one every time a write is made?
<damo22>im still confused
<Pellescours>i'm nt sre to follow
<damo22>ah it gets the port from device_set_filter!
<Pellescours>when device write is called, you have a repy port as parameter, so you don't need to manage that, right ?
<damo22>im very close but something about receiving packets is broken
<damo22>what is mach_msg supposed to return?
<damo22>it returns MACH_MSG_SUCCESS
<damo22>so i dont know what to do now, my branch is pushed to rumpnet
<damo22>Pellescours: you can try it if you want
<damo22>receiving packets is currently broken
<damo22>DHCPDISCOVER on /dev/wm0 to 255.255.255.255 port 67 interval 14
<damo22>sending packet: [ OK ] 342
<damo22>rcvd a packet : [ OK ] 342
<damo22>but nothing
<damo22>i think im getting the written packet back from the wire
<damo22>lol
<damo22>i probably need a bpf filter that discards outgoing packets
<damo22>an ioctl can do it
<damo22>youpi: is this receiving the written packet?
<damo22>sending packet: [ OK ] 342
<damo22>ff ff ff ff ff ff 52 54 00 12 34 56 08 00 rcvd a packet : [ OK ] 342
<damo22>i think it is
<damo22>i dont think netbsd's bpf has an instruction for filtering direction on the packet at the filtering level
<damo22>it has an ioctl
<damo22>BIOCSDIRECTION
<damo22>but if i change the direction flag while its reading, it will read the written packet before it gets a chance to revert to incoming only
<damo22>ah, "and not src <me>"
<youpi>there is no reason for matching send/recv packets
<youpi>with a tcp download for instance, you can receive a lot of datagrams while sending only very few acks datagrams
<damo22>yeah i think i was misunderstanding pfinet
<damo22>the problem i have now, is that the broadcast packet comes back to me but the reply is missing, even though its on the wire
<damo22>tcpdump shows the reply exists:
<damo22>08:51:36.747329 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:12:34:56 (oui Unknown), length 300
<damo22>08:51:36.747367 IP 10.0.2.2.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 274
<youpi>what does <me> mean exactly?
<damo22>mac address
<damo22>DHCPDISCOVER on /dev/wm0 to 255.255.255.255 port 67 interval 8
<damo22>sending packet: [ OK ] 342
<damo22>ff ff ff ff ff ff 52 54 00 12 34 56 08 00 rcvd a packet : [ OK ] 342
<damo22>DHCPDISCOVER on /dev/wm0 to 255.255.255.255 port 67 interval 10
<damo22>.... repeat
<damo22>i seem to receive the packet i just sent
<damo22>but no other packets
<youpi>is that not supposed to be rather ether src <me> ?
<youpi>ether src host <me> even
<damo22>its not a command, bpf filter needs to be constructed manually
<youpi>ok but are you sure you are telling it you are matching an *ethernet* source address ?
<damo22> BPF_STMT(BPF_LD|BPF_W|BPF_ABS, 8), /* get 4 low bytes of src */
<damo22> BPF_STMT(BPF_LDX|BPF_IMM, 0x56341200), /* get 4 low bytes of me */
<damo22> BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_X, 0, 2, 1), /* src == me ? */
<damo22> BPF_STMT(BPF_RET+BPF_K, (uint32_t)-1), /* accept */
<damo22> BPF_STMT(BPF_RET+BPF_K, 0), /* drop */
<youpi>ethernet addresses are 6 bytes...
<damo22>yes
<damo22>im matching the low 4 bytes
<damo22>because its difficult to write a full version
<damo22>but someone told me it is expected that the broadcast packet will arrive to self
<youpi>headers are using big-endian, not little-endian
<youpi>yes, broadcast arrive, to self, that's expeced
<youpi>the source still is yourself
<youpi>but fields are in big-endian in networking
<damo22>yes, my mac is 52:54:00:12:34:56
<youpi>is bpf little-endian or big-endian ?
<youpi>when it's loading the 4 bytes
<youpi>in the packet it will really is 0x00 0x12 0x34 0x56
<damo22>i dont know, i need to check
<damo22>im guessing it casts the bytes to an int
<youpi>yes b ut I would be *really* surprised if it was using little-endian, when basically al lnetwork protocols are using big-endian
<damo22>return (cp[0] << 24) | (cp[1] << 16) | (cp[2] << 8) | np[0];
<damo22>err, i think its using be32dec()
<damo22>but even if i dont use a fancy filter, if i allow all packets i should receive them
<damo22>but im not getting them
<damo22>ive just pushed my latest code to git.zammit.org
<youpi>it doesn't seem to be there?
<damo22>* ec2d40aa (HEAD -> rumpnet, zammit/rumpnet) rumpnet: Use /dev/bpf
<youpi>whi repo?
<damo22>hurd-sv.git
<damo22> https://git.zammit.org/hurd-sv.git/log/?h=rumpnet
<youpi>is (uint32_t)-1 really recognized as "give me everything you have"?
<damo22>i think so yes
<youpi>also, in pfinet/ethernet.c we have a heading {NETF_IN|NETF_BPF, 0, 0, 0},.       .       /* Header. */
<youpi>"thinking" is not enough ;)
<damo22>that is not allowed in netbsd's bpf
<damo22>NETF_* does not exist
<damo22> /* If we passed all the tests, ask for the whole packet. */
<damo22> BPF_STMT (BPF_RET + BPF_K, (u_int)-1),
<damo22> /* Otherwise, drop it. */
<damo22> BPF_STMT (BPF_RET + BPF_K, 0),
<damo22>i borrowed that from bpf.c in some netbsd code
<azert>maybe I’m going to say something stupid, but I think rumpnet could just expose the /dev/bpf interface with its ioctls, and it would be up to pfinet to use it properly
<damo22>no need
<youpi>but then pfinet would have to know whether it's talking to a gnumach device, a netdde device, or a rumpnet device
<damo22>i think the mach device interface in userspace is a simpler solution
<azert>pfinet could have a few backends
<azert>bpf is a clean and flexible interface api
<youpi>azert: but then the *user* would have to know, which would be even less practical
<azert>yes
<damo22>azert: i am not convinced that its very good at all
<damo22>have a look at the code i had to write to receive a packet
<damo22>and its still broken
<damo22>if netbsd was more feature complete, we would not need bpf at all, we could just use AF_LINK socket
<damo22>bpf multiplexes all the nics
<damo22>and you have to select which one youre talking to
<damo22>good night
<Pellescours>These 2 lines are useless (i.e a no-op) right ? https://github.com/etienne02/gnumach/blob/master/vm/memory_object.c#L878
<youpi>Pellescours: it's odd indeed, better check in the git history how that's coming from
<youpi>note that it's not strictly no-op, since may_cache might be > 1
<Pellescours>it comes from 97 commit named « initial source »
<Pellescours>1997
<Pellescours>(so the first commit on git)
<youpi>so it's probably intended to be so, e.g. to normalize the value into {0,1}
<azert>damo22: it is not that bsd is not feature complete, come on! It is deliberate
<azert>raw sockets are an inferior interface then bpf. Even Linux made the switch
<azert>think of an interface on promiscuous mode, bpf is Turing complete
<Pellescours>youpi: should a vm_object_allocate call have an associated vm_object_deallocate? I’m trying to understand the flow but in memory_object_lock_request https://github.com/etienne02/gnumach/blob/master/vm/memory_object.c#L632 we may allocate a new object (called new_object) but I don’t see an associated deallocation
<Pellescours>(I see that at some point we set the reference to null but without a deallocation before)
<Pellescours>damo22: maybe I setup something wrong but when trying to build your branch I get "multiple definitions of « rumpns_rss_getkey »;"
<Pellescours>(while building hurd on branch rumpnet, after having built rumpkernel on branch develop)
<youpi>Pellescours: that can be deallocated by userland through merely dropping the port
<Pellescours>It’s weird, when in quemu window, the window grab my cursor and I move it it print giberrish (when being in gnumach debugger)
<solid_black>hi
<solid_black>Pellescours: please feel free to ask me questions about VM objects / memory objects
<solid_black>perhaps I'll understand them myself while answering :D
<Pellescours>sure, I’m tried the damo rumpnet (but i’m not able to compile) and to search the origin of the lock when doing "heavy" copies with rumpdisk
<Pellescours>I see a DEBUG flag in the vm_pageout, I should try to enable it and see if anything appear
<solid_black>in other news, feliĉan Zamenhofan tagon :)
<Pellescours>it should not be my problem that cause hang (as I am not using smp, so locking is a no-op), BUT, I may have found a deadlock in pageout thread
<Pellescours>pageout_scan calls vm_page_balance with lock vm_page_queue_free_lock held https://github.com/etienne02/gnumach/blob/master/vm/vm_pageout.c#L425
<Pellescours>this calls vm_page_balance_once() which calls vm_page_seg_balance which calls vm_page_seg_balance_page
<Pellescours>and here is the deadlock https://github.com/etienne02/gnumach/blob/master/vm/vm_page.c#L983 because we re-do a lock on this same lock
<youpi>where is the first acquisition of the lock?
<Pellescours>in vm_pageout
<youpi>?
<Pellescours>ah no
<Pellescours>Ah no my bad, I missread, it’s this function that take the lock and return it locked
<Pellescours>vm_page_balance should not be called with this lock being held, but once it return, then it’s held
<Pellescours>youpi: is this safe to do https://github.com/etienne02/gnumach/blob/master/vm/vm_page.c#L939 ? Both are unsigned so in one particular case a + b may overflow to exact 0, isn’t it better to do "if(a > 0 || b > 0)" ?
<youpi>well, if they become so large that they overflow, we have a way more important problem somewhere
<damo22>Pellescours: one of my commits needs to be dropped, the one relating to rss_getkey
<damo22>then it compiles
<Pellescours>ah ok
<Pellescours>I’ll see that later I’m gonna go to bed
<Pellescours>For the paging issue, I’m still trying to determine if it’s a gnumach or a hurd problem. I can easilly reproduce it but debugging is not easy... (I don’t really know where to look)
<Pellescours>I’m wondering if adding some tests to isolate components may help (like one that only involve rumpdisk that read a lot. one that write a lot, one that only involve ext2fs, ...)