IRC channel logs

2026-06-16.log

back to list of logs

<anemofilia>I've made a get request that returned me a vu8 as body, how am I supposed to convert it to a png?
<anemofilia>nevermind, got it
<ArneBab_>old: thank you! I was AFK until late yesterday evening and didn’t check IRC afterwards.
<sneek>Welcome back ArneBab_, you have 1 message!
<sneek>ArneBab_, old says: in case you missed it: https://codeberg.org/guile/guile/pulls/201
<ArneBab_>old: I don’t see obvious problems, but with the change so deep inside the ports, I think proper review of https://codeberg.org/guile/guile/pulls/201/files should have a second pair of eyes with more experience in the C side of suspendable ports, like wingo_
<old>I'm pretty confident with the code itself
<old>I was mostly wondering if that fixed your issue
<old>I can ping wingo and wait a few weeks
<dsmith>heh
<dsmith>sneek, botsnack
<sneek>:)
<ArneBab_>old: then I’ll check my issue and report back.
<old>thx :-)
<dariqq>is there a way to have srfi-19 accept HH:MM format for the timezone? This seems supported by strptime
<ArneBab_>old: I now pulled your commit into my list of local guile patches (it’s kind of crazy that I have 12 commits on top of Guile origin/main for 9 features of which I need 2 for my local setup)
<ArneBab_>old: but I still see the problem. gdb backtrace shows: #4 0x00007fe385b14291 in port_poll (port=port@entry=0x7fe379cfc6e0, events=events@entry=4, timeout=timeout@entry=-1) at /home/arne/eigenes/Programme/guile/libguile/ports.c:1447
<ArneBab_>old: https://paste.debian.net/hidden/62caa256
<ArneBab_>old: line 1447 is the rv = poll (pollfd, nfds, timeout); withsin the while loop you added (before the switch(err)).
<old>could it be then that asyncs are disabled while doing the poll?
<ArneBab_>how could I check that?
<ArneBab_>ACTION is trying to rebuild guile with make clean just to be sure it’s no artefact
<old>go to the frame of port_poll, the variable `t' ought to be present if not optimized away
<old>(gdb) frame 4
<old>(gdb) p t->block_asyncs
<ArneBab_>(gdb) p t->block_asyncs
<ArneBab_>$2 = 0
<ArneBab_>old: does that mean asyncs are not blocked?
<ArneBab_>(gdb) p (*t) → https://paste.debian.net/hidden/6f010717
<ArneBab_>(sorry that it took a while: had to wait until Guile finished recompiling)
<old>paste entry could not be found eh
<ArneBab_>that’s strange … I pasted again: https://paste.debian.net/hidden/18cba9cc
<dsmith-work>old, those speedups are impressive!
<old>ArneBab_: asynsc are not blocked and pendings_async is 0x304, which is '()
<old>so no asyncs hm
<old>oh but wait a second here
<old>port-port is only called as the default waiter
<old>as if you did not install suspendable ports ?
<old>dsmith-work: which speedups?
<old>s/port-port/poll-port
<dsmith-work>old the source location stuff
<old>ohh right
<old>the memory usage dropping is great indeed
<old>but time wise, oddly enough we are not "there" yet
<old>but I did shave couple of seconds of bootstraping guile and compiling the whole thing with this
<old>unfortunatelly the technique for source location compression is quite limited on 32-bit machiens
<dsmith-work>Hmm.,  I'm at work, and don't have the email in front of me, but ISTR something 17s from over 2 minutes. About 10x
<ArneBab_>old: suspendable-ports should be installed when importing fibers, right?
<old>dsmith-work: hmm I think you have misread.
<ArneBab_>ah, no, on run-fibers, #:install-suspendable-ports? is the default, so it should be installed.
<old>the 17 s rows was for compiling a module X
<old>and the 2 minutes row as for compiling a module Y
<old>and the column were identifying with and without the compression
<old>the best result I had was for compiling gnu/packages/python-xyz.scm, which is 41708 lines
<old>from 2m23 to 2m17
<old>to 6 s on 2 mins. Less than 5%
<ArneBab_>old: I’m piping chars through sockets created via socketpair PF_UNIX SOCK_STREAM 0 and then set as nonblocking via (fcntl socket F_SETFL (logior O_NONBLOCK (fcntl socket F_GETFL)))
<ArneBab_>old: is something missing there?
<old>ArneBab_: when you spawn a fiber, fiber should indeed set the parameter for you
<old>are you in a fiber?
<ArneBab_>yes
<old>hm
<old>the fact that port-poll is called is problematic then
<old>this is only called when suspendable ports are installed but that the wait is done not inside a fiber
<ArneBab_>The reason may be that I check for char-ready?
<ArneBab_>maybe because I’m redirecting stdout and stderr and stdin -- could those still be outside the fiber?
<old>it would help if we had a backtrace in Guile I think
<ArneBab_>the problem is that I can’t get in there :-(
<old>ehh we really need a GDB integration to a get a backtrace of the VM from GDB
<old>are you using a custom scheduler in fiber?
<ArneBab_>I don’t think so, I just use run-fibers
<old>hmm
<ArneBab_>and spawn-fiber
<old>right hm
<old>it would be interesting to get the value of (current-write-waiter) for your fibers
<old>when they spawn
<ArneBab_>how do I get that? Just after calling spawn-fiber?
<ArneBab_>the code is available at https://hg.sr.ht/~arnebab/terras-heritage/browse/enter/helpers.w?rev=tip#L667
<ArneBab_>and https://hg.sr.ht/~arnebab/terras-heritage/browse/enter/websocket.w?rev=tip#L285
<ArneBab_>(I’ll gladly turn it to parenthesized form if that helps you)
<old>something like this: https://paste.sr.ht/~old/ec3df2a3dcafe436c28d616535606312ee7834ba
<old>I can actually reproduce locally I remember
<ArneBab_>should I do that before run-fibers?
<old>yes before
<old>it should not matter really we should see the same value everywhere
<ArneBab_>I see ;;; (#<procedure wait-for-readable (port)> #<procedure wait-for-writable (port)>) regardless of whether the run blocks or works. The error message is that the output port gets closed.
<ArneBab_>(many of those)
<old>hm
<ArneBab_>sorry: the error message of the closed output port is OK: that’s from the previous run when reloading the page (the browser closes the socket then)
<old>okay let's try something else
<old>we will force a backtrace
<ArneBab_>should I remove the print waiter code again?
<old>yes
<old> https://paste.sr.ht/~old/a3c9d1842034b7ebd631ed734c4ec960a69532fc
<old>this should throw to who ever is calling poll-port and blocking
<ArneBab_>(there is a moduls-set instead of module-set -- fixed it in the previous code, too)
<ArneBab_>I think I also have to remove my with-exception-handler blocks …
<old>yes sorry I mistype
<ArneBab_>no problem, just wanted to tell you so you don’t wonder how it could work
<ArneBab_>And thank you very much for your help!
<ArneBab_>I don’t see the error yet …
<ArneBab_>(no backtrace) -- will search for more places where it might get eaten
<old>if all fail, open a file in $HOME/trace.txt and dump the backtrace there
<old>my pleasure to help btw
<ArneBab_>so instead of throw 'error args, open-output-file /tmp/trace.txt and write there?
<old>yup
<old>perhaps open with append if you get the error multiple time
<old>and then call backtrace (but first change current-error-port)
<ArneBab_>ah, with-error-to-port?
<old>yes
<old>or just parameterize current-error-port
<dsmith-work>old, Ahh  thanks
<ArneBab_>old: the error doesn’t seem to get thrown
<ArneBab_>I see scm_i_with_continuation_barrier in here -- can with-exception-handler cause the ports to be non-suspendable?
<ArneBab_>(in the gdb backtrace, I checked that again)
<ArneBab_>the throw isn’t triggered -- is there another port-poll?
<old>hmm
<old>you are using guile main?
<ArneBab_>yes -- plus a few patches
<old>I don't see it being used anywhere else
<ArneBab_>I’ll try popping all the patches except for yours
<old>oh hold on
<old>scm_i_write_bytes and scm_i_read_bytes
<old>are calling it also
<ArneBab_>ah, so it runs through C and can’t be intercepted from Scheme?
<old>right
<ArneBab_>would it help you to get a REPL into the running guile?
<old>not anymore it's from the C side
<old>eh
<old>you mention it was a socket right?
<old>is the fd getting polled the socket?
<old>IIRC, you we hanging on POLLOUT so writting
<old>if you have gdb up again
<old>check what is the value of the fd we are waiting on
<old>and check what is that fd
<old>so something like:
<old>(gdb) frame 0
<ArneBab_>that’s at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
<old>(gdb) call (int) port_write_wait_fd(port)
<ArneBab_>No symbol "port_write_wait_fd" in current context.
<ArneBab_>do I need to take another frame?
<old>hmm
<old>it was inlined
<old>can you check pollfd[0].fd ?
<ArneBab_>#0 __syscall_cancel_arch ()
<ArneBab_>do you mean frame 4?
<ArneBab_>at port_poll?
<old>in the frame of port_poll
<old>yes sorry
<ArneBab_>p pollfd[0].fd is 26
<ArneBab_>p pollfd ⇒ {{fd = 26, events = 4, revents = 0}, {fd = 3, events = 1, revents = 0},
<ArneBab_> {fd = 997101568, events = 32681, revents = 0}}
<old>okay now
<old>(gdb) info proc
<old>you should get the pid of your process
<old>then:
<ArneBab_>28824
<old>(gdb) shell ls /proc/PID/fd | grep 26
<old>so (gdb) shell ls /proc/28824/fd | grep 26
<ArneBab_>that returns 26
<ArneBab_>lrwx------ 1 arne users 64 16. Jun 22:50 26 -> socket:[106977518]
<old>ls -l ;)
<ArneBab_>^ with ls -l
<old>okay so it's indeed a socket
<old>so is it possible that the other side have shutdown the connexion and we are hanging there on writting
<ArneBab_>how can I check that?
<old>POLLRDHUP
<old>in port-poll
<old>you can add that flag
<old>pollfd[nfds].events = events & (POLLOUT | POLLPRI);
<old>become: pollfd[nfds].events = events & (POLLOUT | POLLPRI | POLLRDHUP);
<old>this should detect hanghup from other half of the connection and resume the poll
<old>if _this_ does not work, I'm really stuck hehe
<ArneBab_>Do you mean in port.c?
<old>yes
<ArneBab_>same for POLLIN?
<ArneBab_>pollfd[nfds].events = events & (POLLIN | POLLPRI); ?
<old>hm
<old>no just on the POLLOUT since it's this path that we are triggering
<ArneBab_>rebuilding guile to check wherther pollfd[nfds].events = events & (POLLOUT | POLLPRI | POLLRDHUP); works
<ArneBab_>it still blocks
<old>hm
<ArneBab_>did you get it reproduced on your system?
<ArneBab_>(you may need to reload the browser tab a few times until you hit the problem)
<old>you could use lsof
<old>to check the status of the socket
<old>yes I did before
<old>(I'm working on other stuff in parallel so I can't now)
<ArneBab_>no problem -- thanks a lot (again) that you help me anyway!
<old>you can do the same manipulation with GDB to get the socket number in /proc/PID/fd
<old>and then check
<old>the file /proc/net/tcp
<old>and grep for the number of the socket
<old> socket:[106977518]
<old>that number
<old>if using unix socket, tries /proc/net/unix instead
<old>you can always use ss also
<old>ss -ia | grep 106977518
<old>that ought to give you the status of that socket
<ArneBab_>I get lsof: status error on /proc/2185/fd/socket:[107025026]: No such file or directory
<old>alas, my knowledge of networking is limited
<old>hm what about ss -ia | grep 107025026
<ArneBab_>not found -- but I assume that it’s not a network socket, but a socketpair created from guile
<old>ahh right
<old>could it be the socket is full ?
<old>nobody is reading the other end
<old>and you deadlock
<ArneBab_>deadlock sounds plausible …
<ArneBab_>(like this feels)
<ArneBab_>but I don’t understand why …
<ArneBab_>I’ve been tracing this until I hit the "sleep doesn’t return".
<ArneBab_>maybe I should merge stderror handling and stdout handling into the same fiber -- but it feels wrong to not find the problems source.
<old>who's reading the other half of the socket ?
<old>and also, who's writting?
<old>to me this sounds like reader starvation and you end up with deadlock in C
<ArneBab_>do you mean the reader does not get new chars?
<old>well the reader is not favorise to get them
<old>in cooperative mode, you might end up with a writter writting lots of bytes and, I haven't look at the code, but end up in C waiting for the port to accept more byte to write
<old>ideally, the writer would be preempted before that and let the reader go
<ArneBab_>for a socket, #<input-output: socket N> gives me the fd, right?
<old>bbl, gotta go make diner :-)
<ArneBab_>Enjoy :-)
<ArneBab_>(I’ll be afk in about an hour, gotta sleep ☺)
<ArneBab_>timezones …
<ArneBab_>Thank you again!
<ekaitz>mwette: ping ping