IRC channel logs

2024-02-08.log

back to list of logs

<rlb>civodul: made a little more progress on the test-signal-fork hang. Looks like it may be the parent that's hanging when primitive-fork tries to stop the signal thread. It tries to grab the signal_delivery_thread mutex, but it's already held by start_signal_delivery_thread, which never returns from scm_spawn_thread -- don't know why yet.
<rlb>(...so when it hangs start_signal_delivery thread is racing with primitive-fork)
<civodul>rlb: hey! so a race condition as the thread is started and trying to be stopped around the same time?
<rlb>Yes, I think so - it looks like start_ has the mutex, primitive-fork is blocked trying to get it, and start_ never returns from scm_spawn_thread.
<rlb>(In the parent.)
<rlb>I'll keep poking at it - just wanted to mention what I'd seen, in case it rang any bells.
<rlb>It's also pretty easy to reproduce here via "make -j5" with https://codeberg.org/rlb/guile/src/branch/rev-parallel-tests
<rlb>(4 core machine)
<rlb>Imagine other -j's might work fine too, that's just what I've been using.
<rlb>Oh, and the parent signal_delivery_thread never prints a message after that (have an fprintf first thing), so presumably we never get that far either.
<rlb>...and call-with-new-thread appears to block on the wait-condition-variable at the end, though the lambda that sets the thread results does finish. I wondered about some kind of status update race there, but haven't seen anything wrong with the cv/mutex use yet.
<civodul>hmm!
<civodul>i’m at work right now, dealing with things that are maybe even less fun than a deadlock
<civodul>i’ll try to make time next week to investigate
<civodul>glad you managed to get this much debugging info already!
<rlb>Oh, wait, do we need to check whether the thread status has already been delivered *before* we wait?
<rlb>i.e. if that goes first, will we want forever?
<rlb>(if the status is already set)
<rlb>ACTION considers that more carefully... 
<rlb>And no worries, or rush :)
<rlb>s/that/the status delivery/
<rlb>nvm
<dsmith>sneek, botsnack
<sneek>:)
<apteryx>is there a convenient interface to use the equivalent of ,backtrace in a program?
<apteryx>the C defined display-backtrace doesn't accept a width argument