IRC channel logs

2026-03-27.log

back to list of logs

<apteryx>any idea what I could try instrumenting (with pk or similar) in guile itself to understand a problem (dependency cycle?) when loading guile modules? it hangs, ever eating memory
<old>apteryx: primitive-load-path ?
<apteryx>old: that'd be equivalent to the load-hooks
<apteryx>I think it's not a circular module dependency problem after all, though I'm not sure.
<apteryx>what I see with the load hook is the same in the working vs non-working scenario
<czan>Can you reproduce it in a REPL, and interrupt it to see what's executing?
<apteryx>that's a good idea
<czan>It could be worth doing it a few times and seeing what the traces have in common, to try to get a sense of what the actual problem is.
<apteryx>I was looking at gdb backtraces, attaching to the guile process earlier, but it didn't reveal anything useful (to my eye at least)
<dsmith>apteryx, For debugging weird filesystem issues, the Swiss Army Chainsaw (commonly known as "strace") is often very revealing.
<apteryx>dsmith: agreed, strace is often my goto tool of choice; the problem I was trying to debug though would not show disk activity, just guile looping (uses 100% cpu, eating memory steadily) -- a cycle
<graywolf>Hello :) Maybe a naive question, but since 3.0.11 has different byte-code from 3.0.9, why does it not have its own ccache directory? There already is *some* version nomber in .cache/guile/ccache/3.0-LE-8-4.6, so why it was not bumped?
<rlb>ekaitz: the guile-fiber tests appear to have hung overnight in dynamic-wind-star.scm.
<rlb>Did you run guile-fibers make check against an installed version of the fixed guile?
<rlb>Oh, and this time I was using the debian guile-fibers orig.tar.gz, which is I believe 1.4.2, fwiw.
<rlb>But there were also no segfaults through a good number of tests. I don't recall for sure whether it would have died before that test previously, but I suspect so.
<ieure>At least it didn't hang in ford-windstar.jpg https://upload.wikimedia.org/wikipedia/commons/5/57/2001-2003_Ford_Windstar_Limited.jpg
<dsmith>apteryx, Oi!
<ekaitz>rlb: weird
<ekaitz>rlb: 32 or 64 bit?
<ekaitz>i guess 64, right?
<ekaitz>my question the other day to old might be related with this
<ekaitz>it might be that some tests hang
<ekaitz>but i only experienced the hang after removing the fences
<ekaitz>which doesn't mean they didn't happen before
<rlb>ekaitz: right, riscv64, and I just started the tests again, and they got one line further, but may have hung --- I'll give them some time. Someone else is now hammering the machine too ;)
<rlb>(same line, just now two of them)
<ekaitz>that's very weird
<rlb>I can pretty easily test without the git since I also have the debian guile installed there -- I'll do that next and see if it hangs over a few tries.
<rlb>"the jit" :)
<ekaitz>okay yeah please
<rlb>And now that I have it built, I assume it might be pretty easy to test variations if you have any you'd like me to try, i.e. if I can just cherry-pick or whatever without provoking a full bootstrap (shouldn't need one?).
<ekaitz>yeah, great!
<rlb>A full bootstrap likely won't be possible in sane time, now, until the other person is finished with their work.
<rlb>Of course I guess you can do the same thing, but doesn't help unless you can reproduce the hang...
<ekaitz>i'm think about what could produce a hang like that
<ekaitz>thinking*
<rlb>fwiw, what I did was "make install" guile to ~/opt/guile, then unpack debian's fibers 1.4.2 orig.tar.gz, use an "env wraper" to "../guile-opt-env autogen/configure/make/make-check", and that last step is where it hangs.
<ekaitz>yeah, i do something similar but through guix
<rlb>I suppose another step I could take is to try it from a fibers checkout at 1.4.2 --- I *doubt* the debian orig.tar.gz has been altered in any way that would matter, but...
<rlb>(I think last time I might have used a checkout.)
<ekaitz>i used a checkout
<ekaitz>master branch
<rlb>hmm, then suppose you could have fibers fixes I don't.
<ekaitz>or that I didn't find the hangs
<ekaitz>by chance
<rlb>right
<ekaitz>i could try to use the release code and see what happens
<ekaitz>gimme a sec
<ekaitz>rlb: there are just a few commits from master to 1.4.2
<ekaitz>i'll launch tests for 1.4.2 in my machine
<ekaitz>rlb: all fail in 1.4.2
<ekaitz>straigth away
<ekaitz>in master they don't, at least not just when you run them
<rlb>odd --- as mentioned, 1.4.2 gets pretty for for me.
<rlb>"far for"
<ekaitz>i even passed the "expensive" test
<ekaitz>whatever that is
<dsmith>Soo. A question to ask is "What's different?". Is one machine more heavily loaded?
<ekaitz>i get this
<ekaitz>> In procedure struct-ref/immediate: Argument 2 out of range: 10
<ekaitz>maybe i'm testing wrong with 1.4.2
<rlb>I think the one I'm using wasn't loaded the first time (probably), but is now.
<rlb>ekaitz: that didn't go well --- forgot I hadn't upgraded guile in the chroot, so it still had the broken packages; trying again after upgrading.
<rlb>(at least the previous segfaults are still reproducible there, I suppose...)
<ekaitz>:)
<ekaitz>that's why I use guix
<rlb>well, if I don't actually run the right command, I suspect it wouldn't work there either :)