IRC channel logs

2020-06-17.log

back to list of logs

***drakonis1 is now known as drakonis
***dsmith-w` is now known as dsmith-work
<dsmith-work>wingo: currnet master (has arm patch) compiled and build just fone on a stock rasbian buster rpi.
<dsmith-work>Probably was using the new compiler instead of jit?
<dsmith-work>I mean, with the new compiler, is the jit still used as heavily?
<dsmith-work>However, seems to be stuck n srfi-18.test
<dsmith-work>Running again. All tests pass.
<dsmith-work>Not with JIT_THRESHOLD=0 though.
<dsmith-work>Fails at different places on repeated runs too.
<dsmith-work>So, how do I just run say, "test-language"
<dsmith-work>Like this? ./meta/uninstalled-env test-suite/standalone/test-language
<dsmith-work>Hmm. Can't find q.scm when I do that.
<mwette>maybe meta/guile test-suite/standalone/test-language
<dsmith-work>test-suite/standalone/test-language is a shell script.
<mwette>oh
<mwette>are you trying to run the srfi-18 test?
<dsmith-work>No. It's failing earlier in the "standalone" tests.
<dsmith-work>With JIT_THRESHOLD=0
<dsmith-work>mwette: to recap, first "make check" hung on the srfi-18 test. Second "make check" finsihed with no errors.
<dsmith-work>GUILE_JIT_THRESHOLD=0 make check has several failures in the "standalone" tests.
<dsmith-work>And not always the same tests fail.
<dsmith-work>So I wanted to just run one of the failing tests.
<dsmith-work>test-language was one.
<dsmith-work>But I can't seem to run it at all.
<dsmith-work>regardless of the jit threshold
<dsmith-work>And this on and armv7 rpi3
<mwette>Yea, trying to narrow down is what I'd do.
<mwette>Are you building in the source dir or in separate build dir?
<dsmith-work>In source
<dsmith-work>And here is what makes it hard. "GUILE_JIT_THRESHOLD=0 ./meta/guile" worked 11 times and filed with a segfault on the 12th.
***wxie1 is now known as wxie
<mwette>that's a tough one
<mwette>I can execute the contents of test-language one at a time: meta/guile -c "(exit (= 3 (apply + '(1 2))))" --language=elisp
<mwette>but I think that may be pulling modules from already installed stuff.
<mwette>$ meta/guile -c '(display %guile-build-info) (newline)'
<mwette>((buildstamp . 2020-06-17 02:09:19) (CFLAGS . -pthread) (LIBS . -lcrypt -lm) (libguileinterface . 2:1:1) (guileversion . 3.0.2.139-3c32) (extensiondir . /opt/local/lib/guile/3.0/extensions) (pkgincludedir . /opt/local/include/guile) (pkglibdir . /opt/local/lib/guile) (pkgdatadir . /opt/local/share/guile) (includedir . /opt/local/include) (mandir . /opt/local/share/man) (infodir . /opt/local/share/info) (ccachedir
<mwette>. /opt/local/lib/guile/3.0/ccache) (libdir . /opt/local/lib) (localstatedir . /opt/local/var) (sharedstatedir . /opt/local/com) (sysconfdir . /opt/local/etc) (datadir . /opt/local/share) (libexecdir . /opt/local/libexec) (sbindir . /opt/local/sbin) (bindir . /opt/local/bin) (exec_prefix . /opt/local) (prefix . /opt/local) (top_srcdir . /home/mwette/repo/sv/guile) (srcdir . /home/mwette/repo/sv/guile/libguile))
<mwette>
<mwette>so now I see why meta/uninstalled-env
<mwette>$ meta/build-env guile -c '(display %load-path) (newline)' => build dirs
<dsmith-work>So some env isn't getting set right I think.
<dsmith-work>./meta/build-env sh -x test-suite/standalone/test-language
<dsmith-work>The falure is at
<dsmith-work>+ guile --no-auto-compile -l /module/ice-9/q.scm -c 1
<dsmith-work>;;; Stat of /module/ice-9/q.scm failed:
<dsmith-work>;;; In procedure stat: No such file or directory: "/module/ice-9/q.scm"
<dsmith-work>That "/module/ice-9/q.scm" doesn't look right.
<dsmith-work>The script has -l "$top_srcdir/module/ice-9/q.scm"
<dsmith-work>So $top_srcdir isn't set.
<dsmith-work>It's set, but not exported
<mwette>something else too; I added export top_srcdir and top_builddir but then get error wrt "/test-language.el" : No such file
<mwette>added to meta/build-env ; then ran meta/build-env .../test-langauge
<mwette>this is super-hack to debug; but adding srcdir=$top_srcdir/test-suite/standlone export srcdir to meta/build-env; then it works
<mwette>the Makefile in standalone/ sets srcdir for the script
<mwette>dsmith-work: cd test-suite/standalone; make TESTS=test-language check-TESTS
<dsmith-work>Ahh.
<dsmith-work>mwette: Thanks muchly.
<dsmith-work>Ok, for that particular test, I can't get it to fail when run on it's own.
<dsmith-work>When run from make check, it seems to fail about 70-80 %
<dsmith-work>(with the jit threshold set to 0)
<dsmith-work>Ah well. Too sleepy to look any more
***wxie1 is now known as wxie
<mwette>sneek: later tell dsmith-work maybe the system needs to be loaded to stress memory or filesystem
<sneek>Okay.
<tohoyn>sneek, botsnack
<sneek>:)
***rekado_ is now known as rekado
<rekado>I think I found one of the bigger problems in wip-elisp
<rekado>an earlier rebase in 2016 of the commit “intern arbitrary constants” introduced two copies of compile-bytecode
<rekado>it may not matter much in this case, but perhaps there are more rebase errors on wip-elisp
<rekado>bah, this is confusing.
<rekado>in one of the commits for interning arbitrary constants a new “to-file?” field was added to “make-assembler”, but it is always #t, so none of the conditional code that checks the field ever runs
***wxie1 is now known as wxie
***wxie1 is now known as wxie
***wxie1 is now known as wxie
<rekado>when building Guile from source, can I bypass the bootstrap and use the fast Guile that is already installed on my system?
<wingo>rekado: i am not precisely sure when that is possible, to GUILE_FOR_BUILD using a Guile not of the same version
<wingo>i guess that during a stable series, the issue would be that the .go file you have loaded for module MOD may differ from the .scm file that you are compiling for MOD, and that some private detail from the different versions may differ in an incompatible way
<wingo>but i don't know when precisely there would be problems
<wingo>maybe there is no issue
<manumanumanu>ahoy hoy!
<manumanumanu>Is there a reason why predicates like integer? doesn't return the integer it tests? I mean, I understand that the standard says it should, but having them return the tested value (if truthy) would be great for cond => clauses and would make all predicates composable
<wingo>fwiw you can get similar effects with `match`
<wingo>e.g. `(match (foo) ((? integer? x) x))`
<manumanumanu>wingo: but you still don't have composability. I'm rolling my own, mostly to test my emacs text editing ability. Managed to get a list of all predicates from the procedure index in under one minute, half of which my poor poor computer was chugging through emacs macros :H
<wingo>yeah, but it is a consistent point on the design space :)
<janneke>manumanumanu: yeah, having pair? return the list be "handy"
<wingo>procedure ends with ? -> it returns a boolean
<manumanumanu>sure, but since we only have one false value, returning a truthy value would be just fine
<janneke>i've been using non-idiomic "pair??" and "number??" functions a lot...but all that code is not generally usable, so i stopped doing it
<wingo>we have three false values, depending on which predicate you are looking for :)
<wingo>#nil, #f, '()
<manumanumanu>#f #nil and?
<manumanumanu>oh
<wingo>false in elisp
<wingo>honestly i don't use pair? any more at all; all hail match
<manumanumanu>janneke: anyway, I'm creating a modules called composable-predicates that returns the tested value if true. I doubt I'll use it for other things than the repl
<janneke>manumanumanu: ah, yes for the repl that could be nice
<janneke>wingo: hmm, that's one area where guile is no match for mes
<janneke>eh, mes is no match for guile
<manumanumanu>janneke: the only problem is I am starting to get used to my repl niceness :D I have found myself writing my lambda shorthand in code so many times #%(let ((a (abs %1))) (/ a %2)) => (lambda (%1 %2) ...). and I have started to feel tempted to use my for loops more than once.
<janneke>for loops? you must be kidding?
*janneke hates off-by-one errors caused by not filter-mapping with a passion
<janneke>it's one of the things that initiated my moving away from python
<manumanumanu>janneke: they are racket-styled ones. I did a rewrite. They are really just syntactic sugar for a named-let (mostly) left fold
<manumanumanu>(for/list ((i (in-naturals)) (e (in-list lst)) (cons i e)) => '((0 . e0) (1 . e1) ...)
<manumanumanu>generic and extensible and I haven't found a case where the code isn't as good as a named let.
<janneke>manumanumanu: ah, i don't know racket; suppose it's a matter of taste, experience, or perhaps cranal wiring
<manumanumanu> https://hg.sr.ht/~bjoli/guile-for-loops
*janneke 's head is wired better for (map cons '(e0 e1) (iota (length '(e0 e1))))
<janneke>i suppose the computer doesn't care all that much
<manumanumanu>janneke: sure. I write most my code like that as well. The for-loops are generic, though. You can do (in-string ...) and (in-generator ...). I could add an (in-file ...) although that would have about 10% overehad
<manumanumanu>and since it is really just a fold, there are many variants: for/sum, for/foldr, for/and etc.
<manumanumanu>in fact, all loops are just syntactic sugar for for/fold and for/foldr (the latter is only used for for/stream, but could be used to make for/list be non-tail recursive like guile's map).
<manumanumanu>But now I see I never added for/hash(q,v) :D
*RhodiumToad isn't really a fan of looping constructs that try to be too generic
<manumanumanu>RhodiumToad: "too generic" is pretty broad. I decided to not implement arbitrary transformation support, but i suspect your limit comes earlier :D
<manumanumanu>RhodiumToad: I am somewhat curious. What are your objections? My for loops aren't generic in that they blindly accept any sequence. You have to specify (in-list ...) etc except for literals. I believe there is a benefit to having a unified syntax for iterating over sequences, especially once you otehrwise would have used costly combinations of (map blah (filter blah (zip blah lst lst2)))
<civodul>hmm the backtrace i reported for i686 at https://paste.debian.net/1152366/ persists
<civodul>after rebuilding bootstrap/* prebuilt/*
<civodul>it's really the prebuilt .go files that cause problems
<civodul>because building from a fresh checkout is ok
***terpri__ is now known as terpri
<dsmith-work>Wednesday Greetings, Guilers
<sneek>Welcome back dsmith-work, you have 1 message!
<sneek>dsmith-work, mwette says: maybe the system needs to be loaded to stress memory or filesystem
<dsmith-work>mwette: So with repeated runs, ran 5 times no error, failed on 6th.
<dsmith-work>like:
<dsmith-work>while GUILE_JIT_THRESHOLD=0 make TESTS=test-language check-TESTS; do :;done
<civodul>hmm bisect points at cb8cabe85f535542ac4fcb165d89722500e42653 but i'm skeptical
<mwette>dsmith-work: Is that more-or-less repeatable? If you start a big compile job in the background, then how many times?
<chrislck>if there's talk of augmentin integer? and pair?, how about augmenting and=> too to accept multiple procs - https://paste.debian.net/1152476/
<mwette>bad memory?
<dsmith-work>mwette: Probably not. I hope not!
<mwette>maybe add: guile -c '(use-modules (system foreign)) (display (scm->pointer (lambda () #t))) (newline)' and see if there is pattern? Just throwing out ideas here.
<dsmith-work>The addresses of the jit code does change from run to run, at least it did some weeks ago when I started looking into this.
<dsmith-work>So I have a nice core file. Can't seem to get gdb to use it with meta/gdb-uninstalled-guile
<dsmith-work>Doh! Cause it's a shell script!
<dsmith-work>Ahh, that script is running gdb with --args
<dsmith-work>../../meta/uninstalled-env ../../libtool --mode=execute gdb ../../libguile/guile core
<dsmith-work>Program terminated with signal SIGSEGV, Segmentation fault.
<dsmith-work>#0 0x76e965cc in scm_is_string (x=0x0) at strings.h:293
<dsmith-work>Having a SCM be NULL is "not good".
<dsmith-work>Ok, another segfault. Totoally different location.
<dsmith-work>Both backtraces suggest corrupt stack
<mwette>dsmith-work: good luck! -- gotta go now
<dsmith-work>Is it normal for __pthread_cond_wait to have mutex=0x0 ?
<RhodiumToad>no
<dustyweb>hi #guile
<dustyweb>!
<civodul>howdy dustyweb!
<dustyweb>civodul: how goes the hacks?
<civodul>good!
<civodul>though i was hoping to release 3.0.3 and found an issue with pre-built .go files
<civodul>& you?
<civodul>do we have CapTP for Guile yet? :-)
<rekado>bah, I know too little to make sense of Guile Emacs crashes
<rekado>I now get segfaults right away, during the Guile Emacs build
<rekado>0x00000000005f4385 in calloc (nmemb=<error reading variable: DWARF-2 expression error: Loop detected (257).>, size=size@entry=1) at gmalloc.c:1510
<rekado>a backtrace isn’t helpful; it just indicates that calloc is called in a loop
<civodul>does Emacs define its own calloc or something?
<RhodiumToad>it might. emacs is a bit weird about memory
<civodul>yeah
***leoprikler_ is now known as leoprikler
<dsmith-work>What I'm seeing had got to be some kind of stack overwriting I think.
<dsmith-work>Hey. That reminds me. Saw some warnings fly past. About int and pointer not the same size.
<RhodiumToad>what are you building?
<dsmith-work>RhodiumToad: Guile
<RhodiumToad>on arm?
<RhodiumToad>I can have a look if you like
<dsmith-work>Sorry don't have a log
<dsmith-work>Now rebuilding on 32bit intel
<dsmith-work>vm-engine.c: In function 'vm_regular_engine':
<dsmith-work>../libguile/scm.h:176:23: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
<dsmith-work> # define SCM_PACK(x) ((SCM) (x))
<dsmith-work> ^
<dsmith-work>Like that
<RhodiumToad>hm
<RhodiumToad>what version exactly are you compiling?
<dsmith-work>v3.0.2-141-g2e2e13c40
<dsmith-work>Current master
<dsmith-work>RhodiumToad: https://paste.debian.net/1152510/
<RhodiumToad>well those are obvious bugs
<RhodiumToad>some of them, anyway
<RhodiumToad>hm
<RhodiumToad>so the hash things look harmless but the compiler can't really prove that
<RhodiumToad>the others... hard to tell from the code
*RhodiumToad tries a build
<civodul>dsmith-work: i noticed those warnings on i686 but apparently the issue has always been there
<civodul>looks fishy tho
*RhodiumToad still waiting for configure, it takes a while on the raspberry pi
*RhodiumToad checks that the turbo setting is indeed on
<rekado>civodul: yes, calloc is defined in src/gmalloc.c
<civodul>so it might be ELF shenanigans
<civodul>does it use the old glibc malloc hooks?
<civodul>well dunno, just vague ideas
<civodul>dsmith-work: if you have time, could you try "make dist" on x86_64 and then build from that tarball on 32-bit Intel?
<RhodiumToad>date: illegal time format
<RhodiumToad>echo ' { "buildstamp", "'`date -u +'%Y-%m-%d %T' -d @$$BUILD_DATE`'" }, \' >> libpath.tmp <== non-portable use of `date`
<RhodiumToad>bleh. compile failed
<RhodiumToad>ld: error: ./.libs/libguile-3.0.so: undefined reference to GC_get_suspend_signal [and 3 more]
<RhodiumToad>bah
<RhodiumToad>missed a configure option
***heisenberg-25 is now known as habush
<RhodiumToad>tum te tum, I think this is going to take "a while"
<RhodiumToad>BOOTSTRAP GUILEC ice-9/eval.go -- 21 minutes and counting
<dsmith-work>civodul: Sure. Building now.
<dsmith-work>RhodiumToad: Yes. Takes. A. While.
<RhodiumToad>well it's 6 minutes into psyntax-pp.go now
<RhodiumToad>unfortunately, it seems to be limited to one thread
<dsmith-work>Shouldn't those ptr<->int casts be going through uintptr_t ?
<RhodiumToad>wouldn't really help, I think
<RhodiumToad>however, it looks like I get different warnings than you do.
<dsmith-work>Probably compiler differnces
<RhodiumToad>yes.
<RhodiumToad>(I'm using clang)
<dsmith-work>Also, that was from x86, not armv7
<RhodiumToad>ah
<dsmith-work>but still 32bit
<dsmith-work>Buils faster.
<dsmith-work>Like that joke about the guy looking for his keys in kitchen, but lost them in the basement, because the light was better there.
<RhodiumToad>hm, at least it's using more threads now
*RhodiumToad checks cpu temp
<dsmith-work>Erh?
<dsmith-work>Hadn't thought of that.
<dsmith-work>Could the poor little thing be just overheating and having fits?
<RhodiumToad>unlikely
<dsmith-work>Agreed
<dsmith-work>Sure would be nice to have that rewindable syscall thingy civodul used.
<dsmith-work>But it's Intel only.
<dsmith-work>civodul: make dist completed. Now building on 32bit box
<dsmith-work>civodul: look familar:
<dsmith-work>ice-9/boot-9.scm:1669:16: In procedure raise-exception:
<dsmith-work>In procedure bytevector-u32-native-set!: Argument 3 out of range: 654452552
<dsmith-work>Makefile:1927: recipe for target 'ice-9/eval.go' failed
<RhodiumToad>dsmith-work: just so I can try and reproduce, what's the exact sequence of things you're doing?
<dsmith-work>On 64bit: ./autogen.sh && ./configure && nice make -j5 && nice make dist
<dsmith-work>On 32bit: ./configure && nice make
<dsmith-work>After copying over a tarball and unpacking it
<civodul>dsmith-work: ah ha!
<civodul>we have something
<civodul>thanks for testing!
<dsmith-work>Yes?
<civodul>i've been looking at it but it's tricky because there's one level of indirection
<dsmith-work>BTW: that number 654452552 is the same. 0x27022748
<RhodiumToad>so the issue is with bytecode built on 64bit and then run on 32bit?
<dsmith-work>It looked like it might be part of a string at first.
<dsmith-work>RhodiumToad: Yes.
<dsmith-work>civodul: What do you have?
<civodul>dsmith-work: i just meant that you've been able to reproduce the bug, so it's real
<civodul>very real
<civodul>so yes, the issue seems to be with bytecode built on 64-bit
<civodul>thing is, .go files in prebuilt/ are built with -O2, not with the new compiler
<civodul>so in theory, that hasn't changed
<civodul>"rm prebuilt/32-bit-little-endian/system/vm/assembler.go" allows it to proceed
<dsmith-work>So should these prebuilt go file be reproducible?
<dsmith-work>I mean, should not matter if make dist is run on 32bit or 64bit, the go files should be the same.
<dsmith-work>(for the 4 cases {big,little}{32,64}
<civodul>yes
<civodul>another interesting issue: in the tarball, there are no symlinks under prebuilt/
<civodul>whereas in the repo there's 32-bit-little-endian -> i686...
<RhodiumToad>gmake[2]: *** [Makefile:2279: language/cps/compile-bytecode.go] Abort trap (core dumped)
<RhodiumToad>#3 0x20145048 in subxi (_jit=0xbfbfd790, r0=6, r1=0, i0=<optimized out>) at ./lightening/lightening/lightening.c:309
<RhodiumToad>that doesn't look right
***rekado_ is now known as rekado
<rekado>there have been lots of changes to src/gmalloc.c, so I’ll just have to rebase the guile-emacs changes to a newer version of Emacs.
<civodul>uh
***terpri__ is now known as terpri
<RhodiumToad>yeesh, so many macros I can't even figure out where this is
<RhodiumToad>the line number is off, somehow
<RhodiumToad>contents of *_jit seem highly dubious
<RhodiumToad>looks like it aborted in get_temp_gpr, with _jit->temp_gpr_saved++ being equal to 200
<RhodiumToad>when it expected 0 or 1
<dsmith-work>RhodiumToad: what arch?
<RhodiumToad>armv7
<RhodiumToad>RPI2b
<RhodiumToad>bleh. that wild pointer for _jit may be a debugger artifact
<RhodiumToad>or it may be a register corruption
<dsmith-work>Ok, I don't understand what's going on there.
<dsmith-work>If that's called more than once it aborts?
<RhodiumToad>the problem is that it's somehow losing track of j->jit, I think