IRC channel logs

<sneek>Welcome back chrislck :)

<stis>Been spending time on inmproving python-on-guiles speed. really nice

<stis>But i think that we could improve guiles hash for integers, thhere is some experience that indicates that integers shouhuld ne translated as is , I think python does it

<stis>Now python on guiles list compreshensions are 10x faster than cpython

***sneek_ is now known as sneek

<pinoaffe>stis: wow, that's cool!

<wingo>civodul: wrt wip-tree-il-sourcev i think it is a great change but i wonder about having it in 3.0 -- i.e. there are other tree-il producers (fewer consumers tho)

<wingo>string->symbol improvement is great tho! you will need to check though not only for start == 0 but also that the stringbuf length is the string length

<civodul>oh right

<civodul>wingo: i'd expect producers to be unaffected: they can still provide alists, and that should be fine

<civodul>the code is prepared to deal with both (which is not ideal, but keeps compatibility)

<civodul>WDYT?

<wingo>civodul: ah yes i see! very clever

<wingo>how does that work for the compilers?

<wingo>they never inspect the src ?

<civodul>you mean for the layers above treeil?

<wingo>below

<civodul>ah yes, they do

<civodul>some access the 'src' field directly

<wingo>i mean the tree-il->cps and tree-il->bytecode ocmpilers

<civodul>that's why it goes down the layers as sourcev

<wingo>i saw you patched up some analysis/error passes in tree-il

<civodul>yes

<wingo>but the cps compiler never inspects the src?

<wingo>interesting

<civodul>because these lower layers don't call tree-il-src, they can receive either a sourcev or an alist

<wingo>neat

<civodul>the CPS compiler just passes src around without inspecting it AFAICS

<wingo>lgtm then!

<civodul>alright, i'll push shortly

<civodul>thanks for taking a look!

<wingo>np, nice hack!

<sneek>dsmith: Greetings!

<dsmith>wingo: G'day

<wingo>greets :)

<dsmith>wingo: In srfi-14.c, SCM_CHARSET_DATA is now #define'ed twice.

<dsmith>;^}

<wingo>civodul: wrt not respecting 'positions read option, probably we should change that, possibly via adding a kwarg to read-syntax...

<wingo>read options are a terrible interface tho ;)

<civodul>wingo: it is!

<civodul>a keyword argument sounds good

<wingo>i think you would want to thread it through to a -g0 command-line argument

<civodul>yes

<civodul>though it's a bit different from gcc's -g0 because you can have macro that fiddle with source location

<wingo>gcc's -g0 is the same as omitting -g, right ?

<wingo>-g0 should also prevent most .debug_info emission

<civodul>yes, so it goes beyond just disabling source location info in the reader

<civodul>but yes, that'd be useful

<civodul>wingo: pushed the two commits with the change you proposed

<civodul>for good measure i rebuilt lokke against a fresh Guile build from Git, and everything goes well

<civodul>(this gives a good example of an external project that uses the tree-il interface)

<civodul>guix build guile-next --with-git-url=guile-next=/data/src/guile-3.0 lokke --with-input=guile@3.0.7.1=guile-next

<mwette>wingo, civodul: Can you elaborate on the change to source property handling? nyacc's parser passes these through: provided by lexer, and eventually handed off to tree-il.

<mwette>Some peeps are using this, I believe.

<wingo>mwette: i think the change should continue to allow alists as source properties

<mwette>wingo: KK, thanks

<wingo>it simply additionally allows a 3-element vector #(filename line column)

<wingo>psyntax now passes the vector format to tree-il instead of the alist format

<wingo>syntax objects use the 3-element vector internally, so that saves on translation consts, and also saves on memory

<mwette>Great. And the api will be exposed to users?

<mwette>Not sure I'd be using it, but it sounds like it might be faster.

<mwette>... in the lexer->parser->analysis stages.

<wingo>it's exposed to users in two ways. one, in that the domain of the make-lexical-ref etc procedures is expanded to also allow 3-vectors as source

<wingo>two, in that *users* of tree-il might see 3-vectors instead of alists -- but, if they use the tree-il-src accessor they still get the alists

<wingo>you would only get the 3-vector if you are using a raw `match` on the record

<mwette>Got it. Thanks.

<wingo>so, hopefully not a very visible change, for users

<ArneBab>manumanumanu: nice!

<jpoiret>i'm coming back to https://debbugs.gnu.org/cgi/bugreport.cgi?bug=52835 to finally have a proper fix that does proper error handling for the dup/dup2s

<jpoiret>on Linux, dup2 can error with EBUSY, but it's not in POSIX, how should we handle this? is there any preprocessor flag i can use for #ifdef?

<jpoiret>i guess the generic __linux__ should be okay :)

<civodul>jpoiret: i suppose it wouldn't hurt to handle EBUSY unconditionnally, no?

<jpoiret>ah, you're right, i was overcomplicating things

<jpoiret>when i do `./configure --enable-mini-gmp`, numbers.h still tries to import gmp.h

<jpoiret>include*

<jpoiret>weird, a `make clean` seems to have done it

<rlb>civodul: hah, well, not sure about *good* example, but it's an example, and thanks for testing it. (The conversation had reminded me I should try against a newer main.)

<rlb>...and stis' mention of python comprehensions made me vaguely wonder about the possibility of a shared "lazy" infrastructure, i.e. python has generators/comprehensions, clj has the sequence interface, etc. Though I'd guess that it might well be that perf and/or semantic considerations might still lead you to independent implementations.

<civodul>rlb: srfi-41 is quite good i think

<sneek>dsmith-work: Greetings :D

<dsmith-work>UGT Greetings, Guilers

<dsmith-work>sneek: botsnack

<sneek>:)

<dsmith-work>!uptime

<sneek>I've been running for one month and 6 days

<sneek>This system has been up 28 weeks, 5 days, 3 hours, 8 minutes

<wingo>o/

<civodul>\o

<wingo>civodul: hey! just a reflection, i think we should use SCM_UNLIKELY less

<wingo>like, often we don't actually know what's likely or not and often the perf difference of taking the branch or not is close to 0

<wingo>anyway, just a thought re: future c code

<wingo>civodul: how are we doing on 3.0.8 blockers? are there any?

<civodul>yes, you're probably right on SCM_UNLIKELY, it's maybe the new "inline", which was the new "register"

<civodul>re blockers, i think we're pretty good

<civodul>i'm still doing a bit of performance work though

<civodul>in part because of https://issues.guix.gnu.org/53506

<civodul>i think it's important, but it's very incremental and slow progress, so it shouldn't block everything either

<wingo>civodul: when is that bug from?

<wingo>i.e. it happens with 3.0.7 ?

<wingo>not downplaying it, just trying to understand

<civodul>yes, that's with 3.0.7

<civodul>but the memory consumption issue has been here "forever"

<wingo>are you quite sure that it's bignums?

<civodul>not 100% sure

<wingo>i mean the tree-il -> bytecode compiler doesn't go through slot-allocation, right?

<civodul>right, but the assembler uses integers too

<wingo>for what

<wingo>like, ephemeral things, right?

<wingo>max 64 bits

<wingo>that shouldn't lead to huge heap sizes

<civodul>hmm not sure actually, you know better than me :-)

<civodul>but look, it takes 25s and ~500MiB to compile gnu/packages/crates-io.scm (which is 3MiB of text) at -O1, on x86_64

<wingo>haha i dunno, i know some things but i don't see 3 GB heaps ever

<civodul>i can imagine this could get worse on i686 if there are more integers overflowing to bignums

<wingo>very respectfully i doubt that bignums are the issue here

<civodul>but even without it, it's resource-intensive

<civodul>yeah, maybe you're right

<wingo>i could be wrong!

<civodul>so far i've been profiling on x86_64 as a first approach

<wingo>so wrt text size vs max heap size. of course there is a high macro factor, right?

<wingo>like each source token may be multiplied by many via the package macro, right? or is that a rather thin thing

<wingo>(also, weird that in that bug report you were getting errors towards the end of the compilation process; is there some aspect of compilation that's "sticky"?)

<wingo>like does compiling module A lead to allocations staying around permanently?

<wingo>sorry for the ignorant questions, i am just trying to understand :P

<civodul>these are good questions and i don't always have the answers

<civodul>so yes, macro expansion multiply the number of tokens, etc.

<civodul>still, we have a 100x factor and i wonder how we could do better

<civodul>source location info plays an important role; this can be seen by replacing 'annotate' with 'identity' in read-syntax

<civodul>as for allocations staying around: the heap never shrinks (libgc is built with --disable-munmap)

<civodul>woo

<wingo>disable-munmap is fine, just weird that in that bug the error occured at like 95%

<wingo>how likely is it that the "hardest" file to compile would happen at the end? (does that compilation process use threads?)

<wingo>with source location info being in syntax objects and not in the weak table we eliminate some weirdness in the garbage collector. less need for the weak side-table, so things are more predictable

<wingo>how are they when compared to that bug report? can you send an update?

<wingo>with guile from main

<wingo>i ask not to annoy you but because i am interested in the problem :)

<wingo>civodul: i guess you have tried but -O0 isn't any better ?

<wingo>probably not because the terms are larger but who knows

<civodul>heh np! :-)

<civodul>so source location info in syntax objects led to reduced heap usage compared to source props

<civodul>and i agree that it eliminates the gc weirdness, so way to go!

<civodul>i'm not sure i kept the figures though

<civodul>wingo: just sent an update to this issue above: "main" reduces heap and run time by 20%, compared to 3.0.7, when compiling that large file with -O1

<civodul>which is good news

<mwette>So not sure relevant, but a while ago (maybe 3.0 timeframe) I was always running into time/memory issues with large sxml-match forms. My guess was

<mwette>3.0 release timeframe

<civodul>mwette: ah no, i'm looking at resource usage of the compiler

<mwette>It was the compiler. It would grind away for 30+ minutes and then crash.

<civodul>fun fact: with -O0, the compiler uses twice as much memory than at -O1

<mwette>My guess was that sxml-match was asking for very deeply nested let's

<mwette>I first solved by breaking up iwnto multiple sxml-match forms, then by writing my own version.

<civodul>oh interesting

<civodul>it could be similar here in that there's a macro that expands to big letrecs

<dsmith-work>"bigrec" ?

<dsmith-work>An analog to bignum

<daviid>wingo, civodul: fwiw, just pulled main, compiled ... make check fails - FAIL: test-out-of-memory [although i have 8GB of mem, 5GB free mem at the time the test was run] - https://paste.gnome.org/poejqqdw9

IRC channel logs

2022-02-07.log