IRC channel logs

<rlb>I have been surprised in a number of cases where we didn't have any tests...

<rlb>Less surprised when we didn't have non-ascii and/or crazier unicode tests...

<rlb>(Have started introducing more of those via https://codeberg.org/rlb/guile/src/branch/utf8/test-suite/test-suite/data.scm )

<mwette>Apparently, pcre is recommended.

<rlb>where did you mean?

<mwette>it was an ai response to web search for "does linux regex library work on utf-8 strings"

<rlb>Ahh, right --- pcre *could* work. (For now I've used that for clj's regex support in lokke.)

<rlb>ACTION looks to remember what the current state is in utf8...

<rlb>mwette: oh, right, I think that it's probably just "whatever regex(3) does" as usual -- i.e. I think we just convert our string (used to be latin1/utf-32, now always utf-8) to a "locale string" and then hand that to regex_exec(3), etc.

<mwette>OK. So not a big issue.

<cow_2001>why would you check if something's "pair?" and then that it's not "null?" if it's a "pair?" it cannot be a "null?", right?

<cow_2001> https://codeberg.org/guile/guile/pulls/117/files#diff-9310b651841e9094160358f22da8ed6d4faaf31c

<rlb>sounds right to me offhand, i.e. I'd imagine that not null is no longer needed.

<rlb>Without knowing the context there, I'm guessing it's dropping list? for performance.

<mwette>a string is neither

<rlb>Looks like ice-9 match doesn't support multiple values right now?

<cow_2001>turns out that if you have whitespace+ in non-terminals, they add up with the whitespace+ in the terminals ~;~

<cow_2001>i sprinkled whitespace* all over the place and changed some to whitespace+ and stuff stopped working

<cow_2001>it is at times like these i wish i had a proper desk to which i could smash my face into

<ArneBab>old: now there

<ArneBab>old: I don’t know explicit flonum benchmarks. Maybe you could adapt the nbody benchmark from the benchmarksgame: https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/nbody-racket-1.html https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/nbody-racket-2.html

<old>merged the bump of gnulib version

<old>one will need to call autogen

<mwette>ty old

<rlb>...open-input-string doesn't need to copy the contents if the source string is read-only.

<JohnCowan>Float benchmarks would tend to measure only hardware performance and type inference

<old>I'm mostly wondering if adding flonum? as a primcall in tree-il would help these benchmarks

<old>Anyway. It seems to help in some cases. Now flonum? and fixnum? are both primitives understood by tree-il

<old>However, I keep wondering if this the correct type of optimization we want.

<old>Instead of marking these rnrs functions as primitives, it would be interesting if the compiler at the CPS level could infer the type and predicaee check automatically

<old>For example, flonum? now gets compiled to heap-object? + flonum?

<old>However, if one write flonum? by hand: (define (my-flonum? x) (and (real? x) (inexact? x)))

<old>then no primcall is emitted and CPS failed to see that the check is equivalent to flonum?

<old>instead the compiler emits: fixnum? + heap-object? + heap-number? + compnum? + flonum?

<old>Not sure what kind of optimization pass this would be called, I'm no compiler expert yet

<rlb>Is there something like (char->number c) matching (string->number c)? Say you have a char, and you want to convert it to a number (as string->number would) without having to allocate a string.

<rlb>ACTION suspects not

<old>rlb: you mean a digit to number ?

<old>Or the ordinal value of a character

<rlb>right #\4 -> 4

<old>like 0 -> 48 in ASCII

<rlb>as (string->number "4") would

<old>ah okay

<old>hm

<rlb>Though I just realized that in this case it's probably not important.

<old>well in C we have the '0' - c hack :-)

<rlb>i.e. the case in question only cares about ascii

<old>but in Scheme Idk

<rlb>string->number iirc may be locale-aware

<rlb>(which is why I was wondering)

<rlb>(Arabic digits, etc.)

<old>I'm wondering if the: '0' - c hack works in all encoding

<old>in that, is the distance between the 0 character and some other digit always that digit ?

<old>for all encoding

<old>hm

<old>i suspect not

<rlb>I *think* srfi-207 means just 0-9a-fA-F, even though its reference implementation uses string->integer in at least one place.

<rlb>ACTION was just attempting to remove some more repeated string-refs.

IRC channel logs

2026-05-09.log