IRC channel logs

2024-06-22.log

back to list of logs

<rlb>wingo: did find the recently posted pre-scheme plan (though nothing guile-specific), and while I have no idea if it'd be useful, if it might, I suspect I'd be more than happy to try to help with a utf-8 implementation more like the one we might be contemplating for guile, (i.e. sparsely indexed w/variable stride -- step 5 in the plan...).
<rlb>flatwhatson: oh, I hadn't made the connection yet :)
<flatwhatson>rlb: very interested in string compatibility with guile! my ideas there are only loosely sketched at the moment
<rlb>flatwhatson: ok, well not sure it matters, but have it working (via C) in the proposed utf-8 branch, but it's really the idea that might or might not be useful. In memory, it's effectively length-prefixed, read-only (the base buffers) and has a trailing "sparse index" that indexes every nth char, where the stride (nth) is (compile-time) configurable, and the index offset value type (8-bit, 16-bit, ...) is selected based on the length
<rlb>of the string.
<rlb>fwiw - https://codeberg.org/rlb/guile/src/branch/utf8
<rlb>The index is trailing because (as Russ pointed out), any number of operations don't need it. And it's all inline (length, data, index) for cache friendliness (I hope).
<rlb>Might or might not have some new utf-8 specific tests you could borrow too, I guess? Would have to refresh.
<rlb>"git log -p test-suite" on that branch will show some of them -- added some test data that included all the utf-8 sizes, etc., but it's not finished.
<rlb>e.g. https://codeberg.org/rlb/guile/src/branch/utf8/test-suite/test-suite/data.scm#L100-L101
<rlb>(among other bits in that file and elsewhere)
<flatwhatson>thanks, this is all useful! it might be 2 months or so before i'm digging seriously into strings, and i'm not sure yet what to do about indexing.
<flatwhatson>i think for prescheme the indexing needs to be optional (maybe a separate indexed-string type), it seems less relevant for embedded use-cases and i don't want to compromise too much on those in the core language
<rlb>Right, only if it helps, of course, though note that the index can also be omitted until the string hits a certain length by tuning the stride, as long as a single branch is ok, even when there's no index.
<rlb>Also fwiw, the current proposal actually has two flavors of string ascii, and utf-8 (not entirely ascii). The former, of course, never has/needs an index.
<rlb>(that somewhat mirrors the current guile strings which are latin-1 or utf-32)
<rlb>Anyway, feel free to ping me, if you think I might be able to help somehow.
<flatwhatson>thanks rlb, will do!
<cow_2001>if you add black background to the pictures it looks a wee bit spooky and grotesque https://kaka.farm/pub/guile-mascot/
<cow_2001>yeah, i just added a black background rectangle under the image in inkscape :D
<mwette>cow_2001: I like
<cow_2001>not my observation. it was lain of pleroma that first did it