IRC channel logs
2024-06-22.log
back to list of logs
<rlb>wingo: did find the recently posted pre-scheme plan (though nothing guile-specific), and while I have no idea if it'd be useful, if it might, I suspect I'd be more than happy to try to help with a utf-8 implementation more like the one we might be contemplating for guile, (i.e. sparsely indexed w/variable stride -- step 5 in the plan...). <rlb>flatwhatson: oh, I hadn't made the connection yet :) <flatwhatson>rlb: very interested in string compatibility with guile! my ideas there are only loosely sketched at the moment <rlb>flatwhatson: ok, well not sure it matters, but have it working (via C) in the proposed utf-8 branch, but it's really the idea that might or might not be useful. In memory, it's effectively length-prefixed, read-only (the base buffers) and has a trailing "sparse index" that indexes every nth char, where the stride (nth) is (compile-time) configurable, and the index offset value type (8-bit, 16-bit, ...) is selected based on the length <rlb>The index is trailing because (as Russ pointed out), any number of operations don't need it. And it's all inline (length, data, index) for cache friendliness (I hope). <rlb>Might or might not have some new utf-8 specific tests you could borrow too, I guess? Would have to refresh. <rlb>"git log -p test-suite" on that branch will show some of them -- added some test data that included all the utf-8 sizes, etc., but it's not finished. <rlb>(among other bits in that file and elsewhere) <flatwhatson>thanks, this is all useful! it might be 2 months or so before i'm digging seriously into strings, and i'm not sure yet what to do about indexing. <flatwhatson>i think for prescheme the indexing needs to be optional (maybe a separate indexed-string type), it seems less relevant for embedded use-cases and i don't want to compromise too much on those in the core language <rlb>Right, only if it helps, of course, though note that the index can also be omitted until the string hits a certain length by tuning the stride, as long as a single branch is ok, even when there's no index. <rlb>Also fwiw, the current proposal actually has two flavors of string ascii, and utf-8 (not entirely ascii). The former, of course, never has/needs an index. <rlb>(that somewhat mirrors the current guile strings which are latin-1 or utf-32) <rlb>Anyway, feel free to ping me, if you think I might be able to help somehow. <cow_2001>yeah, i just added a black background rectangle under the image in inkscape :D <cow_2001>not my observation. it was lain of pleroma that first did it