IRC channel logs

<sarna>hey! I'm trying to install guile-hall on a mac and I'm a bit stuck. it seems like guile-config somehow generated a wrong path to pkg-config - all invocations fail with `error: ("/opt/homebrew/opt/pkg-config/bin/pkg-config" "--modversion" "guile-3.0")`. the real path to pkg-config is `/opt/homebrew/bin/pkg-config`

<sarna>is this a known problem or should I open an issue on the guile-config repo?

<spk121>hi! If there are any French around... should this be an acceptable result for this test?

<spk121>FAIL: i18n.test: monetary-amount->locale-string: French 8-bit: positive inexact zero - arguments: (expected-value "0,00 +EUR" actual-value "0,00 Ç")

<wingo>ACTION not french yet ;)

<wingo>looks weird to me tho

<spk121>wingo: yeah. I'll keep it as FAIL.

<wingo>dumb question -- should we be continuing to use libunistring? it is not much maintained, right?

<spk121>wingo: as I recall, back in the day, the discussion was libunistring vs ICU or GLib, as I recall. libunistring won for being GNU-friendly and for not being GLib. I was pushing for ICU because I'd used it before

<spk121>but to do it again now, I'd seriously consider cuneicode https://thephd.dev/cuneicode-and-the-future-of-text-in-c

<spk121>or rewrite it all in scheme ;-)

<wingo>both of those possibilities sound v interesting :)

<shawnw>That reminds me; I need to come up with a space-efficient way to store properties from the UCD in scheme.

<wingo>there may be some prior art in chez scheme or so

<shawnw>I have a perl script that populates a sqlite database with *everything*, but it ends up being like 70mb. And scheme (Well, Racket) for storing specific properties in vectors of character ranges that's not too bad but I want something more general and preferrably no external dependencies (Like sqlite)

<rekado>with the streaming web server I still get the rare “fport_write: Broken pipe” error when writing to the HTTP response port

<rekado>the web server is here: https://elephly.net/paste/1686919402.scm.html

<rekado>the body procedure only uses put-bytevector and force-output on the port

<cbaines>rekado, that can happen if the client hangs up before the response is sent, which includes reverse proxies like NGinx if they delivery a gateway timeout (504) to the client

<rekado>cbaines: if it’s really just the client that’s okay. I was a little worried by the error message, thinking that maybe the port disappears even when the connection is not terminated by the client.

<rlb>wingo, spk121: hmm, I'd wondered about cuneicode when I saw the article the other day, and more broadly speaking, at least on my end, now might be a good time if we thought we might want to switch. i.e. I'm having to write a bunch of code that has to use libunistring carefully, and could consider doing that with "something else" instead, so we don't have to be careful twice...

<rlb>(I've also used icu...)

<rlb>Though I suspect we'll likely just stick with libunicode for now.

<rlb>s/unicode/unistring/

<rlb>...iirc, scdoc just wrote its own (for utf-8 basics, to avoid any deps, but I believe we need more than the basics, including conversions to other charsets).

<rlb>wingo: I'd also wondered about (eventually) wanting some of the processing in scheme code, assuming we intended to move more or more stuff there over time.

<rlb>ACTION notices that cuneicode isn't in debian atm.

<civodul>ACTION feels like a newbie when using the GitLab web interface for code review

<civodul>dthompson: hopefully i managed to do it right :-)

<rlb>:)

<dthompson>civodul: I see everything. thank you!

<civodul>yay!

<dthompson>I'm glad things look mostly okay. my C is rusty

<spk121>rlb: FYI, conversions that require persistent state over multiple string conversions (BIG5-HK, ISO-2022-JP, teletext) are already broken in Guile. A solution that did 8-bit codepages and various UTFs and UCSs would suffice.

<spk121>Although, maybe improving things for legacy asian encodings should be a goal? Honestly don't know much about Unicode acceptance in China and Japan.

<rlb>ACTION neither

<rlb>Unless we think the compilers are going to be "sufficiently smart", I've also wondered if we might want vector-optimized versions of some of the relevant utf-8 operations on at least the most common platforms...

<rlb>(whether via a library or ourselves...)

<rlb>(...and of course preferably via a library, even if we help.)

<RhodiumToad>not sure how vectorizable many utf-8 operations are

<rlb>iirc, you can do some clever things, but would have to refresh.

<rlb>i.e. I know I've seen discussion of it before, but haven't delved -- I see this example discusses mostly validation and transcoding (https://github.com/simdutf/simdutf). In any case, just wondered.

<RhodiumToad>ACTION not really a simd expert, but has done a few things

<RhodiumToad>I did a uint64 to decimal string output function that was faster than the one in the intel optimization manual, but still not enough of an improvement over a really good non-vectorized version to be worth it

<rlb>Does seem like the considerations can be complex -- even heard that on some intel cpus, using avx 512 "too much" might actually slow things down (because thermals). In any case, I'm no expert either.

<RhodiumToad>avx2 is the most I've played with

<RhodiumToad>did a numeric IPv4 address to 32-bit binary conversion in 65 clocks, with full validation

<RhodiumToad>(and no unpredictable branches)

<RhodiumToad>(that is, the only branches were forward branches taken only on validation error)

<rlb>interesting

<RhodiumToad>the hard part of course was coping with the variable lengths; but that could be reduced to two table lookups in a very small table

<sarna>hey guys please be honest, how feasible is using guile on macos? I can't install guix, and I failed to install hall (for now at least) :( are there more parts expected to break?

<rlb>ACTION reworks list->string to deal with the fact that with utf-8, concurrent list mutations can change the final stringbuf size (i.e. can't naively scan up front anymore).

<rlb>(well, you can keep that approach, just have to make sure to bail out if the *byte*-length changes in the write pass)

<dalepsmith>sarna: Most devs are on some Linux. I think things like windows and macos don't get much love.

<spk121>sarna: on macos, I think your main option is the packages on pkgsrc. The basic guile libraries are there

<spk121>(I'm not a mac user, tho)

<rlb>Looks like it's also in homebrew: https://formulae.brew.sh/formula/guile

<rlb>ACTION has used homebrew indirectly in CI (for bup, lokke, etc.), but is no expert, and remote log-based debugging "isn't ideal"...

<sarna>spk121: you mean https://www.pkgsrc.org/ ? first time I hear about it

<sarna>rlb: yeah I installed guile via homebrew after I failed to install it from source :) it works fine, but there's no hall formula (or whatever they call it)

<sarna>and hall fails to install due to the issue I mentioned earlier

<sarna>if it's just that I'll have to install libraries in a bit more manual way - I'm totally fine with it. but I wouldn't be fine with something like ocaml on windows, where you need a cursed setup with mingw and whatnot

<sarna>actually I broke my custom emacs install in the process, hope I can get it back.. '^ ^

<dalepsmith>dadinn: If you are coming from matrix, you need to negotiate with the nickserv

<dalepsmith>!uptime

<dsmith>sneek, botsnack

<sneek>:)

<dsmith>!uptime

<sneek>I've been serving for one month and 26 days

<sneek>This system has been up 10 weeks, 6 days, 2 hours, 19 minutes

<rlb>wingo: currently I have ascii (potentially) mutable stringbufs and utf-8 immutable stringbufs. Strings just have a char count and pointer to the stringbuf (or string for shared). That allows mutations to be atomic via single change of the stringbuf pointer, but it also means that a substring that refers to a range of ascii-only chars from a utf-8 string(buf) won't (trivially) "know" that it's ascii-only, and so won't automatically

<rlb>take the optimized paths, and won't allow in-place mutation. Is that a meaningful problem?

<rlb>(Dropping in-place mutability for ascii strings would eliminate one of those irregularities.)

<rlb>I'm also assuming that we probably want to preserve the "atomic via just a single pointer swap" mutations, which rules out some ways ascii strings could just know they're ascii even if the underling stringbuf isn't entirely.

<rlb>At the moment, I'm just proceeding under the assumption that scm_i_is_ascii_string() may return false negatives, and if the optimization is important enough, then the code needs to find the string's stringbuf start and end byte and compare the difference to the string char count instead, i.e. don't rely on the scm_i_is_ascii_string() check.

<sneek>Welcome back dsmith!!

<dalepsmith>dadinn: Yeah, let me find some links... https://libera.chat/guides/faq#can-i-connect-with-matrix https://kparal.wordpress.com/2021/06/01/connecting-to-libera-chat-through-matrix/

IRC channel logs

2023-06-16.log