IRC channel logs

<spk121>does anyone know a way to test if the JIT is working correctly? I'm trying to figure out if my MinGW JIT hack is running right

<civodul>spk121: you can set GUILE_JIT_THRESHOLD=0 to force jitting everything

<spk121>civodul: so if I to that and run ./check-guile and it passes, does that mean that JIT is working?

<civodul>yup, i think so!

*civodul -> zZz

<civodul>happy hacking! :-)

<spk121>huh. I did not expect to be able to get that working

<civodul>well done

<rlb>Is the 'h' in scm_struct_init documented anywhere? (Maybe I missed it.)

***apteryx_ is now known as apteryx

<fnstudio>hi, suppose i want to glue a few strings together into a url, eg "<schema>://<host>:<port>/<path>/"

<fnstudio>is there a guile-y way of doing this?

<fnstudio>i'd like to avoid string-append since a few bits are sort of scaffolding that'd make the use of string-append very verbose

<rekado>fnstudio: (web uri) has build-uri to build URIs

<fnstudio>rekado: oh great, i'll look at that then, thanks!

<fnstudio>hi, i'm struggling with a http-post request: https://paste.debian.net/1189371/

<fnstudio>there's a header that i'm supposed to add to the request that doesn't seem to be recognised/accepted

<fnstudio>and the request generates a "Bad value for header" error

<mdevos>fnstudio: maybe replace the symbol test with the string "test"?

<mdevos>I dunno

<fnstudio>mdevos: hey thanks, i'd actually tried that but it didn't help, but thanks for suggesting it

<mdevos>I found something relevant, let me post a link ...

<mdevos> https://git.savannah.gnu.org/cgit/guix.git/tree/guix/scripts/publish.scm#n665

<fnstudio>let me see

<mdevos>--> declare-header!

<fnstudio>mdevos: hm true, it must be that

<fnstudio>although i've just tried with "declare-opaque-header!" and that didn't work, i'm going to try with "declare-header!" but that means i need to dig a little bit deeper and get a grasp of what the parser/validator/writer should be

<fnstudio>thanks a lot, that's definitely put me on a good track

<iv-so>what is current state of elisp in guile?

<fnstudio>mdevos: this seems to work now https://paste.debian.net/1189380/

<fnstudio>i can see the request to be generated and sent to the server

<fnstudio>so i'm now past that error

<fnstudio>although i got into another one immediately downstream of that

<fnstudio>the server doesn't seem to like the request and the connection is closed straightaway

<fnstudio>with a 401

<fnstudio>it works with curl

<fnstudio>this is no longer about guile though, it might be that i'm missing a header or something

<ArneBab>fnstudio: can you watch the network with wireshark?

<fnstudio>ArneBab: yes, that's what i did, i was able to compare the two requests, curl vs guile

<fnstudio>that gave me a hint

<fnstudio>as the curl one contained a authorization header

<fnstudio>i thought that guile would transform the userinfo argument into a auth header (by calculating the hash of username + password)

<fnstudio>(or whatever... i might have misphrased the above, but you get what i mean)

<fnstudio>but i was wrong

<fnstudio>and i can actually read: "But since passwords do not belong in URIs, the RFC does not want to condone this practice"

<fnstudio> (https://www.gnu.org/software/guile/manual/html_node/URIs.html)

<fnstudio>so the userinfo is actually just the username

<fnstudio>so i added the auth header "manually" and that... did the trick!

<fnstudio>so i now have a working request, filed under small wins :)

<fnstudio>i actually copied the authorization header over from the curl tcpdump

<fnstudio>as a test

<fnstudio>so i now need to actually build it programmatically in guile

<wingo>good evening

<wingo>spk121: nice work getting jit enabled on mingw!!!!

<wingo>works with GUILE_JIT_THRESHOLD=0 then ?

<ArneBab>wingo: do you know offhand how the Lilypond-folks could switch between optimization levels? On the lilypond list there are now benchmarks of Guile 1.8 vs. 2.2 vs. 3.0.6 that do not look all bad — 3.0.6 is still slower for them than 1.8, but less so than 2.2

<wingo>ArneBab: these benchmarks, i guess they are about time to load a file from source and then read and local-eval some expressions, is that right?

<ArneBab> https://lists.gnu.org/archive/html/lilypond-devel/2021-03/msg00054.html

<wingo>i assume that compile time is not part of the benchmarks

<ArneBab> https://lists.gnu.org/archive/html/lilypond-devel/2021-03/msg00049.html

<ArneBab>I’m not sure — it could well be, because Lilypond has lots of code embedded in ly-files.

<wingo>right. so what is happening here is a few things at once. it would not appear that the change to the reader is significant. the biggest difference between 1.8 and 2.x/3.x is that we run psyntax eagerly on the input, then eval, instead of just starting eval directly. and in 1.8 eval is in C, and 2.x/3.x it is in scheme

<mdevos>wingo: have you received my mail on a (fixed) wait-until-port-readable/writable patch for guile-fibers? <https://lists.gnu.org/archive/html/guile-user/2021-03/msg00035.html> (I don't need a response to it yet, just checking :-) as some people have inboxes with >4K unread mails).

<ArneBab>But 3.x is much faster than 2.2 again

<ArneBab>It’s just not yet at the level of 1.8 again

<wingo>depending on whether eval is called from c or scheme will affect lilypond's experience; calls from scheme faster than calls from c

<rlb>Hmm, I'm likely misunderstanding something, but i18n.c's SCM_STRING_TO_U32_BUF uses scm_i_string_wide_chars which strings.h says doesn't guarantee null termination, but it then calls u32_strcoll, which requires null termination?

<rlb>s/requires/depends on/

<wingo>ArneBab: anyway i guess good that 3.0 on the same order as 1.8, and good to know reader isn't big issue. might make sense to think about time-to-eval in future but no significant perf regressions in 3.0.6 afaiu for lilypond

<wingo>though i guess they don't test 3.0.5, so i can also assume that our general direction towards better perf will be fine for them, given that they don't track it on the day-to-day

<ArneBab>it’s still not back at the performance of 1.8, but I think we’re getting closer to enabling them to switch.

<wingo>hard to beat the latency of a pure-C interpreter with lazy macros

<wingo>lazy macros. what a bonkers thing

<ArneBab>3.0.5.116-85433 "eating a little more memory than with guile-1.8.8 but far less than guile-2.2.6"

<fnstudio>(any obvious module for base64-encoding/decoding?)

<wingo>fnstudio: there is one in guile-lib

<ArneBab>fnstudio: I only have some base32 encoding

<wingo>~/src/guile-1.8$ ./pre-inst-guile

<wingo>guile> ((if 42 if list) #f 10 'hey)

<wingo>hey

*wingo shakes head

<fnstudio>wingo: excellent, thanks, looking that up in guile-lib now

<civodul>wingo: wat?!

<wingo>civodul: :)

<mdevos>wingo: likewise?!

<wingo>the good old days weren't so good!

<civodul>what if, in fact, this captured the essence of programming?...

<civodul>fnstudio: guile-gcrypt has base32 and base64

<ArneBab>civodul: I didn’t know …

<fnstudio>civodul: super, i'll give that a try immediately, thanks

<wingo>incidentally john shutt of the kernel language died recently. i always thought fexprs were bonkers but enjoyed reading him in the golden days of lambda-the-ultimate

<civodul>oh

<wingo>yeah. fellow traveller of weird languages; things are more boring without him

<wingo>mdevos: no attachment on that mail?

<rlb>wrt i18n.c and u32_strcoll, fundamentally I'm wondering if we might have a potential buffer overrun there.

<wingo>rlb: humm i think we probably do

<wingo>i agree with your reasoning

<rlb>OK, and I don't see any way to fix it (trivially) with libunistring -- don't think they have non-null-terminated versions.

<wingo>really!

<wingo>that's weird

<wingo>anyway, ok.

<wingo>i guess we mangle SCM_STRING_TO_U32_BUF then

<wingo>are you on it? :)

<rlb>(As an aside, I ended up fully converting strings.c to utf-8 -- in a very blunt manner (would need clean up), and saw that when I started working on the other code that calls into strings.h.)

<wingo>!!!

<rlb>I'm not, but I could be -- obviously a very expensive approach would be to just duplicate the wide strings and null terminate them.

<wingo>yeah but it's reasonable imo

<rlb>Another would be to put a secret null at the end of all wide strings? Could we get away with that?

<wingo>i.e. it fixes the problem. for latin1 strings it's already like that

<rlb>i.e. just stop not null-terminating strings.

<wingo>can't put a secret null in, doesn't work for shared substrings

<rlb>ooooooooooooooooooooooooooh

<rlb>:)

<rlb>of course -- I should know better.

<rlb>been staring at that code off and on for days, of course.

<wingo>:)

<wingo>how do you see the latin1 / utf-32 / utf-8 tradeoffs?

<rlb>Oh, and not sure if this is OK either (and could be changed), but I got rid of SH_STRINGs, i.e. strings just point into their buffers, and always know their fully computed offset.

<wingo>i.e. is utf-8 a clear winner for you?

<rlb>Well, I think it's a good bit easier to deal with in some ways, since I changed it to have ascii and non-ascii strings, the former are fixed-with, but *all* bytes are utf8.

<rlb>i.e. we can pass the string bytes from either string to the u8_... functions.

<rlb>The biggest cost I've seen so far is (unsurprisingly) any use of set_x

<rlb>or ref

<rlb>say in a loop.

<wingo>yeah

<rlb>Code like that will need to change (where possible) to use traversals/transformers/folders of various sorts.

<rlb>and sorry, didn't get rid of shared strings, just changed them to hold a pointer to the original *string* not the string's buffer since non-ascii strings now have byte-start and byte-length memebers.

<fnstudio>(thanks civodul, gcrypt did it!)

<rlb>That's a trade off I made for perf in the common cases, i.e. non-ascii strings keep the computed byte offsets too.

<rlb>(into the buffers)

<rlb>And of course, imagine it's all buggy -- can't run any of it yet, and need to review for remember_upto_here use (once I remember the rules), etc.

<wingo>sounds tricky from a threadsafety POV wrt string-set!

<rlb>yeah, I'm not even sure I understand what our expectations are there yet, and/or what we promised before.

<rlb>unless it's an ascii string and an ascii char, then you have to just copy/rewrite it.

<wingo>races are possible, crashes are not, is the basic thing. means that if you need to change the "shape" of a shared mutable string, you need to be able to allocate a new one and atomically swap it in; from that POV, any byte offset would be invalidated

<rlb>wingo: I think maybe any of the libunistring functions with "str" in the name require null termination, so this could find other places we might want to review: git grep -E 'u[138][^_]+_.*str'

<wingo>logic is in scm_i_string_ensure_mutable_x

<rlb>Hmm, I'll have to think about what that might mean for sharing -- i.e. we can't update all that atomically the way I have it when there's sharing? Not sure. I'll think about it.

<wingo>yeah something to mull over

<rlb>Maybe we can't do that without an immutable "shared offset" table or something?

<rlb>that we can swap in atomically, i.e. one pointer would have to cover both the stringbuf and all the shared offsets

<rlb>or we'll have to "do something else"

<wingo>generally speaking character offsets can persist; byte offsets are tricky

<rlb>shared strings and multibyte...

<wingo>for utf8

<rlb>right I meant the byte offsets

<wingo>so if you can detect that the underlying object changed, you can invalidate a byte offset

<rlb>but if we don't cache those, it's going to be much uglier perf-wise.

<rlb>hmm, yeah, maybe that's better -- come up with a lazy approach

<wingo>and if you need to make an atomic change that might alter the byte size of a character, either you realloc or serialize through a mutex

<rlb>but may still have some tricky data races if we're not careful...

<rlb>mutex would make it much easier (of course)

<wingo>but tricky without string buffers as an indirection

<wingo>nb, i am not arguing for string buffers

<civodul>fnstudio: actually weinholt is the original author of the base64 module and the one to thank :-)

<rlb>oh, I kept string buffers.

<wingo>just pointing out an aspect of the current arrangement

<wingo>rlb: mutation-sharing substrings are not an important perf case -- imo anyway

<wingo>so simple and correct >>> complicated and faster

<wingo>in that regard

<rlb>suppose we could just use more of a sledgehammer for the first version then, and optimize later if it turns out to be worth it.

<rlb>i.e. think that's roughly what you said.

<wingo>sure. as long as we can avoid a mutex for most instances of string-set!

<fnstudio>right, thank you civodul for the tip and weinholt for a library that's just been extremely useful (gcrypt)

<wingo>or maybe that's not even a requirement, dunno

<rlb>(...looks like most of the cases that might involve null termination questions are in vasnprintf.c and i18n.c)

<rlb>well in any case, ascii strings don't have that problem, so that's another chunk of the time we won't be affected either way.

<wingo>well, goldurnit, should probably fix that before 3.0.6

<wingo>but is my only current blocker in that regard fwiw

<wingo>the null-termination issue i mea

<wingo>n

<rlb>well, it's been that way for a *long* time.

<rlb>I'd guess.

<wingo>yeah sure but now that i know it, i hate it :P

<rlb>Heh - I might be able to help, but depends on the timetable - I might or might not have time in the much shorter term.

<rlb>Anyway, and wrt the utf8 stuff it's all so far a *hack* - I just clobbered code freely (deleting, rearranging, reformatting), in ways that might be pretty eyebrow-raising for an upstream submission, so aside from actually getting it working, I might also have to do a *lot* of clean up, deprecations, properly documented removals, etc.

<rlb>(So still might not go anywhere...)

<wingo>yeah regarding utf8, is definitely in the 3.2 category

<wingo>not a 3.0 thing

<rlb>Oh, *no doubt* :)

<wingo>:)

<rlb>btw, I'd started using utf8_t* for the bytes (as does libunistring), but of course a lot of apis have char* -- in terms of the public strings.h api, would we stick with char* and coercions, or...?

<wingo>good question :P

<wingo>publicly, no change IMO

<wingo>there is very little good that would come from change to the string C API :P

<wingo>(initial reaction, obvs)

<wingo>internally -- hoo dunno. initial reaction is uint8_t* is sufficient, not char*, but not sure about utf8_t*; though note that FOO_t is reserved by the c standard

<wingo>so would hesitate to define one locally

*wingo zzz

<rlb>we already get one via libunistring.

IRC channel logs

2021-03-14.log