<spk121>does anyone know a way to test if the JIT is working correctly? I'm trying to figure out if my MinGW JIT hack is running right <civodul>spk121: you can set GUILE_JIT_THRESHOLD=0 to force jitting everything <spk121>civodul: so if I to that and run ./check-guile and it passes, does that mean that JIT is working? <spk121>huh. I did not expect to be able to get that working <rlb>Is the 'h' in scm_struct_init documented anywhere? (Maybe I missed it.) ***apteryx_ is now known as apteryx
<fnstudio>hi, suppose i want to glue a few strings together into a url, eg "<schema>://<host>:<port>/<path>/" <fnstudio>i'd like to avoid string-append since a few bits are sort of scaffolding that'd make the use of string-append very verbose <rekado>fnstudio: (web uri) has build-uri to build URIs <fnstudio>rekado: oh great, i'll look at that then, thanks! <fnstudio>there's a header that i'm supposed to add to the request that doesn't seem to be recognised/accepted <fnstudio>and the request generates a "Bad value for header" error <mdevos>fnstudio: maybe replace the symbol test with the string "test"? <fnstudio>mdevos: hey thanks, i'd actually tried that but it didn't help, but thanks for suggesting it <mdevos>I found something relevant, let me post a link ... <fnstudio>although i've just tried with "declare-opaque-header!" and that didn't work, i'm going to try with "declare-header!" but that means i need to dig a little bit deeper and get a grasp of what the parser/validator/writer should be <fnstudio>thanks a lot, that's definitely put me on a good track <iv-so>what is current state of elisp in guile? <fnstudio>i can see the request to be generated and sent to the server <fnstudio>although i got into another one immediately downstream of that <fnstudio>the server doesn't seem to like the request and the connection is closed straightaway <fnstudio>this is no longer about guile though, it might be that i'm missing a header or something <ArneBab>fnstudio: can you watch the network with wireshark? <fnstudio>ArneBab: yes, that's what i did, i was able to compare the two requests, curl vs guile <fnstudio>as the curl one contained a authorization header <fnstudio>i thought that guile would transform the userinfo argument into a auth header (by calculating the hash of username + password) <fnstudio>(or whatever... i might have misphrased the above, but you get what i mean) <fnstudio>and i can actually read: "But since passwords do not belong in URIs, the RFC does not want to condone this practice" <fnstudio>so the userinfo is actually just the username <fnstudio>so i added the auth header "manually" and that... did the trick! <fnstudio>so i now have a working request, filed under small wins :) <fnstudio>i actually copied the authorization header over from the curl tcpdump <fnstudio>so i now need to actually build it programmatically in guile <wingo>spk121: nice work getting jit enabled on mingw!!!! <wingo>works with GUILE_JIT_THRESHOLD=0 then ? <ArneBab>wingo: do you know offhand how the Lilypond-folks could switch between optimization levels? On the lilypond list there are now benchmarks of Guile 1.8 vs. 2.2 vs. 3.0.6 that do not look all bad — 3.0.6 is still slower for them than 1.8, but less so than 2.2 <wingo>ArneBab: these benchmarks, i guess they are about time to load a file from source and then read and local-eval some expressions, is that right? <wingo>i assume that compile time is not part of the benchmarks <ArneBab>I’m not sure — it could well be, because Lilypond has lots of code embedded in ly-files. <wingo>right. so what is happening here is a few things at once. it would not appear that the change to the reader is significant. the biggest difference between 1.8 and 2.x/3.x is that we run psyntax eagerly on the input, then eval, instead of just starting eval directly. and in 1.8 eval is in C, and 2.x/3.x it is in scheme <ArneBab>But 3.x is much faster than 2.2 again <ArneBab>It’s just not yet at the level of 1.8 again <wingo>depending on whether eval is called from c or scheme will affect lilypond's experience; calls from scheme faster than calls from c <rlb>Hmm, I'm likely misunderstanding something, but i18n.c's SCM_STRING_TO_U32_BUF uses scm_i_string_wide_chars which strings.h says doesn't guarantee null termination, but it then calls u32_strcoll, which requires null termination? <rlb>s/requires/depends on/ <wingo>ArneBab: anyway i guess good that 3.0 on the same order as 1.8, and good to know reader isn't big issue. might make sense to think about time-to-eval in future but no significant perf regressions in 3.0.6 afaiu for lilypond <wingo>though i guess they don't test 3.0.5, so i can also assume that our general direction towards better perf will be fine for them, given that they don't track it on the day-to-day <ArneBab>it’s still not back at the performance of 1.8, but I think we’re getting closer to enabling them to switch. <wingo>hard to beat the latency of a pure-C interpreter with lazy macros <wingo>lazy macros. what a bonkers thing <ArneBab>3.0.5.116-85433 "eating a little more memory than with guile-1.8.8 but far less than guile-2.2.6" <fnstudio>(any obvious module for base64-encoding/decoding?) <wingo>fnstudio: there is one in guile-lib <ArneBab>fnstudio: I only have some base32 encoding <wingo>~/src/guile-1.8$ ./pre-inst-guile <wingo>guile> ((if 42 if list) #f 10 'hey) <fnstudio>wingo: excellent, thanks, looking that up in guile-lib now <wingo>the good old days weren't so good! <civodul>what if, in fact, this captured the essence of programming?... <civodul>fnstudio: guile-gcrypt has base32 and base64 <fnstudio>civodul: super, i'll give that a try immediately, thanks <wingo>incidentally john shutt of the kernel language died recently. i always thought fexprs were bonkers but enjoyed reading him in the golden days of lambda-the-ultimate <wingo>yeah. fellow traveller of weird languages; things are more boring without him <wingo>mdevos: no attachment on that mail? <rlb>wrt i18n.c and u32_strcoll, fundamentally I'm wondering if we might have a potential buffer overrun there. <wingo>rlb: humm i think we probably do <rlb>OK, and I don't see any way to fix it (trivially) with libunistring -- don't think they have non-null-terminated versions. <wingo>i guess we mangle SCM_STRING_TO_U32_BUF then <rlb>(As an aside, I ended up fully converting strings.c to utf-8 -- in a very blunt manner (would need clean up), and saw that when I started working on the other code that calls into strings.h.) <rlb>I'm not, but I could be -- obviously a very expensive approach would be to just duplicate the wide strings and null terminate them. <wingo>yeah but it's reasonable imo <rlb>Another would be to put a secret null at the end of all wide strings? Could we get away with that? <wingo>i.e. it fixes the problem. for latin1 strings it's already like that <rlb>i.e. just stop not null-terminating strings. <wingo>can't put a secret null in, doesn't work for shared substrings <rlb>ooooooooooooooooooooooooooh <rlb>of course -- I should know better. <rlb>been staring at that code off and on for days, of course. <wingo>how do you see the latin1 / utf-32 / utf-8 tradeoffs? <rlb>Oh, and not sure if this is OK either (and could be changed), but I got rid of SH_STRINGs, i.e. strings just point into their buffers, and always know their fully computed offset. <wingo>i.e. is utf-8 a clear winner for you? <rlb>Well, I think it's a good bit easier to deal with in some ways, since I changed it to have ascii and non-ascii strings, the former are fixed-with, but *all* bytes are utf8. <rlb>i.e. we can pass the string bytes from either string to the u8_... functions. <rlb>The biggest cost I've seen so far is (unsurprisingly) any use of set_x <rlb>Code like that will need to change (where possible) to use traversals/transformers/folders of various sorts. <rlb>and sorry, didn't get rid of shared strings, just changed them to hold a pointer to the original *string* not the string's buffer since non-ascii strings now have byte-start and byte-length memebers. <rlb>That's a trade off I made for perf in the common cases, i.e. non-ascii strings keep the computed byte offsets too. <rlb>And of course, imagine it's all buggy -- can't run any of it yet, and need to review for remember_upto_here use (once I remember the rules), etc. <wingo>sounds tricky from a threadsafety POV wrt string-set! <rlb>yeah, I'm not even sure I understand what our expectations are there yet, and/or what we promised before. <rlb>unless it's an ascii string and an ascii char, then you have to just copy/rewrite it. <wingo>races are possible, crashes are not, is the basic thing. means that if you need to change the "shape" of a shared mutable string, you need to be able to allocate a new one and atomically swap it in; from that POV, any byte offset would be invalidated <rlb>wingo: I think maybe any of the libunistring functions with "str" in the name require null termination, so this could find other places we might want to review: git grep -E 'u[138][^_]+_.*str' <wingo>logic is in scm_i_string_ensure_mutable_x <rlb>Hmm, I'll have to think about what that might mean for sharing -- i.e. we can't update all that atomically the way I have it when there's sharing? Not sure. I'll think about it. <rlb>Maybe we can't do that without an immutable "shared offset" table or something? <rlb>that we can swap in atomically, i.e. one pointer would have to cover both the stringbuf and all the shared offsets <rlb>or we'll have to "do something else" <wingo>generally speaking character offsets can persist; byte offsets are tricky <rlb>shared strings and multibyte... <rlb>right I meant the byte offsets <wingo>so if you can detect that the underlying object changed, you can invalidate a byte offset <rlb>but if we don't cache those, it's going to be much uglier perf-wise. <rlb>hmm, yeah, maybe that's better -- come up with a lazy approach <wingo>and if you need to make an atomic change that might alter the byte size of a character, either you realloc or serialize through a mutex <rlb>but may still have some tricky data races if we're not careful... <rlb>mutex would make it much easier (of course) <wingo>but tricky without string buffers as an indirection <wingo>nb, i am not arguing for string buffers <civodul>fnstudio: actually weinholt is the original author of the base64 module and the one to thank :-) <rlb>oh, I kept string buffers. <wingo>just pointing out an aspect of the current arrangement <wingo>rlb: mutation-sharing substrings are not an important perf case -- imo anyway <wingo>so simple and correct >>> complicated and faster <rlb>suppose we could just use more of a sledgehammer for the first version then, and optimize later if it turns out to be worth it. <rlb>i.e. think that's roughly what you said. <wingo>sure. as long as we can avoid a mutex for most instances of string-set! <fnstudio>right, thank you civodul for the tip and weinholt for a library that's just been extremely useful (gcrypt) <wingo>or maybe that's not even a requirement, dunno <rlb>(...looks like most of the cases that might involve null termination questions are in vasnprintf.c and i18n.c) <rlb>well in any case, ascii strings don't have that problem, so that's another chunk of the time we won't be affected either way. <wingo>well, goldurnit, should probably fix that before 3.0.6 <wingo>but is my only current blocker in that regard fwiw <wingo>the null-termination issue i mea <rlb>well, it's been that way for a *long* time. <wingo>yeah sure but now that i know it, i hate it :P <rlb>Heh - I might be able to help, but depends on the timetable - I might or might not have time in the much shorter term. <rlb>Anyway, and wrt the utf8 stuff it's all so far a *hack* - I just clobbered code freely (deleting, rearranging, reformatting), in ways that might be pretty eyebrow-raising for an upstream submission, so aside from actually getting it working, I might also have to do a *lot* of clean up, deprecations, properly documented removals, etc. <rlb>(So still might not go anywhere...) <wingo>yeah regarding utf8, is definitely in the 3.2 category <rlb>btw, I'd started using utf8_t* for the bytes (as does libunistring), but of course a lot of apis have char* -- in terms of the public strings.h api, would we stick with char* and coercions, or...? <wingo>there is very little good that would come from change to the string C API :P <wingo>internally -- hoo dunno. initial reaction is uint8_t* is sufficient, not char*, but not sure about utf8_t*; though note that FOO_t is reserved by the c standard <wingo>so would hesitate to define one locally <rlb>we already get one via libunistring.