IRC channel logs

<mwette>Anyone else get this? My own built 3.0.9 libguile-3.0.so on ubuntu has no scm_c_take_typed_bytevector. But if I rebuild with --disable-lto it does have it. To check: nm $prefix/libguile-3.0.so | grep take_typed_bytevector

<old>Does LTO optimize symbol not used?

<old>not sure what kind of optimization the linker do with LTO

<mwette>it can do stuff like beta reduction between compilation units, IIRC

<old>but public symbol like scm_c_take_typed_bytevector (I assume it has default visibility) should not be optimized away right?

<mwette>I see now. it's internal. See SCM_INTERNAL explanation in scm.h

<mwette>I'm going to have to use C->scheme->C route.

<old>mwette: for me it is extern, so basically does nothing

<chsasank>hey folks, is it possible for me to call guile functions from another language like python?

<mwnaylor>I don't know about directly, but a C module can written as a bridge. https://www.gnu.org/software/guile/manual/html_node/Programming-in-C.html https://docs.python.org/3/extending/extending.html

<mwnaylor>The C module calls the guile interface, wrapped in function that exposes the interface to be called from Python.

<chsasank>is this a good example of this? https://gist.github.com/ncweinhold/9724254

<mwnaylor>Yes, that would be the first half.

<mwnaylor>Something like this would be the next half. https://www.digitalocean.com/community/tutorials/calling-c-functions-from-python

<mwnaylor>Hopefully your implementation will go as smoothly as the examples show.

<mwnaylor>What are in guile do you expect to call from python?

<mwnaylor>I am discussing this from the theoretical point of view, I haven't attempted what you are. You might get some good feedback on the channel #python.

<rlb>civodul: regarding the discussion about apis like getenv/setenv that require the ability to handle arbitrary binary, do we already have some efficient way to change the locale to latin-1 for just the current dynamic extent (i.e. per-thread, etc.), or would that be the next notable thing we'd need if that's at least the medium term plan?

<rlb>I looked around a bit, but didn't see one (may have missed it).

<rlb>I ask in part because I just had to write some trivial C wrappers for some functions (including those) to support a tool I'm toying with.

<rlb>(or I thought I did)

<civodul>rlb: hi! at the C level, there’s scm_{to,from}_latin1_string

<civodul>otherwise there’s the ‘%default-port-encoding’ fluid, but changing it is a bit more expensive

<rlb>Hmm, so how do I get a unicode incompatible value back from (getenv "foo") thread-safely?

<rlb>i.e. I need to set the locale to latin-1 briefly, but just for that thread/region, I think?

<ArneBab>old: it might be interesting to focus the fuzzing on limit values — getting close to the boundaries of a range for example.

<rlb>civodul: ...and if we don't have something like that yet, then seems like we might need to add it for this approach.

<ArneBab>old: ⇒ the most valuable part of the fuzzing might be to navigate the search space efficiently.

<civodul>rlb: i think scm_{to,from}_latin1_string is the way, but fluids are thread-safe too

<rlb>I'm not sure I follow -- how do I use that to help with (getenv "foo") from an arbitrary scheme thread, if say the value of foo can't be encoded in the current locale?

<rlb>i.e. I need to thread-safely change the locale to latin-1, make then call, then change back, I think?

<rlb>(...and more broadly, of course unless you know "foo", there's no way to know whether or not you have to do that)

<rlb>(but that's a separate higher-level concern)

<rlb>civodul: put another way, I was trying to figure out how I can now, or how we might want to make it possible for me to safely call getenv.

<civodul>if scm_getenv does “return scm_from_latin1_string” instead of “return scm_from_locale_string”, then it’ll do what you want, no?

<rlb>Oh...

<rlb>I didn't realize we were entertaining the idea of not respecting the current locale as the default for the relevant syscalls. i.e. are you saying getenv would never return anything but latin-1?

<rlb>ACTION may be substantially misunderstanding

<old>ArneBab: my goal is to mimick a drunk user

<old>Just doing the worst thing you could think of with the API

<old>typically this will test invalid inputs, but also test the internal state of the library

<mwette>old: This reminds me of a story my grad advisor told me. He was shown a new CAE tool someone had developed. At the command line he typed in gibberish hit the return key and the program crashed.

<old>mwette: typically what I want to avoid ^^

<mwette>ACTION has demo of creating binary guile w/ embedded modules (from .go files): github dot com slash mwette slash guile-saapp

<ieure>Why obfuscate the URL?

<mwette>habit

<ieure>What a strange habit.

<mwette> https://github.com/mwette/guile-saapp

<civodul>mwette: nice!

<civodul>i’d love to have a way to embed bytecode in libguile, so you can have a statically-linked Guile that can do minimal stuff without accessing the file system

<civodul>(such as in an initrd)

<dthompson>civodul: that would be cool

<dthompson>would be nice to have that before native compilation is a thing

<ArneBab>mwette: very cool!

<rlb>civodul: so just to double-check, were you suggesting that (getenv ...), etc. might eventually only produce results encoded via latin-1, and not the current locale?

<civodul>no no, i’m just saying how this could be achieved :-)

<civodul>but hmm

<rlb>civodul: with the current code? If so, that's what I was asking -- is there a thread-safe way to briefly change the locale for the current thread only?

<civodul>no sorry, i guess i need to page that back in

<rlb>If not, then I was thinking *that's* what we'd need for a latin-1 strategy.

<civodul>what i had in mind in Brussels was to split %default-port-encoding into two fluids

<civodul>or at least have a new fluid for the encoding of “OS strings”

<rlb>OK, right -- I think we're talking about the same thing. To follow a latin-1 strategy, we'd have to have all the relevant functions respect a locale fluid.

<rlb>or similar

<civodul>yes

<civodul>i could imagine %default-file-name-encoding, which would default to locale encoding

<civodul>now, extending that to getenv, getpw, etc. etc. is tricky

<civodul>well, there could be one fluid for everything that goes through the name service switch

<rlb>I suppose we could have a new "to/from" string function pair that respects another "override" fluid, and scatter those in all the right places.

<civodul>yes

<rlb>i.e. scm_getenv would call scm_to/from_maybe_bytes() and then maybe_bytes would respect a fluid override or, whatever...

<rlb>i.e. general idea, not specific details.

<civodul>yes

<rlb>Then you could say (with-something-something (getenv "foo")) and get back latin-1 :)

<civodul>right

<civodul>maybe one fluid for file names, one for “NSS names”, one for “process-related things”?

<civodul>or just a single fluid for “OS strings”?

<civodul>this is tricky

<rlb>Offhand, I'd think it should just be for any function that's returning or receiving values that are actually just bytes in the end.

<civodul>yes, but maybe that’s too vague? how would you know what’s affected?

<rlb>The other option *might* be to consider python-style byte-smuggling, but I'd *really* want to think about that harder first.

<civodul>also, one might want to special-case file names

<civodul>right :-)

<rlb>wrt know what's affected -- well, in the limiting case, you just need to know what the underlying call specifies, but you really need to know that anyway to write correct code, error-wise already?

<rlb>i.e. getenv might just crash you right now, with no (thread-safe) recourse...

<rlb>but of course, ideally, we'd also document in the info pages (and/or docstrings) which calls have arguments that might not actually be strings.

<rlb>(underneath, and so might require using this latin-1 facility)

<rlb>(if you need to handle arbitrary results)

<rlb>i.e. if you're writing tar, or cp, or...

<civodul>ACTION nods

<rlb>It's basically much of posix (at least as implemented in linux, *bsd, etc.), really?

<rlb>i.e. paths, user names, group names, xattrs, etc.

<rlb>they're all just null terminated bytes.

<civodul>yes, the OS interface in general

<rlb>The python-style approach does have an advantage in that it's finer-grain, i.e. the locale-changing approach means all args have to be the same on this front when the function takes multiple args...

<rlb>And neither approach is without cost with the utf-8 conversion because they're both multi-byte, i.e. non-ascii latin-1 won't be single-byte anymore.

<rlb>Of course the other approach we discussed is to just allow #u8() or strings to all the relevant functions, which I suppose could be done via scattering a scm_to/from_os_string()ish in all the right places.

<rlb>But there, you'd still have the issue of picking return value types.

<rlb>Anyway -- just started wondering about it because I hit it again, and had to write more trivial C wrappers.

<rlb>I can keep doing that, and it's fairly easy for me, but it's of course not ideal for people in general.

<rlb>(Oh, and I suppose not high priority unless we can figure out what we want, in which case, I might hack on it.)

<rlb>Higher might be the thread fix.

<civodul>yes, i’ve had to work around it in Guix too (non-locale-encoded file names specifically), not great

<civodul>oh yes, the thread fix!

<civodul>you had a patch for ‘join-thread’?

<rlb>Yep - https://codeberg.org/rlb/guile/src/branch/rev-parallel-tests can try it, and the parallel tests there too if you like. The proposed deadlock fix is the last two commits there (the parallel test changes are before that.)

<civodul>thanks

<civodul>it’s late for today but i should really schedule time for it

<rlb>sounds good

<graywolf>Out of curiosity, do people still use 2.0 version of guile?

<Arsen>ACTION has a dependency on 0.9

<Arsen>I think

<graywolf>aaaah, so it seems that guile-2.0 actually requires the .scm extension while loading files. (load-from-path "foo") does *not* load "foo.scm".

<graywolf>But based on documentation that sounds like a bug

<graywolf>Hm..

<graywolf>Well whatever, time to make some symlinks

<mwette>redhat 8 provides guile 2.0

<civodul>ACTION would assume that 2.0 has practically disappeared

<dthompson>the real question is: when we gettin 3.0.10? 😈

<graywolf>I hope not before the copy-on-write copy-file is merged

<graywolf>:)

IRC channel logs

2024-02-23.log