IRC channel logs

2023-06-18.log

back to list of logs

<rlb>All tests passing after completely removing UTF-32 string(buf)s (i.e. all stringbufs are now ASCII/UTF-8 in memory and .go files, and there are no more even just temporary wide stringbufs in strings.c).
<rlb>I have a lot of clean up to do, the sparse indexes to add, and there are still a bunch of places where we need to switch to a more suitable algorithm (e.g. don't use string-set! iteration), but that may not be as difficult -- I hope :)
<spk121>wingo: RE new custom ports. for 'make-custom-binary-input-port' and friends, 'port-position' and 'seek' both return a 32-bit signed integer on 32-bit systems.
<spk121>seek calls scm_seek() which calls custom_port_seek(), which packs port position from SCM into off_t. But then scm_seek() immediately unpacks off_t back into SCM
<spk121>If you could skip the middle step, you could avoid an unnecessary truncation
<RhodiumToad>off_t rather than scm_off_t ?
<RhodiumToad>er, scm_t_off
<spk121>scm_t_off. which is long int, which is 32-bits on MinGW32
<RhodiumToad>what, MinGW32 doesn't support large files?
<RhodiumToad>(that said, on freebsd at least gen_scmconfig gets the wrong size for scm_t_off on 32-bit architectures)
<spk121>MinGW32 has func pairs like fseek() and _fseeki64()
<RhodiumToad>and guile isn't being compiled with GUILE_USE_64_CALLS ?
<spk121>configure does pick up GUILE_USE_64_CALLS, but, doesn't pick up HAVE_STAT64, which makes scm_t_off an int.
<RhodiumToad>(the bug in the *BSD case is that there are no *64 calls because off_t is natively 64 bits even on 32-bit systems, so GUILE_USE_64_CALLS is off, but off_t is not the same size as either an int or a long)
<RhodiumToad>does MinGW32 in fact lack a stat64?
<spk121>it has a _stat64, with an underscore
<RhodiumToad>does it support largefile compilation flags to provide a 64-bit off_t and make the *64 functions the default?
<spk121>I don't see anything like that in the UCRT C-library docs.
<RhodiumToad>looks like mingw-w64 supports it but not the original mingw32
<spk121>Yeah. I started with mingw32 with the MSVCRT library because it used to sorta work. But I really want to move straight onto mingw64 with the UCRT library, so I can have UTF-8 and other nice things.
<spk121>But, as of a couple hours ago, I have a tree where mingw32 is apparently fixed with threading and JIT.
<rlb>wingo: in some cases, e.g. utf8_to_codepoint, a while back I changed the code to just use libunistring functions instead of our own code. But now given the earlier discussion, I wondered if that was what we'd want (also wondered if there might be reasons we had that code in there that I didn't know about)...
<cow_2001>is commandline + config files handling always pita?
<cow_2001>okay, now i remember why i used system* instead of open-pipe*. system* blocks. open-pipe* doesn't. on the flipside, system* makes the script running unresponsive to ^C until system* returns.
<cow_2001>i want to run childs one after another in a loop, but never more than one child at a time
<cow_2001>system* blocks, so that's good, but it also makes the parent unresponsive to ^C
<ArneBab>how to do the equivalent of mkdir -p with guile?
<cow_2001>(system* "mkdir" "-p" some-path) ;p
<ArneBab>that’s what I wanted to avoid
<ArneBab>:-)
<daviid>ArneBab: in /gnu/services/file-sharing.scm
<daviid>*in guix, actualy it seems defined in several places
<daviid>like in the above, (guix gexp) and (guix build utils)
<daviid>fwiw https://git.savannah.gnu.org/cgit/guix.git/tree/guix/build/utils.scm#n392
<ArneBab>daviid: thank you! That’s currently too system-dependent, but that should be fixable — and I think it should move to Guile. I think several of these utilities should. But the license is currently GPLv3+ — do you think there’s a chance to get an LGPL version of these?
<cow_2001>did a tiny bit on https://git.sr.ht/~kakafarm/guile-clipboard-speaker/ today
<cow_2001>it's that text to speech script thing
<cow_2001>rip https://davidson.weizmann.ac.il/online/sciencehistory/%D7%94%D7%90%D7%99%D7%A9-%D7%A9%D7%97%D7%9C%D7%91-%D7%A2%D7%A7%D7%A8%D7%91%D7%99%D7%9D-%D7%95%D7%92%D7%99%D7%93%D7%9C-%D7%9E%D7%93%D7%A2%D7%A0%D7%99%D7%9D
<dadinn[m]>I am thinking whether there is a way in Guile to identify that two ports are not the same? In particular, whether two ports both refer to STDIO?
<rlb>If you're ok missing redirections, then possibly (fileno port)?
<dadinn[m]>fileno is what I was thinking too... I just need to be sure that to ports use are not both the STDIO
<dadinn[m]>there is also port->fdes though I am not sure how that file-descriptor could be used then
<rlb>I'd assume that's the same number offhand, but haven't looked carefully.
<rlb>(I'd assumed you wouldn't use the fds, just compare them.)
<dadinn[m]>maybe I could compare the 2 fdes with eq?
<dalepsmith>Can
<dadinn[m]>awesome... let me give this a try! ;)
<rlb>(eqv? (fileno port-x) (fileno port-y))
<rlb>I'd avoid eq? for this.
<dadinn[m]>ah, fileno would be the same too?
<dalepsmith>t two fd's refer to the thing ? ( because dup)
<rlb>There should only be one fileno per port.
<dadinn[m]>rlb: you've mentioned redirection
<rlb>Right -- if you call dup you can cause two fds to point to the same underlying stream.
<rlb> https://en.wikipedia.org/wiki/File_descriptor
<rlb>(decent diagram there too)
<dalepsmith>rlb: Thanks. I couldnt find the words for "uderlying stream".
<rlb>certainly
<dadinn[m]>so comparing fileno is more reliable in case of redirection than comparing the fds?
<dadinn[m]>or the other way around?
<rlb>fileno is the fd.
<rlb>(at least I'm assuming that's what it is, offhand)
<rlb>i.e. if you get an integer for an input/output "stream" on POSIXish operating systems, it's almost certainly the file descriptor.
<rlb>What do we know, if anything, about the general cost of scm_gc_realloc()? Asking because that affects reasoning about algorithm choices wrt the utf-8 migration.
<dadinn[m]>(let* ((port1 (open-input-pipe "sleep 3"))... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/20005c516162312f09c3f78903ca58ae4621b982>)
<dadinn[m]>this returns (25 26 27). shouldn't the 1st and 3rd be the same? 🤔
<rlb>For example, in some cases, is it better to allocate one known-size, throwaway utf-32 array, and then allocate the utf-8 flavor, precisely, once at the end. or to just build a single utf-8 array (no temp utf-32), with the possibility of a "few" reallocs, depending on how you do it? I'd imagine you'd more or less always have one realloc then, for say string-tabulate, if you want to make sure the size is exactly right at the end (if
<rlb>that's worth doing for "nearly right" cases).
<rlb>(Right now I'm just using the utf-32 temp array...)
<rlb>(well at least in one of the interim states)
<rlb>Dup allocates a new fd (see the diagram on the wikipedia page), but it refers to the same file structure in the kernel as the original fd, i.e. there's a level of indirection.
<dalepsmith>dadinn: What rlb said