IRC channel logs
2022-11-26.log
back to list of logs
<rlb>apt install frotz :) <rlb>Now we just need a guile dialect? <robin>cow_2001, common lisp-ese for cow_2001++ :) documentation updates are really important <robin>ZIL? like...zork implementation language? <cow_2001>i'm not 100% the send-email went through, though. may be pending moderation. <rlb>wingo: have any idea how common it might be for gc_realloc() to actually avoid a full copy (for either an increase or a shrink)? Just wondering if it's worth trying to use it for scm_read_extended_symbol(). I did originally, to build the symbol stringbuf "in place", but I'm not sure that's worth it complexity or cost-wise. <rlb>(In other places, written later, I use a hybrid alloca/overflow-to-heap approach to build a copy of the final result locally, which is better for anything that's small enough for the alloca.) <old>cow_2001: I don't see your patches on the archive. Maybe some misconfiguration with gmail? <old>robin: Looks like S-exp had a baby with XML <cow_2001>i mean, i am looking at things right now ~_~ <cow_2001>what i saw was me sending the patches to myself <old>Ah Maybe. I'm not sure if the ML is restricted and need aproval from a moderator first <old>It does not say so on the website <rlb>wingo: I just noticed what looks like string-implementation related code in compile-cps (at least), and wonder if I'm going to get stuck there as soon as I change the strings.c representation? And right now I definitely don't understand (for example) the string-ref "primcall converter" there well enough to handle it. <sneek>Welcome back tohoyn, you have 1 message! <sneek>tohoyn, daviid says: nice, could you add this to the application example, the latest version, the one I send you back with a few changes that would need to be 'in' so we can later add the exmpe all together to the g-golf examples gtk4 ... either paste or emnail, but take a few minutes to exaplin what you first tried that didn't work, so I still look nat it (and learn from your attemps ...) thanks <rlb>ACTION found the cps docs. <tohoyn>daviid: sometimes the message dialogs in the example program are hidden behind the application window ant they are only displayed when the application window is clicked. The C version of the example program behaves similarly. Do you have any idea why this happens? <apteryx>ah, nevermind, I was exec'ing the script via 'guile my-script.scm' ^^' <spk121>apteryx: you didn't add the full link to the paste <apteryx>needed to ./my-script.scm for the shebang to be used <ArneBab>apteryx: you are using /usr/bin/env in addition to the meta-switch. You have to use one or the other. <daviid>tohoyn: no, i don't know, you'd have to ask in #gtk (which moved to libera.chat this 25th of november) - it could be your compositor as well, or the backen engine ,,, no idea <rlb>wingo: I ended up wondering if I could comment out the string-ref primcall-converter for now, and that seems to work. i.e. imagining it's an optional optimization we can rework later. <rlb>(...and I suspect a replacement will be a bit more complex given the need to consult the index for non-ascii strings) <daviid>sneek: later tell tohoyn message dialogs are in adw (libadwaita) <wingo>rlb: regarding realloc: i think increas will always copy. decrease will copy in some gcs and not in others <wingo>rlb: yes the primcall converter is optional. probably the right thing in a utf-8 world is an intrinsic instead <wingo>for string-ref. i wish string-set! were not a thing :) <wingo>you will also want to provide access to the raw utf-8 as an immutable bytevector <rlb>wingo: hmm, that makes sense wrt bytevector. Do you mean eventually, or immediately (i.e. things won't work)? <wingo>eventually. because you will want to implement srfi-13 in terms of a utf-8 fold over the bytevector <rlb>Ohh, well so far, I'd just been manually implementing srfi-{13,14} in C, i.e. converting all the existing C code. But there's plenty still to do because I've left any number of the set_x/ref based versions alone, and that'll be catastrophic wrt perf, of course. <rlb>ACTION has just been using libunicode, plus or minus a couple of add-on macros. <rlb>Nice wrt the decoder. <rlb>I also realized wrt realloc that I had an approach in my previous (faster, more hackish) attempt that may be plausible, i.e. you build the string in the (unrevealed) stringbuf, and then when you finish() it, we know the old and new sizes and so can decide whether it's worth a final gc realloc or not. <rlb>(Though that doesn't get you one fairly obvious optimization as easily as another approach I'd been using, i.e. being able to build the result in an alloca buffer if it's small enough, so you can just do one final precise allocation in the common cases -- e.g. for most symbols a 64-byte alloca should be more than enough). <rlb>(But those cases can just use hybrid code.) <wingo>any chance to use an immutable bytevector as the stringbuf? <wingo>like to be able to adopt a bytevector as-is <rlb>wingo: hmm, thinking... <wingo>just thinking that a bytevector buffer can be a decent string builder <wingo>especially if we provide a "shrink" primitive <rlb>wingo: and excellent timing -- I *just* got back to the point (more carefully than the previous attempt) of having revoked all the external wide access so that I can start moving strings.c over to utf8 via the indexed bufs. <rlb>So right now the "stringbuf" is a heap object with two flavors, ascii and non-ascii (utf8). The ascii flavor has two words, type and char count, followed by the ascii chars *in-line*. The non-ascii flavor has three words, type, char count, byte count, followed by the (variable element width) offset index and then the aligned utf-8 bytes, also in-line. <rlb>A UTF8 bit in the first word (a la the existing WIDE) bit tells you which flavor you have. <rlb>And they're immutable. <rlb>Also now thinking about moving the index to the end for the reasons mentioned the other day. <rlb>I put everything in-line to avoid a pointer chase, and to improve overall cache locality. For smaller strings, the whole thing might fit in a cache line? <rlb>If that's worth preserving, we *could* have a third variant of "taken" strinbuffer, I suppose. <rlb>i.e. where one of the words is just a bytevector reference. <rlb>But we'd need adapter code to paper over getting the uint8_t* to the bytes for all three flavors. I suppose I could see about adding that in the new flavor I'm starting, so we'll have the option fairly easily... <wingo>i guess my big question is, what is the story for writing high-performance string processing in scheme <rlb>i.e. I might be able to lay the groundwork for that option as I go this time. <wingo>is it ports? does this enable good string ports? or is it direct access to a utf8 bytevector? <wingo>i assume it's not string-ref <wingo>i think bytevector access will be fastest though ports are nice too <rlb>I guess first I'd need to know what kind of things we're trying to focus on wrt perf. For now, I was going under the assumption that the underlying stringbufs would need to be immutable, and set! would just be prohibitively expensive for anything other than ascii strings. <wingo>yes. my assumption is we need an internal "string builder" kind of interface <wingo>and that generally it is ok to make string-set! arbitrarily expensive <rlb>And that we'd also need an offset index (variable width so it's as cheap as posdible), and that also suggests immutability. <wingo>but with the string builder interface we get amortized O(1) append of strings or chars and relatively cheap buffer-to-string conversion <wingo>append of strings or chars to the string builder buffer i mean, and O(n) in case of appending a string to the buffer of course <rlb>Right now (only via internal functions) you can alloc a utf8 string of a given size, then start filling in its internal uint8_t*, realloc as you like, and then finish() -- builderish? <rlb>Though in most internal cases, you might do better with a hybrid alloca approach (as mentioned). <rlb>i.e. build exactly what you want, and then convert to a string at the end with just one copy. <rlb>I guess it depends... <rlb>But with the current "in-line" stringbufs you couldn't use bytevectors and get the most efficient end-result, i.e. there'd always be a final bulk copy. <rlb>Of course if you're size-doubling or wahtever, you might end up with a final copy anyway if you didn't want to leave a lot of empty space... <rlb>I suppose as yet, implicitly I was favoring reads over writes/mutation, i.e. more precise allocations, and the in-lining for cache locality. <rlb>But that somewhat disfavors writes/construction costs? <haugh>old, I just realized one of the best reasons to use (ice-9 format) is the output port and none of our fivehead syntax ideals include that <old>Well you can do `(display #@"1+1=@(+ 1 2)" port)` <old>Instead of: `(format port "1+1=~a" (+ 1 1))` <old>The alternative is doing something like `fstring` in my blog post. I could extend the syntax so one can pass a second parameter that is a port <old>But with a reader, it's not possible to pass a port <old>So I was thinking for something like: #@specification@"my string" <haugh>I will admit that if all you want is interpolation then these do start to look like separate solutions <old>Werid syntax, but the '@' are necessary to keep paredit happy in emacs. <old>The specification would be anything really .. it would even be a variable name for a port for example <old>But I think that seperating output from the formating is easier <old>Example of a specification: #@trim;port=my-port;literal@" @foo" -> print `@foo` (no whitespace) in port named `my-port` <old>But really this look more and more like a hack <old>Other example: #@port=(current-output-port)@"@(foo)" -> print the value of the expression `(foo)` to port from expression `(current-output-port)` <haugh>Maybe trying to use the same reader extension for multiline heredocs and for in-line interpolation is the problem. <old>But there's only a few character that don't mess up with paredit <old>And guix has 3 of them <old>Thus the abuse of @@ <haugh>I am now officially taking the stance that this is a third-party tooling issue and should therefore be handled in the third-party domain <haugh>This is not configurable behavior in paredit? <haugh>what if our module exports heredoc-reader and interpolation-reader, where interpolation-reader is a heredoc wrapper with interpolation enabled and trimming disabled. Then we could agree on the "string literal" interpolation syntax separately from the parent reader syntax and just let the user do their own read-hash-extend calls. <haugh>I recall you're not big on lots of required user config though. <old>Right. A single #:use-module should suffice. <old>But there's could be 4 variant of #CHR <old>So #"x^2=@(* x x)" for simple interpolation <old>#@"..." trimming. And #|"..." literal <old>s/literal/trimming no interpolation <old>literal is simply "..." <old>#| is actually nested block comments, but we might hijack it <cow_2001>is it okay talking about racket here? i've tried looking at the beautiful racket book but i do not find the book very beautiful. hard on the eyes and there are no *.info files. ;p