IRC channel logs

***sneek_ is now known as sneek

<nalaginrut>morning guilers~

<b4283>salut

<nalaginrut>heya

<lloda>morning

<lloda>wingo, could you have a look at the array patches?

<wingo>morning!

<wingo>lloda: i started looking at them last night

<wingo>i got a few patches in, will continue tonight

<lloda>great :D

<civodul>Hello Guilers!

<wingo>morning civodul :)

<lloda>wingo: there's this incompatible change where I made vector = simple vector. I hesitated, but I pushed forward b/c I couldn't get much agreement on the ml. You be the judge.

<wingo>ok, will do

<lloda>and thanks!

<wingo>thank you and again, apologies for the delay!

<mark_weaver>I also tried to review your patches long ago, but I got hung up because of many changes that I wasn't sure about.

<b4283>i was just using the assoc-list api, and got stuck because i thought assq-set! always mutate the list

<mark_weaver>I don't remember the details, but at least some of the changes seemed questionable to me.

<lloda>they probably are, I don't disagree

<b4283>is it intentional that users must the common gateway "set!" because of complier optimization reasons?

<b4283>must *use*

<mark_weaver>b4283: well, for starters, there's no way to mutate an empty list.

<mark_weaver>and if there's no matching association to mutate, then it gets added to the front, which again cannot be done by mutation.

<mark_weaver>so it's not really about compiler optimization at all.

<mark_weaver>it's just due to the nature of scheme lists. you must always 'set!'.

<b4283>it's confusing to have a ! in assq-set

<b4283>but i guess that's required by r?rs anyways

<mark_weaver>well, it does mutate the existing alist in some cases, so it's needed.

<mark_weaver>if the association is already in the list, then it's mutated.

<mark_weaver>if it's not, then there's nothing to mutate. it's just added to the front.

<b4283>mark_weaver: i get it, thanks for the explanation

<mark_weaver>np!

<mark_weaver>lloda: can you explain the nature of the incompatibility? I don't quite know what you mean by "vector = simple vector". I understand the distinction, but I don't know what that equation means.

<mark_weaver>what procedures are affected?

<mark_weaver>(in the public API, that is)

<lloda>it'll take me a minute to recall it

<wingo>i can push an up-to-date patchset...

<mark_weaver>lloda: okay, thanks

<mark_weaver>b4283: we probably should make it more clear in the manual.

<wingo>mark_weaver, lloda: lloda-array-cleanup in master

<mark_weaver>we should probably emphasize that you should always use 'set!' in combination with those destructive alist procedures.

<mark_weaver>wingo: thanks, I'll take a look tomorrow (need to sleep soon).

<wingo>i removed the patches to compile-assembly.scm; they might need corresponding fixes to the new compiler, who knows

<wingo>i am relying on the tests to help me figure that out

<wingo>s/master/savannah git/ :)

<lloda>yes, iirc those things are tested.

<wingo>lloda: do you have commit access?

<lloda>I don't

<wingo>hum, we should flip that bit anyway

<lloda>the changes to compile-assembly.scm where in reading array literals.

<lloda>wingo: I'll have a look

<mark_weaver>I have patches in r7rs-wip for compile-assembly.scm to handle cyclic literals.

<wingo>lloda: go into savannah if you would and request to be added to the guile group

<lloda>will do, thx

<wingo>that will at least let you push branches to savannah, which is easier for everyone :)

<wingo>and i think bugfixes are welcome as well, though mark or ludo will correct me ;) just send them to the list and if you get no response within a week or so, push them

<wingo>mark_weaver: cyclic literals, interesting; i wonder if master can handle those...

<mark_weaver>yeah, I was curious about that too.

<wingo>should be an easy fix, if not... the linker should handle it almost automagically

<lloda>wingo: request sent.

<wingo>lloda: done, thanks!

<lloda>mark_weaver: simple vectors are like bytevectors or uniform vectors, but for the SCM type.

<mark_weaver>lloda: I don't know what that means.

<wingo>they are one-dimensional packed arrays of SCM values

<b4283>mark_weaver: the manual already clearified that "the only safe way to use it is to through set!", just that i missed it in the first place :/

<wingo>there is SCM_IS_SIMPLE_VECTOR, etc...

<lloda>right

<lloda>it's a unique type.

<wingo>and they have their own tc7

<wingo>yes

<lloda>however, 'vector's can be simple vectors or certain kinds of arrays also

<lloda>so whenever you use a vector-ref, vector? etc, checks are made to see if the array passes as a vector 'functionally'

<lloda>I mean, if it's an array, and if that array passes as a vector

<lloda>so 'vector's are not a unique type.

<wingo>lloda: what was your thinking when you made vector-ref only work on simple vectors?

<mark_weaver>so your patches would make the 'vector-ref', 'vector-set!', and maybe 'vector?' procedures work only on simple vectors, not arrays, is that right?

<lloda>yes.

<mark_weaver>I think that's a very good thing.

<wingo>are you treating only the array interface as the all-singing polymorphic interface?

<lloda>the thinking was that vector = uniform-vector = bytevector, just with a different element type.

<wingo>buf, confusing question :)

<mark_weaver>it means that we can generate much simpler code for the vector ops.

<wingo>yep

<lloda>singing and polymorphic go well together

<mark_weaver>especially when we have native compilation, that will be good.

<lloda>and yes.

<wingo>interesting

<lloda>array is then the only polymorphic type.

<wingo>i've never known when any of those things should be polymorphic, and that sounds like a fine rule to me

<mark_weaver>so the array operations will work on vectors, but not vice versa, right?

<lloda>yes.

<mark_weaver>that sounds great.

<mark_weaver>and now I'm trying to remember what I found questionable. I guess I'll have to look through the patches again :)

<mark_weaver>lloda: well, thanks for your patience on this. I'm truly sorry that you've had to wait so long. it's just a daunting review job, that's all.

<lloda>nah, you're right it's messy code.

<mark_weaver>I'm going to try to take a close look in the next week though.

<wingo>mark_weaver: do you want to take the review?

<wingo>i was going to start on it but we shouldn't duplicate work

<mark_weaver>well, it's probably good for both of us to review it.

<wingo>as you like, it doesn't matter to me

<mark_weaver>please do review it, wingo. but also please give me a chance to review it before pushing, if you don't mind.

<wingo>mark_weaver: ok, i'll hold off functional changes for more review, but i will push bug-fixes and test things and similar

<mark_weaver>sounds good.

<wingo>still it seems like double-review is too much, but whatever

<mark_weaver>well, I think you're probably more familiar with that area of the code. but at the same time, I want to remember what I found questionable. I might not do a full review.

<mark_weaver>it's possible that I had 'stable-2.0' too much in mind when I tried to review it last time, and that the vector==simple-vector thing worried me. I hope that's the case.

*mark_weaver --> zzz

<wingo>sleep well :)

<b4283>there's a nice song about sleeping well

<civodul>lloda: welcome to the Savannah group ;-)

<lloda>thanks! :)

<lloda>I've built lloda-array-cleanup. make check passes. I've tested my programs against it. Everything seems to work except for this message:

<lloda>;;; WARNING: compilation of [...]

<lloda>;;; ERROR: don't know how to intern #2f64()

<lloda>similar errors for other literals.

<lloda>I was on 2.0.9 before, so this is new to me.

***DerGuteM1 is now known as DerGuteMoritz

<civodul>"don't know how to intern"?!

<civodul>just add it to the symbol hash table!

<civodul>:-)

<civodul>that's on master?

<lloda>it's lloda-array-cleanup which is on top of master. I want to blame master b/c I had the extra patches before on top of 2.0.9 and not this issue. But I haven't tested master yet.

<wingo>yes, probably the new compiler doesn't handle that for whatever reason

<wingo>also there is no symbol hash table in master; only a weak set :)

<civodul>wingo: right, but still, it does know how to intern things, doesn't it? :-)

<wingo>yes :)

<civodul>heheh

<wingo>but as in stable-2.0, symbols are gc'd, so it's not really interning i guess

<wingo>dunno

<wingo>hard to tell :)

<jmd>Calling (link "x/y/z" "w/x/y/z") I get the error ERROR: In procedure link:

<jmd>ERROR: No such file or directory

<jmd>which file or directory does it think does not exist?

<civodul>you can't tell

<civodul>the syscall doesn't provide more info

<jmd>Oh. Then is there a mkdir which will create the necessary subdirs?

<b4283>mkdir -p

<wingo> http://blog.frama-c.com/index.php?post/2013/02/26/Portable-arithmetic-operations

<wingo> http://en.wikipedia.org/wiki/SipHash

<wingo>i wonder if we should use that

<wingo>probably so.

<civodul>systemd and Rust have it, so i guess we must

<wingo>hehe

<wingo>it's to prevent hash collision attacks

<wingo>if we switch to utf-8 strings we can hash directly over the utf-8 bytes

<wingo>right now we have to be careful to hash over codepoints, since a given string can have multiple representations

<wingo> https://131002.net/siphash/

<wingo>it would be nice to pre-compute hashes for strings and symbols that we residualize into object files...

<wingo>i reckon in that case we can just use a well-known seed

<wingo>of course statically creating a hash table would be nice, too :)

<wingo>prolly won't happen tho

<mark_weaver>I've been thinking about statically creating the symbol table for symbols in core guile for a while, to speed up startup.

<mark_weaver>but so far it's just a thought. never really looked into it.

<wingo>it's possible; probably the biggest gains though would be pre-computing hashes and pre-allocating variable cells

<wingo>at least according to valgrind

<mark_weaver>I'm not sure why pre-allocating the variable cells would be more important than preallocating the hash table chain chells.

<wingo>the symbol table doesn't have chain cells

<wingo>it's a weak set

<mark_weaver>oh, interesting. I must look at that code sometime.

<mark_weaver>for that matter, in order to make 'equal?' and 'write' handle cycles without a severe performance regression, I'm going to need hash tables that don't allocate anything in the common case.

*wingo gets grumpy whenever he thinks about cycles

<mark_weaver>fortunately, in both cases, the elements are removed from the hash table in LIFO order, like a stack, which makes it much simpler.

<mark_weaver>I can do the thing where I put the elements directly in the array, and if that bucket is already full, I scan until I find a free entry.

<wingo>you might check out the weak table implementation then -- it uses an open-coded robin hood hashing scheme

<wingo>and a set might suffice

<mark_weaver>will do@

<mark_weaver>s/@/!/

<wingo>with 2/3 of the memory usage of a table

<wingo>(the weak set and table implementations store the raw hash value also)

<wingo>i guess we could have deterministic stringbuf hashes, but a string's hash or a symbol's hash would have to be mixed with a per-invocation private key

<mark_weaver>when are stringbufs hashed?

<mark_weaver>bare stringbufs, that is.

<wingo>they aren't, right now

<mark_weaver>good.

<wingo>dunno, just tossing around ideas on how to pre-compute hashes while not being a vulnerability

<mark_weaver>btw, what is the symbol table keyed on nowadays in master?

<wingo>the hash of the stringbuf ;)

<mark_weaver>I seem to recall it used to be keyed on the symbol itself, which I always thought was ridiculous.

<wingo>the hash of the stringbuf backing the string

<mark_weaver>ah, so we do hash bare stringbufs.

<wingo>or more precisely, the hash of the codepoints composing the symbol

<wingo>but that hash is not a component of a stringbuf.

<mark_weaver>oh, this needs utf8

<wingo>well, right now it goes character by character because there are different encodings

<wingo>but if we had utf-8 strings it could just hash the utf-8 bytes

<mark_weaver>yeah, that would be a big win.

<wingo>yes

<wingo>though there are utf-8 specialized string hashers in master

<wingo>just not as fast as hashing bytes

<mark_weaver>after 2.0.10 is out the door, I have two guile priorities: fixing the thread-safety of module autoloading, and utf-8 strings come after that I think.

<mark_weaver>one question: why do we have to go character by character, anyway?

<mark_weaver>s/character/codepoint/

<mark_weaver>the encoding that a string is in is deterministic, based on the contents of the string.

<wingo>shared substrings

<mark_weaver>if every character is latin-1, then it's latin-1.

<mark_weaver>oh, right :)

<wingo>:)

<wingo> http://www.python.org/dev/peps/pep-0456/

<mark_weaver>I wonder if shared substrings is actually a win in practice. somehow, I doubt it.

<wingo>depends on your use-case, i would think

<wingo>i want to do subbytevector now...

<mark_weaver>sure. if you take huge substrings of huge strings, then definitely a win. but how often is that, I wonder.

<wingo>without it, my file upload code has to store double the memory of the upload

<mark_weaver>it means an extra indirection on every string.

<mark_weaver>*nod*

<wingo>i think it's probably a win; marius took it out around 2006 or so but had to put it back in due to users complaining

<mark_weaver>okay

<wingo>and v8 has like 20 kinds of strings

<mark_weaver>yikes

<wingo>so, i don't want 20 kinds of strings, but at least one project has deemed it important enough to invest lots of time on it

<wingo>ropes, substrings, byte strings, ucs-16 strings, etc etc

<wingo>but their strings are immutable, so that's a difference

<mark_weaver>I'm going to want to allow the underlying bytes of a utf-8 string to be exposed as a bytevector.

<wingo>yeah!

<wingo>er

<wingo>wait :)

<wingo>is that a good idea?

<mark_weaver>a large number of efficient utf-8 algorithms depend on working by byte.

<wingo>ok

<mark_weaver>in fact, that's one of utf-8's greatest strengths.

<wingo>ok let's do it; probably it pays off

<civodul>hm?

<civodul>exposing the internal representation?

<wingo>the thing you lose is some type-based optimization things

<mark_weaver>yeah.

<wingo>civodul: for implementing algorithms in scheme

<civodul>i understand, but it has to remain an internal API

<mark_weaver>for example, searching can be done by bytes in utf-8. same for regexp searches.

<wingo>it's definitely a privileged function

<wingo>humm

<civodul>string->utf8 could return a COW bytevector

<wingo>we don't have COW bytevectors

<wingo>and i don't think we want them

<civodul>yes, that's the problem :-)

<mark_weaver>what about immutable bytevectors?

<civodul>prolly too expensive, yes

<wingo>i think they would make too many things slow

<wingo>mark_weaver: yes that could work

<mark_weaver>yeah, I don't know of an algorithm that needs write access to the bytes.

<wingo>it still has a runtime cost (bytevector-u8-ref not implying that bytevector-u8-set! is valid) but that's probably ok

<mark_weaver>as soon as you need to write, then you might need to change the length and that's a mess anyway.

<wingo>ok let's do immutable bytevectors then

<wingo>we can do string->utf8/read-only

<mark_weaver>cool.

<wingo>though with immutable strings it could be that the backing store is in fact mutable

<mark_weaver>well, we can arrange to make them immutable.

<wingo>but i think we can describe that adequately inthe manual

<wingo>well you might want to provide read-only capabilities to a piece of memory, but that memory might change

<mark_weaver>when utf8 strings are mutated, that will be a slow path anyway.

<mark_weaver>the string will have to be broken up into blocks, and then reassembled when converted to utf-8.

<wingo>yeah

<wingo>we could pessimize string-set!

<wingo>try to do something sensible but not care too much about it

<mark_weaver>right. we had a thread on this topic years ago, and I proposed a scheme that make 'string-set!' constant-time but with a large constant.

<civodul>mark_weaver: you had posted arguments in favor of utf-8 internally, no?

<civodul>i can't find it

<mark_weaver>yes.

<mark_weaver>let me see.

<wingo>logarithmic string-set! is fine with me too

<mark_weaver>yeah, if we went logarithmic (for string-ref too), then we could probably do things like efficient concatenation as well.

<mark_weaver>and efficient insertions/deletions.

<mark_weaver>I'll give it some thought.

<wingo>good luck

<wingo>it's a terrible design space

<mark_weaver>civodul: "O(1) accessors for UTF-8 backed strings", March 2011

<civodul> http://trac.sacrideo.us/wg/wiki/StringRepresentations

<civodul>mark_weaver: that's a message that says it's doable, not a message that says it's worthwhile :-)

<mark_weaver>yeah, I'm looking for the right message. that's not quite the right one.

<civodul>but yeah, the above has interesting arguments

<civodul>the "StringRepresentations" page

<civodul>irregex does crazy things, indeed

<mark_weaver>well, I'm having trouble finding the message where I gave the best arguments. but basically: (1) it's a single representation, so binary string operations could be optimized without handling 4 different cases, (2) it would allow us to use libunistring for many of our string operations, (3) UTF-8 operations can typically be done byte-wise without difficulty, which makes things much faster and simpler.

<wingo>is libunistring maintained?

<mark_weaver>and of course, conversion to/from utf-8 is very fast, and that's the common case these days.

<mark_weaver>I don't know, but if it isn't, that should probably be fixed, no?

<mark_weaver>unicode has a lot of intricate algorithms, and it seems worthy of a library rather than each project duplicating that work.

<mark_weaver>handling the bare utf-8 code points is simple. but things like case conversion is nasty.

<civodul>wingo: it's dormant, but not unmaintained, i'd say

<mark_weaver>and the normalization algorithms.

<civodul>mark_weaver: thanks for the list

<mark_weaver>and at some point, people will want to be able to really deal with *characters* as opposed to code points for some things.

<mark_weaver>in fact, what people really think of as characters, e.g. corresponding to a single glyph when rendered, are actually multiple code points.

<mark_weaver>it makes my head hurt to think about this stuff.

<mark_weaver>I'd rather leave it to a library :)

<mark_weaver>in fact the scheme standards kind of got it wrong to equate characters with code points.

<mark_weaver>anyway, going afk for a while. happy hacking!

<jemarch>hi

<civodul>'lo!

<tupi>hello guilers

<tupi>wingo: whenever you feel like it, could you add me to guile-gnome, thanks

<wingo>i see that your assignment came through, great

<wingo>i think that's why i didn't do it in the past

<wingo>done

<wingo>tupi: ^

<tupi>no problem, thanks!

<wingo>please ask before pushing things :)

<tupi>of course, wilco!

<tupi>wingo, guile-gnome web pages are under git as wel?

<wingo>tupi: they are under cvs actually; and i think they are generated from something in the guile-gnome source tree

<wingo>ah no they are all in cvs

<wingo>generated from .scm files

<wingo>there is a makefile there

<wingo>there is also a script somewhere to update the docs

<tupi>ok, where is the cvs server ?

<wingo>see the "use cvs" link on the upper right hand side of the savannah project

<tupi>tx

<mark_weaver>R.I.P. Pete Seeger :-(

<wingo>indeed

<wingo>heh, it seems that guile unrolls a factorial loop entirely if the number is 23 or less

<wingo>,optimize (let lp ((n 23)) (if (zero? n) 1 (* n (lp (1- n)))))

<wingo>$6 = 25852016738884976640000

*wingo bows before peval

<mark_weaver>hah, nice :)

<add^_>Greetings Guilers!

<davexunit>hey add^_

<add^_>Hey davexunit :-D

<add^_>What's up?

<add^_>sneek: seen ijp

<sneek>I last saw ijp on Jan 15 at 02:10 pm UTC, saying: who is also apparently at utah.

<add^_>hm, ok, not to long ago then, just haven't seen him in quite a while.

<davexunit>add^_: not too much. just working and stuff. frustrated with corporate software development at the moment.

<add^_>Oh?

<add^_>What happened?

<add^_>err, why are you frustrated?

<dsmith-work>"corporate software development"

<davexunit>add^_: adding hacks on top of hacks to please stakeholders.

<add^_>ah

<add^_>Sounds "meh".

<add^_>I get the geist

<add^_>maybe it's "gist" in English

<davexunit>I have to implement something far more complicated than necessary for the sake of small user experience complaints.

<add^_>:-S

<add^_>That sucks :-/

<add^_>Oh well...

<add^_>Life in corporations is like that I suppose..

<davexunit>eyah

<davexunit>yeah*

<add^_>On another note, I'll be buying a Kinesis keyboard soon :-D

<add^_>Well, most likely

<davexunit>oh neat. I hear those are good for RSI

<add^_>Yes

<add^_>Well

<add^_>For not getting it xD

<davexunit>yeah

<add^_>I wonder how hard it'll be to get used to emacs with that :-P

<add^_>A friend who has one says that it would probably take a month of use to get used to (just the keyboard)

<add^_>not with emacs

<add^_>Sounds pretty much like getting used to dvorak..

<add^_>But dvorak was annoying in emacs for me. Although I've heard people who thinks emacs is even easier with dvorak...

<davexunit>I haven't tried dvorak or any other keyboard layout

<add^_>If you already can touchtype or whatever it's called, on qwerty, it's probably not necessary to switch...

<taylanub>Go with Colemak if you'll switch. Less different than Dvorak, and just as good or a tiny bit better.

<add^_>:-P

<add^_>taylanub: is that what you use?

<taylanub>yes

<add^_>Cool

<add^_>So, what's the logic behind where the keys are placed?

<taylanub>Not sure, some complicated algorithms and hand-tuning.

<add^_>It seems like all the vowels are on homerow, like on dvorak

<add^_>which is good

<taylanub>(Of course, like any algorithm, they're only complicated until you know them.)

<add^_>oh, nope, not all of them

<add^_>lol

<add^_>I think there are algorithms that are still complicated even when you know them :-P

<taylanub>I once made a script for an IRC client, I think it was WeeChat, that printed the number of home-row keys for QWERTY, Dvorak, and Colemak, at the end of the sent line, for every line I sent.

<taylanub>IIRC Colemak mostly had a small edge over Dvorak, and QWERTY a huge gap behind.

<taylanub>(Of course home-row hits are just one very specific criteria.)

<add^_>Well, it's an important one

<add^_>I don't feel like trying yet another keyboard layout though :-/ Maybe I should..

<taylanub>Colemak goes well with Emacs BTW, from what I can tell, though I never used Emacs with anything else than Colemak so I have no comparison.

<taylanub>Also, w and f are aside, so don't accidentally C-x C-w a file instead of C-x C-f'ing it!!

<add^_>lol

<add^_>There will always be those kinds of things ;-)

<taylanub>Yeah, it actually prompts for a confirmation, you have to spell out y e s and hit RET, for some reason I did it as a reflect that one time. :\\

<add^_>Ouch

<add^_>lol

<add^_>well, testing out dvorak again right now... Takes forever to type

<add^_>lol

<add^_>for me

<add^_>I think I'll get used to it again soon enough

<add^_>yeah

<add^_>oops, miss chatt

<add^_>..

IRC channel logs

2014-01-28.log