IRC channel logs

2014-01-28.log

back to list of logs

***sneek_ is now known as sneek
<nalaginrut>morning guilers~
<b4283>salut
<nalaginrut>heya
<lloda>morning
<lloda>wingo, could you have a look at the array patches?
<wingo>morning!
<wingo>lloda: i started looking at them last night
<wingo>i got a few patches in, will continue tonight
<lloda>great :D
<civodul>Hello Guilers!
<wingo>morning civodul :)
<lloda>wingo: there's this incompatible change where I made vector = simple vector. I hesitated, but I pushed forward b/c I couldn't get much agreement on the ml. You be the judge.
<wingo>ok, will do
<lloda>and thanks!
<wingo>thank you and again, apologies for the delay!
<mark_weaver>I also tried to review your patches long ago, but I got hung up because of many changes that I wasn't sure about.
<b4283>i was just using the assoc-list api, and got stuck because i thought assq-set! always mutate the list
<mark_weaver>I don't remember the details, but at least some of the changes seemed questionable to me.
<lloda>they probably are, I don't disagree
<b4283>is it intentional that users must the common gateway "set!" because of complier optimization reasons?
<b4283>must *use*
<mark_weaver>b4283: well, for starters, there's no way to mutate an empty list.
<mark_weaver>and if there's no matching association to mutate, then it gets added to the front, which again cannot be done by mutation.
<mark_weaver>so it's not really about compiler optimization at all.
<mark_weaver>it's just due to the nature of scheme lists. you must always 'set!'.
<b4283>it's confusing to have a ! in assq-set
<b4283>but i guess that's required by r?rs anyways
<mark_weaver>well, it does mutate the existing alist in some cases, so it's needed.
<mark_weaver>if the association is already in the list, then it's mutated.
<mark_weaver>if it's not, then there's nothing to mutate. it's just added to the front.
<b4283>mark_weaver: i get it, thanks for the explanation
<mark_weaver>np!
<mark_weaver>lloda: can you explain the nature of the incompatibility? I don't quite know what you mean by "vector = simple vector". I understand the distinction, but I don't know what that equation means.
<mark_weaver>what procedures are affected?
<mark_weaver>(in the public API, that is)
<lloda>it'll take me a minute to recall it
<wingo>i can push an up-to-date patchset...
<mark_weaver>lloda: okay, thanks
<mark_weaver>b4283: we probably should make it more clear in the manual.
<wingo>mark_weaver, lloda: lloda-array-cleanup in master
<mark_weaver>we should probably emphasize that you should always use 'set!' in combination with those destructive alist procedures.
<mark_weaver>wingo: thanks, I'll take a look tomorrow (need to sleep soon).
<wingo>i removed the patches to compile-assembly.scm; they might need corresponding fixes to the new compiler, who knows
<wingo>i am relying on the tests to help me figure that out
<wingo>s/master/savannah git/ :)
<lloda>yes, iirc those things are tested.
<wingo>lloda: do you have commit access?
<lloda>I don't
<wingo>hum, we should flip that bit anyway
<lloda>the changes to compile-assembly.scm where in reading array literals.
<lloda>wingo: I'll have a look
<mark_weaver>I have patches in r7rs-wip for compile-assembly.scm to handle cyclic literals.
<wingo>lloda: go into savannah if you would and request to be added to the guile group
<lloda>will do, thx
<wingo>that will at least let you push branches to savannah, which is easier for everyone :)
<wingo>and i think bugfixes are welcome as well, though mark or ludo will correct me ;) just send them to the list and if you get no response within a week or so, push them
<wingo>mark_weaver: cyclic literals, interesting; i wonder if master can handle those...
<mark_weaver>yeah, I was curious about that too.
<wingo>should be an easy fix, if not... the linker should handle it almost automagically
<lloda>wingo: request sent.
<wingo>lloda: done, thanks!
<lloda>mark_weaver: simple vectors are like bytevectors or uniform vectors, but for the SCM type.
<mark_weaver>lloda: I don't know what that means.
<wingo>they are one-dimensional packed arrays of SCM values
<b4283>mark_weaver: the manual already clearified that "the only safe way to use it is to through set!", just that i missed it in the first place :/
<wingo>there is SCM_IS_SIMPLE_VECTOR, etc...
<lloda>right
<lloda>it's a unique type.
<wingo>and they have their own tc7
<wingo>yes
<lloda>however, 'vector's can be simple vectors or certain kinds of arrays also
<lloda>so whenever you use a vector-ref, vector? etc, checks are made to see if the array passes as a vector 'functionally'
<lloda>I mean, if it's an array, and if that array passes as a vector
<lloda>so 'vector's are not a unique type.
<wingo>lloda: what was your thinking when you made vector-ref only work on simple vectors?
<mark_weaver>so your patches would make the 'vector-ref', 'vector-set!', and maybe 'vector?' procedures work only on simple vectors, not arrays, is that right?
<lloda>yes.
<mark_weaver>I think that's a very good thing.
<wingo>are you treating only the array interface as the all-singing polymorphic interface?
<lloda>the thinking was that vector = uniform-vector = bytevector, just with a different element type.
<wingo>buf, confusing question :)
<mark_weaver>it means that we can generate much simpler code for the vector ops.
<wingo>yep
<lloda>singing and polymorphic go well together
<mark_weaver>especially when we have native compilation, that will be good.
<lloda>and yes.
<wingo>interesting
<lloda>array is then the only polymorphic type.
<wingo>i've never known when any of those things should be polymorphic, and that sounds like a fine rule to me
<mark_weaver>so the array operations will work on vectors, but not vice versa, right?
<lloda>yes.
<mark_weaver>that sounds great.
<mark_weaver>and now I'm trying to remember what I found questionable. I guess I'll have to look through the patches again :)
<mark_weaver>lloda: well, thanks for your patience on this. I'm truly sorry that you've had to wait so long. it's just a daunting review job, that's all.
<lloda>nah, you're right it's messy code.
<mark_weaver>I'm going to try to take a close look in the next week though.
<wingo>mark_weaver: do you want to take the review?
<wingo>i was going to start on it but we shouldn't duplicate work
<mark_weaver>well, it's probably good for both of us to review it.
<wingo>as you like, it doesn't matter to me
<mark_weaver>please do review it, wingo. but also please give me a chance to review it before pushing, if you don't mind.
<wingo>mark_weaver: ok, i'll hold off functional changes for more review, but i will push bug-fixes and test things and similar
<mark_weaver>sounds good.
<wingo>still it seems like double-review is too much, but whatever
<mark_weaver>well, I think you're probably more familiar with that area of the code. but at the same time, I want to remember what I found questionable. I might not do a full review.
<mark_weaver>it's possible that I had 'stable-2.0' too much in mind when I tried to review it last time, and that the vector==simple-vector thing worried me. I hope that's the case.
*mark_weaver --> zzz
<wingo>sleep well :)
<b4283>there's a nice song about sleeping well
<civodul>lloda: welcome to the Savannah group ;-)
<lloda>thanks! :)
<lloda>I've built lloda-array-cleanup. make check passes. I've tested my programs against it. Everything seems to work except for this message:
<lloda>;;; WARNING: compilation of [...]
<lloda>;;; ERROR: don't know how to intern #2f64()
<lloda>similar errors for other literals.
<lloda>I was on 2.0.9 before, so this is new to me.
***DerGuteM1 is now known as DerGuteMoritz
<civodul>"don't know how to intern"?!
<civodul>just add it to the symbol hash table!
<civodul>:-)
<civodul>that's on master?
<lloda>it's lloda-array-cleanup which is on top of master. I want to blame master b/c I had the extra patches before on top of 2.0.9 and not this issue. But I haven't tested master yet.
<wingo>yes, probably the new compiler doesn't handle that for whatever reason
<wingo>also there is no symbol hash table in master; only a weak set :)
<civodul>wingo: right, but still, it does know how to intern things, doesn't it? :-)
<wingo>yes :)
<civodul>heheh
<wingo>but as in stable-2.0, symbols are gc'd, so it's not really interning i guess
<wingo>dunno
<wingo>hard to tell :)
<jmd>Calling (link "x/y/z" "w/x/y/z") I get the error ERROR: In procedure link:
<jmd>ERROR: No such file or directory
<jmd>which file or directory does it think does not exist?
<civodul>you can't tell
<civodul>the syscall doesn't provide more info
<jmd>Oh. Then is there a mkdir which will create the necessary subdirs?
<b4283>mkdir -p
<wingo> http://blog.frama-c.com/index.php?post/2013/02/26/Portable-arithmetic-operations
<wingo> http://en.wikipedia.org/wiki/SipHash
<wingo>i wonder if we should use that
<wingo>probably so.
<civodul>systemd and Rust have it, so i guess we must
<wingo>hehe
<wingo>it's to prevent hash collision attacks
<wingo>if we switch to utf-8 strings we can hash directly over the utf-8 bytes
<wingo>right now we have to be careful to hash over codepoints, since a given string can have multiple representations
<wingo> https://131002.net/siphash/
<wingo>it would be nice to pre-compute hashes for strings and symbols that we residualize into object files...
<wingo>i reckon in that case we can just use a well-known seed
<wingo>of course statically creating a hash table would be nice, too :)
<wingo>prolly won't happen tho
<mark_weaver>I've been thinking about statically creating the symbol table for symbols in core guile for a while, to speed up startup.
<mark_weaver>but so far it's just a thought. never really looked into it.
<wingo>it's possible; probably the biggest gains though would be pre-computing hashes and pre-allocating variable cells
<wingo>at least according to valgrind
<mark_weaver>I'm not sure why pre-allocating the variable cells would be more important than preallocating the hash table chain chells.
<wingo>the symbol table doesn't have chain cells
<wingo>it's a weak set
<mark_weaver>oh, interesting. I must look at that code sometime.
<mark_weaver>for that matter, in order to make 'equal?' and 'write' handle cycles without a severe performance regression, I'm going to need hash tables that don't allocate anything in the common case.
*wingo gets grumpy whenever he thinks about cycles
<mark_weaver>fortunately, in both cases, the elements are removed from the hash table in LIFO order, like a stack, which makes it much simpler.
<mark_weaver>I can do the thing where I put the elements directly in the array, and if that bucket is already full, I scan until I find a free entry.
<wingo>you might check out the weak table implementation then -- it uses an open-coded robin hood hashing scheme
<wingo>and a set might suffice
<mark_weaver>will do@
<mark_weaver>s/@/!/
<wingo>with 2/3 of the memory usage of a table
<wingo>(the weak set and table implementations store the raw hash value also)
<wingo>i guess we could have deterministic stringbuf hashes, but a string's hash or a symbol's hash would have to be mixed with a per-invocation private key
<mark_weaver>when are stringbufs hashed?
<mark_weaver>bare stringbufs, that is.
<wingo>they aren't, right now
<mark_weaver>good.
<wingo>dunno, just tossing around ideas on how to pre-compute hashes while not being a vulnerability
<mark_weaver>btw, what is the symbol table keyed on nowadays in master?
<wingo>the hash of the stringbuf ;)
<mark_weaver>I seem to recall it used to be keyed on the symbol itself, which I always thought was ridiculous.
<wingo>the hash of the stringbuf backing the string
<mark_weaver>ah, so we do hash bare stringbufs.
<wingo>or more precisely, the hash of the codepoints composing the symbol
<wingo>but that hash is not a component of a stringbuf.
<mark_weaver>oh, this needs utf8
<wingo>well, right now it goes character by character because there are different encodings
<wingo>but if we had utf-8 strings it could just hash the utf-8 bytes
<mark_weaver>yeah, that would be a big win.
<wingo>yes
<wingo>though there are utf-8 specialized string hashers in master
<wingo>just not as fast as hashing bytes
<mark_weaver>after 2.0.10 is out the door, I have two guile priorities: fixing the thread-safety of module autoloading, and utf-8 strings come after that I think.
<mark_weaver>one question: why do we have to go character by character, anyway?
<mark_weaver>s/character/codepoint/
<mark_weaver>the encoding that a string is in is deterministic, based on the contents of the string.
<wingo>shared substrings
<mark_weaver>if every character is latin-1, then it's latin-1.
<mark_weaver>oh, right :)
<wingo>:)
<wingo> http://www.python.org/dev/peps/pep-0456/
<mark_weaver>I wonder if shared substrings is actually a win in practice. somehow, I doubt it.
<wingo>depends on your use-case, i would think
<wingo>i want to do subbytevector now...
<mark_weaver>sure. if you take huge substrings of huge strings, then definitely a win. but how often is that, I wonder.
<wingo>without it, my file upload code has to store double the memory of the upload
<mark_weaver>it means an extra indirection on every string.
<mark_weaver>*nod*
<wingo>i think it's probably a win; marius took it out around 2006 or so but had to put it back in due to users complaining
<mark_weaver>okay
<wingo>and v8 has like 20 kinds of strings
<mark_weaver>yikes
<wingo>so, i don't want 20 kinds of strings, but at least one project has deemed it important enough to invest lots of time on it
<wingo>ropes, substrings, byte strings, ucs-16 strings, etc etc
<wingo>but their strings are immutable, so that's a difference
<mark_weaver>I'm going to want to allow the underlying bytes of a utf-8 string to be exposed as a bytevector.
<wingo>yeah!
<wingo>er
<wingo>wait :)
<wingo>is that a good idea?
<mark_weaver>a large number of efficient utf-8 algorithms depend on working by byte.
<wingo>ok
<mark_weaver>in fact, that's one of utf-8's greatest strengths.
<wingo>ok let's do it; probably it pays off
<civodul>hm?
<civodul>exposing the internal representation?
<wingo>the thing you lose is some type-based optimization things
<mark_weaver>yeah.
<wingo>civodul: for implementing algorithms in scheme
<civodul>i understand, but it has to remain an internal API
<mark_weaver>for example, searching can be done by bytes in utf-8. same for regexp searches.
<wingo>it's definitely a privileged function
<wingo>humm
<civodul>string->utf8 could return a COW bytevector
<wingo>we don't have COW bytevectors
<wingo>and i don't think we want them
<civodul>yes, that's the problem :-)
<mark_weaver>what about immutable bytevectors?
<civodul>prolly too expensive, yes
<wingo>i think they would make too many things slow
<wingo>mark_weaver: yes that could work
<mark_weaver>yeah, I don't know of an algorithm that needs write access to the bytes.
<wingo>it still has a runtime cost (bytevector-u8-ref not implying that bytevector-u8-set! is valid) but that's probably ok
<mark_weaver>as soon as you need to write, then you might need to change the length and that's a mess anyway.
<wingo>ok let's do immutable bytevectors then
<wingo>we can do string->utf8/read-only
<mark_weaver>cool.
<wingo>though with immutable strings it could be that the backing store is in fact mutable
<mark_weaver>well, we can arrange to make them immutable.
<wingo>but i think we can describe that adequately inthe manual
<wingo>well you might want to provide read-only capabilities to a piece of memory, but that memory might change
<mark_weaver>when utf8 strings are mutated, that will be a slow path anyway.
<mark_weaver>the string will have to be broken up into blocks, and then reassembled when converted to utf-8.
<wingo>yeah
<wingo>we could pessimize string-set!
<wingo>try to do something sensible but not care too much about it
<mark_weaver>right. we had a thread on this topic years ago, and I proposed a scheme that make 'string-set!' constant-time but with a large constant.
<civodul>mark_weaver: you had posted arguments in favor of utf-8 internally, no?
<civodul>i can't find it
<mark_weaver>yes.
<mark_weaver>let me see.
<wingo>logarithmic string-set! is fine with me too
<mark_weaver>yeah, if we went logarithmic (for string-ref too), then we could probably do things like efficient concatenation as well.
<mark_weaver>and efficient insertions/deletions.
<mark_weaver>I'll give it some thought.
<wingo>good luck
<wingo>it's a terrible design space
<mark_weaver>civodul: "O(1) accessors for UTF-8 backed strings", March 2011
<civodul> http://trac.sacrideo.us/wg/wiki/StringRepresentations
<civodul>mark_weaver: that's a message that says it's doable, not a message that says it's worthwhile :-)
<mark_weaver>yeah, I'm looking for the right message. that's not quite the right one.
<civodul>but yeah, the above has interesting arguments
<civodul>the "StringRepresentations" page
<civodul>irregex does crazy things, indeed
<mark_weaver>well, I'm having trouble finding the message where I gave the best arguments. but basically: (1) it's a single representation, so binary string operations could be optimized without handling 4 different cases, (2) it would allow us to use libunistring for many of our string operations, (3) UTF-8 operations can typically be done byte-wise without difficulty, which makes things much faster and simpler.
<wingo>is libunistring maintained?
<mark_weaver>and of course, conversion to/from utf-8 is very fast, and that's the common case these days.
<mark_weaver>I don't know, but if it isn't, that should probably be fixed, no?
<mark_weaver>unicode has a lot of intricate algorithms, and it seems worthy of a library rather than each project duplicating that work.
<mark_weaver>handling the bare utf-8 code points is simple. but things like case conversion is nasty.
<civodul>wingo: it's dormant, but not unmaintained, i'd say
<mark_weaver>and the normalization algorithms.
<civodul>mark_weaver: thanks for the list
<mark_weaver>and at some point, people will want to be able to really deal with *characters* as opposed to code points for some things.
<mark_weaver>in fact, what people really think of as characters, e.g. corresponding to a single glyph when rendered, are actually multiple code points.
<mark_weaver>it makes my head hurt to think about this stuff.
<mark_weaver>I'd rather leave it to a library :)
<mark_weaver>in fact the scheme standards kind of got it wrong to equate characters with code points.
<mark_weaver>anyway, going afk for a while. happy hacking!
<jemarch>hi
<civodul>'lo!
<tupi>hello guilers
<tupi>wingo: whenever you feel like it, could you add me to guile-gnome, thanks
<wingo>i see that your assignment came through, great
<wingo>i think that's why i didn't do it in the past
<wingo>done
<wingo>tupi: ^
<tupi>no problem, thanks!
<wingo>please ask before pushing things :)
<tupi>of course, wilco!
<tupi>wingo, guile-gnome web pages are under git as wel?
<wingo>tupi: they are under cvs actually; and i think they are generated from something in the guile-gnome source tree
<wingo>ah no they are all in cvs
<wingo>generated from .scm files
<wingo>there is a makefile there
<wingo>there is also a script somewhere to update the docs
<tupi>ok, where is the cvs server ?
<wingo>see the "use cvs" link on the upper right hand side of the savannah project
<tupi>tx
<mark_weaver>R.I.P. Pete Seeger :-(
<wingo>indeed
<wingo>heh, it seems that guile unrolls a factorial loop entirely if the number is 23 or less
<wingo>,optimize (let lp ((n 23)) (if (zero? n) 1 (* n (lp (1- n)))))
<wingo>$6 = 25852016738884976640000
*wingo bows before peval
<mark_weaver>hah, nice :)
<add^_>Greetings Guilers!
<davexunit>hey add^_
<add^_>Hey davexunit :-D
<add^_>What's up?
<add^_>sneek: seen ijp
<sneek>I last saw ijp on Jan 15 at 02:10 pm UTC, saying: who is also apparently at utah.
<add^_>hm, ok, not to long ago then, just haven't seen him in quite a while.
<davexunit>add^_: not too much. just working and stuff. frustrated with corporate software development at the moment.
<add^_>Oh?
<add^_>What happened?
<add^_>err, why are you frustrated?
<dsmith-work>"corporate software development"
<davexunit>add^_: adding hacks on top of hacks to please stakeholders.
<add^_>ah
<add^_>Sounds "meh".
<add^_>I get the geist
<add^_>maybe it's "gist" in English
<davexunit>I have to implement something far more complicated than necessary for the sake of small user experience complaints.
<add^_>:-S
<add^_>That sucks :-/
<add^_>Oh well...
<add^_>Life in corporations is like that I suppose..
<davexunit>eyah
<davexunit>yeah*
<add^_>On another note, I'll be buying a Kinesis keyboard soon :-D
<add^_>Well, most likely
<davexunit>oh neat. I hear those are good for RSI
<add^_>Yes
<add^_>Well
<add^_>For not getting it xD
<davexunit>yeah
<add^_>I wonder how hard it'll be to get used to emacs with that :-P
<add^_>A friend who has one says that it would probably take a month of use to get used to (just the keyboard)
<add^_>not with emacs
<add^_>Sounds pretty much like getting used to dvorak..
<add^_>But dvorak was annoying in emacs for me. Although I've heard people who thinks emacs is even easier with dvorak...
<davexunit>I haven't tried dvorak or any other keyboard layout
<add^_>If you already can touchtype or whatever it's called, on qwerty, it's probably not necessary to switch...
<taylanub>Go with Colemak if you'll switch. Less different than Dvorak, and just as good or a tiny bit better.
<add^_>:-P
<add^_>taylanub: is that what you use?
<taylanub>yes
<add^_>Cool
<add^_>So, what's the logic behind where the keys are placed?
<taylanub>Not sure, some complicated algorithms and hand-tuning.
<add^_>It seems like all the vowels are on homerow, like on dvorak
<add^_>which is good
<taylanub>(Of course, like any algorithm, they're only complicated until you know them.)
<add^_>oh, nope, not all of them
<add^_>lol
<add^_>I think there are algorithms that are still complicated even when you know them :-P
<taylanub>I once made a script for an IRC client, I think it was WeeChat, that printed the number of home-row keys for QWERTY, Dvorak, and Colemak, at the end of the sent line, for every line I sent.
<taylanub>IIRC Colemak mostly had a small edge over Dvorak, and QWERTY a huge gap behind.
<taylanub>(Of course home-row hits are just one very specific criteria.)
<add^_>Well, it's an important one
<add^_>I don't feel like trying yet another keyboard layout though :-/ Maybe I should..
<taylanub>Colemak goes well with Emacs BTW, from what I can tell, though I never used Emacs with anything else than Colemak so I have no comparison.
<taylanub>Also, w and f are aside, so don't accidentally C-x C-w a file instead of C-x C-f'ing it!!
<add^_>lol
<add^_>There will always be those kinds of things ;-)
<taylanub>Yeah, it actually prompts for a confirmation, you have to spell out y e s and hit RET, for some reason I did it as a reflect that one time. :\\
<add^_>Ouch
<add^_>lol
<add^_>well, testing out dvorak again right now... Takes forever to type
<add^_>lol
<add^_>for me
<add^_>I think I'll get used to it again soon enough
<add^_>yeah
<add^_>oops, miss chatt
<add^_>..