IRC channel logs

2017-03-02.log

back to list of logs

***logicmoo is now known as dmiles
<wingo>yo
<wleslie>sup?
<wleslie>guix for the release is a great idea, btw
<lloda`>wingo: so I've reduced the other day's profile to
<lloda`>% cumulative self
<lloda`>time seconds seconds calls procedure
<lloda`> 42.28 1.90 1.83 2000000 vector-ref
<lloda`> 19.85 0.94 0.86 1000000 *
<lloda`> 19.85 0.86 0.86 anon #x7f27fe1788c0
<lloda`> 18.01 0.88 0.78 1000000 vector-set!
<lloda`>
<lloda`>this looks good but I can't find what #x7f27fe1788c0 is.
<lloda`>#:display-style 'tree shows it being called by all vector-ref, vector-set!
<lloda`>and also * and the top function. everything calls it, it seems.
<lloda`>the disassembly shows a bunch of 'anonymous procedure at #x...' but not the address above.
<wingo>looks like vector-ref / vector-set aren't being inlined
<wingo>also * isn't being inlined
<wingo>are you using apply to inline vector-ref etc? or doing some kind of table-based dispatch?
<lloda`>the calls are ((struct-ref ra n) ...) where (struct-ref ra n) is vector-ref in this case
<wingo>oddly you will do better with a manual eta conversion :/
<wingo>vector-ref -> (lambda (v n) (vector-ref v n))
<wingo>i think anyway
<lloda`>ahh
<lloda`>I'll try that, thx
<wingo>holy crap my blog spam rate is enormous
<lloda`>wingo: the conversion didn't really work. I mean now I have the lambdas in the profile instead of the vector-*, but it looks much the same, same anon too.
<lloda`>sorry about the spam :-/
<wingo>hum
<wingo>just means i have to write a classifier in scheme :)
<wingo>i have a good collection of spam & ham
<lloda`>so I have written most of the array functions in Scheme and it isn't that bad. array-slice-for-each is faster in Scheme. array-map! is slower by a factor of 2 to 8 depending on how many arguments there are. array-copy! is 300 times slower, but the C uses memcpy :p.
<lloda`>This is with the type dispatch done at the last moment, as above (I could move it out of the loop with more macros) but that's how C does it, anyway
<lloda`>but what do you mean with a classifier?
<lloda`>ah, the spam ;p
<wleslie>yees. I was going to comment that the basic observation about the promises/callback approach isn't really about performance at all, but then I decided that was about as relevant to the post as the many comments about BMI and facebook likes.
<wleslie>no use picking nits in a hurd of camels.
<wingo>i have locally reverted around 3K comments, i think there are still some 1-2K left
<avoine>ecraven: may I ask what are you using to produce the benchmark.html file in your benchmark?
<paroneayea>do I remember there being a deduplicate procedure baked into guile?
<davexunit>paroneayea: delete-duplicates
<davexunit>srfi-1 I think?
<paroneayea>davexunit: aha, thanks :)
<paroneayea>I was looking through srfi-1 but I must have passed over it
<davexunit>np!
<ArneBab_>we might want to be here: http://stackoverflow.com/documentation/scheme/851/getting-started-with-scheme#t=201703021821098898047
<wingo>moo
<amz3>o/
<OrangeShark>o/
<snape>is a tuple the same thing as a dotted pair?
<mejja>a dotted pair is a cons cell
<snape>I could not find any Lisp/scheme official writing about it
<wingo>tuple isn't really a scheme concept
<wingo>there are a few data structures you can use to compose values
<wingo>if you don't want that data structure to have a type and you have 2 values then a pair is ok
<wingo>otherwise a list or a vector are usual
<wingo>a disjoint type, i mean
<snape>I have two values, each of them being a string
<snape>see https://lists.gnu.org/archive/html/guix-devel/2017-03/msg00051.html
<snape>I'm not sure whether I should choose cons or list.
<wingo>choose cons if it's just two. choose vector if >2 and number is fixed. choose list otherwise
<wingo>or choose vector for 2; that's fine too
<wingo>cons is a little weird :)
<snape>got it. Thanks you :)
<snape>*Thank
<wingo>neat, i just trained a bayesian classifier
<mthl>wingo: what does your classifier do? :)
<wingo>tells whether a blog comment is spam :)
<amz3>:)
<mthl>wingo: I am curious, how did you get a training set?
<wingo>mthl: from my own blog comments, each comment is a git commit, i have to revert the spam
<wingo>so i know the reverted ones are spam; there are still some spam in the kept set but i hope to use this classifier to identify them too
<wingo>only a few thousand on either side tho so i don't know how well this will work
<mthl>nice :)
<mthl>I guess you used Guile for this?
<wingo>yeah
<wingo>i have a 5% false negative rating now which is not so great i guess
<wingo>and what appears to be a 20% false positive rating but i know there is spam in the ham set so maybe that's all actually spam...
<amz3>I am working on guildhall website
<wingo>it's an overtrained classifier tho; i didn't subsample
<amz3>how did you get it done? I mean where did you look up the algorithm?
<wingo>it's a "naive bayes classifier"
<wingo>mostly from https://en.wikipedia.org/wiki/Naive_Bayes_classifier tho that page has many digressions
<wingo> https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering has some nice simplifications
<amz3>so i was wrong
<wingo>this is great, the classifier found tons of ham that was actually spam