IRC channel logs

***logicmoo is now known as dmiles

<wingo>yo

<wleslie>sup?

<wleslie>guix for the release is a great idea, btw

<lloda`>wingo: so I've reduced the other day's profile to

<lloda`>% cumulative self

<lloda`>time seconds seconds calls procedure

<lloda`> 42.28 1.90 1.83 2000000 vector-ref

<lloda`> 19.85 0.94 0.86 1000000 *

<lloda`> 19.85 0.86 0.86 anon #x7f27fe1788c0

<lloda`> 18.01 0.88 0.78 1000000 vector-set!

<lloda`>

<lloda`>this looks good but I can't find what #x7f27fe1788c0 is.

<lloda`>#:display-style 'tree shows it being called by all vector-ref, vector-set!

<lloda`>and also * and the top function. everything calls it, it seems.

<lloda`>the disassembly shows a bunch of 'anonymous procedure at #x...' but not the address above.

<wingo>looks like vector-ref / vector-set aren't being inlined

<wingo>also * isn't being inlined

<wingo>are you using apply to inline vector-ref etc? or doing some kind of table-based dispatch?

<lloda`>the calls are ((struct-ref ra n) ...) where (struct-ref ra n) is vector-ref in this case

<wingo>oddly you will do better with a manual eta conversion :/

<wingo>vector-ref -> (lambda (v n) (vector-ref v n))

<wingo>i think anyway

<lloda`>ahh

<lloda`>I'll try that, thx

<wingo>holy crap my blog spam rate is enormous

<lloda`>wingo: the conversion didn't really work. I mean now I have the lambdas in the profile instead of the vector-*, but it looks much the same, same anon too.

<lloda`>sorry about the spam :-/

<wingo>hum

<wingo>just means i have to write a classifier in scheme :)

<wingo>i have a good collection of spam & ham

<lloda`>so I have written most of the array functions in Scheme and it isn't that bad. array-slice-for-each is faster in Scheme. array-map! is slower by a factor of 2 to 8 depending on how many arguments there are. array-copy! is 300 times slower, but the C uses memcpy :p.

<lloda`>This is with the type dispatch done at the last moment, as above (I could move it out of the loop with more macros) but that's how C does it, anyway

<lloda`>but what do you mean with a classifier?

<lloda`>ah, the spam ;p

<wleslie>yees. I was going to comment that the basic observation about the promises/callback approach isn't really about performance at all, but then I decided that was about as relevant to the post as the many comments about BMI and facebook likes.

<wleslie>no use picking nits in a hurd of camels.

<wingo>i have locally reverted around 3K comments, i think there are still some 1-2K left

<avoine>ecraven: may I ask what are you using to produce the benchmark.html file in your benchmark?

<paroneayea>do I remember there being a deduplicate procedure baked into guile?

<davexunit>paroneayea: delete-duplicates

<davexunit>srfi-1 I think?

<paroneayea>davexunit: aha, thanks :)

<paroneayea>I was looking through srfi-1 but I must have passed over it

<davexunit>np!

<ArneBab_>we might want to be here: http://stackoverflow.com/documentation/scheme/851/getting-started-with-scheme#t=201703021821098898047

<wingo>moo

<amz3>o/

<OrangeShark>o/

<snape>is a tuple the same thing as a dotted pair?

<mejja>a dotted pair is a cons cell

<snape>I could not find any Lisp/scheme official writing about it

<wingo>tuple isn't really a scheme concept

<wingo>there are a few data structures you can use to compose values

<wingo>if you don't want that data structure to have a type and you have 2 values then a pair is ok

<wingo>otherwise a list or a vector are usual

<wingo>a disjoint type, i mean

<snape>I have two values, each of them being a string

<snape>see https://lists.gnu.org/archive/html/guix-devel/2017-03/msg00051.html

<snape>I'm not sure whether I should choose cons or list.

<wingo>choose cons if it's just two. choose vector if >2 and number is fixed. choose list otherwise

<wingo>or choose vector for 2; that's fine too

<wingo>cons is a little weird :)

<snape>got it. Thanks you :)

<snape>*Thank

<wingo>neat, i just trained a bayesian classifier

<mthl>wingo: what does your classifier do? :)

<wingo>tells whether a blog comment is spam :)

<amz3>:)

<mthl>wingo: I am curious, how did you get a training set?

<wingo>mthl: from my own blog comments, each comment is a git commit, i have to revert the spam

<wingo>so i know the reverted ones are spam; there are still some spam in the kept set but i hope to use this classifier to identify them too

<wingo>only a few thousand on either side tho so i don't know how well this will work

<mthl>nice :)

<mthl>I guess you used Guile for this?

<wingo>yeah

<wingo>i have a 5% false negative rating now which is not so great i guess

<wingo>and what appears to be a 20% false positive rating but i know there is spam in the ham set so maybe that's all actually spam...

<amz3>I am working on guildhall website

<wingo>it's an overtrained classifier tho; i didn't subsample

<amz3>how did you get it done? I mean where did you look up the algorithm?

<wingo>it's a "naive bayes classifier"

<wingo>mostly from https://en.wikipedia.org/wiki/Naive_Bayes_classifier tho that page has many digressions

<wingo> https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering has some nice simplifications

<amz3>so i was wrong

<wingo>this is great, the classifier found tons of ham that was actually spam

IRC channel logs

2017-03-02.log