IRC channel logs

<chrislck>manumanumanu: for my previous code I had simplified https://stackoverflow.com/questions/7313563/flatten-a-list-using-only-the-forms-in-the-little-schemer/7324493

<chrislck>combine all 4 snippets into one, using srfi-1 forms as appropriate, and it becomes clear how it works

<chrislck>but I admit passing the named let as a parameter to fold was a noob genius that made the code beautifully simple

<manumanumanu>jcowan: Which is of course not strange once you think of it (at least for a language with proper TCO). Converting it to something else than regular recursive calls seems like a waste of life :) I was just surprised, since I had never seen it before.

<roelj>Does anyone have a small example on how to use 'flock'?

<zig>hello #guile

<roelj>What's an efficient way to do thread-safe file access in Guile? (I want to read to read a list from a file and cons a value to it)

<roelj>And do that without losing a value from multiple threads.

<wingo>do you want to update the file?

<roelj>yes

<wingo>what would the contents of the file look like before and after, if before had 2 items and after had 3?

<wingo>if the items were numbers, say

<roelj>((D E F) (A B C)) -> ((G H I) (D E F) (A B C))

<roelj>Oh, sorry, numbers: (2 1) -> (3 2 1)

<wingo>in that case, i would create a new file -- i see that guile's api here is not great

<wingo>see the private call-with-output-file/atomic from (system base compile) actually

<wingo>i think that's what you want

<roelj>Ah, let me see

<roelj>So, IIUC, it creates a temporary file, does its I/O and puts it in the target place.

<roelj>What happens when two actions are performed on it simultaneously? Only one will end up in the target (the last action)?

<cehteh>roelj: would storing the stuff in a sqlite be an option?

<roelj>cehteh: Currently my program does not have sqlite as dependency, so I'd prefer a solution without external dependencies :)

<roelj>Maybe a log structure would work better. Is 'write' thread-safe?

<wingo>roelj: ah :) that is a more tricky issue

<wingo>write is not safe in that sense

<wingo>i agree i would use a database of some kind

<wingo>sqlite maybe, or git, or something like that

<manumanumanu>Have a .lock file then.

<wingo>yes, file locking can work too

<zig>fwiw, sqlite is a lot ceremony because of the single-writer thing. SQLite recommends to avoid direct manipulation of files on the file system, as the file system does not do what you really except to do. (sure, a software vendor will not say that its very own software is useless).

<manumanumanu>until your users have an accident which crashes the program and they then can't access the files :D

<manumanumanu>maybe store a unique "session id" in the lock file

<manumanumanu>and warn if the lock file is there, but with an old session id

<manumanumanu>roelj: or use a dedicated writer thread that accepts input through a queue.

<manumanumanu>I'm full of ideas this morning!

<manumanumanu>I am just trying to postpone the inevitable: going back to editing XSLT and XML DTDs.

<zig>that is database work.

<cehteh>sqlite isnt the best for that, but if it would be an option then likely the easiest as it does all the synchonization in a bullet proof (but low performing) way

<roelj>Thanks for the ideas :)

<cehteh>if its not, then as manumanumanu saied, single writer threas is a good idea

<cehteh>otherwise you end up in locking any synchonization hell

<zig>cehteh: wiredtiger (or sqlite lsm extension) is a little bit lower level but much more versatile.

<cehteh>i wouldnt go into details but rolling that on your own with low levle posix/filesystem calls will be pita

<zig>+1

<roelj>Hehe, I'm beginning to see the complexity of it

<cehteh>you can do a global lock thing and single writer

<cehteh>beware of deadlocks

<cehteh>and figure out how to prevent data races :D

<roelj>Reinvent a database :D

<cehteh>basically

<cehteh>lor just log writing, append only, still with locking (that you write full records)

<cehteh>once a while reconstruct the original file

<zig>it also depends whether the data is bigger than memory or not (oom killer is still around)

<wingo>made a little thing for wasm work: https://gitlab.com/wingo/guile-wasm

<wingo>i suppose we could compile wasm files into .go files

<manumanumanu>wingo: cool. What does "decode it into a scheme representation" mean? Actual scheme code?

<manumanumanu>But I guess that would not be a small side project :D :D

<manumanumanu>Answered my own question!

<zig>neat

<zig>TIL: there is popcount function wasm

<zig>TIL: there is popcount function in wasm

<zig>:/

<manumanumanu>everybody needs it!

<zig>manumanumanu: really? when is it useful?

<zig>I think, i stumbled upon it while computing the hamming distance or something like that.

<zig>hamming distance between two bit strings.

<manumanumanu>I use it when I work with bitvectors.

<manumanumanu>but mostly, it is used in HAMTs

<manumanumanu>at least, that is where I have seen it most

<zig>oh it make sense.

<zig>hamming distance in C: __builtin_popcount(x ^ y) (ref: https://en.wikipedia.org/wiki/Hamming_distance#Algorithm_example)

<zig>it is the count of substituions required to make x == y

<manumanumanu>Is x a power of 2? (= 1 (logcount x))

<zig>I used it in (novel?) algorithm with simhash to compute and query similar or duplicate document (ref: https://hyper.dev/blog/fuzzbuzz-hash-algorithm.html)

<wingo>i usually use (zero? (logand n (- n 1))

<wingo>though that is true for 0 also

<zig>manumanumanu: no it is the simhash of something where simhash is a locality-senstive-hash.

<zig>I only have an implementation in another programming language.

<manumanumanu>wingo: the satisfaction of doing something with bitfiddling is enormous, isn't it? :D

<wingo>:)

<manumanumanu>When I finally understood that constant time hex encode I was euphoric

<wingo>i bitfiddle all the time & find it a bit gnarly, personally :P

<manumanumanu>For me, a lowly classical musician, it is still novel. Every time i think of solving anything using it, it is like an endeavour! "Will it work? Will it be fast? Will I survive?"

<jcowan> https://vaibhavsagar.com/blog/2019/09/08/popcount/ is pretty comprehensive

<manumanumanu>A friend of mine who is bitmucking professionally (embedded C stuff mostly) has popcount as a litmus test for programming languages. If it has popcount, it can probably be used for serious work. If it hasn't he can't be bothered.

<manumanumanu>wingo: that logand thing is much prettier. I expect popcount to use at least some cycles...

<wingo>if you like this sort of thing, you might enjoy the book "hacker's delight"

<manumanumanu>I will check it out, thanks

<jcowan>To each their own, as the devil said when he painted his tail sky-blue.

<jcowan>It's especially interesting that gcc and clang will *detect* an implementation of popcount they are compiling and replace it with the one instruction

<roptat>I think I already asked, but I can't remember how to put a string to lower case?

<manumanumanu>string-downcase

<manumanumanu>?

<manumanumanu>or string-downcase! if you want to maybe-mutate it.

<zig>Thanks!

<manumanumanu>jcowan: it is surprising how many compiler optimizations are just regular pattern matching (x % 2) == 0 being the most famous one.

<manumanumanu>It has tricked millions of programmers into believing there is nothing slow with modulo...

<lloda>roptat: you can type ,a case at the REPL to find this sort of thing

<roptat>thanks, for some reason I couldn't find it in the manual, I must not have searched properly

<lloda>whish there was an override of sorts for format

<lloda>like (format p "BEGIN~pEND" (lambda (p) (write-things-to-p)))

<lloda>unnecessary I guess

<zig>there is call-with-output-string https://www.gnu.org/software/guile/manual/html_node/String-Ports.html#index-call_002dwith_002doutput_002dstring

<lloda>that'd work

*zig got greenlight for the one and only test directly related to full-text search in guile-babelia.

<stis>hey guilers!

<str1ngs>hello stis

<str1ngs>wingo: does guile-wasm mean guile scheme can be compiles to wasm?

<str1ngs>compiled*

<wingo>nope, just a toolkit for working with wasm file

<wingo>s

<str1ngs>ahh okay that's what I thought.

<str1ngs>I wonder if though a C to wasm, could compile guile to wasm. that would be intresting

<zig>I tried emscripten with another scheme, I gave up, IF I work on something related to that I will target wasm directly instead of compiling the whole compiler.

<str1ngs>zig: right, compiling direct to wasm would be ideal

<nly>hi

***jao is now known as Guest36927

<str1ngs>hello nly

<nly>hey str1ngs

<nly>:)

<nly>str1ngs what do you think of this static website? https://o-nly.github.io/resonance/ generated using wintersmith

<str1ngs>nly: looks really nice. are you aware of https://dthompson.us/projects/haunt.html ?

<nly>yup

<nly>does nomad have a website yet?

<str1ngs>nly: btw I have a project maybe can be combined with https://o-nly.github.io/resonance/

<str1ngs>nly: not yet about the time I start working on documentation I'll probably create a website

<str1ngs>nly: the project is https://github.com/mrosset/home

<nly>if you want i could try to setup a website using wintersmith or maybe haunt is even better? what say?

<str1ngs>nly: I think I rather have some documention first. it's definitely on my todo list though

<nly>maybe outsource some work :)

<nly>page not found https://github.com/mrosset/home

<str1ngs>nly: ahh it was private, try now

<nly>i saw it

<zig>I am confused both of you nly and str1ngs are working on a browser?

<str1ngs>zig: right it's called nomad which is an extensible web browser using guile

<nly>at this point i'd like to point you to a website for nomad zig

<nly> http://savannah.nongnu.org/projects/nomad

<nly>but it's not made yet

<nly>by website i mean just someplace with all the relevant links, irc, source, docs etc.

<zig>ah ok you are working on the same project?

<nly>i made the bookmarks and shroud module

<nly>yes

<zig>oh ok

<str1ngs>nly: I think ideally I'd like to create the manual with texinfo. then provide that online

<str1ngs>nly: like proper GNU projects do

<zig>what branch do you recommend to read?

<str1ngs>zig: feature-release is the stable branch

<str1ngs>zig: though there are newer branches that are moving the API towards gobject introspection.

<zig>ok

<nly>str1ngs that would best, but it's a start

<str1ngs>nly: if you wanted to work on something I'd say the manual would be a good start

<str1ngs>at least the autotools framework to generate a manual

<nly>ok, that seems reasonable

<str1ngs>nly: here is the guide line https://www.gnu.org/prep/standards/standards.html#Documentation

<str1ngs>nly: maybe we use texinfo to generate the web documention html and then haunt or something for the landing website.

<str1ngs>nly: then we can also provide the html documentation internally, why not it's a web browser after all

<zig>"The size of the cache is the single most important tuning knob for a WiredTiger application. Ideally the cache should be configured to be large enough to hold an application's working set."

<zig>:/

<zig> https://source.wiredtiger.com/3.2.0/tune_cache.html

***sneek_ is now known as sneek

<zig>that is... sad to read.

<weinholt>zig, it would be true for any cache, no?

<zig>(a post about the future of databases http://www.cs.cmu.edu/~pavlo/blog/2015/09/the-next-50-years-of-databases.html)

<zig>what is the working set? to me there is "what is requested often" that I will call hot data and "what is requested less often" that I will call cold data. The cache should keep around only the hot data. Hence have statistics about the whole database one what is queried often or not.

<zig>So the cache assuming, it is a fraction of the total memory used by the database, should not be the whole database, it should only be the data that is requested often.

<zig>but at the moment, I am loading what I downloaded from scheme websites, html only is around 3GB. But it is more that 8G in memory.

<zig>AND I just figured, that I made a typo (!) and the database is using mmap.

<zig>that typo was not catched because I forgot to fix a bug reported during the SRFI process (related to keywors / options that must be checked).

<zig>For the search engine usecase, it is very difficult to figure what is the working set.

<zig>also the cache must be big enough to keep the data of the whole transaction in memory.

<weinholt>zig, are you working to bring the search engine back up?

<zig>weinholt: yes

<weinholt>nice

<weinholt>it will be like the old days of the internet: a search engine that actually finds interesting stuff :)

<str1ngs>is there a link to this search engine?

<jcowan>zig: A working set is all the data you have had to access in the last n time units, for some suitable choice of n.

<zig>str1ngs: the code is at https://git.sr.ht/~amz3/guile-babelia (only scheme, no goops ;)

<zig>weinholt: hopefully..

<zig>jcowan: fwiw, there is n in time units that can be configured in wt. Instead one set the size of the cache, and min and max number of threads dedicated to eviction

<zig>here is the current config string: mmap=false,eviction=(threads_max=3,threads_min=2),log=(enabled=true,file_max=1024MB),create,cache_size=5120MB

<jcowan>you mean "there is no n"?

<zig>I have been trying since 20:00

<jcowan>Anyway, that's an abstract definition, not an implementation

<zig>NO n yes.

<zig>there is no n.

<jcowan>That wouldn't be a cache config parameter anyway: it's what the application knows about its own history.

<zig>I am somewhat "happy" that wt documentation acknowledge that tuning eviction is difficult.

<jcowan>so for example a LRU cache is a good approximation in many cases, but not (for example) when the data are being accessed in a loop larger than the cache size, in which case evicting the least-recently-used is *exactly* the wrong thing.

<jcowan>What we really want is a cache that caches all the data we are about to use.

<jcowan>"Prediction is very difficult, especially about the future."

<str1ngs>zig: cool thanks. oh I don't mind goops. but then I've been trying to find an alternative data structure. I guess records is the scheme way to do things?

<str1ngs>zig: I'm kinda use to C a go structs

<str1ngs>zig: this looks very interesting, thanks for the link

<zig>yw

<zig>I guess I am not knowledge able enough to rule out oop from scheme. I never used goops.

<zig>at r7rs has nothing like oop.

<zig>at least r7rs has nothing like oop.

<str1ngs>one think I like goops is the generic methods

***apteryx_ is now known as apteryx

<str1ngs>zig: make check works but I had to hack stemmer.scm for my computed stemmer

<str1ngs>I'm using your channel btw

<jcowan>R7RS-large will, if I have anything to say about it, provide generic functions but not classes.

<jcowan>(that is, you can *have* classes, they just won't be in the standard)

<str1ngs>jcowan: thanks I will look into that

IRC channel logs

2019-11-21.log