IRC channel logs

2019-11-03.log

back to list of logs

<spk121>.
<f-a>Hello, I am a new guile user
<f-a>is there a way to autocomplete an expression in the repl?
<f-a>like (nu<tab> for (null?
<chrislck>do you have readline ?
<f-a>I am sure it involves some readlin- I do chrislck
<chrislck>you'll want to type nu<tab>
<f-a>ho, it lists suggestions, silly me
***heroux_ is now known as heroux
<spk121>OK. Let's get a new release of guile-gi out the door. Trying to figure out what to put in NEWS is always the last task.
***ng0_ is now known as ng0
<civodul>spk121: woohoo!
<mwette>spk121: I checked out your stuff -- looking good. (I think MOP is usually overkill but this I want to try out.)
<spk121>mwette: originally we tried it without, but, it was just easier (from a implementer point of view)
<mwette>spk121: good to know your approach -- thanks
<davidl>Given a guile string varible that holds a big json document, what is the fastest way you can convert it to a list of lists representing the json list of objects?
<davidl>I have tried using the jansson C json library to walk through and create a list and return the list of lists as SCM to my guile program but that still takes a very long time to run for just a couple hundred megabytes of json.
<davidl>or, given a filename, what is the fastest way you can create a guile list of lists representing a json list of objects that is contained in the file? I am open to using the FFI and writing C code to read directly from a file just to get performance in building the guile lists. If anyone can explain even broadly what needs done I'd be grateful. My current attempt is still very slow.
<spk121>There are a couple of JSON libraries, but I don't know which is best for your use case. Sorry
<davidl>I have tried both sjson and json and both are slow in their json-string->scm methods.
<stis>davidl: have you tried to find which function does the most work in thise libraries?
<davidl>stis: no I have not
<stis>also how much time do you spend per atom in the messages?
<stis>what is your expectation?
<davidl>not sure I understand what you mean by "per atom in the messages". I expect that parsing 200MB of json text to a scheme datastructure should not take more than like 10-15 seconds.
<davidl>using python3.7 I could read, transform and write back to disk 200MB in 5.5 seconds.
<davidl>with guile the same operation takes around 40 seconds after leaving the read and writing to disk to a C extension module.
<davidl>most of those 40 seconds are spent on the read-json-from-string and write-json-to-string procedures.
<stis>yes the thing is if the time is in the parsing or creation of atoms. If it is in the parsing you can do better. More difficult if it is in the creation of the specific numbers strings etc in the messages
<davidl>I believe it is in the guile creation of lists and atoms.
<stis>if you only have substrings you could store just indecies in stead of creating substrings. I found that making strings are expensive
<stis>but you should do aproper analysis of which operation that really takes the time first.
<davidl>stis: ok thanks. this will take some time to figure out... :/
<davidl>stis: what do you mean by "store just indecies" by the way?
<stis>instead of creating a string of a substring you store a pair of numbers indicating start and end in the original string.
<stis>Also you do 10-15 Million cons allocations per second so that is one limit
<stis>cons alloactions are pretty fast in guile
<davidl>ok, thx, maybe Ill try to something like that eventually.
<amz3>davidl: write-json-to-string is a mistake
<amz3>even if you don't rely on guile port, you can write the datastructure directly to the file descriptor
<amz3>no need for an intermediate big string.
<amz3>davidl: also, I was under the impression that you would give up. You really want you (tiny?) script(s) to be written in Guile :D
<amz3>davidl: what the json parser outputs? hash-tables? or alist? this might have performance impact downstream in the algorithm.
<davidl>amz3: I do want to do my work with guile whenever I can :-) the scripts are probably not going to be tiny in the end actually, which is why I wanted to switch from bash and jq to a "real" programming language like guile, which also has guile-bash which allows for a smooth transition or just certain parts to run as guile... but apparently performance didn't really improve with guile.. yet.
<davidl>amz3: reg json parser outputs; the guile-json library uses alist, and sjson uses "atlists" which are just regular 2-element lists with indices and values as elem 1 and 2 respectively.
<davidl>sjson do allow for parsing the json-strings to "fash", which is some hash data-structure.
<davidl>amz3: how do I write a datastructure directly to a file descriptor? (or file)