IRC channel logs


back to list of logs

<dsmith-work>Happy Friday, Guilers!!
***ng0_ is now known as ng0
<lloda>should or should not (when #t (define a 0) a) work?
<jcowan>It should not.
<jcowan>when takes a sequence of expressions, not a definition. If you want definitions, you can say (when #t (let () (define a 0) a)), or just (when #t (let ((a 0)) a)).
<b4283>didn't noticed there was "when" and "unless" until today
<jcowan>Yes. Some implementations specify the value of when/unless, but IMO one should not rely on it.
<jcowan>R6RS says if the test is true, it returns the value of the last expression, but if it is false, it returns an unspecified value -- which might by chance be the same as the value of the last expression.
<jcowan>So use when/unless for effect only.
<davidl>When Im reading and writing to disk with guile, even guile 2.9.x it takes ~10 times longer than in python3.7 - does that make sense or is it probably something with my implementations?
<davidl>Im using read-string from ice-9 rdelim and put-string from ice-9 textual ports
<davidl>python takes 5-6 seconds to read from file, transform it and write it back, guile takes ~55 seconds and it's all about the I/O.
<amz3>if you use raw files for storing data, that is probably the bottleneck. Actual data work, use data expert system, dubbed databases. Even in Python, you would not use an actual file for doing data work. Except for prototyping.
<amz3>davidl: ^
<davidl>amz3: so if I transfer my 195Megabyte json-file to a database, would I see similar performance between a python-script and a guile-script loading and writing back data from/to the db?
<amz3>davidl: well, in guile you don't have GIL so...
<amz3>195 is small.
<davidl>amz3: I know, its why Im so surprised
<davidl>even my jq script doing the same thing is faster than guile
<davidl>what is GIL?
<amz3>jq is command line written in C? it is not comparable to guile, is it?
<amz3>Global Interpreter Lock. It means you can use multiple thread to do computation.
<amz3>unlike Python.
<amz3>well, in python you can use multiprocessing, but still, you would need to serialize in the mainthread using pickle or something to transfer that to the other process, unlikely to be faster using Python even in that case.
<amz3>IF you do some kind of computations.
<davidl>What are the strengths of guile compared to python? What can I possibly code "manually" to get a similar performance for simple I/O like this 195M file?
<amz3>GIL is the reason I started working with Guile.
<amz3>davidl: you can bind use C FFI to bind the glib primitives.
<amz3>davidl: you can bind *using* C FFI the glib primitives.
<amz3>maybe they are not glib but anyway.
<amz3>davidl: what are you up to do with guile? outside reading and writing files?
<davidl>yes, raw json text files
<davidl>reading them, small transformations which are fast enough, then writing them back as files.
<davidl>amz3: generally speaking I am doing some API programming projects.
<davidl>both json and xml, but currently just json stuff.
<amz3>davidl: and then you do what one the file is written to disk?
<amz3>is it a process that runs continuously or is it a command line one-shot-of-some-sort command?
<davidl>It will run basically like a cronjob. I have a json-list of json-objects in a text-file that get's downloaded regularly. I read the file and for each object I do a transformation and then I write the result to another file.
<amz3>then why you need performance if it is a background job?
<davidl>amz3: it is just one of a multitude of background jobs and this is only one part of many computations that will happen for this module.
<amz3>davidl: you should invest the time to write the same computation in python and guile and benchmark that. That is a real world benchmar (unlike microbenchmarks).
<davidl>well, just reading the string into a scm-list ("atlist" in sjson) takes longer than the whole I/O and transformation in python :/
<amz3>I have to believe you.
<amz3>It is not what I find out in my own benchmarks.
<davidl>Im gonna try guile-json instead of sjson and read directly from an input port instead of to a string first.
<davidl>amz3: so just the part running (define myjson (json->scm (open-input-file "195megabytejson.json"))) takes 35 seconds with guile 2.2.6 using guile-json.
<davidl>amz3: what have you found in your benchmarks?
<amz3>I found that parsing s-expr using guile took 63ms where as in python it takes 247ms
<amz3>davidl: ^
<amz3>the benchmark does include the time taken to read the file
<amz3>davidl: can you make you benchmark into a git repository so I can have a look at it.
<amz3>davidl: I plenty of json to test with so no need to include that in the repository.
<amz3>davidl: but if you can good :)
<amz3>the benchmark does NOT include the time taken to read the file
<amz3>$ cat cpython-cx-read/time.log guile-cx-read/time.log
<amz3>CPython is 3 times slower on this benchmark.
<amz3>and still slower when counting the file.
<amz3>(much slower that 185Mb)
<amz3>(much smaller that 185Mb)
<ArneBab>davidl: read-performance of ports isn’t perfect in Guile
<ArneBab>Python on the other hand is deferring writing right off to C
<ArneBab>davidl: the speedups mark managed to get into string-replace-substring just by optimizing the code were really impressive (easily factor 10)
<ArneBab>so you might be able to golf this down quite a bit
<ArneBab>davidl: and if there’s anything Python is really optimized at, then it’s reading line by line (with open(<filename>, 'r') as f: for line in f: print(line)
*jcowan wonders if some of the JSON processing in Python is happening in C