IRC channel logs

2015-08-09.log

back to list of logs

<amz3>I try to build parser combinator based on GLL. All my text is turned into a mishmash of strings and parens
<amz3>terminals are parsed correctly but after that... mishmash
<paroneayea>okay, nice, I think I've read enough to be able to implement this, assuming I can make my way around the FFI API
<amz3>paroneayea: I made a little macro to help with ffi, maybe you'll find it useful, I published it on the ML it has "wiredtiger" in the subject
<paroneayea>amz3: cool, thanks! :)
<paroneayea>ACTION taking a break and playing some Dungeon Crawl Stone Soup for a few
<amz3>I had a quick look at the header, you won't need an improved version since they doesn't seem to be any function pointers in struct
<amz3>little markdown parser is on its way :)
<linas>Seems I have a utf8-bug in guile-2.1 .. or I'm crazy.
<linas>using display to send tet on a socket fails:
<linas>so, for example: tell netcat to listen on port 7777 -- "nc -l 7777"
<linas>then (define sss (socket PF_INET SOCK_STREAM 0))
<linas>(connect sss AF_INET (inet-pton AF_INET "127.0.0.1") 7777)
<linas>(display "SmålandSmåland\\n" sss)
<linas>which results in netcat receiving garbage. I'm kind of irked about this, its just wrong.
<daviid>how do i write a if HAVE_this and HAVE_that then ... endif in a Makefile.am? I see "if test <a single test>; then ..." but i can't find an 'and' or any complex example, any hint?
<linas>(set-port-encoding! sss "utf-8") fixes this behavior but it seems to be just a plain-bad behavior
<daviid>linas, i'd do (setlocale LC_ALL "") instead, but I thought all port were utf-8 per default i guile-2.2, don't know, you'll have to check with 1 of our maintainer
<linas>daviid, thanks, but stelocal lc_all does not fix the bug .. and I am running 2.2 out of a recent git pull
<linas>re ou automake question
<linas>I suspect logical-and is not supported, for compatibility reasons (too hard to convert to regular makefile?) just guessing, though.
<linas>I take that back, (set-port-encoding! sss "utf-8") does not fix the issue, either.
<linas>I'm stumped. Better not use display, I guess ...
<daviid>linas: yeah wrt to automake, I can't find not even an example of such tests
<mark_weaver>linas: what kind of garbage are you getting on the other end after (set-port-encoding! sss "utf-8") ?
<linas>the same ...
<linas>I am experimenting, and also getting this:
<linas>(use-modules (rnrs bytevectors)) (string->utf8 "Småland")
<linas>which prints #vu8(83 109 63 108 97 110 100)
<linas>which is clearly wrong, since the third letter å cannot possibly be decimal 63
<linas>mark_weaver -- so I assume, right now that the above rnrs bug might be the same bug, seen a different way!?
<mark_weaver>63 is "?", so the problem is happening when the string "Småland" is being read into the interpreter. it must not be reading as UTF-8.
<mark_weaver>linas: how are you typing in these code snippets to guile?
<mark_weaver>s/into the interpreter/by guile/
<linas>cut n paste, just like here.
<mark_weaver>into what? guile running in a bare terminal?
<mark_weaver>or in emacs?
<linas>well, for this demo, in the real bug, its crap that comes in from a C++ socket, to a java server, etc etc.
<linas>guile running in a bare terminal
<mark_weaver>okay, the default encoding for ports in guile is based on the current locale setting
<linas>current locale is En_US utf8
<linas>this is soething that used to work great in guile-1.8 5 years ago
<mark_weaver>what is the value of the LANG environment variable where guile is run?
<mark_weaver>well, guile-1.8 has no concept of string encodings at all, everything was just raw bytes.
<linas>$ env |grep LANG
<linas>LANG=en_US.UTF-8
<mark_weaver>linas: are you setting the locale in your application that links with libguile?
<linas>(setlocale LC_ALL "")
<linas>did you try the code snippet above? its short ...
<linas>I set the locale both in the c code that calls guile, and again, within guile. Belt and suspenders
<mark_weaver>yes, it works for me in guile-2.0
<linas>ahh!
<mark_weaver>ACTION compiles master
<linas>I' running guile 2.1 from a git pull from maybe June this summer
<linas>well, I'm stumped. The short demo seems to work fine in guile 2.0.9, but the server that is having the bug is running guile-2.0.9
<mark_weaver>linas: it also works for me running bipt's guile for guile-emacs.
<mark_weaver>which is based on master
<linas>I guess maybe I should try a newer pull from master.
<mark_weaver>linas: just for kicks, can you try LANG=en_US.utf8
<mark_weaver>(ending with "utf8" instead of "UTF-8")
<mark_weaver>it could also be due to issues with your terminal program, depending on what encoding it uses to talk to the subprocesses.
<mark_weaver>i.e. it might not be set up for UTF-8
<mark_weaver>you might try running: echo "Småland" > testfile
<linas>export LANG=en_US.utf8 didn't change anything. I've got to try git master ..
<mark_weaver>and then looking at the output of 'hexdump -c testfile'
<mark_weaver>from within that same terminal program
<mark_weaver>sorry, I meant hexdump -C
<linas>yeah, tried hexdump earlier
<linas>Doubly confusing, now: the guile-2.0.9 shell works great, but an app linked to guile-2.0.9 mangles the text.
<paroneayea>mark_weaver: hey! ping
<paroneayea>mark_weaver: any idea how to get an array of strings across the FFI? :)
<paroneayea>I've got this:
<paroneayea>(define PQexecParams (pointer->procedure '* (dynamic-func "PQexecParams" libpq) (list '* int '* '* '* '* int)))
<paroneayea>this should more or less work like:
<paroneayea>(PQexecParams conn (string->pointer "SELECT * FROM animals WHERE name = $1") 1 #nil (list->array '("monkey")) #nil #nil 0)
<mark_weaver>linas: the 'guile' program calls 'setlocale' automatically. when libguile is linked to a program, libguile doesn't set the locale. it's up to your program to do so.
<linas>Ahh but using (set-port-encoding! sss "utf-8") does fix things on my guile-2.0.9-linked server!
<paroneayea>however, I don't know how to do the list->array bit, or the nils or maybe even the integers if that doesn't happen automagically ;)
<linas>mark_weaver, yes, I understand. I have a work-around for my guile-2.0.9 server, and will retest with latest guile-git master shortly
<mark_weaver>paroneayea: so, from the C side, what should this array of strings looks like? like an argv array?
<linas>better go help paroneayea who thankfully is not using guile-dbi
<paroneayea>linas: trying to learn enough to make an FFI alternative for postgres bindings :)
<paroneayea>mark_weaver: like this:
<paroneayea> const char *params[1];
<paroneayea> params[0] = "monkey";
<paroneayea>
<paroneayea>mark_weaver: so yeah, same kind of thing
<mark_weaver>paroneayea: in your call to 'pointer->procedure', it looks like the second argument should be an integer
<mark_weaver>but in your example call, the second argument is the result of 'string->pointer'
<paroneayea>mark_weaver: why should it be an integer?
<paroneayea>it's returning a pointer...
<paroneayea>to a PGresult
<paroneayea>postgres struct type
<mark_weaver>paroneayea: because the final list passed to 'pointer->procedure' is (list '* int '* '* '* '* int)
<mark_weaver>looks like you need another '* inserted at the front of that list.
<mark_weaver>but anyway, for the argv thing:
<mark_weaver>so, an easy way to do it is to model the array of pointers as a struct.
<mark_weaver>(with on field per element)
<paroneayea>:O
<paroneayea>mark_weaver: you can do that?
<paroneayea>mark_weaver: btw for reference, http://pamrel.lu/e0002/
<paroneayea>mark_weaver: are structs basically just arrays in C this whole time and I never knew? ;P
<mark_weaver>and then you can do something like: (define (list->array ls) (make-c-struct (make-list (+ 1 (length ls)) '*) (append (map string->pointer ls) (list %null-pointer))))
<mark_weaver>(untested)
<paroneayea>:O
<mark_weaver>paroneayea: if that's too hacky for you, then the other way is to use bytevectors
<paroneayea>mark_weaver: I was trying to understand how the bytevectors stuff worked...
<mark_weaver>well, the basic idea would be to make a bytevector of the required length, which would hold the array of pointers, and use 'bytevector->pointer' to pass it to the wrapped C function.
<mark_weaver>so (sizeof '*) will tell you the size of pointers
<mark_weaver>and then you would use 'pointer-address' to get the numerical value of each pointer returned by 'string->pointer'
<paroneayea>okay :)
<paroneayea>ACTION pretends to be smart enough to follow all this ;)
<paroneayea>I think I'm catching on-ish but I need to play with it
<mark_weaver>and put those into the bytevector using 'bytevector-u64-native-set!' or the appropriate one depending on the size of pointers
<mark_weaver>but frankly, I would just do the 'make-c-struct' thing.
<mark_weaver>libffi, which is what our dynamic FFI is based on, doesn't have support for arrays at all
<mark_weaver>so in practice, we have to model them as structs sometimes.
<paroneayea>mark_weaver: so can a struct of all strings be used interchanably with an array of all strings in C? :O
<mark_weaver>and from my recollections of reading the C standards on the layout of structs, I think this is safe and portable.
<mark_weaver>or at least as portable as guile
<mark_weaver>paroneayea: yes, I believe so
<paroneayea>whee :)
<paroneayea>The More I Know!
<mark_weaver>well, I should clarify (backpedal?) a bit
<mark_weaver>modern C compilers are allowed to make various assumptions about what can be accessed via a pointer, if that pointer is not a void* or char*
<mark_weaver>so, for example, if you do (*p = 0) from some code, where p is a pointer to a struct, the compiler can assume that nothing but a struct of that type was changed.
<mark_weaver>this is from memory, so I might be slightly off on some of the details
<paroneayea>mark_weaver: well, far above my head, but like so many things in my life, filed into orgmode until the time where I can fully understand it!
<mark_weaver>so I think it's not quite true that "all strings be used interchanably with an array of all strings in C"
<mark_weaver>however, I'm sure that libffi has that angle covered in any number of ways.
<paroneayea>mark_weaver: okay, awesome :)
<paroneayea>mark_weaver: thank you!
<paroneayea>mark_weaver: this is a lot to wrap my head around, but I will do my best to do so!
<linas>gahh this is 3.5 times more annoying. My app breaks when I do (define x "Ćićolina\\n") I get an encoding error.My brain hurts.
<linas>actually scm_eval_string(scm_from_utf8_string(" (define x \\"Ćićolina\\n\\")"))); which is nuts because this works great on alternate fridays
***michel_mno_afk is now known as michel_mno
***michel_mno is now known as michel_mno_afk
<amz3>héllo :)
<paroneayea> http://snap.berkeley.edu/ fun! scratch reinvented to be schemey
<nalaginrut>paroneayea: you mean snap is written with Scheme now?
<amz3>nalaginrut: it's written in javascript
<paroneayea>yeah still seems to be js
<amz3>I've written another markdown parser nalaginrut, this time there is no dependency :D
<paroneayea>I was told more schemey, got perhaps disproportionately excited ;)
<nalaginrut>amz3: do you use LALR finally? or other magic?
<nalaginrut>;-)
<amz3>nalaginrut: magic magic it has to be magic
<amz3>they say it's schemey but it has for loops, I'm not sure i understand the sens of schemey
<amz3>« It also features first class lists, first class procedures, and continuations. »
<amz3>it's a good start for a modern language
<amz3>I think they have a strong point with visual programming
<amz3>I mean I like it
<amz3>nalaginrut: I started with the first parser combinator for GLL (racket) then made my way to get it to work for me
<amz3>It still need a bit more testing, but if this parser combinator thing is as powerful as it is said, I'm hopeful that this will work flawlessly.
<amz3>I will convert my blog to it
<nalaginrut>amz3: what is GLL? ;-)
<amz3>some library/tutorial in racket to explains how to build a parser combinator
<amz3>it's on github
<amz3> https://github.com/epsil/gll
<amz3>the first parser combiantor is enough to do what I need
<amz3>I'm wondering how big markdown parser is using LALR
<amz3>This week, I did a lot of progress
<nalaginrut>oh~it's LL
<nalaginrut>alright, it seems a very good article to read
<nalaginrut>ACTION added to bookmark
<amz3>it's good article, especially if you racket
<amz3>and scheme in general
<amz3>for instance, I did not bother to implement the `memo` procedure
<amz3>I tried to build the second parser with continuation and LL but failed
<amz3>there is trampoline in the third parser :)
<amz3>or is it introduced in the second
<nalaginrut>amz3: good, we may delay `memo' to the future, I'm pleased to see more parsing tools of Guile, for multi-lang
<taylanub>paroneayea: if you ever do something more complicated with C arrays/structs/whatever and bytevectors, you might want to check <https://github.com/TaylanUB/scheme-bytestructures>. (the README is quite convoluted so maybe just check the examples.) it suffers from lack of usage though; while the library is constructed very logically with maximal generality, it's not fine-tuned to make the most
<taylanub>common usage patterns most convenient (because I don't know what the most common usage patterns are).
<paroneayea>taylanub: thanks!
<paroneayea>o/
<davexunit>hey paroneayea
<paroneayea>hi davexunit!
<paroneayea>huh!
<paroneayea>is libffi efficient enough where doing (pointer->procedure) in every invocation is fast enough?
<paroneayea>I figured you'd want to do that once per commonly used function and (define) it somewhere you can access
<davexunit>you only want to do that conversion once
<davexunit>and store the result
<paroneayea>that's what I figured...
<paroneayea>ACTION was just looking at guile-sqlite3 and it was done per invocation
<davexunit>bad times.
<davexunit>are you sure, though?
<davexunit>surely wingo wouldn't do that unless there was good reason to do it.
<paroneayea>davexunit: that's why I asked! ;p
<paroneayea>ACTION dances
<paroneayea>things are starting to work-ish :)
<paroneayea>maybe time to move this into a git repo and out of my ~/sandbox/ directory :)
<paroneayea>Should I call my guile FFI powered postgres bindings library something boring like "guile-pg-ffi" or should I call it "heffalump"?
<davexunit>heffalump makes me lol
<davexunit>gotta go, though! happy hacking
<amz3>paroneayea: where did you publis your bindings?
<paroneayea>amz3: I haven't published them yet
<paroneayea>they're only starting to work
<paroneayea>I'm brand new to the FFI and postgres's C bindings simultaneously
<amz3>good
<paroneayea>so it has taken me a while :)
<amz3>you know C ?
<paroneayea>only barely enough :)
<linas>mark_weaver et al. After pulling from git last night, I still get utf8 bugs. The easiest one is this one:
<linas>(use-modules (rnrs bytevectors)) (setlocale LC_ALL "") (string->utf8 "Småland")
<linas>which returns an incorrect byte sequence for the icrcle-a (should be hex c3 a5 U+00E5)
<linas>similar results for (string->utf8 "Hòa Phú Phú Tân")
<daviid>linas: I'd stick to 2.0 for now if I was you :), any reason why you're using 2.2? wingo said, a while ago, there will another +- 12 releases of 2.1 before it becomes stable... and wingo is working in his retreat, he is +- uncommunicable atm ...
<amz3>retreat? he will leave guile?
<daviid>no :) i ment like in his world
<daviid>sneek: seen wingo
<sneek>I last saw wingo on Jun 24 at 08:05 pm UTC, saying: and how long prerelease will take, no idear.
<amz3>ah ok ^^
<linas>I forget why I started playing with guile-git. There was some rational reason, though :-)
<linas>One problem is that my app, when linked to guile-2.0.9 had a set of bugs that go away when linked to guile-git
<daviid>well it's good to use the source, but checkout the stable-2.0 branch
<daviid>2.0.9 is really old now
<linas>and right now, I get utf8 bugs, no matter which one I link to .. except that they are different bugs, almost oppiste of one another
<davexunit>amz3: remember that time I said I had a parser combinator implementation hanging around? here it is: http://paste.lisp.org/display/153344
<linas>thanks daviid because I'm slowly getting tied in knots over this.
<daviid>you should use the stable-2.0 git branch: 2.0.11 which has a lot of patches compared to the tarball
<amz3>davexunit: thx I will have a look at it. hopefully it doesn't have the same problem as mine
<davexunit>amz3: and what problem is that?
<daviid>linas: there are no related utf-8 bugs in stable afaict, but if, you'd immediately receive support from mark_weaver, I think
<amz3>davexunit: once I start combining parser, the parse result value becomes a mess of nested list and ordering is chaotic
<davexunit>your parser should transform lists into the relevant data structure
<amz3>I need to solve the problem on a example by exmaple basis (almost)
<davexunit>I have a sequence parser that returns a list
<daviid>linas: when you git clone guile, you're in master [as for any git clone]. the problem is that the guile scheme wrt git, is master is the devel branch, you have to checkout the stable-2.0 to work with stable
<davexunit>but by lifting a list->foo procedure into the parser, I can transform as needed
<linas>yes right daviid
<amz3>davexunit: what is lifting, please, I saw that term already but... don't remember
<davexunit>lifting is the process of transforming a non-monadic function into a monadic one.
<davexunit>so, in the case of our parsers, let's say we parse a sequence of characters
<davexunit>our parse-sequence parser returns a list of characters, but we would like the final result to be a string
<davexunit>there's a list->string procedure, but the interface isn't right.
<davexunit>parsers take the character stream as an argument
<davexunit>so, this is where the lift comes in
<davexunit>are you familiar with the term "return" in the context of monads?
<amz3>no
<davexunit>in my library, (parse-return "foo") would return a parser that *always* returns "foo", regardless of the character stream given to it.
<amz3>I mean even monads I don't them
<daviid>davexunit: cool! is this code on your git host maybe?
<davexunit>amz3: okay. I'm not great at thinking in terms of monads myself, but I get the rough idea. I'd recommend reading up on them a bit to understand how my code is working.
<davexunit>I think my code is rather straight forward, so hopefully it can assist your understanding.
<davexunit>daviid: I will upload it, but so far it's only in that paste.
<amz3>I should do a bit a reading indeed
<daviid>ok, let me know please, and tx to share it
<linas>problem is, I have multiple production servers running on multiple machines :-(
<davexunit>daviid: here you go: https://git.dthompson.us/guile-parser-combinators.git
<daviid>thanks! cloning now :)
<davexunit>:)
<amz3>davexunit: how do you build the ast ?
<amz3>there is no parse-node pseudo parser that "annotate" the stream ?
<davexunit>amz3: I don't build an AST.
<amz3>how do you interpret the output then?
<davexunit>the user can build the representation that they need
<davexunit>so they could build an AST
<amz3>how ?
<paroneayea>davexunit: boo, "winnie the pooh" never entered the public domain
<davexunit>by writing a parser that produces the right AST object
<paroneayea>guess I can't use heffalump
<davexunit>paroneayea: boo indeed
<davexunit>I still blame Steamboat Willie for all of this.
<paroneayea>could call it proboscis
<paroneayea>or maybe I should give up and give it a boring name
<davexunit>boring name is fine
<davexunit>guile-postgres
<amz3>davexunit: that was my question, I need to write parser that produce that ast, that's where the problem the parser I have
<amz3>where the problem is
<amz3>I try to build an ast in a semi-automatic way, using parse-each and parse-any, but it doesn't work well, I think I need another kind of parsers
<amz3>also what do you think of (parse-if) and (parse-not)
<amz3>I had to build those to make the markdown parser
<amz3>actually I don't know otherwise
<amz3>to give you an exemple of how building ast is difficult in my program. Writing one-or-more in terms of zero-or-more doesn't result in the same ast
<davexunit>amz3: what are the semantics of parse-if and parse-not?
<davexunit>my code has parse-maybe
<amz3>(parse-if predicate parser): parser consume stream only if predicate is valide. in guile-log is called (f-and)
<amz3>(parse-not parser) fails is parser succeed, and succeed if parser fails
<amz3>without consuming stream
<amz3>I don't understand the purpose of maybe
<amz3>parse-maybe
<davexunit>amz3: if the parse fails, it returns a default value instead.
<paroneayea>mark_weaver: heya!
<paroneayea>mark_weaver: if you don't mind helping with one more thing, I'm very close on this...
<paroneayea>oh!
<paroneayea>actually
<paroneayea>nm :D
<paroneayea>I may have it after all!
<linas>daviid I get the same utf8 errors with the guile-git stable-2.0 branch as well.
<linas>so, for my server app, it seems guile git master is the one with the least utf8 craziness in it.
<linas>which is a great reason to use git master in the first place ...
<daviid>linas don't use master and report a bug for stable if you think it is a bug,
<daviid>imo
<davexunit>amz3: here's a couple of parsers for markdown headers: http://paste.lisp.org/display/153348
<daviid>I'm using utf-8 daily, never had a single bug in years, but i don't use socket and ... just current-*-ports and files
<davexunit>amz3: in terms of ASTs, they return the SXML representation of the markdown
<amz3>directly? that is nice
<amz3>I have a post-processing step
<davexunit>parse-map does the necessary post-processing
<amz3>yep parser-map seems very useful
<amz3>I'll have a better look tomorrow
<davexunit>I should probably write parse-match
<amz3>I fixed my ast processing somehow ^^
<amz3>nite!
<davexunit>that is sugar for (parse-map (match-lambda ...) ...)
<davexunit>later
<linas>daviid, yes, except that master is the least buggy of all the different versions of guile I played with.
<linas>Isolating bugs is a huge effort.
<linas>I've already lost 2-3 days trying to track this one down.
<linas>and its hitting productions serverss such that they have to be rebooted a few times a day.
<linas>Its simpler just to run on master, than to try to debug stable
<daviid>linas, I really don't understand, using guile-2.1 for production servers ???
<linas>its the lest buggy version of guile out there
<linas>least
<daviid>i'd nerver do that i'm sorry
<linas>what can I say
<linas>I got lazy, put se3veral production servers on the old, stable guile, and promptly ogt hit
<daviid>and wingo is not available, you won't have support before nobody kows when [not blaming him here, he jsut works on the compiler and has recently changed plans ..]
<linas>problem is they guy runing them didn't tell me
<daviid>linas at all cost, production servers, run guile-2.0, ans send a tinny example, you will receibve support
<linas>yes, but as I said, guiile-2.0 is bbuggier than master
<daviid>no way
<linas>has been like that for years.
<daviid>proove it :)
<linas>I've filed multiple bug reports. Some have gotten fixed
<linas>some are bugs that I was told would never be fixed in stable, because they are not wiorth the effort
<daviid>real bugs always get fixed, but some poepl send bug in fact it's rfi, not the same thing ...
<daviid>bbl
<linas>stable has multiple real bugs in it that have not been fixed
<linas>and I am not the only one to hit them
<linas>Sheesh
<daviid>ping maintainers then
<linas>and I do, every now and then.
<daviid>but running guile-2.1 in production is just crazy, imo
<linas>Look, I;'ve been doing this for what, 15 years nnow?
<linas>its not crazy because
<daviid>i know :)
<linas>it doesn't leak memory
<linas>it doesn't crash with weirdo GC bugs
<linas>and it does utf8 mostly a whole liot better
<linas>oh, and its significantly faster
<daviid>ok
<daviid>the last 1 is expected of course, but ... do as you wish
<daviid>linas: did mean to criticize you by the way, good luck!