<amz3>I try to build parser combinator based on GLL. All my text is turned into a mishmash of strings and parens <amz3>terminals are parsed correctly but after that... mishmash <paroneayea>okay, nice, I think I've read enough to be able to implement this, assuming I can make my way around the FFI API <amz3>paroneayea: I made a little macro to help with ffi, maybe you'll find it useful, I published it on the ML it has "wiredtiger" in the subject <paroneayea>ACTION taking a break and playing some Dungeon Crawl Stone Soup for a few <amz3>I had a quick look at the header, you won't need an improved version since they doesn't seem to be any function pointers in struct <amz3>little markdown parser is on its way :) <linas>Seems I have a utf8-bug in guile-2.1 .. or I'm crazy. <linas>using display to send tet on a socket fails: <linas>so, for example: tell netcat to listen on port 7777 -- "nc -l 7777" <linas>then (define sss (socket PF_INET SOCK_STREAM 0)) <linas>(connect sss AF_INET (inet-pton AF_INET "127.0.0.1") 7777) <linas>(display "SmålandSmåland\\n" sss) <linas>which results in netcat receiving garbage. I'm kind of irked about this, its just wrong. <daviid>how do i write a if HAVE_this and HAVE_that then ... endif in a Makefile.am? I see "if test <a single test>; then ..." but i can't find an 'and' or any complex example, any hint? <linas>(set-port-encoding! sss "utf-8") fixes this behavior but it seems to be just a plain-bad behavior <daviid>linas, i'd do (setlocale LC_ALL "") instead, but I thought all port were utf-8 per default i guile-2.2, don't know, you'll have to check with 1 of our maintainer <linas>daviid, thanks, but stelocal lc_all does not fix the bug .. and I am running 2.2 out of a recent git pull <linas>I suspect logical-and is not supported, for compatibility reasons (too hard to convert to regular makefile?) just guessing, though. <linas>I take that back, (set-port-encoding! sss "utf-8") does not fix the issue, either. <linas>I'm stumped. Better not use display, I guess ... <daviid>linas: yeah wrt to automake, I can't find not even an example of such tests <mark_weaver>linas: what kind of garbage are you getting on the other end after (set-port-encoding! sss "utf-8") ? <linas>I am experimenting, and also getting this: <linas>(use-modules (rnrs bytevectors)) (string->utf8 "Småland") <linas>which prints #vu8(83 109 63 108 97 110 100) <linas>which is clearly wrong, since the third letter å cannot possibly be decimal 63 <linas>mark_weaver -- so I assume, right now that the above rnrs bug might be the same bug, seen a different way!? <mark_weaver>63 is "?", so the problem is happening when the string "Småland" is being read into the interpreter. it must not be reading as UTF-8. <mark_weaver>linas: how are you typing in these code snippets to guile? <linas>cut n paste, just like here. <linas>well, for this demo, in the real bug, its crap that comes in from a C++ socket, to a java server, etc etc. <linas>guile running in a bare terminal <mark_weaver>okay, the default encoding for ports in guile is based on the current locale setting <linas>current locale is En_US utf8 <linas>this is soething that used to work great in guile-1.8 5 years ago <mark_weaver>what is the value of the LANG environment variable where guile is run? <mark_weaver>well, guile-1.8 has no concept of string encodings at all, everything was just raw bytes. <mark_weaver>linas: are you setting the locale in your application that links with libguile? <linas>did you try the code snippet above? its short ... <linas>I set the locale both in the c code that calls guile, and again, within guile. Belt and suspenders <linas>I' running guile 2.1 from a git pull from maybe June this summer <linas>well, I'm stumped. The short demo seems to work fine in guile 2.0.9, but the server that is having the bug is running guile-2.0.9 <mark_weaver>linas: it also works for me running bipt's guile for guile-emacs. <linas>I guess maybe I should try a newer pull from master. <mark_weaver>it could also be due to issues with your terminal program, depending on what encoding it uses to talk to the subprocesses. <linas>export LANG=en_US.utf8 didn't change anything. I've got to try git master .. <mark_weaver>and then looking at the output of 'hexdump -c testfile' <linas>Doubly confusing, now: the guile-2.0.9 shell works great, but an app linked to guile-2.0.9 mangles the text. <paroneayea>mark_weaver: any idea how to get an array of strings across the FFI? :) <paroneayea>(define PQexecParams (pointer->procedure '* (dynamic-func "PQexecParams" libpq) (list '* int '* '* '* '* int))) <paroneayea>(PQexecParams conn (string->pointer "SELECT * FROM animals WHERE name = $1") 1 #nil (list->array '("monkey")) #nil #nil 0) <mark_weaver>linas: the 'guile' program calls 'setlocale' automatically. when libguile is linked to a program, libguile doesn't set the locale. it's up to your program to do so. <linas>Ahh but using (set-port-encoding! sss "utf-8") does fix things on my guile-2.0.9-linked server! <paroneayea>however, I don't know how to do the list->array bit, or the nils or maybe even the integers if that doesn't happen automagically ;) <linas>mark_weaver, yes, I understand. I have a work-around for my guile-2.0.9 server, and will retest with latest guile-git master shortly <mark_weaver>paroneayea: so, from the C side, what should this array of strings looks like? like an argv array? <linas>better go help paroneayea who thankfully is not using guile-dbi <paroneayea>linas: trying to learn enough to make an FFI alternative for postgres bindings :) <mark_weaver>paroneayea: in your call to 'pointer->procedure', it looks like the second argument should be an integer <mark_weaver>but in your example call, the second argument is the result of 'string->pointer' <mark_weaver>paroneayea: because the final list passed to 'pointer->procedure' is (list '* int '* '* '* '* int) <mark_weaver>looks like you need another '* inserted at the front of that list. <mark_weaver>so, an easy way to do it is to model the array of pointers as a struct. <paroneayea>mark_weaver: are structs basically just arrays in C this whole time and I never knew? ;P <mark_weaver>and then you can do something like: (define (list->array ls) (make-c-struct (make-list (+ 1 (length ls)) '*) (append (map string->pointer ls) (list %null-pointer)))) <mark_weaver>paroneayea: if that's too hacky for you, then the other way is to use bytevectors <paroneayea>mark_weaver: I was trying to understand how the bytevectors stuff worked... <mark_weaver>well, the basic idea would be to make a bytevector of the required length, which would hold the array of pointers, and use 'bytevector->pointer' to pass it to the wrapped C function. <mark_weaver>and then you would use 'pointer-address' to get the numerical value of each pointer returned by 'string->pointer' <paroneayea>ACTION pretends to be smart enough to follow all this ;) <paroneayea>I think I'm catching on-ish but I need to play with it <mark_weaver>and put those into the bytevector using 'bytevector-u64-native-set!' or the appropriate one depending on the size of pointers <mark_weaver>but frankly, I would just do the 'make-c-struct' thing. <mark_weaver>libffi, which is what our dynamic FFI is based on, doesn't have support for arrays at all <mark_weaver>so in practice, we have to model them as structs sometimes. <paroneayea>mark_weaver: so can a struct of all strings be used interchanably with an array of all strings in C? :O <mark_weaver>and from my recollections of reading the C standards on the layout of structs, I think this is safe and portable. <mark_weaver>modern C compilers are allowed to make various assumptions about what can be accessed via a pointer, if that pointer is not a void* or char* <mark_weaver>so, for example, if you do (*p = 0) from some code, where p is a pointer to a struct, the compiler can assume that nothing but a struct of that type was changed. <mark_weaver>this is from memory, so I might be slightly off on some of the details <paroneayea>mark_weaver: well, far above my head, but like so many things in my life, filed into orgmode until the time where I can fully understand it! <mark_weaver>so I think it's not quite true that "all strings be used interchanably with an array of all strings in C" <mark_weaver>however, I'm sure that libffi has that angle covered in any number of ways. <paroneayea>mark_weaver: this is a lot to wrap my head around, but I will do my best to do so! <linas>gahh this is 3.5 times more annoying. My app breaks when I do (define x "Ćićolina\\n") I get an encoding error.My brain hurts. <linas>actually scm_eval_string(scm_from_utf8_string(" (define x \\"Ćićolina\\n\\")"))); which is nuts because this works great on alternate fridays ***michel_mno_afk is now known as michel_mno
***michel_mno is now known as michel_mno_afk
<nalaginrut>paroneayea: you mean snap is written with Scheme now? <amz3>nalaginrut: it's written in javascript <amz3>I've written another markdown parser nalaginrut, this time there is no dependency :D <paroneayea>I was told more schemey, got perhaps disproportionately excited ;) <amz3>nalaginrut: magic magic it has to be magic <amz3>they say it's schemey but it has for loops, I'm not sure i understand the sens of schemey <amz3>« It also features first class lists, first class procedures, and continuations. » <amz3>it's a good start for a modern language <amz3>I think they have a strong point with visual programming <amz3>nalaginrut: I started with the first parser combinator for GLL (racket) then made my way to get it to work for me <amz3>It still need a bit more testing, but if this parser combinator thing is as powerful as it is said, I'm hopeful that this will work flawlessly. <amz3>I will convert my blog to it <amz3>some library/tutorial in racket to explains how to build a parser combinator <amz3>the first parser combiantor is enough to do what I need <amz3>I'm wondering how big markdown parser is using LALR <amz3>This week, I did a lot of progress <amz3>it's good article, especially if you racket <amz3>for instance, I did not bother to implement the `memo` procedure <amz3>I tried to build the second parser with continuation and LL but failed <amz3>there is trampoline in the third parser :) <amz3>or is it introduced in the second <nalaginrut>amz3: good, we may delay `memo' to the future, I'm pleased to see more parsing tools of Guile, for multi-lang <taylanub>paroneayea: if you ever do something more complicated with C arrays/structs/whatever and bytevectors, you might want to check <https://github.com/TaylanUB/scheme-bytestructures>. (the README is quite convoluted so maybe just check the examples.) it suffers from lack of usage though; while the library is constructed very logically with maximal generality, it's not fine-tuned to make the most <taylanub>common usage patterns most convenient (because I don't know what the most common usage patterns are). <paroneayea>is libffi efficient enough where doing (pointer->procedure) in every invocation is fast enough? <paroneayea>I figured you'd want to do that once per commonly used function and (define) it somewhere you can access <paroneayea>ACTION was just looking at guile-sqlite3 and it was done per invocation <davexunit>surely wingo wouldn't do that unless there was good reason to do it. <paroneayea>maybe time to move this into a git repo and out of my ~/sandbox/ directory :) <paroneayea>Should I call my guile FFI powered postgres bindings library something boring like "guile-pg-ffi" or should I call it "heffalump"? <amz3>paroneayea: where did you publis your bindings? <paroneayea>I'm brand new to the FFI and postgres's C bindings simultaneously <linas>mark_weaver et al. After pulling from git last night, I still get utf8 bugs. The easiest one is this one: <linas>(use-modules (rnrs bytevectors)) (setlocale LC_ALL "") (string->utf8 "Småland") <linas>which returns an incorrect byte sequence for the icrcle-a (should be hex c3 a5 U+00E5) <linas>similar results for (string->utf8 "Hòa Phú Phú Tân") <daviid>linas: I'd stick to 2.0 for now if I was you :), any reason why you're using 2.2? wingo said, a while ago, there will another +- 12 releases of 2.1 before it becomes stable... and wingo is working in his retreat, he is +- uncommunicable atm ... <amz3>retreat? he will leave guile? <sneek>I last saw wingo on Jun 24 at 08:05 pm UTC, saying: and how long prerelease will take, no idear. <linas>I forget why I started playing with guile-git. There was some rational reason, though :-) <linas>One problem is that my app, when linked to guile-2.0.9 had a set of bugs that go away when linked to guile-git <daviid>well it's good to use the source, but checkout the stable-2.0 branch <linas>and right now, I get utf8 bugs, no matter which one I link to .. except that they are different bugs, almost oppiste of one another <linas>thanks daviid because I'm slowly getting tied in knots over this. <daviid>you should use the stable-2.0 git branch: 2.0.11 which has a lot of patches compared to the tarball <amz3>davexunit: thx I will have a look at it. hopefully it doesn't have the same problem as mine <daviid>linas: there are no related utf-8 bugs in stable afaict, but if, you'd immediately receive support from mark_weaver, I think <amz3>davexunit: once I start combining parser, the parse result value becomes a mess of nested list and ordering is chaotic <davexunit>your parser should transform lists into the relevant data structure <amz3>I need to solve the problem on a example by exmaple basis (almost) <davexunit>I have a sequence parser that returns a list <daviid>linas: when you git clone guile, you're in master [as for any git clone]. the problem is that the guile scheme wrt git, is master is the devel branch, you have to checkout the stable-2.0 to work with stable <davexunit>but by lifting a list->foo procedure into the parser, I can transform as needed <amz3>davexunit: what is lifting, please, I saw that term already but... don't remember <davexunit>lifting is the process of transforming a non-monadic function into a monadic one. <davexunit>so, in the case of our parsers, let's say we parse a sequence of characters <davexunit>our parse-sequence parser returns a list of characters, but we would like the final result to be a string <davexunit>there's a list->string procedure, but the interface isn't right. <davexunit>parsers take the character stream as an argument <davexunit>are you familiar with the term "return" in the context of monads? <davexunit>in my library, (parse-return "foo") would return a parser that *always* returns "foo", regardless of the character stream given to it. <amz3>I mean even monads I don't them <daviid>davexunit: cool! is this code on your git host maybe? <davexunit>amz3: okay. I'm not great at thinking in terms of monads myself, but I get the rough idea. I'd recommend reading up on them a bit to understand how my code is working. <davexunit>I think my code is rather straight forward, so hopefully it can assist your understanding. <davexunit>daviid: I will upload it, but so far it's only in that paste. <amz3>I should do a bit a reading indeed <daviid>ok, let me know please, and tx to share it <linas>problem is, I have multiple production servers running on multiple machines :-( <amz3>davexunit: how do you build the ast ? <amz3>there is no parse-node pseudo parser that "annotate" the stream ? <amz3>how do you interpret the output then? <davexunit>the user can build the representation that they need <paroneayea>davexunit: boo, "winnie the pooh" never entered the public domain <davexunit>by writing a parser that produces the right AST object <davexunit>I still blame Steamboat Willie for all of this. <paroneayea>or maybe I should give up and give it a boring name <amz3>davexunit: that was my question, I need to write parser that produce that ast, that's where the problem the parser I have <amz3>I try to build an ast in a semi-automatic way, using parse-each and parse-any, but it doesn't work well, I think I need another kind of parsers <amz3>also what do you think of (parse-if) and (parse-not) <amz3>I had to build those to make the markdown parser <amz3>actually I don't know otherwise <amz3>to give you an exemple of how building ast is difficult in my program. Writing one-or-more in terms of zero-or-more doesn't result in the same ast <davexunit>amz3: what are the semantics of parse-if and parse-not? <amz3>(parse-if predicate parser): parser consume stream only if predicate is valide. in guile-log is called (f-and) <amz3>(parse-not parser) fails is parser succeed, and succeed if parser fails <amz3>without consuming stream <amz3>I don't understand the purpose of maybe <davexunit>amz3: if the parse fails, it returns a default value instead. <paroneayea>mark_weaver: if you don't mind helping with one more thing, I'm very close on this... <linas>daviid I get the same utf8 errors with the guile-git stable-2.0 branch as well. <linas>so, for my server app, it seems guile git master is the one with the least utf8 craziness in it. <linas>which is a great reason to use git master in the first place ... <daviid>linas don't use master and report a bug for stable if you think it is a bug, <daviid>I'm using utf-8 daily, never had a single bug in years, but i don't use socket and ... just current-*-ports and files <davexunit>amz3: in terms of ASTs, they return the SXML representation of the markdown <amz3>I have a post-processing step <davexunit>parse-map does the necessary post-processing <amz3>yep parser-map seems very useful <amz3>I'll have a better look tomorrow <amz3>I fixed my ast processing somehow ^^ <davexunit>that is sugar for (parse-map (match-lambda ...) ...) <linas>daviid, yes, except that master is the least buggy of all the different versions of guile I played with. <linas>Isolating bugs is a huge effort. <linas>I've already lost 2-3 days trying to track this one down. <linas>and its hitting productions serverss such that they have to be rebooted a few times a day. <linas>Its simpler just to run on master, than to try to debug stable <daviid>linas, I really don't understand, using guile-2.1 for production servers ??? <linas>its the lest buggy version of guile out there <linas>I got lazy, put se3veral production servers on the old, stable guile, and promptly ogt hit <daviid>and wingo is not available, you won't have support before nobody kows when [not blaming him here, he jsut works on the compiler and has recently changed plans ..] <linas>problem is they guy runing them didn't tell me <daviid>linas at all cost, production servers, run guile-2.0, ans send a tinny example, you will receibve support <linas>yes, but as I said, guiile-2.0 is bbuggier than master <linas>has been like that for years. <linas>I've filed multiple bug reports. Some have gotten fixed <linas>some are bugs that I was told would never be fixed in stable, because they are not wiorth the effort <daviid>real bugs always get fixed, but some poepl send bug in fact it's rfi, not the same thing ... <linas>stable has multiple real bugs in it that have not been fixed <linas>and I am not the only one to hit them <linas>and I do, every now and then. <daviid>but running guile-2.1 in production is just crazy, imo <linas>Look, I;'ve been doing this for what, 15 years nnow? <linas>it doesn't crash with weirdo GC bugs <linas>and it does utf8 mostly a whole liot better <linas>oh, and its significantly faster <daviid>the last 1 is expected of course, but ... do as you wish <daviid>linas: did mean to criticize you by the way, good luck!