<mwette>I checked out guile from sv and created a branch wip-load-lang. Should I submit as patch or what? <stis>vijaymarupudi: Greate blogpost, thanks! <leinad>Is there some Guile lib/module that does XML pretty-printing? To be more precise: I have SXML and I would like to serialize it to properly indented XML. <leinad>Thanks, lampilelo, I will have a look <lampilelo>leinad: and you can also use some c library through ffi if there's nothing ready-to-use <leinad>True, but I think it's a bit heavyweight to pull in a C library for something this trivial :) <lampilelo>it's not very hard to make a wrapper over few functions <leinad>I'd rather write some small procedure since I already have an SXML representation of the XML <leinad>I like Guile's FFI interface, though <leinad>It's cool to be able to reach out for C code in only a few lines of Scheme code :) <civodul>leinad: i see what you mean, but there's no such thing as "pretty-printed XML" because whitespace is significant <leinad>Oh, I see... However if whitespace is significant why is xml->sxml stripping CRLF to LF. Is this than a bug? <civodul>leinad: whitespace is significant but crlf vs. cr is another story i guess :-) <leinad>so its the textual ports api that reads in the CRLF as LF? <leinad>do you have any suggestion on how to read a file like the one mentioned on the mailing list and transform it to sxml without losing the CRLFs? <civodul>if you read it using the binary port API, you get the exact bytes <civodul>now if you want to parse xml and then do sxml->xml, you'll somehow need to set the "EOL style" of the output port to the right value ***roptat is now known as Guest3438
***Guest3438 is now known as roptat
<leinad>civodul: from what I understand about xml->sxml's source using binary ports to preserve CRLF is not that trivial :( <leinad>internally it uses the SSAX parser which in turn uses its own port handling from what I understand by glimpsing over the sources <leinad>Do I miss something obvious you had in mind? <leinad>Apparently it is correct behaviour to strip a CR from a CRLF according to the XML Recommendation, so xml->sxml works perfectly <wingo>leinad: fwiw i see the same as you, (call-with-input-string "<foo>\r\n</foo>" xml->sxml) -> (*TOP* (foo "\n")) <wingo>but weirdly i don't see where the \r is getting trimmed <vijaymarupudi>leinad, could you replace the crlfs in the original string with a unique element say <crlf></crlf> and then replace it again in the output? <leinad>vijaymarupudi: hmm that might work :) thanks <leinad>wingo: the SSAX implementation in Guile is in accordance with the XML Recommendation. At least a comment in sxml/upstream/SSAX.scm before the ssax:read-char-data procedure says so :D <leinad>so it is indeed the application I am feeding my generated XML into which is misbehaving :/ <leinad>yes, but also annoying: the XML contains some JavaScript with //-style comments which doesn't go well with this behaviour as you can imagine ;-) <wingo>ah yes, html parsing is a mess <wingo>someone recently wrote a spec-compliant html parser in racket, we should port that over <wingo>the htmlprag that we have is well-meaning but out of date