IRC channel logs

2024-03-02.log

back to list of logs

<rschulman>SignalWalker: Did you do a syrup parsing library?
<SignalWalker>yeah
<SignalWalker>it's also pretty messy :V
<rschulman>Ah, well shoot. There goes some time spent. :)
<rschulman>Would you like assistance in your efforts? I'd love to help.
<SignalWalker>sure!
<SignalWalker>honestly, there's a very good chance your parsing library's designed better than mine
<SignalWalker>i'll give you the link to the repo in a sec, gotta finish something first
<SignalWalker> https://github.com/SignalWalker/rexa
<SignalWalker>this is not my finest work; almost nothing is documented and there's a lot of discarded artifacts of design changes lying around
<SignalWalker>and the syrup parsing library is basically a weird bad version of serde because i wanted to see what implementing that was like :V
<rschulman>Mine is unlikely to be better. I've just been using winnow to parse syrup into the existing `preserve` rust library types.
<rschulman>And even then, I only have the fundamental types done so far.
<SignalWalker>i see, i see
<SignalWalker>i used nom to parse the fundamentals, and then i wrote a couple goofy derive macros to parse records, etc. into structs
<rschulman>"a weird bad version of serde" :D
<SignalWalker>i really need to learn more about designing parsers tbh
<Zarutian_iPad>syrup is a bit hacked together format btw
<rschulman>What I've thrown together so far: s
<rschulman> https://github.com/rschulman/syrup-rs
<SignalWalker>nice
<SignalWalker>i've never heard of preserves; why use it?
<rschulman>Syrup is a serialization method for the preserves data model.
<rschulman>See the README for syrup for details: https://github.com/ocapn/syrup
<SignalWalker>wow i can't believe i somehow missed that
<rschulman>:)
<rschulman>What's a bit odd is that preserves already has two other serialization methods, which they've implemented serde on. I haven't thought through how to also do serde for it in a separate library.
<rschulman>I will say that it does feel like syrup (and preserves for that matter) were written by folks who don't work in type safe languages. :)
<cwebber>rschulman: what about json?
<rschulman>cwebber: Hmm, I think json suffers from the same. Basically, arrays/lists can have items of diverse types. :)
<rschulman>Its totally fine, I'm mostly just poking mild fun.
<cwebber>rschulman: also note that type safety means different things: dynamic languages used to be praised for being "strongly typed", more so than, say, C at the time, because an operation wouldn't *succeed* at corrupting memory, etc because runtime checking would protect it. so I think it would be more accurate to say statically typed
<cwebber>and this same criticism has been blasted at json, and it's true, but json is also the #1 encoding format in the world
<cwebber>and this extra comes up once you start moving into protocol space. See also: open world vs closed world systems
<cwebber>there are no right answers, both have different challenges
<cwebber>closed world systems are cleaner but can't be extended without a centralized committee
<cwebber>open world systems are flexible but people may say some things you don't fully understand
<rschulman>Er yes, sorry, I did indeed mean statically typed.
<cwebber>syrup is a perfectly good approach for the same type of design that json is designed to fill across this particular axis
<rschulman>And yeah, there are no right answers, like everything in life. :)
<cwebber>and json ends up being fine for many things
<cwebber>sometimes something like protocol buffers or cap'n proto is better, but the main use cases of those systems in practice tend to be a closed system
<cwebber>not a bad decision!
<cwebber>but not the design we're going for
<rschulman>I generally tend to fall toward the other end of the axis... I like that cap'n proto and protobufs have a document that lays out the expectations and from which code can be generated.
<rschulman>But this aint my rodeo, so I'm happy to hack on a rust version of syrup!
<rschulman>:)
<cwebber>yes, and it's worth considering how the upgrade path works in both
<cwebber>what happens when new information needs to be added? how is it coordinated?
<rschulman>True.
<rschulman>I think both approaches have pros and cons in that situation.
<cwebber>and also exploratory networked programming is generally not prioritized
<cwebber>yes upgrading information is challenging,
<cwebber>and there are different pros and cons
<cwebber>but ocapn is primarily built for a sea of cooperative systems with no centralized coordination of what the possibility space is
<cwebber>that's the reason for prioritizing the type of design ocapn has
<cwebber>not an accident, a decision point in the design space of tradeoffs :)
<rschulman>Makes sense
<cwebber>anyway, don't mean to sound defensive, just mean to explain
<rschulman>Not at all
<cwebber>excited about your implementation :)
<cwebber>and there are ways to tame an open system
<rschulman>And like I said earlier, I was just poking fun.
<cwebber>for a more static approach
<cwebber>np, I just take addressing it seriously
<cwebber>because I have had to address it in earnest many times
<cwebber>especially over the course of activitypub
<rschulman>Haha, I'm sure.
<calmclam>cwebber: If I understand correctly and to restate, you are saying that the more constraints a data protocol has around the shape of the data it may contain, the less likely it is to tolerate multiple uncoordinated systems utilizing it? Even if the more structured protocol supports backwords/forwards compatibility?
<cwebber>calmclam: That's part of it. But regarding the last sentence, imagine you want to build an open world system that can be extended in a number of ways unforseen by the initial authors without a central committee mediating those extensions. How does your forwards/backwards compatibility mechanism work?
<cwebber>for example, let's say we have the vanilla ActivityPub network with the base set of ActivityStreams messages, as defined.
<cwebber>now two different use cases want to attach information relevant to their needs to outgoing messages
<cwebber>both of these are developed in parallel in a decentralized system
<cwebber>what does the extension mechanism look like
<cwebber>describe in detail, because Cap'N Proto and Protocol Buffers, with their default upgrade mechanism, don't support this, and the question is, why? :)
<cwebber>whereas activitypub allows for defining new terms and attaching them very loosely to the object
<cwebber>which has been done repeatedly since the spec was initially launched
<cwebber>same with xmpp, rss/atom, etc
<cwebber>podcasting wouldn't have happened without this feature, for instance.
<cwebber>explain how you would like to accomplish this task, and what your "more structured" system looks like, and what the forwards/backwards compatibility paths are
<cwebber>this isn't mean to shoot it down
<calmclam>I see. So the problem is that while the more structured system may support backwards/forwards compatibliity, it's not possible for _independent_ entities to move the protocol forward without coordination, as they wouldn't then be able to talk to each other.
<cwebber>yes.
<calmclam>And so you necessarily end up with centralized control over the protocol, whether you want it or not.
<cwebber>yep.
<cwebber>And we're explicitly designing a decentralized system... OCapN has nautical metaphors because it's a vast sea of decentralized coordinating entities :)
<cwebber>it's possible to "tie down" a more loose system
<cwebber>eg in json, many statically compiled systems parse the structured data read from json into specific statically typed records
<cwebber>and people use things like json-schema, etc
<calmclam>Ok, thanks
<cwebber>likewise you can also turn a very "highly structured" system into something much more loose, otherwise json libraries which can parse arbitrary json data couldn't exist at all for haskell etc
<cwebber>Gödel'ing it, effectively ;)
<cwebber>not everything written in a highly structured system stays that way. If you implement a faithful NES emulator, within the abstraction of the running NES layer, it'll still have the goofy memory model of the NES, even though it was written within a lower abstraction system that doesn't have that property
<cwebber>Gerry Sussman had a yellow and black police tape over his office door at one point when I visited that said
<cwebber>"ABSTRACTION BARRIER: DO NOT CROSS"
<cwebber>and I think of it often :)
<calmclam>lol
<calmclam>I recently watched a video ("Simple Made Easy", Rich Hickey) that made the argument that less structured data was simpler than more structured data, and I've been struggling a bit to understand that. I guess this is one example of that.
<rschulman>Hm, could you get away with having a more structured protocol at the ocapn level while making the content of the messages passed from object to object being where the less structured data comes into play?
<cwebber>rschulman: yes, that's possible
<cwebber>rschulman: however I don't think it's really that strong of a benefit
<cwebber>since the biggest area people run into this who aren't library authors is in terms of how the usage of the system works
<cwebber>and the problem is the same there
<rschulman>yeah, shifting the layer at which the question occurs.
<cwebber>yeah
<cwebber>see also https://xkcd.com/277/ ;)
<rschulman>There may be something to be said for having the lower levels be stricter, but I'm not sure.
<cwebber>that's me, hopping on the roof of the car
<cwebber>rschulman: the main advantage is if it provides a compression mechanism, but one nice thing about Syrup is it's very very easy to implement
<cwebber>this is another reason json is popular even in very "structured" APIs that don't change much
<rschulman>"its dead simple" 4thewin
<cwebber>and the reason why x.509 / PEM / ASN.1 / BER / DER are widely loathed
<rschulman>I mean, I think those are probably loathed for reasons beyond just being statically typed... :)
<cwebber>however it's true that we could use something like protocol buffers, cap'n proto which provide *nice* abstractions for structured data. though last I checked, I was surprised to find out that Syrup expressions are typically *more compact* than cap'n proto ones
<cwebber>eg integers are often expressed in fewer bytes
<cwebber>iirc
<cwebber>but that's a vague memory of a conversation with zenhack, bless his soul
<cwebber>at any rate.
<cwebber>we could still compact the protocol by reducing the amount of symbol-tagging
<cwebber>and still use syrup
<cwebber>but for now a more verbose one is easier for develeopers to debug while exploring the protocol
<cwebber>and such an optimization can be done when the protocol's abstract stuff congeals
<cwebber>at any rate, the main topic on the table isn't whether to use syrup or something more structured
<cwebber>it's whether or not to use syrup or json
<cwebber>which would be much less efficient by a large margin
<cwebber>and would require a side-table for binary data
<cwebber>but would be at the benefit of not needing to implement syrup for some poeple
<cwebber>and using some existing tools like jq
<cwebber>it also wouldn't have canonicalization without using a separate tool, so it might not *actually* benefit from using developers' existing tools, since they may need entirely new ones
<cwebber>but that's the main debate we've been having
<rschulman>Yeah, I think for this purpose syrup feels like a better solution than json.
<rschulman>And honestly, I think the only thing I was "complaining" about was that arrays can have values of different types in them anyway. :)
<cwebber>:)
<rschulman>Which is a problem with json just as much, for that matter.