IRC channel logs

<daviid>any dynamic ffi pro here ?

<peterbrett>daviid: ther's no difference

<peterbrett>daviid: The C standard technically requires that in a function prototype, an empty argument list is declared as "void"

<peterbrett>daviid: Most compilers are tolerant of just using () instead of (void), unless they're in "pedantic" mode

<peterbrett>So, if you see "int foo(void);", it means exactly the same thing as "int foo();"

<peterbrett>daviid: Does that help?

<daviid>peterbrett: ok thanks, so I guess using the dyn ffi, (list) is the right thing to do

<daviid>using it crashes guile anf geiser though :)

<peterbrett>That seems bad!

<daviid>let me paste something, just in case somwone wants to help, it really is s hort code

<peterbrett>I think you want (pointer-procedure '* (dynamic-func "gtk_clutter_window_new" %libclutter-gtk) '())

<daviid>ah, let me try that before to paste then

<daviid>oh but that should be the same thig

<daviid>'() and (list) [which is safer if i am not mistaken]

<daviid>that worked! I wonder why

<peterbrett>No idea, I just read the manual and typed it back to you :P

<daviid>wingo: let me ask the most improbable quizz ever :) is there a way to mix our guile-gnome[clutter] bindings and 'some stuff' using the dynamic ffi? I guess not but ...

<daviid>I guess it is impossible

<daviid>yep, not only would it require to convert pointers to [gobject] gtk and clutter class instances, but it actually won't even let 'try':

<daviid>(use-modules (gnome-2) (oop goops) (gnome gtk) (gnome clutter) (clutter-gtk))

<daviid> (define w (gtk-clutter-window-new)) -> GLib-GObject-WARNING **: cannot register existing type 'GtkWidget'

<daviid>followed by GLib-GObject-CRITICAL **: g_type_add_interface_static: assertion 'G_TYPE_IS_INSTANTIATABLE (instance_type)' failed

<daviid>I was dreaming a bit too much I guess :)

***siel_ is now known as siel

<mark_weaver>sneek: later tell wingo: I did "make clean && make" at tag v2.1.2 on GuixSD i686, and still see the compile failure. Here's the tail of the build log: http://paste.lisp.org/+6KPR

<sneek>Will do.

<civodul>Hello Guilers!

<artyom-poptsov>Hello civodul

<Phlogistique>yo

<amz`>héllo :)

***octo_ is now known as octophore

<civodul>wingo, mark_weaver: time to announce a potluck?

<wingo>a bit late, but sure :)

<sneek>wingo, you have 1 message.

<sneek>wingo, mark_weaver says: I did "make clean && make" at tag v2.1.2 on GuixSD i686, and still see the compile failure. Here's the tail of the build log: http://paste.lisp.org/+6KPR

<civodul>wingo: i think we're on time since traditionally it's announced a week before :-)

<civodul>could you send a message?

<wingo>sure, please remind me if i don't get to it today, big deadline at work tomorrow

<civodul>ok

<amz`>potluck?

<paroneayea>ooh, another guile potluck

<paroneayea>motivation to get more 8sync stuff done? :)

<davexunit>wingo: for your latest blog post, I read everything after the point at which you mention Alice's Restaurant in Arlo Guthrie's voice.

<wingo>davexunit: :)

<davexunit>has anyone ever considered the idea of using something like persistent hash tables to implement immutable record types?

<wingo>doesn't sound like a win to me, unless a record has many many fields

<davexunit>wingo: I find myself in the situation where I have a record with a bunch of fields, say a data type representing the player in a video game, and it's changing it's position 60 times per second.

<davexunit>its*

<davexunit>some fields change a lot, others don't.

<davexunit>usually, most don't.

<davexunit>so, knowing this, I create a type hierarchy

<davexunit>I make a data type for the mostly static stuff, so that creating a new <player> or whatever doesn't have to allocate so much.

<davexunit>as I was doing this, I thought, "could a specialized record type implementation for this sort of thing remove this tedium?"

<davexunit>maybe this idea is silly, I haven't done the relevant implementation and benchmarking.

<mark_weaver>davexunit: you can make a specialized record type with bytevectors

<mark_weaver>I'm unsure whether using types for this purposes is a good idea

<davexunit>mark_weaver: I guess what I had in mind was using a persistent data structure so that changing the value of 1 field was cheap

<mark_weaver>sounds like a cons cell

<davexunit>maybe it could be represented that way.

<mark_weaver>although if the type of that 1 field is not an immediate, you could do better

<davexunit>my constant fight with trying to use pure functions to describe simulations is GC pressure.

<mark_weaver>not represented as an immediate in SCM values, I mean

<davexunit>so finding ways to reduce allocation is valuable to me.

<mark_weaver>*nod*

<davexunit>I frequently have data types that have a "position" field that's advanced by the object's velocity 60 times per second.

<davexunit>it's often the only field that is changing so frequently, but I pay the penalty of allocating an entire new struct that shares nothing with the one it inherits from.

<davexunit>my trick thus far is to disassociate object position (or whatever is changing so often) from the other details of that object.

<davexunit>would be nice to formalize the trick somehow.

<mark_weaver>davexunit: I have an idea:

<mark_weaver>instead of storing the current position, record a (time, position) pair for some time in the past

<mark_weaver>which combined with the velocity, can be used to compute the current position at any future time.

<mark_weaver>well, an approximation of the time.

<davexunit>(in my case, the "time" is the number of game frames, not real time)

<davexunit>I have thought of doing something similar to this approach.

<davexunit>the (time, position) pair would be invalidated each time the velocity changes

<davexunit>so, that could potentially be a win for things that don't often change their velocity

<mark_weaver>more generally, coordinates could be represented by polynomials in time.

<mark_weaver>x0 + v*t can represent uniform velocity

<davexunit>it's difficult to not advance in discrete steps because things are rather dynamic

<davexunit>a player moving on screen can't be described as a function over time.

<mark_weaver>a + b*t + c*t^2 can represent uniform acceleration

<mark_weaver>and more generally still, you could represent a position as a function of time

<davexunit>that is also attractive, but has been difficult to make work.

<mark_weaver>but I'm not sure if this is good enough to be helpful

<davexunit>it's helpful.

<davexunit>but frequently, objects are dynamic in that they change how they move based on analyzing the world

<davexunit>a monster running after a player may have a constant speed, but it changes its direction based upon the player position.

<davexunit>for example.

<mark_weaver>if you think in terms of trajectories through space-time, then in any case where you know what the trajectory is *likely* to be for some period of time, you could avoid having to update the object until something changes that invalidates that predicated path.

<davexunit>right, so perhaps that "function over time" would change as necessary

<mark_weaver>the question is, how many of your objects spend a significant amount of time following a path that could be predicted likely in some time in the past

<mark_weaver>but the nice thing is, you don't have to be sure the prediction will be correct.

<mark_weaver>as long as the predicted trajectory holds, you can avoid updating the object

<mark_weaver>when something changes that invalidates the prediction, then you must update

<davexunit>yeah, so that could be a net-win in terms of allocation.

<davexunit>would probably need to memoize these procedures, in practice, since it's easy to foresee multiple call sites asking for the player's current position.

<davexunit>but that's neither here nor there.

<mark_weaver>so if you have a wheel that is turning, you can predict the circular paths of the points on that object.

<davexunit>right

<davexunit>I actually already do some of this in the form of "tweening", where I interpolate between two known states, an initial and a final, over a finite period of time.

<davexunit>so thanks mark_weaver, this has been a good brainstorm.

<mark_weaver>you're welcome! looking forward to seeing how it develops :)

<davexunit>we shall see.

<davexunit>this may come in handy in additional places, notably when rendering.

<davexunit>there's an effect known as "temporal aliasing", when you are rendering in-between two ticks of the simulation. if you render the state as it was at the last tick, the simulation appears to jump or jitter when the next frame comes around.

<davexunit>to fix it, you are supposed to do a linear interpolation between the current state and the previous state to smooth the animation.

<mark_weaver>one more thing: for solid objects, the positions of all the vertices on that object can be computed based on a relatively small number of variables, representing the position and orientation of the object.

<mark_weaver>for objects with many vertices, this could be a big win.

<mark_weaver>dunno, maybe this is already done and obvious.

<davexunit>yeah. right now I work with simple, axis-aligned bounding boxes.

<mark_weaver>*nod*

<mark_weaver>davexunit: how do you represent the position and orientation of solid objects in sly?

<davexunit>mark_weaver: for position I have a few vector types for 2, 3, and 4 dimensions.

<davexunit>for rotations, quaternions.

<mark_weaver>excellent!

<davexunit>and for doing render transformations, 4x4 matrices.

<davexunit>a vector can be used to make a translation matrix

<davexunit>quaternion for a rotation, etc.

<davexunit>or even just a simple theta value for simple rotations

<davexunit>I honestly haven't used quaternions much yet, but I know that they are necessary for interpolating rotations across multiple axes.

<davexunit>and they also avoid a phenomenon known as "gimbal lock" when trying to represent things like first-person camera viewpoints

<mark_weaver>quaternions are definitely the way to go

<mark_weaver>I guess I would consider replacing every quickly-changing-and-mostly-predictable field with a function of time that returns that field.

<davexunit>yeah

<mark_weaver>and here's an ugly hack you could do to reduce the number of allocations:

<mark_weaver>you could represent the object as a single function of time that returns multiple values: the commonly changing coordinates, and a record containing the relatively-infrequently-changed fields.

<mark_weaver>this could be used to reduce the number of allocations to just 1 per change of this function.

<mark_weaver>but I'm not sure this is worth it.

<davexunit>I could play around with that.

<mark_weaver>the cleaner way to support this kind of thing is to essentially allow *inlining* of records into other records, like nested structs in C.

<davexunit>yeah

<davexunit>but this is okay for now.

<davexunit>I anticipate that I will want to reduce computational overheard for computing the same f(t) multiple times, for which I can use a specialized form of memoization.

<davexunit>(allocation overhead, too, depending on the type of object being computed)

<davexunit>there's lots of work to be done here, now it's a matter of finding the time.

<mark_weaver>davexunit: I'm not sure the same f(t) will come up often enough to make memoization a win

<davexunit>it will depend.

<mark_weaver>except maybe for things that aren't moving at all

<mark_weaver>even for a single coordinate at uniform velocity, that's a 2 dimensional space of (x0,v)

<davexunit>on a single update, I often ask for the player position several times in different places.

<davexunit>did the player collide with a wall? did the player collide with an enemy projectile? did they collide with an enemy?

<mark_weaver>sure, but I assumed that the f(t) would be stored in the player record, so it wouldn't need to be created each time the player position is queried

<davexunit>it could be.

<mark_weaver>that would defeat the point

<davexunit>but I thought the idea was to only store 'f', not the position.

<mark_weaver>the whole point is that the f(t) would be stored in the objects, so that it wouldn't need to recomputed except when the object's trajectory changes in a non-predictable way.

<mark_weaver>oh, I see what you mean

<mark_weaver>sorry, I'm so used to mathematicians writing f(t) when they mean f, that I misinterpreted you

<davexunit>ohhhh

<davexunit>sorry

<davexunit>notation fail

<mark_weaver>no need to apologize, I should remember that I'm talking to schemer :)

<davexunit>what I was trying to say is that I will definitely ask for (f current-time) multiple times, in which case it would be good to cache it somehow.

<mark_weaver>well, the *really* hacky thing to do would be to not allocate f(t) at all, but instead return its individual fields as multiple return values from f

<mark_weaver>and I'm not recommending that, but it would be awesome if we had some way to do that without writing the source code that way.

<davexunit>as in, not allocate a new instance of a record type?

<mark_weaver>the more reasonable thing would be to embed a single-element cache inside f itself.

<davexunit>yeah, that was roughly my idea.

<davexunit>cache the most recently queried thing.

<davexunit>because that's the access pattern, typically.

<davexunit>successive calls with the same argument.

<mark_weaver>although I must say that introducing the cache is a very unfortunate complication

<davexunit>might not need it.

<davexunit>I won't worry about it until I see what this looks like when it's actually written.

<mark_weaver>it occurs to me that if f(t) needs to be sampled and thus allocated on every frame, then there's no point to doing any of this f(t) business at al

<mark_weaver>it's only a win if you can avoid the allocation

<mark_weaver>and so actually, you really want f(t) for every scalar

<mark_weaver>so that f(t) itself is a scalar, and can be unboxed

<mark_weaver>hmm

<davexunit>well, the overhead we'd be eliminating here is having to allocate an entire <big-struct>

<davexunit>and instead allocating only a small thing like a vector.

<mark_weaver>but if f(t) is a double, then it will need to be boxed because f(t) can't be inlined

<davexunit>if 'f' changes infrequently, then <big-struct> can be allocated infrequently

<mark_weaver>it only works well if f(t) is an immediate

<davexunit>allocating <big-struct> 30 times per second would already by 2x better.

<davexunit>instead of having to allocate the vector *and* <big-struct> 60 times per second.

<mark_weaver>is the cost of allocating <big-struct> significantly more than the cost of allocating <small-vector>? I think maybe not.

<davexunit>I'd have to benchmark it, but it would be one less allocation per update

<davexunit>and the benefits may be great for lots of objects.

<davexunit>sometimes I have over 1000 objects.

<mark_weaver>but in any case, if you know you'll need to allocate <small-vector> from f(t) in most frames, then you might as well make the object representation be <small-vector> containing f(t), with an extra field pointing to the <big-struct> of infrequently-updated field.s

<mark_weaver>and that avoids a lot of complication, and caching.

<davexunit>okay, so that puts me roughly back where I am now.

<mark_weaver>I think the f(t) trick is only a win if f(t) itself is an immediate

<mark_weaver>but: maybe f(t) *can* be an immediate in many useful cases.

<mark_weaver>if f(t) is a position on the screen, then 62-bit integers should be more than sufficient.

<mark_weaver>those integers could be converted to/from floats if desired, in such a way that the conversions could be inlined into the code that needs to do math of these positions

<mark_weaver>such that all of the floats are unboxed

<davexunit>here's an example: I created a <bullet> data type representing a projectile. it contains only 2 fields: the axis-aligned bounding box (the thing that changes each frame), and the "properties".

<davexunit>the properties are an instance of a second type called <bullet-type> that has the static features of the projectile.

<mark_weaver>davexunit: they would be represented as fixnums only when being passed to/from a function that must be represented as a real closure.

<davexunit>mark_weaver: hmm, maybe, but I would definitely need some sugar to cover this all up. :)

<wingo>mark_weaver: inlining of records into other records is like locality in a way

<mark_weaver>yeah, that's the question: if there a way to do it in such a way that the higher-level code is elegant

<mark_weaver>wingo: right

<wingo>so if your gc is good you can avoid that on the compiler level

<mark_weaver>wingo: the question is, can we perform some optimization that decouples records from heap objects, analogous to how procedures have been decoupled from closures in guile

<mark_weaver>to allow records to be inlined into other records without affecting the user code, roughly like procedures can be inlined into other procedures

<davexunit>at the end of the day, I want a way to make immutable data cheaper, and I (naively) thought that a persistent data structure could help by sharing unchanged fields.

<wingo>mark_weaver: i think on a fundamental level, that optimization is called bump-pointer allocation, and relocation that preserves allocation order

<mark_weaver>one could perhaps implement a new record system on top of bytevectors that allows nesting similar to nested structs in C

<wingo>of course there are things the compiler can do but it's really squirrely

<davexunit>another, somewhat related thought of mine, was how I could use something like object pooling for a large particle simulation.

<vanila>I have a question about coding a system with a couple components in it, say one component is a 'stack' type

<davexunit>something where I could still create a functional API on top, but at the low-level was working with a fixed-size, pre-allocated object pool.

<mark_weaver>BDW-GC would need to run in a mode that supports interior pointers

<vanila>i can implement it in 3 different ways, but then how could you choose one when you load up the code?

<davexunit>vanila: I don't quite understand the question.

<mark_weaver>I don't know, I'm thinking out loud. it could be that's there's no way to make this work reasonably, or it's beyond our abilities

<vanila>sorry its a bit hard to explain

<davexunit>mark_weaver: thanks for the brainstorming session.

<mark_weaver>davexunit: sure, it's been fun to think about :)

<wingo>davexunit: i think your problem is that you have lots of allocations that die young. you want a generational gc :)

<mark_weaver>or even just a GC where allocation is much cheaper, like a moving GC

<davexunit>wingo: yeah, I suppose so.

<davexunit>for my own purposes, though, I would be happy to hack together something like an object pool, but I haven't found a way to make it not completely awkward.

<wingo>mark_weaver: not sure tbh. have you measured bdw-gc's throughput on repeated allocations of a fixed size? i found it to be pretty fast

<wingo>problem is less the throughput than the usual latency of a gc

<wingo>clearly we need to introduce "placement new" into scheme, lol :)

<mark_weaver>for a game, I think that I'd be sufficiently concerned with worst-case latency that generational GC wouldn't solve the problem to my satisfaction.

<wingo>(define (alloc-loop n) (let lp ((x #f) (n n)) (unless (zero? n) (lp (cons #f #f) (1- n)))))

<wingo>> ,time (alloc-loop #e1e8)

<wingo>;; 0.531861s real time, 0.536956s run time. 0.006589s spent in GC.

<wingo>not bad, allocating and freeing some 2e8 pairs a second

<wingo>er

<wingo>heh

<wingo>i fell into the microbenchmark trap :)

<wingo>friggin compiler optimized out the cons

<davexunit>lol

<mark_weaver>:)

<wingo>(define (alloc-loop n) (let lp ((x #f) (n n)) (if (zero? n) x (lp (cons #f #f) (1- n)))))

<wingo>> ,time (alloc-loop #e1e8)

<wingo>$1 = (#f . #f)

<wingo>;; 1.931374s real time, 3.738833s run time. 2.477520s spent in GC.

<wingo>run time greater than real time because parallel marking

<wingo>so, 500K pairs/s.

<wingo>this is on my laptop, i think 2 cores and 2 hyperthreads/core

<davexunit>what's the overhead for delimited continuations like?

<wingo>er

<davexunit>I use them pretty extensively, for scripting.

<wingo>i can't math

<wingo>#e5e7 pairs/s :)

<wingo>800 mb/s

<davexunit>I need to get my code organized enough to do some more serious benchmarking.

<wingo>davexunit: i don't know tbh. depends on the size of stack that they capture but there are some things that could be lighter weight

<wingo>there is an essential part of the cost that is o(n)

<davexunit>okay.

<wingo>but guile probably isn't optimizing as much as it could

<wingo>probably your n is not big, dunno

<wingo>like when you suspend your continuation there's probably not a whole lot of stuff on the stack

<wingo>dunno, i am speculating

<mark_weaver>wingo: the thing is, davexunit is looking at uglifying his code to reduce the number of allocations by some small constant factor, and I'm pretty sure we could reduce the cost of allocation by a much larger factor if allocations were simply bumping an allocation pointer register in the VM, say.

<wingo>in that case there are some fixed costs that dominate i think, that we could optimize beter

<wingo>mark_weaver: have you seen the thread-local free-list stuff in the vm?

<wingo>i think the freelists that the vm has are pretty cheap for allocation; dunno tho

<mark_weaver>I confess I didn't look carefully.

<davexunit>wingo: thanks. I'm trying to come up with a functional way of representing scripted objects in a game. I could either use continuations or a monadic combinator approach. ;)

<wingo>davexunit: benchmark, if it matters :) otherwise pick the poison that tastes better :0

<wingo>:) rather

<davexunit>:)

<mark_weaver>okay, I guess I should do my homework before continuing this discussion.

<davexunit>I have homework to do as well.

<wingo>mark_weaver: yeah i dunno, it's pretty weird stuff. i suspect the issue is more pause times than throughput but i don't have the science to back that, either :)

<davexunit>pauses can be an issue, but they are less of an issue if they happen infrequently

<davexunit>if the GC is constantly running it's painful.

<wingo>davexunit: have you measured your allocation rates?

<davexunit>wingo: not properly.

<davexunit>does guile provide an easy way to do this?

<wingo>there's ,stat at the repl

<davexunit>I'd have to come up with something clever for that, at the REPL

<wingo>built on (gc-stats), in the manual i think

<davexunit>since Sly runs a game loop

<davexunit>okay

<davexunit>so I can wrap my game loop in a form that will gather stats

<wingo>or periodically print out (gc-stats)

<wingo>it's an alist

<davexunit>ah, yeah, that sounds fine.

<davexunit>I have a lot to do.

<davexunit>I haven't had much time to tinker with silly games.

<wingo>heh yeah

<davexunit>the last big thing I did was make sprite rendering cheap with a batch rendering implementation, but that was mostly just an OpenGL hack.

<davexunit>but now I can render 1000+ sprites without trouble

<davexunit>the next thing should probably be to rewrite the matrix multiplication routine in pure Guile and see what the float unboxing buys me.

<wingo>ACTION z

IRC channel logs

2016-02-08.log