IRC channel logs

2014-11-03.log

back to list of logs

<davexunit>hey guilers. does anyone have any thoughts about how to do fast linear algebra with guile? In particular, I'm interested in 4x4 matrix multiplication for 3D graphics.
<davexunit>I wrote an implementation using f32 arrays, but it's extremely slow.
<davexunit>I'm thinking I may have to write the procedures in C to avoid boxing/unboxing floats from guile arrays.
<ijp>hmm, for 4x4 there is no advantage to a fancy algorithm
<ijp>yeah getting rid of the boxing is probably the easiest way to go faster, but mark_weaver is probably the man to ask
<ijp>davexunit: you could try using wingo's leaf compiler
<davexunit>ijp: yeah, the problem there is that it's x86_64 only and it's not really something wingo is interested in supporting. it was just a neat hack.
<adhoc>davexunit: can you do blocks of 4x4 m-mults to pass out as a larger job to a C lib
<adhoc>the biggest issue is the packaging time versus individual array mult time, IME
<davexunit>maybe that's what I need to do.
<adhoc>so your back end library can compute some in parallel if it can
<davexunit>the other thing is that maybe I should give up on immutable matrices
<davexunit>and re-use the same memory
<adhoc>are you doing time seroes stuff ?
<davexunit>I'm trying to render a scene graph with reasonable performance
<davexunit>and currently matrix multiplication is by a huge margin the biggest bottleneck
<davexunit>I could speed things up a little bit with memoization, but I end up having to do a ton of matrix mults anyway because every renderable object needs to get multiplied by the camera matrix, which is often moving constantly.
<adhoc>right
<adhoc>there is a point at which you get it all working and then optimize bits of it
<davexunit>I have it mostly working.
<davexunit>the only thing not working in my scene graph right now is allowing cameras to be nodes in the graph.
<adhoc>i wonder if you do initial setup and pass things into a C lib?
<davexunit>I'm trying to stick to pure guile.
<adhoc>fair enough
<davexunit>and if I do use a c lib, I only want to use it for very low level things.
<adhoc>that reminds me ... danm this backup is *still* going =/
<davexunit>if there was a good library that I could wrap quickly to do this I would give it a try
<adhoc>yeah, all that kind of stuff is in fortran ;)
<davexunit>another option is to use an untyped array instead of a f32 array, but then I still have to convert that to an f32 array in order to pass a pointer to it in an OpenGL call
<davexunit>perhaps I could get away with using a single f32 array instance as the buffer, since the OpenGL context only works in a single thread anyway.
<adhoc>perhaps larger arrays into which you pass an offset ?
<adhoc>larger single object array
<ijp>davexunit: how were you taught matrix multiplication? were you just given the definiition and expected to memorise it, or did you get the function from vectors to vectors explanation?
<ijp>composition of functions*
<adhoc>ijp: i was wondering about vectors
<davexunit>ijp: I honestly can't remember.
<davexunit>but basically, I can *never* remember how to do matrix multiplication. I have to look it up every time.
<adhoc>davexunit: you don't use it enough ;)
<ijp>I remeember it these days, but only because I got the latter explanation (eventually)
<davexunit>well I never have to do it manually
<davexunit>the computer does it, so I'm free to forget the process
<ijp>davexunit: but how do you know when to use it, if you don't understand it?
<ijp>I suppose game programming has its own pedagogy here
<davexunit>admittedly I'm not very good at dealing with matrices, but I know enough to use them to translate, rotate, scale vectors
<adhoc>ijp: game programming is learning which corners you can cut ;)
<ijp>davexunit: okay then, so you know that multiplication is combining transformations
<davexunit>I should learn it better. I *did* understand it better when I was taking linear algebra in college.
<adhoc>we used to use Pi = 3
<davexunit>yeah, I know how to compose transformations to get the desired effect.
<adhoc>its much much faster ;)
<davexunit>or some rational number approximation that guile doesn't have to box :)
<adhoc>right
<adhoc>sound like the boxing is half your battle
<adhoc>davexunit: you know how i was stuck on the shri 43 thing because i'm running a guild too old
<ijp>adhoc: I remember my one of my maths teachers (about 2nd year of middle school as americans would reckon it) gave us a story about decorating a cake with 22 orange segments around the circumference, and 7 across the diameter :)
<adhoc>i wonder if there is a better way to get a vector library
<ijp>vectors as in vanilla arrays, or vectors as in cross products and whatnot
<davexunit>adhoc: for some background, I periodically bring up the difficulties of boxed floats here. unfortunately, it's a tough problem to solve. mark/wingo have come up with solutions, but they introduce other issues.
<adhoc>ijp: i worked for a games company a while back, we had some interesting ways to deal with speed in our core libraries
<ijp>oh right, by shri you meant srfi
<adhoc>ijp: yes
<ijp>even the author of srfi 43, our good friend Riastradh, doesn't really recommend it these days
*adhoc could have done with more sleep
*davexunit realizes that he is using srfi-43 in one module
<adhoc>ijp: whats the recommended alternative ?
<ijp>adhoc: to use a loop macro instead
<davexunit>I replaced f32 arrays with untyped arrays, and copied them to an f32 array before sending it to opengl. no noticable improvement. bummer. :(
<davexunit>at least it was easy to test.
<davexunit>I thought maybe reducing the floating point math would have a greater impact.
<davexunit>would object pooling even make sense given the presence of the gc?
<ijp>sure
<davexunit>I was thinking of pre-allocating a large amount of matrices
<davexunit>and using a guardian to add them back to the pool when unused
<adhoc>davexunit: i was wondering if the gc clean up is hurting you?
<adhoc>how do you find out how much time the gc is taking ?
<ijp>adhoc: gcprof
<adhoc>ok
<davexunit>yeah my current stress test is GCing many times per second
<davexunit>each run taking ~13ms
<davexunit>that doesn't add up to the 12FPS I'm getting, though
<davexunit>about 3 GCs per second from what I can see
<davexunit>hmm, when printing GC stats, does anyone know if the complete colletion time includes the marking time?
<davexunit>because I see this: World-stopped marking took 12 msecs (9 in average)
<davexunit>then: Complete collection took 14 msecs
<davexunit>I'm out of ideas.
<mark_weaver>davexunit: it may be better to wait until we have native code generation with unboxed floats in core guile, although I can understand if you are growing impatient.
<davexunit>mark_weaver: you are probably right.
<davexunit>I did some more digging, and it seems that floats aren't really the issue in this case. it's the amount of memory allocation in general.
<davexunit>I adjusted my test such that no array had floats in it, and performance was just as bad.
<davexunit>so, I need to reduce memory allocation, but it's difficult when each matrices are considered immutable.
<davexunit>I could make my procedures work via mutation instead of creating new objects, but that defeats my goal of giving the user a functional API.
<davexunit>weird, my computer crashed. :(
<mark_weaver>memory allocation is much more expensive than I'd like with bdwgc.
<adhoc>davexunit: games and animation is a balance between speed and accuracy
<adhoc>oh, and safety
<adhoc>the sheer number of crash bugs we had logged against our PS2 games in the early 2000's was mental
<adhoc>divide by zeros and off by one errors were enourmous, even in C++, where there is supposed to be some checking in the compiler and support libraries
<ijp>adhoc: are you a brit?
<adhoc>ijp: no
<ijp>commonwealth?
*adhoc worked in london for a few years
<adhoc>ijp: effectively
<ijp>so that explains the use of "mental" there
<adhoc>too much BBC tv in the 80's & 90's ?
<ijp>I never hear US people use it
<adhoc>ijp: no, because they speak another language
<davexunit>maybe I should be look into pre-allocating matrices and re-using them...
<mark_weaver>davexunit: I don't see how you'll know when you can reuse a matrix without doing something like GC yourself.
<mark_weaver>if you're going to go that route, it's probably better to just give up immutability
<davexunit>yeah
<davexunit>makes sense
<davexunit>well, perhaps I can devise a way to give the user a functional API, but use mutable matrices behind the scenes.
<mark_weaver>I'm reluctant to say it, but you might want to try porting your code over to racket in the meantime, and see if you can make it work well there.
<mark_weaver>I'd like to add an option for a precise moving GC to Guile at some point, but in truth it may be a while until we get there.
<davexunit>hmmmmm. okay. advice heeded.
<mark_weaver>right now, we have to lock a mutex for every block allocated, which is really not good.
<mark_weaver>I'd be curious to see if you can make a functional game library work with a better GC.
<davexunit>someone demonstrated just that at this year's racketcon
<ijp>davexunit: are you working on a game using sly at the moment, or just sly itself?
<mark_weaver>I think Guile will get there at some point, but it may be a while
<davexunit>ijp: I have a game in mind, but I'm working on small demos at the moment.
<nalaginrut>morning guilers~
***sethalve_ is now known as sethalves
<civodul>Hello Guilers!
<ArneBab>moin nalaginrut, civodul
<nalaginrut>heya
<wingo>morning
<dsmith-work>Morning Greetings, Guilers
***dsmith-work is now known as dsmith
***dsmith is now known as dsmith-work
<dsmith-work> [10:30]
<dsmith-work>ERC> [10:30]
<dsmith-work>ERC> /msg alis help list
<dsmith-work>Heya Guilers. Anyone up to an offtopic make quesion? Is there a way to inform make that an executable depends on a linker script (in the current directory)? WITHOUT it being pass to the linker as a file?
<dsmith-work>The linker script needs to be passed with the -T option.
<dsmith-work>Hmm. Maybe I can "edit" the dependancies....
<dsmith-work>Hah! works,
<dsmith-work>
<dsmith-work>yey. The wonders of IRC
<taylanub>good old rubber-duck debugging :)
<dsmith-work>I spent about 2-3 hours on Friday reading ld and make docs trying to come up with a solution so that if I edit the .gld file, the exec will get re-linked.
<dsmith-work>And mere seconds after I pose the question here, something comes to me..