Arc Forumnew | comments | leaders | submitlogin
A faster rainbow
9 points by conanite 2985 days ago | 6 comments
I switched rainbow from a continuation-passing-style implementation to a stack-oriented VM. This alone sped things up about 2x. Most of this effort was due to frustration with welder, an arc IDE, being excruciatingly slow. It used to take anything between 4 and 5 seconds to refresh (re-parse) arc.arc - that number is now down to under 400ms.

The stack-oriented VM runs faster than the CPS-based approach, partly because it allocates fewer objects per function call, but also because it does a lot more work compiling code in advance. The entire history of rainbow, in fact, is about getting more work done in the compilation phase. Rainbow is the first time I've implemented a language, and I think I'm beginning to understand the basics now.

The stack machine affords more opportunities for optimisation, as it's easier to spot inefficiencies when your code is pushing and popping the same values over and over again. Here are some examples:

The expansion of

  (or a b)
ends with something like

  (if gs24 gs24 nil)
which is exactly the same as

So when rainbow generates stack-instructions for conditionals, it checks for this case, and emits the raw symbol where possible. Of course, we could also modify 'or not to do this, but I think it should be a goal of an interpreter to optimise away this kind of inefficiency if it permits a more concise and elegant expression of an idea.

Another example:

  (do a b c)
expands to

  ((fn () a b c))
, an invocation of a closure with no arguments, no parameters, that is not assigned to any variable. Simply inlining this (instead of creating a closure then invoking it as rainbow did formerly), knocked 20ms off my then-380ms test case. It was a big win.

Another example: the invocation handler by default requested N parameters off the top of the argument stack, where N varied depending on the expression under evaluation. It turns out that the java equivalent of

  (cons (pop args) (cons (pop args) (cons (pop args) nil)))
for a 3-arg invocation is way faster than requesting the list of the last N args each time, mostly, I suspect, because the conditional loop of that request is eliminated.

It bothered me too that the VM would push args onto the stack only to have them popped off again directly, when these args were either literals or symbol lookups, ie their values were immediately accessible. Passing these via the stack seemed like unnecessary overhead, but on the other hand, there is an explosive combination of possible invocation types, as the java code for retrieving values is different for bound symbols, free variables, and literals.

  (self)          ; invoke a bound variable with zero args
  (map car xs)    ; (free-var free-var bound-var)
  (is x #\space)  ; (free-var bound-var literal)
  ... several hundred more
I hand-wrote handlers for what I presumed were the most common cases, and the results were good. So yesterday, I dumped the hand-coded versions, and hacked out an arc script to generate handlers for all combinations of invocations involving free variables, bound variables, and literals, for invocations up to 3 args. That makes about 170 different cases, and combined these optimisations knocked another 20ms off my test case. Note that this complicates the build somewhat: you need to build rainbow in order to run the arc script to generate the optimisation code. So to get the optimisations, run the build twice. I'll describe this more in the readme file.

So, with everything combined, my test-case

  (bm 100 (index-source a)) ; a is the text content of arc.arc
returned 303ms on the most recent run, compared to 313ms for arc3 on mzscheme. (It looks like rainbow is a little faster in this case, but I haven't upgraded to arc3.1 yet so I can't say for sure)

The other good news is that there are many more unexploited optimisations waiting. The expansion of 'aif, for example, could be treated with a special-case conditional instruction that doesn't pop its argument. There could be many more invocation handlers for cases where some arguments are the results of other invocations.

Useful to know: the two most significant performance gains were the following

1. run "java -server" instead of default client-VM java (at least on macosx 10.4 or 10.5, java5 or java6). They could have renamed this switch - "java -fast"

2. Quit firefox. Seriously. This speeds things up about 10% for me.

Just to recap, here's what rainbow does:

* a fully compatible arc3 interpreter that runs arc code unmodified

* still supports tail-call optimisation and first-class continuations

* for certain cases, appears to run at least as fast as arc3

* offers an easy interface to java libraries

* the whole stack-oriented VM was committed a day ago, so it's still very fresh, it hasn't had a thorough testing. All 393 automated tests pass, and stuff generally appears to work, but I haven't tried running news.arc on it yet.

And thanks to a suggestion from palsecam, the rainbow repl quits on ctrl-D (on macosx). Some people sent patches, thank you all!

2 points by palsecam 2985 days ago | link

Conanite, now this is really awesome!

Rainbow is a great project, and like for Arc, the performance was/is an important issue (at least, not making people fall asleep waiting for the arc prompt to show is a minimum).

Lots, lots of people here complained/complain about this point, and it's good to see that at least you don't ignore them. Performance matters in the real world, Google made his empire based on this point.

Apparently, the optimization stuff here is real, and this is great. I mean, no offense, but gaining ~ 10% on 1 unique function ('in), and w/ actually making the operation incorrect (for the nil case), is what I'd called premature useless optimization. But this is real, useful optimization. Super!

Thanks a lot!

palsecam, going to `git clone' rainbow ;-)


2 points by conanite 2985 days ago | link

fall asleep waiting for the arc prompt

this is a disadvantageous side-effect of using "java -server" - startup time is significantly longer. And I need to not run tests automatically on startup, that delays the prompt, too.

premature useless optimization

hey, I like my 'in :))


1 point by palsecam 2985 days ago | link

> fall asleep waiting for the arc prompt

This was more about the official Arc on MzScheme than Rainbow here. Rainbow is a special case.

> not run tests automatically on startup

Yes, I remember I disabled them when I was using Rainbow for the last time.

Oh, and just FTR, there were a failed test, which I think was about a little problem in internationalization handling. There was a message about a decimal number not being OK, because my locale is fr_FR and the correct decimal separator in France is "," and not ".".

(but this is really not a big deal. Plus, i18n is terribly complex to get right. Plus, most of us (people not having English as their mother tongue) are so used to get fucked we don't even bother with this kind of things anymore.)

> hey, I like my 'in :))

Hihi, I see you've submitted a new "correct" version on the concerned thread, this is really cool.

Anyway, don't take me too seriously. I just expose my views quickly & in a direct manner, and I don't think they're (at least always ;-D) correct. I'm glad you defend yours, and I respect you a lot for that. But sorry, the "useless" here, at least, was too much.

And on the end I suppose we think the same here: (performance) improvements are a good thing to do. I was actually glad to see the "A faster 'in" thread, I prefer that 100x than seeing a thread which would argue Arc is perfect in its current form and people are morons not to think so.


2 points by conanite 2984 days ago | link

Ok, the github repository is alright now, here's the url:

I found another couple of speedups so my test case is down to about 290ms now.


1 point by conanite 2985 days ago | link

oops, there's an error with my github commit, please check back later!


2 points by jazzdev 2972 days ago | link

Yeah, this is awesome. Great work!