Arc Forumnew | comments | leaders | submit | eds's commentslogin
2 points by eds 6634 days ago | link | parent | on: CL-Arc: Arc Compiler in Common Lisp

I was thinking about working on a meta-circular native code compiler for Arc, but that is way beyond my abilities at the moment, so I came up with something I think I can actually accomplish.

At the very least, if I write this CL-Arc compiler, I'll be able to write games in Arc using existing CL libraries for SDL.

-----

4 points by almkglor 6634 days ago | link

Why not? Every higher-level language is just a macro on the assembly language of a computer. Model your native code as a list like (__asm (mov 42 ax) (ret)). Consing is just a call to a predefined cons: (__asm (push d) (push a) (call cons) (ret)) - or by applying the lessons of Lambda The Ultimate X (__asm (push d) (push a) (jmp cons)). ^^

Okay, okay, I was actually planning to do something like that a long time ago, haha. But I haven't (yet) found any holes in the basic concept of modelling every function call (f x) as a macroexpansion on (__funcall f x), which expands to an (__asm) form with the function call. Then for functions themselves (fn (x) (f1 x) (f x)), transform the last function call to a tail call: (__asm_let x (esp 4) (__funcall f1 x) (__tail_funcall f x)). Stuff like that ^^.

-----

1 point by eds 6633 days ago | link

If you want to start writing an Arc native code compiler (in Arc), it just brings up a lot of difficult issues that I'm not sure how to deal with. A reader, GC, continuations, tail recursion, etc. all have to be implemented in Arc. You stop getting those for free once you remove the Scheme runtime from the picture.

So quite frankly, I have a lot to learn before I'll be able to take on a project like that.

-----

1 point by almkglor 6633 days ago | link

Encapsulation my friend, encapsulation. Just make a general sketch and leave the details a bit. Then write the details. Besides, you've already done it! Just s/CL/Arc/ your posted article, then s/compile to Arc/compile to assembly/

Sure GC is nontrivial, but Boehm's GC is not bad at all. And if you really need continuations/tail recursion than make everything continuation passing style (you'll probably need to anyway). And I'm sure raymyers' treeparse can help in the reader department.

(IMO the difficulty here in itself is probably the assembly code you'll emit for a given piece of code, not reader/GC/conts/tailrec)

Remember, CL is by itself not so similar to Scheme that you can directly use its reader, as well as its execution model, in your final product. You'll write your own Arc reader in CL anyway (in CL 'arc == 'ARC, in arc 'arc != 'ARC), so you might as well (tada!) write it in Arc and compile it down to assembly. You'll need conts and no, you can't trust CL enough to handle tailrecs.

(Hmm. Maybe I shouldn't be advising you, maybe I should be doing this myself to steal your thunder ^^)

-----

3 points by stefano 6632 days ago | link

CL reader is highly extendable and you can tell it to be case sensitive: (setf (readtable-case readtable) 'sensitive), if I remember correctly. It's possible, I think, to use it to read Arc code.

-----

1 point by eds 6632 days ago | link

Is this portable?

But assuming it works in some form or other, it should remove most of the need to write a custom reader. Though there is still the issue of (for example) complex number syntax, etc.

-----

3 points by stefano 6632 days ago | link

It's in the standard (CLTL2). You can fix everything, because you can tell the reader to use your own functions on particulars characters, as an example this is a piece of code that lets you use Arc [... _ ...] syntax in Common Lisp:

(defun read-square (stream c) (declare (ignore c)) (let ((body (read-delimited-list #\] stream t))) `#'(lambda (_) ,body)))

(set-macro-character #\] #'(lambda (x y) (declare (ignore x y)) (values))) (set-macro-character #\[ #'read-square)

-----

1 point by eds 6633 days ago | link

So would you use the existing (C/C++) implementation of Boehm GC? If so then doesn't that make this not completely implemented in Arc? If not then that's one more piece to write (although I guess it isn't too difficult to translate code that's already been written in another language).

Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters.

You have a good point about the reader, I should probably add that to my proposal. And writing it in Arc would be an interesting exercise.

And can't I trust CL about tail recursion? Most decent implementations do tail recursion, right? And can't I tell people to stay away from those that don't?

But suppose I can't trust CL to do tail recursion. What am I supposed to do about it?

-----

1 point by almkglor 6633 days ago | link

> So would you use the existing (C/C++) implementation of Boehm GC? If so then doesn't that make this not completely implemented in Arc?

And neither is Linux completely implemented in C, and C compilers written in C are not completely implemented in C, because bits and pieces of the libraries they link their code to are written in assembly.

> Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters.

The Lambda the Lutimate papers are a good place to start if you're interested - they include some hand-written assembly code equivalents to Scheme/Lisp code, largely function calls and prefix/suffix. Given that the most basic axioms of Arc include (fn ...) and a function call syntax, this would be quite of interest.

> But suppose I can't trust CL to do tail recursion. What am I supposed to do about it?

Use 'prog and 'go? ^^ Lambda is the lutimate!!

-----

1 point by sacado 6632 days ago | link

"Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters."

I found a good link / tutorial about how to compile a subset of Scheme to C language. The compiler is about 800 lines of Gambit Scheme (blank lines included) and even deals with tail-recursion and continuations ! (well, that's based on the lambda papers...)

No GC, but you can use Boehm and have one for free.

-----

1 point by stefano 6632 days ago | link

You can trust CL to do tail recursion. I'm not sure if the standard requires it but every decent implementation must provide it.

-----

2 points by eds 6632 days ago | link

I think I'll just say that CL-Arc is only compatible with the subset of CL implementation that do tail recursion. (Which happens to be all the implementations I would consider using anyways.)

-----

1 point by kens2 6634 days ago | link

Garbage collection?

-----

3 points by almkglor 6634 days ago | link

The only time you need garbage collection is when you're allocating new memory. This is of course abstracted away into the 'cons procedure you end up calling at each (cons a d) - basically 'cons triggers gc if necessary. You may then very well just use the someone-Boehm garbage collector for C, which will (I think!) helpfully look at registers and stack for you.

The someone-Boehm GC (reportedly) works well with C - I'm reasonably sure that it will work well with assembly.

-----

1 point by sacado 6634 days ago | link

only cons ? What about bignums ? And strings ?

-----

2 points by stefano 6633 days ago | link

If you use the Boehm GC, you can handle everything simply by calling GC_malloc every time you need memory.

-----

1 point by sacado 6633 days ago | link

Yes, you're right. Sorry about that.

Anyway, maybe the right way to do so is by destructuring a Scheme implementation ? Starting from a given implementation, you write your compiler from scratch, but use the facilities of the chosen implementation for the reader and the GC. Then, once it's working, you gradually remove the scaffolding by implementing these things by hand...

-----

1 point by almkglor 6633 days ago | link

Well, if you're going to end up implementing something like my unrolled-lists ideas, then everything can very well be a cons cell underneath. Including bignums and strings.

-----

4 points by stefano 6633 days ago | link

PicoLisp (http://www.software-lab.de/ref.html#cell) uses cons cells to implement everything, from bignums to strings.

-----

1 point by sacado 6633 days ago | link

Hmm, looks like an interesting beast... I'll have a look at it some day...

-----

1 point by eds 6634 days ago | link | parent | on: CL-Arc: Arc Compiler in Common Lisp

Wow, did you come up with that yourself? I'd like to quote it if you don't mind.

-----

1 point by almkglor 6634 days ago | link

Err, yeah. I tend to pattern match on lots of things - I've got some weird takes on Buddhist philosophy somewhere on the boards too.

-----

5 points by eds 6634 days ago | link | parent | on: Can't see the bug in my short routine

It's because 'quote doesn't necessarily create a new list every time (at least not in function definitions). If you do

  (def foo ()
    '(a b c))
(foo) will return the same list every time, so if you modify the list in or before your next call, you'll get the modified instead of the original value.

-----

2 points by map 6634 days ago | link

That helps. Thanks.

-----

2 points by eds 6634 days ago | link | parent | on: CL-Arc: Arc Compiler in Common Lisp

This is a draft of my proposal for an Arc compiler in CL as a part of Google's Summer of Code. Any feedback would be appreciated.

-----

1 point by eds 6633 days ago | link

I posted a slight revision to my proposal, fixing some errors in grammar.

-----

2 points by eds 6635 days ago | link | parent | on: Optional outer parentheses / exterior symbols

>> In the case of Python, yes, it is prejudice.

> No, it's common sense. Call me old-fashioned, but I prefer being able to tell if code is syntactically correct just by looking at it instead of having to perform a hex dump.

Are you trying to say that source code (either with or without meaningful indentation) requires a hex dump to read? Or are you talking about object code (in which case I am pretty sure indentations is entirely irrelevant).

>> Need I remind you what happened when Larry Wall tried to combine all the great ideas from a bunch of different languages into one?

> Yes, it got hugely popular. You got a problem with that?

I don't know about you but Perl's conglomerate syntax, combined with all that extra magic, makes Perl pretty unmanageable in my opinion after a couple hundred lines of code.

From pg's essay Arc at 3 Weeks:

"We're also going to try not to make the language look like a cartoon character swearing. [...] (Dan Giffin recently observed that if you measure Perl programs by the number of keys you have to press, they don't seem so short.)"

-----

2 points by eds 6637 days ago | link | parent | on: Dual Installation of MzScheme

I don't know about dual mzscheme installations, but I do know that the stable Anarki branch from the git repo is compatible with versions up to 372 (and newer, with some slight modifications).

The repo can be accessed at git://github.com/nex3/arc.git , or if that isn't convenient for you I have a tarball up at http://blackthorncentral.net/files/nex3-arc-stable.tar.gz .

-----

5 points by eds 6639 days ago | link | parent | on: Create your own collection: cached-table

Seriously, this Arkani naming thing is starting to bug me, I think we should call it Arki. Or another name that doesn't take a couple of seconds of mental parsing to figure out you're talking about the wiki, not Anarki.

-----

7 points by almkglor 6639 days ago | link

Err, Anarki is a wiki, right? ^^ LOL!

Okay, okay, I'll do s/Arkani/Arki/g ^^.

Edit: done!

-----

2 points by eds 6639 days ago | link | parent | on: Alternate form of fn

It is true though that whenever you overload an operator you depend on the programmer understanding more about a single operation. Thus by definition overloading increases the complexity of a language.

Please see (and refute if you like) my comment at http://arclanguage.org/item?id=5213 .

There is some significant complexity in setting a variable. Note the current existence of four (or more) different forms of 'fn: 'fn (anonymous), 'afn (capturing 'self as a local name), 'rfn (using a user defined local name), and 'def (using a user defined global name). All these forms exist for the purpose of creating functions, but they each has a different way of setting varaibles.

How do you capture that complexity if you want to start combining them into a single operation? Do you just ignore everything except 'fn and 'def and say the user can type the extra character for 'afn and 'rfn? That seems rather inconsistent.

-----

1 point by tokipin 6639 days ago | link

> > It is true though that whenever you overload an operator you depend on the programmer understanding more about a single operation. Thus by definition overloading increases the complexity of a language.

that's if we're looking at it from the perspective of the operator. but if we look at it from the perspective of an expression, i think it is simpler. to illustrate: one of the nice examples on the front page of ruby-lang.org is the following (keeping their comments):

  # Ruby knows what you
  # mean, even if you
  # want to do math on
  # an entire Array
  cities  = %w[ London
                Oslo
                Paris
                Amsterdam
                Berlin ]
  visited = %w[Berlin Oslo]
 
  puts "I still need " +
       "to visit the " +
       "following cities:",
       cities - visited
if we look at it from the perspective of the operators, we see we've overloaded + and - for non-numeric types. however, if we look at it from the perspective of the programmer as they are writing those lines, we see that the programmer just wants to say "remove the cities i've visited from the cities i have to visit," and "cities - visited" is almost a direct translation of that. it makes sense. the fact that that operator happens to be overloaded from another domain is irrelevant. the context secures the role of the operator

-----

1 point by eds 6640 days ago | link | parent | on: Alternate form of fn

"You want brevity? You can't handle the brevity :)"

http://arclanguage.org/item?id=3355

And as I said before (http://arclanguage.org/item?id=5189),

  (fn (() . args) body)
actually works under the current implementation of Arc, it just throws away the first argument.

-----

2 points by eds 6640 days ago | link | parent | on: Arc Summer of Code Project Ideas

An Arc native code compiler would be very nice, but unfortunately I don't really know anything about compiler implementation. So for me it would probably too long a haul for one summer.

Still thinking about Arc Psyco... or a CL implementation of Arc....

-----

1 point by sacado 6640 days ago | link

Then I vote for Psyco :) !

Seriously, I think a CL implementation would be a good thing to have, so as to have access to all the CL library and world, but that would also mean having a somewhat competing implementation, and I'm not abig fan of that. Mainly because time spent on maintaining this implementation is not time spent on the "canonical" one.

But that's just my opinion and the CL implementation could be very valuable to others around here... And maybe a fully compatible CL Arc is possible (I mean one where you run your "canonical" Arc code unmodified, what ever this arc code is, should it deal with hard things to port such as FFI or ccc...), I don't know...

-----

1 point by eds 6640 days ago | link

How would one go about dealing with ccc in a CL implementation of Arc? (Isn't this the biggest difficultly with writing a version of Arc in CL?)

-----

3 points by almkglor 6640 days ago | link

I think pg's On Lisp contains some samples on transforming code to continuation passing style, which would allow us to create ccc. It would also mean bashing the code quite a bit, and CL doesn't necessarily optimize functions that are passed around and called in tail position, unlike Scheme which requires such to be optimize.

-----

1 point by sacado 6640 days ago | link

Yes, it is not required by the norm, but as far as I know, ACL, SBCL, CMUCL and CLisp all implement tail call optimizations (for the latter, only on compiled functions though). Maybe that could be considered enough to rely on it.

-----

1 point by eds 6640 days ago | link

Perhaps I'll have to go back and read that part of On Lisp some time. (I stopped after mvdo because the macros were hurting my head. But perhaps I've developed a thicker skin since then.)

-----

2 points by kens1 6640 days ago | link

Peter Norvig has a simple Scheme interpreter (tail recursive, with call/cc) implemented in Common Lisp at http://norvig.com/paip/README.html - it would probably be a reasonable basis. One option would be to implement Scheme in Common Lisp, and then simply run the existing Arc ac.scm on top of that. But it would probably make more sense to cut out the middle layer and modify Norvig's code to run Arc instead of Scheme.

-----

More