Arc Forumnew | comments | leaders | submitlogin
1 point by akkartik 4218 days ago | link | parent

Thanks a lot for the comments! In particular you've got me thinking about the numeric filename prefixes. The fatal errors have been getting onerous for me as well, thanks for drawing my conscious attention to them.

Thanks for the crazy load bug. I'll fix it later today.

Yes, loading wart is painfully slow. It's gotten 2x slower since the infix changes. Startup isn't doing anything special, it's just that the wart 'prelude' is a lot more code (since it's most of the language) than we typically type into the repl. I'll work on this some and get back to you.

You're right that relying on load order for generic dispatch makes some cases weird. On the flip side, it's powerful enough to permit multiple dispatch. Like the regex idea in http://arclanguage.org/item?id=16833 that permits either arg to be a regex. When I experimented with generic functions in the past I was always unsure whether to dispatch on the first or last arg. Things like keep like the last arg to be generic, for example. I think the problem isn't predicate dispatch, but checking predicates in load order. Is there a way to make the clauses load-order independent? Or an elegant way to influence load order without moving code around? Hmm, I'm going to think about this.

Regarding 6: I really don't intend to make fully-parenthesized lisp some sort of second class citizen. Feel absolutely free to stick with it for code you write. And I'd love to hear if it makes reading existing code more painful.

Regarding 5: I intended to show that construct_macro_call is an internal detail. I would move it later if I could, but it's used when extending def later in the file. I could a) just unindent it where it is, b) move it after mac and unindent it, c) move it to the bottom of the file, and not use the new mac to extend def (because that wouldn't work yet). Somehow I don't like any of those choices. What do you think?

Regarding 7: It didn't occur to me to watch keypresses on shift. That wasn't a motivator at all for eliminating parens or anything else. (I've got parens swapped with square brackets in my local environment for lisp.) In general, wart is more concerned with reading than with writing. But I'm going to think about it.

Regarding 9: Yes, this is a little weird when you come from other lisps. Wart doesn't consider ' to be an abbreviation for quote. It's the other way around; ' is part of the core syntax (it's processed during tokenization) along with backquote and unquote. The name quote is just for manipulating code. Things like if (car.form = quote), etc. I used to use it in my inliner/partial evaluator, but the inliner is not even in there anymore, and there's really no reason for code-manipulation stuff to be available so early on. I'm just going to take the long forms out. [Edit: done]

Another difference with traditional lisp: 'a is read as (' . a) rather than (' a). Similarly for backquote and unquote. This allows me to sidestep complexities like what happens in (' a b) or ,@,@x that I haven't found a use for yet.



2 points by fallintothis 4217 days ago | link

Is there a way to make the clauses load-order independent?

The Magpie stuff Pauan linked is really interesting. From my layman's intuition, though, it still only seems to be able to linearize clauses because they're limited. As long as we know what sort of clauses we're working with, we can define a partial order on them. The same also happens to hold true for "typical" generic methods that dispatch based on class, because classes readily form a poset (using the subclass relationship). I have a feeling that the answer would be somewhere around "intractable" for arbitrary code, especially if the code didn't have formal semantics we could reason about. I mean, arbitrary code might never halt, so it'd be impossible to linearize those cases to the "end of the line" just because (in general) the Halting Problem is undecidable. But hey, I'm not a doctor.

At least the order that definitions appear in the code easily linearizes the clauses, despite the elephant-in-the-room of loading multiple files.

Regarding 6: I really don't intend to make fully-parenthesized lisp some sort of second class citizen.

That's good. I didn't mean to mischaracterize. I think that might inadvertently be the impression (or even outcome?) from so much as having a no-parens rule: the thought goes from "let's do Lisp with fewer parentheses" to "let's do Lisp with no parentheses", slippery slope fallacy though that is. Plus, at a glance, their elision seems common in your source.

And I'd love to hear if it makes reading existing code more painful.

I think that's a big reason I don't like removing them. There are fewer visual cues for grouping expressions. Not as big a deal in something where indentation is all you have, like Python. A little weirder when my head's trying to fill in where parentheses should rightly be. Blah blah, curmudgeon, blah.

Regarding 5: What do you think?

It only has the one call site; could you just inline it using afn?

Regarding 7: It didn't occur to me to watch keypresses on shift.

It's more of a heuristic than a solid metric, but I definitely notice it. E.g., I get annoyed working in camelCase languages.

Regarding 9: I'm just going to take the long forms out.

So, how do I manipulate quotes in macroexpansions, as needed in qq.wart?

-----

2 points by akkartik 4206 days ago | link

"how do I manipulate quotes in macroexpansions, as needed in qq.wart?"

Yes, you're right that I need to bring back my definitions of quote, etc for qq.wart. But there was more to it than that. Here is -- I think -- a relatively faithful port of qq-expand including the optimizer: http://gist.github.com/3971932#file_080qq.wart. I say 'relatively' because wart doesn't support multiple args to quote/unquote/quasiquote, so a lot of your hardest tests become moot. Also, the backquote doesn't expand to a call to quasiquote in wart.

Sorry this took so long. Even when you provided such thorough tests it was hard to make myself focus on this exercise; there's always alternatives with more instant gratification. But in the end it was a fun exercise in recreating the 'theory' of a (tiny) codebase (http://alistair.cockburn.us/ASD+book+extract%3A+%22Naur,+Ehn...)

---

Rather than start with your port I went back to your arc sources and tried to track my process in an effort to generalize lessons. I was able to split your recursive functions into clauses without much effort[1]. For example:

  ; Arc
  (def qq-cons (expr1 expr2)
    ; assume expr2 is non-splicing
    (let operator (if (splicing expr1) 'dotted-list 'cons)
      (if (no optimize-cons*)
           (list operator expr1 expr2)
          (and (~splicing expr1) (literal expr1) (no expr2))
           (list 'quote (list (eval expr1)))
          (no expr2)
           (list 'list expr1)
          (atom expr2)
           (list operator expr1 expr2)
          (caris expr2 'list)
           (dotted-list 'list expr1 (cdr expr2))
          (and (quoted-non-splice expr1) (quoted-non-splice expr2))
           (list 'quote (cons (cadr expr1) (cadr expr2)))
          (list operator expr1 expr2))))

  # Wart; an early incorrect version
  def qq_cons(expr1 expr2)
    # assume expr2 is non-splicing
    let operator (if splicing?.expr1 'dotted_list 'cons)
      (list operator expr1 expr2)

  def qq_cons(expr1 expr2) :case (and Optimize_cons quoted_non_splice?.expr1
                                                    quoted_non_splice?.expr2)
    (list quote (list cdr.expr1 cdr.expr2))

  def qq_cons(expr1 expr2) :case (and Optimize_cons (carif.expr2 = 'list))
    (dotted_list 'list expr1 cdr.expr2)

  def qq_cons(expr1 expr2) :case (and Optimize_cons atom?.expr2)
    let operator (if splicing?.expr1 'dotted_list 'cons)
      (list operator expr1 expr2)

  def qq_cons(expr1 expr2) :case (and Optimize_cons no.expr2)
    (list 'list expr1)

  def qq_cons(expr1 expr2) :case (and Optimize_cons ~splicing?.expr1
                                                    literal?.expr1
                                                    no.expr2)
    (list quote (list eval.expr1))
The key was to peel the cases off in reverse order.

As in the past, wart is more verbose than arc, but in exchange you get to move the cases around. For example, all but the first clause can be in a separate file so you don't care about them when Optimize_cons is unset.

---

But there was more to this than a simple transliterated port, because quote/unquote/backquote were represented differently in wart than other lisps. I had to track down and update all the places in your version that made this foundational assumption.

Since the backquote is baked into wart's reader rather than expanding to quasiquote like in other lisps, I traced through all top-level calls to qq-expand as a baseline to compare against:

  ; change in original arc version
  (mac quasiquote (expr)
    (ret ans qq-expand.expr
      (ero expr " => " ans)))
Once I had this output in hand I could start porting your tests. I started with a pass at just visually catching as many cases of treating the cdr of quote, quasiquote, etc. as a list as I could[2]; you might enjoy seeing what I was missing at that point, the 5 tokens that took a week of debugging :)

Two forms turned out to be quite useful in my debugging:

  # http://arclanguage.org/item?id=11140
  after_fn qq_expand_list(expr)
    (prn "qq_expand_list " expr "
  => " result)
  # ..and so on for qq_transform, qq_cons, qq_append.

  # Easily turn debug prints on and off.
  def xprn args
    nil
---

References are to snapshots of the same gist:

[1] https://gist.github.com/3971932/f85f4da34d79d7d6cb1c9b01ce60.... Try diffing this version against your port.

[2] https://gist.github.com/3971932/72d159184afaa68e42920801b75e...

-----

2 points by Pauan 4217 days ago | link

"As long as we know what sort of clauses we're working with, we can define a partial order on them."

You could require that the user provide an order when defining a new pattern.

---

My idea with Nulan is that functions shouldn't change: they should be static algorithms. Thus, Nulan doesn't allow you to overload functions like you can in Arc or Wart. Instead, if you wish to overload a function, you define a new gensym which can be stuck into any object to overload it:

  $def %foo (uniq)

  $def foo
    { %foo F } -> ...
    ...        -> ...
Now, if you call the "foo" function with an object that has a %foo key, it'll call the first clause. Otherwise it'll call the second clause.

This gives complete control to the function to decide whether it can be overloaded or not, and exactly how the overloading happens. I think this is the most flexible approach I've ever seen.

As a more concrete example, objects can have a %len key to define a custom (len ...) behavior, an %eval key for custom evaluation behavior, %call for custom call behavior, %pattern-match for custom pattern match behavior, etc.

And because of the way pattern matching works, you can even match against multiple gensyms at once:

  $def %foo (uniq)
  $def %bar (uniq)

  $def foo
    { %foo F
      %bar G } -> ...
    { %foo F } -> ...
    { %bar G } -> ...
    ...        -> ...
The above defines different behavior when an object has both the %foo and %bar keys, and also when it has just %foo or just %bar.

---

"So, how do I manipulate quotes in macroexpansions, as needed in qq.wart?"

I think hardcoding symbols (like in quasiquote) is a terrible idea, so I would just not define quasiquote. Instead, I'd use Arc/Nu quasisyntax, which doesn't use quote at all:

  `(foo bar qux)   -> (list foo bar qux)
  `(foo (bar) qux) -> (list foo (list bar) qux)
If you want quote, just use unquote:

  `(foo ,'bar qux) -> (list foo (quote bar) qux)
Also, I wouldn't define quasiquote as a macro, I'd have it be a part of the reader.

-----

3 points by fallintothis 4216 days ago | link

You could require that the user provide an order when defining a new pattern.

Interesting idea, but the proper mechanism for it eludes me.

- You could have the user give a precedence level, since integers are totally ordered, but that's a terrible idea---magic constants in every generic declaration. Any more restricted domain (e.g., specifying high, medium, or low precedence) gets a little vague.

- You could have a simpler mechanism where each new rule is added to a known, fixed location in the linearization. So basically the order is a double-ended queue and generic declarations have keywords for "push me to the front" or "push me to the back". But that's probably a bit too basic, and still relies on declaration order (in fact, complicating it).

- You could have the user specify a previously-declared chunk of code to execute first, like

  def g(x)
    (prn "default")

  def g(x) :case (odd? x)
    (prn "odd")

  def g(x) :case (x = 1) :before (odd? x)
    (prn "one")
But that's too tightly-coupled and requires code duplication. Plus, if you're already duplicating the code that's close by a definition, why not instead reorder the definitions so a simpler order-they're-read mechanism works?

- I'm really scraping the bottom of the barrel now...Have the user supply their own predicate that determines which generic to apply first? Which means the user basically has to implement their own customized partial-order. I really don't see that happening.

When using generics myself, I don't want to think about these sorts of things too hard. That's what I like about class-based generic dispatch: I just have to think about the function one class at a time, and the right function will be applied to the right instances using their simple, implicit order. For general predicates, instead of hard-coding an order I'd rather use a big if/case/pattern-match that I at least know is ordered the way I wrote it.

-----

4 points by rocketnia 4216 days ago | link

"You could have the user specify a previously-declared chunk of code to execute first"

I think this is similar to Inform 7's approach, where rules can be referred to by name. Generally, every rule in a library has a name, but individual stories omit them unless necessary.

---

"- I'm really scraping the bottom of the barrel now...Have the user supply their own predicate that determines which generic to apply first? Which means the user basically has to implement their own customized partial-order. I really don't see that happening."

That's the approach I took in Lathe a long time ago.[1] Under my approach, the partial order is itself extensible, and it even determines the ordering of its own extensions. I quite like the expressiveness of this approach, but this expressiveness is barely meaningful for in-the-large programming: In order for the precedence rule programmer to have enough information to make judgments by, they need to require all other programmers to annotate their extensions with appropriate metadata. That or they need to maintain their own up-to-date metadata describing common libraries! Instead of programming language battles, we get framework battles full of boilerplate.

Since finishing that system, I've been pondering the "framework" conventions I'd like to use myself, and I've been trying to fold those decisions into the core of a language design.

Whereas Magpie and many other multimethod systems make a big deal about subclasses, I try to avoid any kind of programming where almost all X's do Xfoo, but some X's are also Z's so they do Zfoo instead. By the same principle, I'd avoid the [odd? _] vs. [= 1 _] case altogether. If fact, as long as I succeed in formulating in the non-overlapping designs I like, I avoid the need for precedence rules altogether... but it's still an option I keep in mind just in case.

Currently, I favor supporting extensibility by way of sealer/unsealer pairs and first-class (multi)sets.

Sealer/unsealer pairs work when each extension is an all new range of possibilities. In Arc, I'd just represent these as tagged values, and I wouldn't bother to pursue the secure encapsulation associated with sealer/unsealer pairs. In linear logic, the additive operators describe this kind of combination.

First-class (multi)sets work for when each extension is a new participant in a single computation. In Arc, a list is a good representation for this. In linear logic, the multiplicative operators describe this kind of combination.

When precedence is necessary, it can be modeled explicitly as a transformation of a (multi)set into a more structured model. I think any programmer who makes an extensible tool should carefully choose a transformation that makes sense for their own purposes--whether it's really "precedence" or something else.

---

"For general predicates, instead of hard-coding an order I'd rather use a big if/case/pattern-match that I at least know is ordered the way I wrote it."

That's my take on it, too. My examples of multi-rule functions have included factorial and exponentiation-by-squaring, but those are unrealistic. There's no reason for an extension to come interfere in that process, so it might as well be the body of a single declaration.

When I was using predicate dispatch more often, I often discovered that I could satisfy most of my use cases with just two extension annotations:

- This is the real meaning of the function, and all the other cases are just for auto-coercion of parameters into the expected format. Use this extension first. (This came up when implementing 'iso in terms of 'is, and it also makes sense for coercion functions themselves, such as 'testify.)

- This extension is the last resort. Use it last. (Maybe it throws an informative error, or maybe it returns false when everything else returns true. This also works for all-accepting coercion functions like 'testify, but I'm suspicious of this kind of design. A call to 'testify always does Xfoo, except when it does Zfoo.)

---

[1] Unfortunately, right now arc-Lathe's version of the precedence system isn't something I maintain, and Lathe.js's version has no easy-looking examples because I no longer store extensions in mutable global state. Lathe.js's "hackable" utilities are now higher-order utilities that take all their once-global dependencies as explicit parameters.

-----

1 point by Pauan 4216 days ago | link

"they need to require all other programmers to annotate their extensions with appropriate metadata."

If Nulan had multimethods, I'd probably just require programmers to add additional metadata to the object in addition to the %pattern-match gensym. But Nulan doesn't have linearization or multimethods, so I avoid that whole mess!

---

"Whereas Magpie and many other multimethod systems make a big deal about subclasses"

And I have to agree: objects are amazing, but classes and subclasses are terrible. In fact, I'm of the opinion that probably all hierarchial relationships are too restrictive. Hence Nulan's object system which is completely flat, but has pretty much all the benefits of classes/prototypes.

Basically, the behavior that is common to all objects (of a particular kind) is put into a function, and behavior that is specific to a particular object is put into the object itself. And immutability gives you wonderfully clean semantics for prototype/class behavior, without all the complication and baggage.

-----

2 points by Pauan 4216 days ago | link

Well, I was thinking in terms of Nulan, where you define new patterns with a custom "%pattern-match" property. It wouldn't be hard to add in a "%pattern-match-order" property or such, though I admit I haven't really thought through how such a property would work...

Obviously that kind of system wouldn't work in wart where predicate dispatch is based on arbitrary function calls. Hence my idea in Nulan of not allowing function mutation. I would write the above code like this:

  $def g
    1        -> (prn "one")
    (odd? X) -> (prn "odd")
    ~        -> (prn "default")
That is, it first tries 1, then (odd? X), then the ~ wildcard, which matches anything. This is equivalent to the following Arc code:

  (def g (x)
    (if (is x 1)
          (prn "one")
        (odd? x)
          (prn "odd")
        (prn "default")))
If you want to make a function extensible, you would use a property on an object, like I described here: http://arclanguage.org/item?id=16851

-----

1 point by akkartik 4216 days ago | link

Yeah, I went through a similar thought process.

One possibility is to pin a clause at the front:

  def g(x) :priority-case (x = 1)
    (prn "one")
It generalizes to your high/medium/low categories, but perhaps it's useful if you permit exactly one priority clause (newer ones overwrite it).

-----

1 point by akkartik 4217 days ago | link

I think I'm getting lost in the $def foos and the %foos. Can you show how you would make say len aware of queues?

-----

2 points by Pauan 4217 days ago | link

I could switch to Arc syntax if you like. Looking at the Arc implementation of queues...

  (def queue () (list nil nil 0))
  (def qlen (q) (q 2))
Well, Arc uses mutation, and I wouldn't, but here's how you would define it in Nulan:

  $def queue; ->
    [[] [] 0
      @{ %len: X -> (X 2) }]
And here's the same thing, but with Arc syntax[1]:

  (def queue ()
    (array nil nil 0
      @(dict %len (fn (x) (x 2)))))
The above is basically exactly the same as Arc, except it uses an array rather than a cons, and it has a custom %len property. If you don't want the different parts of the queue to be public, you could use gensyms like this:

  (w/uniq (len left right)
    (def queue ()
      (dict %len  (fn (x) (x len))
            len   0
            left  nil
            right nil)))
Rather than returning an array of 3 elements, it returns an object, which has a "len", "left", and "right" property.

In either case, the object that is returned has a %len property, which is a function that computes the length. The "len" function would then be defined like this:

  (def len (x)
    ((x %len) x))
That is, it first looks up the %len property in the object, and then calls it with itself as the first argument.

---

* [1]: You might be wondering what's going on here... well, in Nulan, a "list" is just like a JavaScript array: it's an ordinary object which uses numeric keys and has a custom %len property. In particular, that means that all the tools that work on objects work on arrays too.

The @ syntax is used for splicing. So, for instance, to merge 3 objects together, you'd say { @foo @bar @qux }. And because arrays are objects too, you can splice them.

So what we're doing is, we first create an array of 3 elements, and we then splice in a new object. This new object has a custom %len property, which overrides the %len property of the array.

Alternatively, we could have done it like this:

  (set (array nil nil 0) %len ...)
But I think it's better to use the splicing notation, especially when you use [] for arrays and {} for objects. By the way, these two are actually equivalent:

  { @[[] [] 0]
    %len ... }

  [[] [] 0
    @{ %len ... }]
In the first case we take an empty object, splice in an array, and then assign a %len property. In the second case, we take an array and splice in an object that has a %len property.

In either case, the object that is returned has the same keys and values.

-----

2 points by Pauan 4217 days ago | link

Sorry to hijack your wart thread, but... I just realized something. I wanted to allow for iterating over the keys of an object, but that caused issues, as I discussed with rocketnia earlier (http://arclanguage.org/item?id=16823)

Anyways, JavaScript gets around this problem by allowing you to define a property on an object that is "non-enumerable". But any kind of system that I add in that lets you define "non-enumerable" properties is going to be big and complicated.

Instead, I found a very simple way to create enumerable objects in a way that is completely customizable, and doesn't even need to be built-in:

  (= %keys (uniq))
  
  (def iterable-object ()
    (let keys []
      { %set  (fn (x k v)
                (pushnew k keys))
        %rem  (fn (x k)
                (pull k v))
        %keys (fn (x) keys) }))
Here's what's going on. Firstly, we got the %keys gensym, which is supposed to be a function that returns a list of keys in the object.

The function "iterable-object" returns a new object that has custom %set, %rem, and %keys properties:

%set is called when assigning a property to the object. It takes the key and pushes it into the "keys" array.

%rem is called when deleting a property from the object. It takes the key and removes it from the "keys" array.

%keys just returns the "keys" array.

Now, these objects are capable of being enumerated, which means they could be passed to "map", for instance. But here's the fun part: you can completely control which properties are enumerated and which aren't.

In this case, the object will only enumerate properties that are added or removed after the object is created. So any properties defined previously are still hidden. At least, depending on how I implement splicing and %set...

---

What the above basically means is... "every computer problem can be solved by the addition of more properties on an object" :P

-----

1 point by Pauan 4216 days ago | link

After fidgeting with the syntax, here's what I got:

  $def iterable-object ->
    [ %set  -> x k o n
              [ @x %keys -> ~ (pushnew (keys x) k) ]
      %rem  -> x k o
              [ @x %keys -> ~ (pull (keys x) k) ]
      %keys -> ~ {} ]
I actually realized that swapping [] and {} is way better, meaning that [ foo bar ] is (dict foo bar) and { foo bar } is (array foo bar). There's two reasons for this:

1) {} is closer to () than [] is, which is really nice in macros:

  $mac $accessor -> n v
    $uniq %a
      {$def n -> %a
        {{%a v} %a}}
2) I found that {} didn't stand out enough, but [] does.

---

By the way, in case you're curious about the Nulan syntax... $ is prefixed to vau/macros, which is why it's "$def" rather than "def"

-> is the function macro, which means (-> x y z ...) is equivalent to (fn (x y z) ...) in Arc

~ is the "wildcard syntax" which matches anything, just like _ in Haskell

[ foo bar ] translates to (dict foo bar), and { 1 2 3 } translates to (array 1 2 3)

@ is for splicing. Which means that [ @foo @bar @qux ] merges three objects into one. If you want to update an object with new properties, it's idiomatic to say [ @foo ... ]

Gensyms are prefixed with %

-----

1 point by Pauan 4216 days ago | link

Which, if translated into JavaScript, would look something like this...

  var iterableObject = function () {
    var a = {};
    a.set = function (x, k, o, n) {
      var a = Object.create(x);
      a.keys = function () {
        return pushnew(keys(x), k)
      }
    };
    a.rem = function (x, k, o) {
      var a = Object.create(x);
      a.keys = function () {
        return pull(keys(x), k)
      }
    };
    a.keys = function () {
      return []
    };
    return a
  }

-----

2 points by rocketnia 4216 days ago | link

"Instead, I found a very simple way to create enumerable objects in a way that is completely customizable, and doesn't even need to be built-in"

That's my approach too. Or it would be, if I actually got around to building a language out of Cairntaker. :)

-----

2 points by akkartik 4217 days ago | link

:) All very interesting, thanks. The len example was very clear. Hijack away!

-----

3 points by Pauan 4218 days ago | link

"Is there a way to make the clauses load-order independent?"

Magpie has a precedence system with its multimethods:

http://magpie-lang.org/multimethods.html#linearization

-----

2 points by akkartik 4218 days ago | link

I've fixed the load bug.

I also managed to return performance to pre-infix levels by the simple expedient of turning on compiler optimizations; on my machine it yields a 2x speedup. Startup time is now 2.6s, and tests take 35s to run.

Finally, the interpreter no longer dies when it encounters unbound vars, or when trying to invoke a non-function.

Update 9 hours later: After a little more optimization, wart will now start up in less than a second if the binary is up to date. The first run after downloading (build+run time) takes 20s. Tests run in 10s (excluding build time).

I tried to be clever and assume frequent building if running tests, but it turns out wart now does enough work that optimizations always improve build+tests time.

http://github.com/akkartik/wart/compare/5707dcc022...904be9f...

-----