Arc Forumnew | comments | leaders | submitlogin
An update on wart
3 points by akkartik 1325 days ago | 8 comments
2 months ago I showed this bug in wart:

  arc> (iso :x :x)
  nil
(http://arclanguage.org/item?id=18287)

This is now fixed. In the process I had to overhaul how wart binds params in fn. It should be a lot more robust, but the code has gotten a lot more complected. I'll try to explain why.

In broad terms, traditional lisp evaluators handle lambda binding as follows:

  Eval all args -> bind sequentially.
When I implemented pervasive keyword args for all params back in 2011 I changed this to:

  Reorder all args -> Eval all args -> bind sequentially.
Basically it works by first 'pinching out' all keyword args into a table, then successively binding args while looking for candidates in the table.

However, wart also added quoted parameters at some point. Useful as a primitive to build macros out of. You could quote a parameter to suppress evaluation of corresponding args at call time. But I kept having to add special cases to support this in the second, eval step above. Eventually, in early 2013 I switched to:

  Reorder all args -> Bind in param sequence, eval'ing as necessary.
A year later I started noticing more issues. The culprit was parameter aliases, which let me give params distinct names for keyword args. In wart, aliases are fully general and support haskell-like as-patterns (http://www.haskell.org/tutorial/patterns.html):

  def (foo (xs | (head ... tail))
Here foo will receive an arg called xs, whose car is bound to head, and cdr to tail (the ellipses are like dotted lists in regular scheme/arc). Very useful feature for building generic functions and so on; it let me very straightforwardly overlay features of def and mac atop one another (https://github.com/akkartik/wart/blob/96d90427d9/047generic.wart). However, the cost was additional complexity that I thoroughly under-estimated. What happens if you alias some quoted and some unquoted params, in various orders? In many cases I was silently interpreting the cons inside the alias as a destructured param, which required first eval'ing it, causing the above bug. There were other issues, like this:

  def (foo ... ('params | (x y)))
    some body

  (foo 1 :x 2)
Should params be (1 :x 2) or (2 1)? Should the answer change based on whether it's quoted or not?[1] This and other issues basically convinced me that reordering args in advance was no longer defensible. Instead, I am now at this monstrosity:

  In param sequence, bind keyword or position args eval'ing as necessary.
All structure is gone. But I have a lot more tests and things seem to work much better. Hopefully structure will return in time as I gradually work through possible refactorings.

Anyway, I'm still unconvinced that these features are all a good idea as I said before. But at least I have a working prototype to judge over time.

A summary of code changes: https://github.com/akkartik/wart/compare/8bcd5212c...96d90427d9

[1] The answer to this question is now always (1 :x 2) regardless of whether or not params is quoted. So I can afford to not have to reorder args when binding it. One new constraint is that param aliases can only include conses and as-params at the end. So this is illegal:

  (a | (b c) | d)
But these are legal and supported:

  (a | d | (b c))

  (a | (b ... (c | (d ... (e | (f g))))))


2 points by rocketnia 1325 days ago | link

In case it helps (or just drives a discussion), I've ironed out my own design spec for the features you're going for.

It's not quite the same as what you're describing. For one thing, it doesn't include the new restrictions you're talking about.

For another thing, it includes a new pattern type. Symbols have a lot of duties in your patterns because they deal with evaluation and keywords. I found it easiest to separate symbol patterns into two different types of patterns, only one of which deals with keywords. The non-keyword patterns I represent as strings (just to distinguish them from symbols). It would be easy to replace this with a special form or with some other literal type.

...Oh, I guess you don't have an explicit keyword parameter syntax either. I assumed you did, so it's in here too. (It could actually help clarify what's going on here. The symbol pattern deals with keywords and does just a bit more on top of that, so I like the side-by-side comparison.)

---

The pattern language has several kinds of patterns. A pattern operates on a single unevaluated s-expression (henceforth "expr") and an evaluation environment, and it outputs a map of variable bindings. It also outputs an evaluation result, if it evaluates the expr as part of its processing. (The evaluation result is only used for (<a> | <b>) patterns, to prevent duplicate evaluations.)

These are the available patterns:

  <str>  ; where <str> is a string
Evals the current expr and binds the variable named <s> to the result value.

  <s>  ; where <s> is a symbol
Runs the pattern <str> on the current expr, where <str> is the string name of the symbol <s>.

Uses those variable bindings and that evaluation result.

(That may look simple enough, but a completely different symbol pattern applies when matching a list element. See below.)

  '<a>
Runs the pattern <a> on a quoted version of the current expr.

Uses all variable bindings produced by <a>.

  ()
Evals the current expr. If the result isn't nil, raises an error.

  ; any cons cell
Does several special things depending on the contents of the pattern's cons cell:

  (<a> | <b>)
Runs pattern <a> on the current expr.

If that pattern returned an evaluation result, runs pattern <b> on a quoted version of that result value. Otherwise, runs pattern <b> on the current expr.

Uses all variable bindings produced by <a> and <b>, preferring the bindings in <b>.

Uses the evaluation result of <a> or <b> if available, preferring the result of <b>.

  (<before...> <k> <v> ... <rest>)
    ; where <before...> is any number of non-keyword, non-symbol
    ; elements; <k> is a keyword; and <rest> could be anything,
    ; especially an improper list
This pattern treats the current expr as an unevaluated improper list, looking for the first occurrence of <k> with another element after it (which we will refer to as the value expr).

If the keyword binding is found: Matches <v> against the value expr. Matches <rest> against an alteration of the current expr, changed to remove the keyword and the value expr. Uses all variable bindings produced by <v> and <rest>, preferring the bindings in <rest>.

If the keyword binding is not found: Matches the pattern <rest> against the entire expr. Uses those variable bindings.

  (<before...> <s> ... <rest>)
    ; where <before...> is any number of non-keyword, non-symbol
    ; elements; <s> is a symbol; and <rest> could be anything,
    ; especially an improper list
Let <k> be the keyword with the same name as the symbol <s>, and let <str> be this name as a string.

This pattern treats the current expr as an unevaluated improper list, looking for the first occurrence of <k> with another element after it (which we will refer to as the value expr).

If the keyword binding is found: Matches <str> against the value expr. Matches <rest> against an alteration of the current expr, changed to remove the keyword and the value expr. Uses all variable bindings produced by <str> and <rest>, preferring the bindings in <rest>.

If the keyword binding is not found: Matches the pattern (<str> ... <rest>) against the entire expr. Uses those variable bindings.

  ; for any other kind of cons cell...
  (<a> ... <b>)
Evals the current expr. If the result isn't a cons cell, raises an error. Otherwise, runs the patterns <a> and <b> on quoted versions of the result's car and cdr.

Uses all variable bindings produced by <a> and <b>, preferring the bindings in <b>.

-----

1 point by rocketnia 1324 days ago | link

I found another mistake I made. In the two rules that use <before...> and <rest>, that <before...> should be used whenever the rule matches the remainder of the expr.

In particular, for symbols to work as positional args whenever there isn't a keyword, the symbol rule needs to fall back to (<before...> <str> ... <rest>), reflecting the original location of the symbol in the pattern. I just had it falling back to (<str> ... <rest>) for some reason.

---

  (<before...> <k> <v> ... <rest>)
    ; where <before...> is any number of non-keyword, non-symbol
    ; elements; <k> is a keyword; and <rest> could be anything,
    ; especially an improper list
This pattern treats the current expr as an unevaluated improper list, looking for the first occurrence of <k> with another element after it (which we will refer to as the value expr).

If the keyword binding is found: Matches <v> against the value expr. Matches <rest> against an alteration of the current expr, changed to remove the keyword and the value expr. Uses all variable bindings produced by <v> and <rest>, preferring the bindings in <rest>.

If the keyword binding is not found: Matches the pattern <rest> against the entire expr. Uses those variable bindings.

  (<before...> <s> ... <rest>)
    ; where <before...> is any number of non-keyword, non-symbol
    ; elements; <s> is a symbol; and <rest> could be anything,
    ; especially an improper list
Let <k> be the keyword with the same name as the symbol <s>, and let <str> be this name as a string. This pattern treats the current expr as an unevaluated improper list, looking for the first occurrence of <k> with another element after it (which we will refer to as the value expr).

If the keyword binding is found: Matches <str> against the value expr. Matches <rest> against an alteration of the current expr, changed to remove the keyword and the value expr. Uses all variable bindings produced by <str> and <rest>, preferring the bindings in <rest>.

If the keyword binding is not found: Matches the pattern (<str> ... <rest>) against the entire expr. Uses those variable bindings.

---

After this fix:

---

  (<before...> <k> <v> ... <rest>)
    ; where <before...> is any number of non-keyword, non-symbol
    ; elements; <k> is a keyword; and <rest> could be anything,
    ; especially an improper list
This pattern treats the current expr as an unevaluated improper list, looking for the first occurrence of <k> with another element after it (which we will refer to as the value expr).

If the keyword binding is found: Matches <v> against the value expr. Matches the pattern (<before...> ... <rest>) against an alteration of the current expr, changed to remove the keyword and the value expr. Uses all variable bindings produced by <v> and (<before...> ... <rest>), preferring the bindings in the latter.

If the keyword binding is not found: Matches the pattern (<before...> ... <rest>) against the entire expr. Uses those variable bindings.

  (<before...> <s> ... <rest>)
    ; where <before...> is any number of non-keyword, non-symbol
    ; elements; <s> is a symbol; and <rest> could be anything,
    ; especially an improper list
Let <k> be the keyword with the same name as the symbol <s>, and let <str> be this name as a string. This pattern treats the current expr as an unevaluated improper list, looking for the first occurrence of <k> with another element after it (which we will refer to as the value expr).

If the keyword binding is found: Matches <str> against the value expr. Matches the pattern (<before...> ... <rest>) against an alteration of the current expr, changed to remove the keyword and the value expr. Uses all variable bindings produced by <str> and (<before...> ... <rest>), preferring the bindings in the latter.

If the keyword binding is not found: Matches the pattern (<before...> <str> ... <rest>) against the entire expr. Uses those variable bindings.

-----

2 points by rocketnia 1324 days ago | link

Oops, in one of those rules I said <s> where I meant <str>. Here's the corrected version:

  <str>  ; where <str> is a string
Evals the current expr and binds the variable named <str> to the result value.

-----

1 point by akkartik 1324 days ago | link

The next one refers to s and str as well. Is that intended?

Also, I don't follow the distinction between the string case and the symbol case.

-----

2 points by rocketnia 1324 days ago | link

"The next one refers to s and str as well. Is that intended?"

Yes. Only <s> appears in the pattern itself, but the meaning of <str> is explained by "...where <str> is the string name of the symbol <s>."

---

"Also, I don't follow the distinction between the string case and the symbol case."

Did you see this part? -- "(That may look simple enough, but a completely different symbol pattern applies when matching a list element. See below.)"

A symbol which occurs in a list acts as a hybrid keyword/positional argument. A symbol by itself acts just like a string.

---

Sorry I'm being terse with this stuff. I found it hard to express my points in informal English, which is why I went with a spec document style in the first place.

Let's see...

My motive is to show you a way your desired features can be accomplished in an extensible ravioli style, rather than one big chunk of spaghetti. What you want is achievable with a simple enough collection of parser combinators, although sometimes their interfaces may be unusual due to expression evaluation being part of the parsing process.

For instance, in the system I gave, the parser combinators use a few unusual design patterns:

- The combinators must take in an evaluation environment.

- The combinators frequently quote a value so they can pass it off to another combinator that may eval it again. (I think I've seen you use this style before, actually.)

- The combinators must sometimes return the evaluated expression, just so that the (<a> | <b>) pattern can avoid double evaluation.

Another important point I'm making is that you probably don't need to reorder the argument list for the purposes of keyword arguments. My system instead (effectively) reorders the parameter pattern so the keywords are processed first.

I got this system a bit wrong because I don't support multiple keyword aliases for the same parameter. My (<a> | <b>) pattern instead binds multiple parameter variables to the same argument value, so I got that backwards.

-----

2 points by akkartik 1324 days ago | link

This seems really useful. I got the motivation off the bat, but I still don't understand the spec. Can you give an example session showing how functions are defined, and how they are called?

-----

2 points by rocketnia 1324 days ago | link

Okay, I'll give that a try. Let's say we're defining a function and using it like this:

  > (def foo '(a b)
      (list a b))
  > (foo (+ 1 1) (+ 2 2))
  ((+ 1 1) (+ 2 2))
Let's vary some things here:

  '(a b)             ; the parameter list
  (a b)              ; the args to (list ...)
  ((+ 1 1) (+ 2 2))  ; the args to (foo ...)
  ((+ 1 1) (+ 2 2))  ; the result
  
  (baz)
  (baz)
  ((+ 1 1))
  ; Error! Can't make the function call (2).
  
  '(a b c)
  (a b c)
  ((+ 1 1) :c (+ 3 3) (+ 2 2))
  ((+ 1 1) (+ 2 2) (+ 3 3))
  
  '(a :b b c)
  (a b c)
  ((+ 1 1) (+ 2 2))
  ; Error! Unbound variable b in expression (list a b c).
So I have some more kinks to work out, lol. I finally appreciate your predicament. :)

The (baz) issue is caused by the complexity of the fact that in a normal function call, each of the arguments should be evaluated, but the list of arguments shouldn't. So if the traditional Arc parameter list (baz) is a pattern, it should destructure an unevaluated list in such a way that the baz pattern inside operates on an evaluated version of the list's first element.

It would be a lot simpler if we wrote this argument list differently:

  > (def foo (baz) ...)  ; doesn't work
  > (def foo `(,baz) ...)
  > (def foo (arg-list baz) ...)
      ; where (arg-list ...) is a pattern special form
I think that last one is promising, because then we can just write (baz) and the macroexpansion of (def ...) can insert that into (arg-list baz) for us. Then we get to have slight differences between how parameter lists work and how list destructuring works.

As for the "unbound variable b" issue, I think that could be solved by having each pattern take an optional current expr, rather than a mandatory one. My exact implementation of this would probably vary substantially based on what else I was trying to do. I might want a guarantee of which variables will be bound by a pattern long before there's an actual argument list to match it against, for compilation purposes and such.

-----

2 points by rocketnia 1325 days ago | link

Whoops, I forgot alias support entirely! Ah well, I'll just leave it this way for now.

-----