Arc Forum | Turns out 'varscope has an inconsistent corner case, the way I was originally th...

Arc Forum

3 points by rocketnia 4903 days ago | link | parent

Turns out 'varscope has an inconsistent corner case, the way I was originally thinking about it.

  (mac foo () "macro")
  (mac id-mac (x) x)
  (varscope v
    (id-mac (v foo (fn () "function")))
    (foo))

Should this result in "macro" or "function"? What if we change it up like this?

  (mac foo () "macro")
  (mac id-mac (x) x)
  (varscope v
    (id-mac (v id-mac (fn (x) nil)))
    (id-mac (v foo (fn () "function")))
    (foo))

I'd rather have 'varscope work in a compilation phase (no dependence on fexprs), and I'd rather not make the order of compilation matter, so I'm going to make a very hackish decision: The body of a (varscope ...) form should be compiled as though it put no variables in scope. Macros from the surrounding scope will work even if the local scope shadows them at run time.

By no coincidence, this design compromise is compatible with the hypothetical implementation below. I actually only realized this flaw once I was documenting that implementation. :-p

(By complete coincidence(?), this is similar to Arc 3.1's bug where local variables don't shadow macros. In the hypothetical language(s) I'm talking about, function parameters and let-bound variables would hopefully still shadow macros, so it wouldn't be quite the same.)

---

"IIUC you propose to write "varscope" at the top, and "var" further down. Doesn't that mean that whenever you define a variable, you need to write "varscope" before it?"

Close. You can have more than one variable per varscope. But I'm guessing you knew that. :-p

That means when you define a variable, you don't necessarily need to define a varscope if a suitable one already exists. But I don't expect even a single 'varscope to appear very often in code; instead I expect convenience macros to take care of it.

---

"The way that's handled in bullet ... it's ok to use the same variable name in subsequent for loops: the previous instance is not alive by the time current is reached."

That's a good example of when a macro could take care of establishing a varscope.

  (for <init> <condition> <step>
    <...body...>)
  ==>
  (varscope var
    <init>
    (while <condition>
      <...body...>
      <step>))

For the analogous case in JS (or rather a hypothetical JS-like language whose semantics are based on 'varscope), the only thing that establishes a varscope is the "function () {}" syntax. Sibling loops of the form "for ( var i = ..." use the same i because they don't establish a new scope for themselves.

---

"Or do you also have "lets" that work as in scheme?"

Yes, I would have them both. It's hard not to have 'let since it can just be defined as a macro over 'fn. Whether one would be emphasized over the other, I'm not sure.

---

"Also, you kind of lose me in the last two paragraphs."

I was talking about a compiler for varscope bodies. It'll help to back up a bit....

In a language where macros return compiled code rather than code to compile, traditional macros are simple to implement as sugar:

  (mac when (condition . body)
    `(if ,condition ,@body))
  ==>
  (def-syntax when (condition . body) gensym123_static-env
    (compile `(if ,condition (do ,@body)) gensym123_static-env))

If you find this shockingly similar to Kernel-style fexprs, hey, me too. :-p

  (mac when (condition . body)
    `(if ,condition ,@body))
  ==>
  (def-fexpr when (condition . body) gensym123_dynamic-env
    (eval `(if ,condition (do ,@body)) gensym123_dynamic-env))

IMO, the compile phase is just an fexpr eval phase whose result is used as code. Arc macros, which don't have access to the environment, are limited in the same way as pre-Kernel fexprs.

So what I'm suggesting is that in addition to 'compile, we have a second compilation function that lets us compile the body of a (varscope ...) or (module ...) form.

In fact, here's exactly how I'd use it. I'll call it 'compile-w/vars.

  ; I'm assuming 'compile-w/vars, 'def-syntax, and 'compile exist in
  ; Arc.
  ;
  ; I'm also assuming 'mc exists in Arc as an anonymous macro syntax (so
  ; that 'mc is to 'mac as 'fn is to 'def).
  ;
  ; Last but not least, I'm assuming (nocompile <code>) exists in Arc as
  ; a way to embed compiled code inside uncompiled code. When
  ; (nocompile <code>) is compiled in a static environment <env>, it
  ; should associate any free variables in <code> with variables bound
  ; in <env>. To make this happen, both 'compile-w/vars and 'compile
  ; should accept code even if it has free variables, and compiled code
  ; should be internally managed in a format that allows for this kind
  ; of augmentation.
  
  (def-syntax varscope (label . body) env
    ; In case you're not familiar, this is a destructuring use of 'let.
    (let (new-body vars)
           ; NOTE: We compile the body in the *outer* environment, not
           ; the local environment the varscope establishes.
           (compile-w/vars
             label (mc (var val)
                     `(= ,var ,val))
             `(do ,@body) env)
      (make-compiled-let (map [list _ nil] vars)

      (compile `(with ,(mappend [do `(,_ nil)] vars)
                  (nocompile ,new-body))
               env)))
  
  (def-syntax anon-module-w/var (label . body) env
    (w/uniq g-table
      (let (new-body vars)
             ; NOTE: We compile the body in the *outer* environment, not
             ; the local environment the module establishes.
             (compile-w/vars
               label (mc (var val)
                       ; Set both a variable and a table entry.
                       `(= (,g-table ',var) (= ,var ,val)))
               `(do ,@body) env)
        (compile `(with (,g-table (obj) ,@(mappend [do `(,_ nil)] vars))
                    (nocompile ,new-body)
                    ,g-table)
                 env))))
  
  ; This anaphorically binds 'var as the module's variable declaration
  ; form.
  (mac module (name . body)
    `(= ,name (anon-module-w/var var ,@body)))

As before, I release this code for anyone to use--or rather to derive actual working code from. :-p No need for credit.

---

"So, yes, it's just a var binding."

That's a var binding even though it uses 'set? I'm confused.

---

"no, I didn't know what [fexprs] were before I read your post"

What post is that?

I had a half-written reply that started with "Come to think of it, you probably do have fexprs," and went on to explain why I suspected it, what they were, and what you might get if you embraced it or rejected it. Should I still post it? It sounds like you understand it already, but it wouldn't do to have a time paradox. :-p

Anyway, since you're an fexpr fan now, I would like to emphasize the other side: The translation of my (point ...) example into imperative code is straightforward to do during a compilation phase, and fexprs get in the way of compilation phases. :)

It may be possible to force one's way through fexprs during a compilation phase too, but I expect that algorithm to look like a cross between a) constant-folding and b) static type inference with dependent types (since eval's return type depends on the input values). Rather than simply using recursion to compile subexpressions, the algorithm would do something more like using recursion together with concurrency, so that some subgoals could wait for information from other subgoals. To complicate things further, if the program uses a lot of mutable variables, the algorithm might not be able to treat them as constants, and it might not get very far unless you run it at run time as a kind of JIT.

I find this pretty intimidating myself. I've made steps toward at least the constant-folding part of this (which I expect will be sufficient for almost all fexpr programs written in a reasonable style), but I've gotten bogged down in not only the difficulty but also my own apathy about fexprs.

1 point by Pauan 4903 days ago | link

"The translation of my (point ...) example into imperative code is straightforward to do during a compilation phase, and fexprs get in the way of compilation phases. :)"

Why not have both? As in, have a way of saying "this should all be done at compile-time" that doesn't involve fexprs at all, or involves a special variant of fexprs. Reminds me of a discussion somewhere about micros (as opposed to macros)...

-----

3 points by rocketnia 4902 days ago | link

"Reminds me of a discussion somewhere about micros (as opposed to macros)..."

This probably isn't what you mean, but...

One of my oldest toy languages (Jisp) had syntactic abstractions I called "micros," and I discuss them here: http://arclanguage.org/item?id=10719

tl;dr: My micros are fexprs that not only leave their arguments unevaluated but also leave them unparsed. The input to a micro is a string.

Much like how I just said macroexpansion in the compilation phase was like fexpr evaluation, what I've pursued with Penknife and Chops is like a compilation phase based on micro evaluation.

---

"Why not have both? As in, have a way of saying "this should all be done at compile-time" that doesn't involve fexprs at all, or involves a special variant of fexprs."

One way to have both is to have two fexpr evaluation phases, one of which we call the compile phase. This is staged programming straightforwardly applied to an fexpr language... and it's as easy as wrapping every top-level expression in (eval ... (current-environment)).

However, that means explicitly building all the code. If you want to call foo at the repl, you can't just say (foo a b c), you have to say (list foo a b c).

With quasiquote it's much easier for the code that builds the code to look readable. So suppose the REPL automatically wraps all your code in (eval `... (current-environment)). Entering (foo a b c) will do the expected thing, and we can say (foo a ,(bar q) c) if we want (bar q) to evaluate at compile time.

Now let's fall into an abyss. Say the REPL automatically detects the number of unquote levels we use in a command, and for each level, it wraps our code in (eval `... (current-environment)) to balance it. Now (foo a b c) will do the expected thing because it's wrapped 0 times, (foo a ,(bar q) c) will do the expected thing because it's wrapped once, and so on. We have as many compile phases as we need.

The price is one reserved word: unquote. This would be the one "special variant of fexprs."

-----

1 point by rocketnia 4901 days ago | link

"The price is one reserved word: unquote. This would be the one "special variant of fexprs.""

Possible correction: If any kind of 'quasiquote is ever going to be in the language, it should probably have special treatment so that its own unquotes nest properly with the REPL's meaning of unquote. An alternative is to use a different syntax for non-REPL unquotes (e.g. ~ instead of ,).

Also note that ,foo could be a built-in syntax that doesn't desugar to anything at all (not even using the "unquote" name), instead just causing phase separation in a way that's easy to explain to people who understand 'quasiquote.

-----

1 point by Pauan 4900 days ago | link

"Also note that ,foo could be a built-in syntax that doesn't desugar to anything at all (not even using the "unquote" name), instead just causing phase separation in a way that's easy to explain to people who understand 'quasiquote."

Yes, I currently think that all syntax should be at the reader level, rather than trying to use macros to define syntax. As an example of what I'm talking about, in Nu, [a b c] expands into (square-brackets a b c) letting you easily change the meaning of [...] by redefining the square-brackets macro.

Or the fact that 'foo expands into (quote foo) letting you change the meaning of the quote operator... or the fact that `(foo ,bar) expands into (quasiquote (foo (unquote bar))), etc.

I used to think that was great: hey look I can easily change the meaning of the square bracket syntax! But now I think it's bad. I have both conceptual and practical reasons for thinking this.

---

I'll start with the conceptual problems. In Lisp, there's essentially three major "phases": read-time, compile-time, and run-time. At read-time Lisp will take a stream of characters and convert it into a data structure (often a cons cell or symbol), compile-time is where macros live, and run-time is where eval happens.

Okay, so, when people try to treat macros as the same as functions, it causes problems because they operate at different phase levels, and I think the same exact thing happens when you try to mix read-time and compile-time phases.

---

To discuss those problems, let's talk about practicality. quasiquote in particular is egregiously bad, so I'll be focusing primarily on it, though quasisyntax also suffers from the exact same problems. Consider this:

  `(,foo . ,bar)

You would expect that to be the same as (cons foo bar), but instead it's equivalent to (list foo 'unquote 'bar). And here's why. The above expression is changed into the following at read-time:

  (quasiquote ((unquote foo) . (unquote bar)))

And as you should know, the . indicates a cons cell, which means that the above is equivalent to this:

  (quasiquote ((unquote foo) unquote bar))

Oops. This has caused practical problems for me when writing macros in Arc.

---

Another problem with this approach is that you're hardcoding symbols, which is inherently unhygienic and creates inconsistent situations that can trip up programmers. Consider this:

  `(,foo (unquote ,bar))

You might expect that to result in the list (list foo (list 'unquote bar)) but instead it results in the list (list foo bar), because the symbol unquote is hardcoded.

---

Yet another problem is that it requires you to memorize all the hard-coded names for all the syntax. You have to remember to never define a function/macro called quote. To never define a function/macro called unquote, to never define a function/macro called square-brackets, etc... which means this will break:

  ; oops, redefined the meaning of the quote syntax
  (let quote ...
    'foo)

When the number of syntax is small, that's not really a high price to pay, but it is still a price.

---

Also, this whole "read syntax expands into macros" thing is also inconsistent with other syntax. For instance, (1 2 3) isn't expanded by the reader into (list 1 2 3). That is, if you redefine the list function, the meaning of the syntax (1 2 3) doesn't change. But if you redefine the quote macro, then suddenly the syntax 'foo is different.

The same goes for strings. Arc doesn't expand "foo" into (string #\f #\o #\o) either. So redefining the string function doesn't change the meaning of the string syntax. So why are we doing this for only some syntax but not others?

---

All of the above problems go away completely when you just realize that read-time is a separate phase from compile-time. So if you want to change the meaning of the syntax 'foo the solution isn't to redefine the quote macro. The solution is to use a facility designed for dealing with syntax (such as reader macros).

This is just like how we separate compile-time from run-time: you use functions to define run-time stuff, macros to define compile-time stuff, and reader macros to define read-time stuff.

This also means that because the only way to change the syntax is via reader macros (or similar), the language designer is encouraged to provide a really slick, simple, easy-to-use system for extending the syntax, rather than awful kludgy reader macros.

-----