Arc Forumnew | comments | leaders | submit | rocketnia's commentslogin
3 points by rocketnia 1 day ago | link | parent | on: Ask AF: Advantages of alists?

Can you give an example of where the notation is better with tables than with alists? Maybe there's something you can do about it, e.g. writing a macro, using `defcall` or `defset`, or extending `sref`.

reply

2 points by kinnard 1 day ago | link

I'm running into trouble with both actually.

  (= tpipe {todo 
              '({id 1 cont "get eggs" done '(nil)}
                {id 23 cont "fix toilet" done '(nil)})
            week '(nil)
            today '({id 23 cont "Build something that works in arc" done '(nil)})
            done '({id 23 cont "Research Ordered Associative Arrays" done '(2019 1 21)})})

  (= apipe '((todo ({id 1 cont "get eggs" done '(nil)} 
                    {id 23 cont "fix toilet" done '(nil)})) 
             (week (nil)) 
             (today ({id 23 cont "Build something that works in arc" done '(nil)}))
             (done ({id 23 cont "Research Ordered Associative Arrays" done (2019 1 21)}))))

  arc> ((alref apipe 'todo) 0)
  '(%braces id 1 cont "get eggs" done '(nil))
vs

  arc> tpipe!todo.1
  '(%braces id 23 cont "fix toilet" done '(nil))
But neither works like so

  arc> tpipe!todo.1!id

  arc> ((alref apipe 'todo) 0) 'id)
One must employ (list) or quasiquotation

    (= npipe {todo 
                (list {id 1 cont "get eggs" done '(nil)}
                      {id 23 cont "fix toilet" done '(nil)})
              week '(nil)
              today (list {id 23 cont "Build something that works in arc" done '(nil)})
              done (list {id 23 cont "Research Ordered Associative Arrays" done '(2019 1 21)})})

    (= unqpipe {todo 
                `(,{id 1 cont "get eggs" done '(nil)}
                  ,{id 23 cont "fix toilet" done '(nil)})
                week '(nil)
                today `(,{id 23 cont "Build something that works in arc" done '(nil)})
                done `(,{id 23 cont "Research Ordered Associative Arrays" done '(2019 1 21)})})

This is less of a problem with tpipe since in practice items would get pushed into a "pipe".

Ideal world with key and index access:

  arc> pipe!todo.1!id
  23

  arc> pipe.0.1!id
  23
EDIT: I think tables might be better overall and that the order-necessitating functionality may not be worth the trouble of alists.

reply

2 points by rocketnia 1 day ago | link

I recommend not expecting `quote` or `quasiquote` to be very useful outside the context of metaprogramming.

Quotation is helpful for metaprogramming in a tangible way, because we can easily refactor code in ways that move parts of it from being quoted to being unquoted or vice versa.

And quotation is limited to metaprogramming in a tangible way, because we can only quote data that's reasonably maintainable in the same format we're maintaining our other code in. For instance, an Arc `quote` or `quasiquote` operation is itself written inside an Arc source code file, which is plain text, so it isn't very useful for quoting graphics or audio data.

We can of course use other functions or macros to construct those kinds of data. That's essentially Arc's relationship with tables. When we've constructed tables, we've just used (obj ...) and (listtab ...) and such.

Adding tables to the language syntax is doable, but it could have some quirks.

  ; Should this cause an error, or should it result in the same thing as
  ; '(let i 0 `{,++.i "foo"}) or '(let i 0 `{,++.i "foo"})? Each option
  ; is a little surprising, since any slight edit to the code, like
  ; replacing one ++.i with (++ i 1), would give us valid code to
  ; construct a two-entry table, and this code very well might have
  ; arisen from a slight edit in the opposite direction.
  '(let i 0
    `{,++.i "foo" ,++.i "bar"})
  
  ; Should this always result in 2, or should it result in 1 if "foo"
  ; comes last in the table's iteration order?
  (let x 0
    `{"foo" ,(= x 1) "bar" ,(= x 2)}
    x)
Personally, I tend to go the other way: I prefer to have as few kinds of data as possible in a language's quotable syntax.

A macroexpander needs to extract two things from the code at any given time: The name of the next macro to expand, and the region of syntax the macro should operate on. Symbols help encode macro names. Lists and symbols together help encode regions of plain text. I think it's for these reasons that symbols and lists are so essential to Arc's syntax.

Arc has other kinds of data that commonly occur in its quotable syntax, like strings and numbers, but it doesn't need them nearly as much as symbols and lists. Arc could expand symbols like |"Hello, world!"| and |123| as ssyntax, or it could simply let everyone put up with writing things like (string '|Hello, world!|) and (int:string '|123|) each time.

Tables would fit particularly well into a language's quotable syntax if they somehow helped encode regions of syntax. For instance, if a macro body consisted of all the files in a directory, then a table could be an appropriate represention of that file collection.

reply

1 point by akkartik 13 hours ago | link

I'm having a lot of trouble parsing this comment.

> I recommend not expecting `quote` or `quasiquote` to be very useful outside the context of metaprogramming.

My immediate reaction is to disagree. A lot of the reason Lisp is so great is that quasiquotation is orthogonal to macros/metaprogramming.

    > ; Should this cause an error, or should it result in the same thing as
    > ; '(let i 0 `{,++.i "foo"}) or '(let i 0 `{,++.i "foo"})?
Those two fragments are the same?

In general it feels unnecessarily confusing to include long doc comments in code fragments here. We're already using prose to describe the code before and after.

Code comments make sense when sharing a utility that you expect readers to copy/paste directly into a file to keep around on their disks. But I don't think that's what you intend here?

Finally, both your examples seem to be more about side effects in literals? That is a bad idea whether it's a table literal or not, and whether it uses quasiquoting or not. Do you have a different example to show the issue without relying on side-effects?

reply

3 points by i4cu 1 day ago | link

> EDIT: I think tables might be better overall and that the order-necessitating functionality may not be worth the trouble of alists.

I would agree. At least your examples seem to point to that.

I would suggest you flatten your data and add indexes:

  (= data (obj 1 (obj cont "get eggs")
               2 (obj cont "fix toilet")
               3 (obj cont "Build something that works in arc")  
               4 (obj cont "Research Ordered Associative Arrays")))
Then:

  (= todo  '(1 2))
  (= today '(3))
  (= done  '(4))
And use the indexes to lookup the records. You'll notice by doing it this way you're able to control the order and don't need alists to do so.

Trying to go down the road of deeply nested tables or alists will only lead you to pain and suffering (at least in arc).

Edit: wow, I think arc needs 'sets' too :)

reply

1 point by kinnard 1 day ago | link

The main issue with alists is that the special syntax doesn't work and the notation is so verbose . . . I don't know if the efficiency issues would even come to bear for me.

EDIT: nesting is not behaving as I expect but that may be a product of my own misunderstanding.

reply

2 points by i4cu 1 day ago | link

I would be careful not to structure data/logic to accommodate a special syntax.

i.e while using:

  pipe!todo.1!id 
is certainly fancy, writing a function is just as effective and most likely more performant since it doesn't require read de-structuring:

(fetch pipe todo first id)

So I'm suggesting you shape your syntax usage around your data, not your data around your syntax. You can always write a macro to obtain your desired level of brevity.

reply

2 points by kinnard 17 hours ago | link

Shaping syntax around data rather than data around syntax sounds like the move, I'm probably just not used to having that option.

reply

3 points by shawn 1 day ago | link

Thanks for pointing this out. I’ve pushed a fix. Can you confirm?

reply

2 points by kinnard 1 day ago | link

Nice. I'm getting this error:

  $ arc
  ac.rkt:347:43: tablist: unbound identifier
  in: tablist
  location...:
   ac.rkt:347:43
  context...:
   raise-unbound-syntax-error
   for-loop
   [repeats 2 more times]
   finish-bodys
   for-loop
   finish-bodys
   lambda-clause-expander
   loop
   [repeats 66 more times]
   module-begin-k
   expand-module16
   expand-capturing-lifts
   expand-single
   temp74_0
   compile16
   temp68_2
   ...

reply

3 points by shawn 1 day ago | link

Hmm. I know why. My mistake.

Un momento; fix incoming.

The general idea behind the fix is that quoted literals need to be treated as data. Arc now has two new functions for this purpose: quoted and unquoted.

The fact that (quote {a 1}) now becomes a hash table is a little strange. I’m not entirely sure that’s correct behavior. It depends whether (car '({a 1})) should yield a hash table. It seems like it should, which is reified in the code now.

EDIT: Ok, I've force-pushed the fixed commit. (Sorry for the force-push.)

If you `git reset --hard HEAD~1 && git pull` it should work now.

reply

3 points by kinnard 1 day ago | link

Works great! The only step further that comes to mind to me is:

  arc> '(pipe "water")
  '(pipe "water")

  arc> "she"
  "she
  
  arc> 23
  23

  arc> {pipe "water"}
  {pipe "water"}
rather than

  arc> {pipe "water"}
  '#hash((pipe . "water"))

reply

3 points by shawn 21 hours ago | link

I tried improving anarki's repl experience at https://github.com/arclanguage/anarki/pull/145

It sort of went well, but mostly not. :)

Personally, I found I prefer racket's pretty-printing with the horrible hash tables compared to something like {pipe "water" a 1 b 2 c ...} because if you try to evaluate `items` or `profs` you won't have a clue what the data is without pretty-printing.

And it turns out I suck at writing pretty-printers. Someone else do it!

reply

3 points by shawn 1 day ago | link

Was already working on that. It's clear it's time.

Few minutes. Maybe an hour.

reply

3 points by shawn 1 day ago | link

Close: https://gist.github.com/shawwn/03b936d37e4cd83ca6652bb03c527...

not bad for precisely 59 minutes.

Brb, transferring to starbucks.

reply


I almost never use (coerce param 'sym) when I can say sym.param instead. I've always thought Arc doesn't really do enough with the `coerce` function to justify having it in the language; individual functions like `sym` and `string` already do its job more concisely.

---

In practice when I used to write weakly typed utilities in Arc, I tended to find `zap` very nice:

  (def do-something-with-a-symbol (param)
    (zap sym param)
    ...)
If you're unfamiliar with `zap`, (zap sym param) is essentially (= param (sym param)).

I prefer strong typing these days, but I've sometimes thought this `zap` technique could be refined to put the coercions in the argument list directly. Arc has (o arg default) for optional arguments, and we could imagine a similar (z coercion arg) for automatically zapping an argument as it comes in:

  (def do-something-with-a-symbol ((z sym param))
    ...)
---

Something else that's been on my mind is that it could be useful to have higher-order coercion combinators.

Racket code can use `->` to build a contract for a function out of contracts for its arguments and its result. The result of (-> string? string? symbol?) is a contract that verifies a value is a two-argument function and then replaces it with a function that delegates to that one, but which verifies that the arguments are strings and that the result is a symbol.

The same thing could be done for coercions: The result of (as-> string string sym) could be a coercion function that coerces its argument into a two-argument function that delegates to the original value, but which first coerces the arguments to strings and then coerces the result to a symbol.

  (def as-> args
    (let (result-coercion . arg-coercions) rev.args
      (zap rev arg-coercions)
      (fn (orig)
        (fn args
          (result-coercion:apply orig
            (map (fn (coercion arg) coercion.arg)
              arg-coercions args))))))
  
  (= symbol+ ((as-> string string sym) +))

  arc> (symbol+ 'foo 'bar)
  'foobar
Similarly, in Racket, `(listof symbol?)` is a contract that checks that a value is a list of symbols, and for coercions we could imagine a corresponding `(aslistof sym)` operation for use in your `(map [coerce _ 'sym] ...)` example.

Sometimes Arc's weak typing suffers from making poor guesses as to whether `nil` is intended as a symbol or as a list (not to mention as a boolean), and it takes some adjusting to work around it:

  arc> (symbol+ 'foo 'nil)
  'foo
  
  arc> (symbol+ 'nil 'nil)
  '||
  
  arc> (= symbol+ ((as-> [string:or _ "nil"] [string:or _ "nil"] sym) +))
  #<procedure: as->>
  
  arc> (symbol+ 'nil 'nil)
  'nilnil

reply

2 points by kinnard 1 day ago | link

I overlooked the existence of `sym` which does make my request altogether superfluous. Higher-order coercion combinators will take some digestion on my part!

reply


"it's a limitation of racket that you can't `read` a table you've written"

Eh? You definitely can, in Racket. :) The only problem I know of is that when you write a mutable table, you read back an immutable one.

---

"You're right that Arc still can't read tables written via `write`."

Arc's supported this for as long as I can remember. The only problem is that they come back immutable.

Here's Arc 3.2 on Racket 7.0:

  arc> (= foo (fromstring (tostring:write:obj a 1 b 2) (read)))
  #hash((a . 1) (b . 2))
  arc> (= foo!a 3)
  Error: "hash-set!: contract violation\n  expected: (and/c hash? (not/c immutable?))\n  given: '#hash((a . 1) (b . 2))\n  argument position: 1st\n  other arguments...:\n   'a\n   3"
And here's the latest Anarki:

  arc> (= foo (fromstring (tostring:write:obj a 1 b 2) (read)))
  '#hash((a . 1) (b . 2))
  
  arc> (= foo!a 3)
  3
  
  arc> foo
  '#hash((a . 3) (b . 2))
Oh, I guess it works!

It looks like that's thanks to a March 8, 2012 commit by akkartik (https://github.com/arclanguage/anarki/commit/547d8966de76320...)... which, lol... Everything I was saying in a couple of recent threads about replacing the Arc reader to read mutable tables... I guess that's already in place. :)

reply

3 points by kinnard 2 days ago | link

It'd be cool if this worked with curly brace syntax. You could read in (and write) a file that looked like this:

  {'id 3 
   'c {
      'name "james c clarke" 
      'age 23 
      'addr "1724 Cox Ave. NY, NY 90210"
     }
  }
I think it'd make reading and writing a much better experience.[1]

[1] http://arclanguage.org/item?id=20803

EDIT: I guess this would be called "table literals"?

reply

1 point by kinnard 1 day ago | link

Does the necessity of quasiquote + unquote feel natural?

  (= tpipe {todo 
              '({id 1 cont "get eggs" done '(nil)}
                {id 23 cont "fix toilet" done '(nil)})
            week '(nil)
            today '({id 83 cont "Build something that works in arc" done '(nil)})
This won't work:

  arc> tpipe!todo.1!id
One must employ (list) or quasiquotation

    (= npipe {todo 
                (list {id 1 cont "get eggs" done '(nil)}
                      {id 23 cont "fix toilet" done '(nil)})
              week '(nil)
              today (list {id 83 cont "Build something that works in arc" done '(nil)})
              done (list {id 44 cont "Research Ordered Associative Arrays" done '(2019 1 21)})})

    (= unqpipe {todo 
                `(,{id 1 cont "get eggs" done '(nil)}
                  ,{id 23 cont "fix toilet" done '(nil)})
                week '(nil)
                today `(,{id 83 cont "Build something that works in arc" done '(nil)})
                done `(,{id 44 cont "Research Ordered Associative Arrays" done '(2019 1 21)})})

reply


"The best part is that the server should be able to reboot without losing the closures."

Might want to remember to wipe them out if you make certain changes to the code, so that you don't have to think about the effects of running old code and new code in the same system.

Pretty nifty that elisp has that built in. :)

reply


(Edit: Oops, I replied out of order and didn't read shawn's comment with the elisp examples before writing this.)

I suspect what pg means by it is primarily that it's tricky to do in Racket (though I'm not sure if it'd be because there are too few options or too many).

Essentially, I think it's easy to display a closure by displaying its source code and all the captured values of its source code's free variables. (Note that this might be cyclic since functions are often recursive.)

But there is something tricky about it, which is: What language is the source code in? In my opinion, a language design is at its most customizable when even its built-in syntaxes are indistinguishable from user-defined ones. So the ideal displayed format of a function would ideally involve some particular set of user-definable syntaxes. Since a language designer can't anticipate what user-defined syntaxes will exist, clearly this decision should ultimately be up to a user. But what mechanism does a user use to express this decision?

As a baseline, there's at least one straightforward choice the user can make: The one that expresses the code in terms of Arc special forms (`fn`, `assign`, `if`, `quote`, etc.) and Arc functions that aren't implemented in Arc (`car`, `apply`, etc.). In a language that isn't aiming for flawless customizability, this is enough.

Now suppose we try to generalize this so a user can choose any set of syntaxes to express things in terms of -- say, "I want the usual language, but with `=` treated as an additional built-in." If the code contains `(= x (rfn ...))`, then the macroexpander at some point needs to expand the `rfn` without expanding the `=`. That's not really viable in terms of Arc's macroexpander since we don't even know `(rfn ...)` is an expression in this context until we process `=`. So this isn't quite the right generalization; the right generalization is something trickier.

I suppose we can have every function printed in terms of its pre-macroexpansion source code along with printing all the macros and other macroexpansion-time state it happened to rely on as it macroexpanded the first time. That would narrow down the problem to how to print the built-in functions. And we could solve that problem in the language design by making it so nothing ever captures a built-in function as a first-class value, only as a late-bound reference to the global variable it's accessible from.

Or we could have the user specify their own macroexpander and make it so that whenever a function is printed, if the current macroexpander hasn't expanded that function yet, it does so now (just to determine how the function is printed, not how it behaves). This would let the user specify, for instance, that `assign` expands into `=` and `=` expands into itself, rather than the other way around.

These ideas are incomplete, and I think making them complete would be pretty tricky.

In Cene, I have a different take on this: If a function is printable (and not all are), then it's printable because it's a callable struct with a tag the printer knows about. It would be printed as a struct. The function implementation wouldn't be printed. (The user could look up the source code information based on the struct tag, but that's usually not printable.) There may be some exceptions at the REPL where information is printed that usually isn't available, because the REPL is essentially a debugging context and the debugger sees all. (Racket's struct inspectors express a similar debugger-sees-all principle, but I haven't seen the REPL take advantage of it.)

reply

2 points by shawn 3 days ago | link

You're hitting on a problem I've been thinking about for years. There are a few reasons this is tricky, notably related to detecting whether something is a variable reference or a variable declaration.

  (%language arc
    (let in (instring "  foo")
      (%language scm
        (let-values (((a b c) (port-next-location in)))
          (%language el
            (with-current-buffer (generate-new-buffer "bar")
              (insert (prin1-to-string c))
              (current-buffer)))))))

To handle this example, you'll need to know whether each form is a function call, a variable definition, a list of definitions (let-values), a function call, and which target the function is being called for.

For example, an arc function call needs to expand into `(ar-apply foo ...)`

And due to syntax, you can't just handle all the cases by writing some hypothetical very-smart `ar-apply` function. If your arc compiler targets elisp, it's tempting to try something like this:

  (ar-apply let (ar-apply (ar-apply a (list 1))) ...
which can collapse nicely back down to

  (let ((a 1)) ...)
in other words, it's tempting to try to defer the "syntax concern" until after you've walked the code and expanded all the macros. Then you'd collapse the resulting expressions back down to the language-specific syntax.

But it quickly becomes apparent that this is a bad idea.

Another alternative is to have a "standard language" which all the nested languages must transpile tO:

  (%let in (%call instring "  foo")
    (%let (a b c) (%call port-next-location in)
      (|with-current-buffer| (%call generate-new-buffer "bar")
        (%call insert (prin1-to-string c)
          (%call current-buffer)))))
Now, that seems much better! You can take those expressions and easily spit out code for scheme, elisp, arc, or any other target. And from there it's just a matter of adding shims on each runtime.

The tricky case to note in the above example is with-current-buffer. It's an elisp macro, meaning it has to end up in functional position like (with-current-buffer ...) rather than something like (funcall #'with-current-buffer ...)

There are two ways to deal with this case. One is to hook into elisp's macroexpand function and expand the macros as you go. Emacs calls this eager macroexpansion, and there are some cases related to autoloading (I think?) that make this not necessarily a good idea.

The other way is to punt, and have the user indicate "this is an elisp form; do not mess with it."

The idea is that if the symbol in functional position is surrounded by pipe chars, then the compiler should leave its position alone but compile the arguments. So

   (|with-current-buffer| foo
     (prn (|buffer-string|)))
That works quite nicely, untll you try this:

  (|let| ((|a| 1) (|b| 2))
    (+ |a| |b|))
Then you'll be in for a nasty surprise: not only does it look visually awful and annoying to write, but it won't work at all, because it'll compile to something like this:

  (let (ar-funcall2 (a 1) (b 2))
    (ar-funcall2 _+ a b))

I am not sure it's possible to escape the "syntax concern". Emacs itself had to deal with it for user macros. And the solution is unfortunately to specify the syntax of every form explicitly:

https://www.gnu.org/software/emacs/manual/html_node/elisp/In...

  (def-edebug-spec let
       ((&rest
         &or symbolp (gate symbolp &optional form))
        body))
Ugh, augh, grawr. You can see how bad it would be to curse the user with having to do this for every macro they write.

Yet I am not sure it's possible to escape this fate. And it seems to work well in emacs.

Hopefully some of that might be helpful on your quest. The goal is worth pursuing.

reply

3 points by rocketnia 3 days ago | link

I can't speak to elisp, but the way macro systems work in Arc and Racket, the code inside a macro call could mean something completely different depending on the macro. Some macros could quote it, compile it in their own way, etc. So any code occurring in a macro call generally can't be transformed without changing the meaning of the program. Trying to detect and process other macro calls inside there is unreliable.

I have ideas in mind for how macro systems can express "Okay, this macro call is over; everything beyond this point in the s-expression is an expression." But that doesn't help with Arc or Racket, whose macro systems aren't designed for that.

So something like your situation, where you need to walk the code before knowing which macroexpander to subject each part of it to, can't reliably treat the code as code. It's better to treat the code as a meaningless soup of symbols and parentheses (or even as a string). You can walk through the data and find things like `(%language ...)` and treat those as escape sequences.

(What elisp is doing there looks like custom escape sequences, which I think is ultimately a more concise way of doing things if new macro definitions are rare. It gets into a middle ground between having s-expression soup and having a macro system that's designed for letting code be walked like this.)

Processing the scope of variables is a little difficult, so my escape sequences would be a bit more verbose than your example. It's not like we can't take a Racket expression and infer its free variables, but we can only do that if we're ready to call the Racket macroexpander, which isn't part of the approach I'm describing.

(I heard elisp is lexically scoped these days. Is that right?)

This is how I'd modify the escape sequences. This way it's clear what variables are passing between languages:

  (%language arc ()
    (let in (instring "  foo")
      (%language scm ((in in))
        (let-values (((a b c) (port-next-location in)))
          (%language el ((c c))
            (with-current-buffer (generate-new-buffer "bar")
              (insert (prin1-to-string c))
              (current-buffer)))))))
Actually, instead of just (in in), I might also specify a named strategy for how to convert that value from an Arc value to a Racket value.

Anyhow, once we walk over this and process the expressions, we can wind up with generated code like this:

  ; block 1, Arc code
  (fn (__block2 __block3)
    (let in (instring "  foo")
      (__block2 __block3 in)))
  
  ; block 2, Scheme code
  (lambda (__block3 in)
    (let-values (((a b c) (port-next-location in)))
      (__block3 c)))
  
  ; block 3, elisp code
  (lambda (c)
    (with-current-buffer (generate-new-buffer "bar")
      (insert (prin1-to-string c))
      (current-buffer)))
We also collect enough metadata in the process that we can write harnesses to call these blocks at the right times with the right values.

This is a general-purpose technique that should help with any combination of languages. It doesn't matter if they run in the same address space or anything; that kind of detail only changes what options you have for value marshalling strategies.

I think there's a somewhat more convenient approach that might be possible between Arc and Racket, since their macroexpanders both run in the same process and can trade off with each other: We can have an Arc macro that expands its body as Racket code (essentially Anarki's `$`) and a Racket macro that expands its body as Arc code. But there are some difficulties designing the latter, particularly in terms of Racket's approach to hygiene and its local macros, two things the Arc macroexpander has zero concept of. When we switch from Racket to Arc and back to Racket, the hygiene information and local macro scopes will probably be obliterated.

In your arcmacs project, I guess you might also be able to have an Arc macro that expands its body as elisp code, an elisp macro that expands its body as Racket code, etc. :-p So maybe that's the approach you really want to take with `%language` and I'm off on a tangent with this "escape sequence" interpretation.

reply

2 points by rocketnia 4 days ago | link | parent | on: Creating Languages in Racket

"[Racket] has every feature you could want. And yet it is very difficult to do even the simplest things. For example, when an error is raised in Anarki, you'll see a stack trace that points to ac.rkt rather than the actual location within the arc file that caused the error."

Is that really "the simplest things"? :-p It seems to me Arc goes out of its way to avoid interleaving source location information on s-expressions the way Racket does. Putting it back, without substantially changing the way Arc macros are written, seems to me like it would be pretty complicated, and that complication would exist whether Arc was implemented in Racket or not. (I think aw's approach is promising here.)

---

It's fun to see Lumen is designed for compiling to other languages even when it's self-hosted. That's always how I figured Arc or pretty much any of my languages would have been written once they were self-hosted.

I've been trying to write Racket libraries that (among other benefits) let me implement Cene in Racket the way I'd like to implement Cene in Cene, which should make it an easier process to port it into a self-hosting Cene implementation. But I certainly don't have it working yet, the way Lumen clearly is. :)

reply

3 points by shawn 12 hours ago | link

Is that really "the simplest things"? :-p It seems to me Arc goes out of its way to avoid interleaving source location information on s-expressions the way Racket does. Putting it back, without substantially changing the way Arc macros are written, seems to me like it would be pretty complicated, and that complication would exist whether Arc was implemented in Racket or not.

Surprisingly it's possible to get pretty close. The trick is to read-syntax rather than read, and then have ac map over syntax objects properly. At that point it's a matter of using eval-syntax rather than eval.

The takeaway is that the error messages are much, much nicer. I'm talking "the error is at line 213 of app.arc in function init-userinfo" nicer.

It's not perfect. And to get to perfect, you'd have to do as you say and rework how macros behave. But it's maybe 90% of the benefit with little work.

reply


That does seem like a conundrum.

Where do you actually need source location information in order to get Arc function names to show up in the profiler?

Would it be okay to track it just on symbols, bypassing all this list conversion and almost all of the Arc built-ins' unwrapping steps (since not many operation have to look "inside" a symbol)?

If you do need it on cons cells, do you really need it directly on the tail cons cells of a macro body? I'd expect it to be most useful on the cons cells in functional position. If you don't need it on the tails, then it's no problem when the `apply` strips it.

Oh, you know what? How about this: In `load`, use `read-syntax`, extract the line number from that syntax value, and then use `syntax->datum` and expand like usual. While compiling that expression, turn `fn` into (let ([fn-150 (lambda ...)]) fn-150) or (procedure-rename (lambda ...) 'fn-150), replacing "150" here with whatever the source line number is. Then the `object-name` for the function will be "fn-150" and I bet it'll appear in the profiling data that way, which would at least give you the line number to work with.

If you want, and if that works, you can probably have `load` do a little bit of inspection to see if the expression is of the form (mac foo ...) or (def foo ...), which could let you create a more informative function name like `foo-150`.

There's something related to this in `ac-set1`, which generates (let ([zz ...]) zz) so that at least certain things in Arc are treated as being named "zz". Next to it is the comment "name is to cause fns to have their arc names while debugging," so "zz" was probably the Arc variable name at some point.

reply

3 points by aw 14 days ago | link

> Would it be okay to track it just on symbols

mmm, not sure. It'd probably be easier to start with a working version (even if slow) and then remove source information from lists and see if anything breaks.

> In `load`, use `read-syntax`, extract the line number from that syntax value

erm, so all functions forms compiled during the eval of that expression would get named "fn-150"?

reply

3 points by rocketnia 14 days ago | link

"erm, so all functions forms compiled during the eval of that expression would get named "fn-150"?"

That's what I mean, yeah. Maybe you could name them with their source code if you need to know which one it is, if it'll print names that wide. :-p This isn't any kind of long-term aspiration, just an idea to get you the information you need.

reply


"`var` can't be a macro since a macro can only expand itself -- it doesn't have the ability to manipulate code that appears after it"

It can't be a regular Arc macro, but Arc comes with at least two things I'd call macro systems already: `mac` of course, but also `defset`. Besides those, I keep thinking of Arc's destructuring syntax and reader syntax as macro systems, but that's just because they could easily be extended to be.

I would say it makes sense to add another macro system for local definitions. As a start, it could be a system where the code (do (var a 1) (var b 2) (+ a b)) would expand by calling the `var` macro with both the list (a 2) and the list ((var b 2) (+ a b)).

But I think that's an incomplete design. All it does is help with indentation, and I feel like a simpler solution for indentation is to have a "weak opening paren" syntactic sugar where (a b #/c d) means (a b (c d)):

  (def foo ()
    (let a (...)
    #/do (do-something a)
    #/let b (... a ...)
    #/do (do-something-else b)
    #/let c (... b ...)
      etc...))
The "#/" notation is the syntax of my Parendown library for Racket (and it's based on my use of "/" in Cene). I recently wrote some extensive examples in the Parendown readme: https://github.com/lathe/parendown-for-racket/blob/master/RE...

(For posterity, here's a link specifically to the version of that readme at the time of this post: https://github.com/lathe/parendown-for-racket/blob/23526f8f5...)

While I think indentation reduction is one of the most important features internal definition contexts have, that benefit is less important once Parendown is around. I think their remaining benefit is that they make it easy to move code between the top level and local scopes without modifying it. I made use of this property pretty often in JavaScript; I would maintain a programming language runtime as a bunch of top-level JavaScript code and then have the compiler wrap it in a function body as it bundled that runtime together with the compiled code.

In Racket, getting the local definitions and module-level definitions to look alike means giving them the same support for making macro definitions and value definitions at every phase level, with the same kind of mutual recursion support.

They go to a lot of trouble to make this work in Racket, which I think ends up making mutual recursion into one of the primary reasons to use local definitions. They use partial expansion to figure out which variables are shadowed in this local scope before they proceed with a full expansion, and they track the set of scope boundaries that surround each macro-generated identifier so that those identifiers can be matched up to just the right local variable bindings.

The Arc top level works much differently than the Racket module level. Arc has no hygiene, and far from having a phase distinction, Arc's top level alternates between compiling and running each expression. To maintain the illusion that the top level is just another local scope, almost all local definitions would have to replicate that same alternation between run time and compile time, meaning that basically the whole set of local definitions would run at compile time. So local definitions usually would not be able to depend on local variables at all. We would have to coin new syntaxes like `loc=` and `locdef` for things that should interact with local variables, rather than using `=` and `def`.

Hm, funny.... If we think of a whole Arc file as being just another local scope, then it's all executed at "compile time," and only its `loc=` and `locdef` code is deferred to "run time," whenever that is. This would be a very seamless way to use Arc code to create a compilation artifact. :)

reply

1 point by akkartik 16 days ago | link

"While.. indentation reduction is one of the most important.. that benefit is less important once Parendown is around."

One counter-argument that short-circuits this line of reasoning for me: the '#/' is incredibly ugly, far worse than the indentation saved :) Even if that's just my opinion, something to get used to, the goal with combining defining forms is to give the impression of a single unified block. Having to add such connectors destroys the illusion. (Though, hmm, I wouldn't care as much about a trailing ';' connector. Maybe this is an inconsistency in my thinking.)

reply

2 points by rocketnia 15 days ago | link

"the '#/' is incredibly ugly, far worse than the indentation saved"

I'd love to hear lots of feedback about how well or poorly Parendown works for people. :)

I'm trying to think of what kind of feedback would actually get me to change it. :-p Basically, I added it because I found it personally helped me a lot with maintaining continuation-passing style code, and the reason I found continuation-passing style necessary was so I could implement programming languages with different kinds of side effects. There are certainly other techniques for managing continuation-passing style, side effects, and language implementation, and I have some in mind to implement in Racket that I just haven't gotten to yet. Maybe some combination of techniques will take Parendown's place.

---

If it's just the #/ you object to, yeah, my preferred syntax would be / without the # in front. I use / because it looks like a tilted paren, making the kind of zig-zag that resembles how the code would look if it used traditional parens:

  (
    (
      (
  /
  /
  /
This just isn't so seamless to do in Racket because Racket already uses / in many common identifiers.

Do you think I should switch Parendown to use / like I really want to do? :) Is there perhaps another character you think would be better?

---

What's that about "the impression of a single unified block"? I think I could see what you mean if the bindings were all mutually recursive like `letrec`, because then nesting them in a specific order would be meaningless; they should be in a "single unified block" without a particular order. I'm not recommending Parendown for that, only for aw's original example with nested `let`. Even in Parendown-style Racket, I use local definitions for the sake of mutual recursion, like this:

  /...
  /let ()
    (define ...)
    (define ...)
    (define ...)
  /...

reply

1 point by akkartik 15 days ago | link

My objection is not to the precise characters you choose but rather the very idea of having to type some new character for every new expression. My personal taste (as you can see in Wart):

a) Use indentation.

b) If indentation isn't absolutely clear, just use parens.

Here's another comment that may help clarify what I mean: https://news.ycombinator.com/item?id=8503353#8507385

(I have no objection to replacing parens with say square brackets or curlies or something. I just think introducing new delimiters isn't worthwhile unless we're getting something more for it than just minimizing typing or making indentation uniform. Syntactic uniformity is Lisp's strength.)

I can certainly see how I would feel differently when programming in certain domains (like CPS). But I'm not convinced ideas that work in a specific domain translate well to this thread.

"What's that about "the impression of a single unified block"? I think I could see what you mean if the bindings were all mutually recursive like `letrec`, because then nesting them in a specific order would be meaningless; they should be in a "single unified block" without a particular order. I'm not recommending Parendown for that, only for aw's original example with nested `let`."

Here's aw's original example, with indentation:

    (def foo ()
      (var a (...))
      (do-something a)
      (var b (... a ...))
      (do-something-else b)
      (var c (... b ...))
      etc...)
The bindings aren't mutually recursive; there is definitely a specific order to them. `b` depends on `a` and `c` depends on `b`. And yet we want them in a flat list of expressions under `foo`. This was what I meant by "single unified block".

(I also like that aw is happy with parens ^_^)

Edit: Reflecting some more, another reason for my reaction is that you're obscuring containment relationships. That is inevitably a leaky abstraction; for example error messages may require knowing that `#/let` is starting a new scope. In which case I'd much rather make new scopes obvious with indentation.

What aw wants is a more semantic change where vars `a`, `b` and `c` have the same scope. At least that's how I interpreted OP.

reply

2 points by rocketnia 14 days ago | link

"What aw wants is a more semantic change where vars `a`, `b` and `c` have the same scope. At least that's how I interpreted OP."

I realize you also just said the example's bindings aren't mutually recursive, but I think if `a` and `c` "have the same scope" then `b` could depend on `c` as easily as it depends on `a`, so it seems to me we're talking about a design for mutual recursion. So what you're saying is the example could have been mutually recursive but it just happened not to be in this case, right? :)

Yeah, I think the most useful notions of "local definitions" would allow mutual recursion, just like top-level definitions do. It was only the specific, non-mutually-recursive example that led me to bring up Parendown at all.

---

The rest of this is a response to your points about Parendown. Some of your points in favor of not using Parendown are syntactic regularity, the clarity of containment relationships, and the avoidance of syntaxes that have little benefit other than saving typing. I'm a little surprised by those because they're some of the same reasons I use Parendown to begin with.

Let's compare the / reader macro with the `withs` s-expression macro.

I figure Lisp's syntactic regularity has to do with how long it takes to describe how the syntax works. Anyone can write any macro they want, but the macros that are simplest to describe will be the simplest to learn and implement, helping them propagate across Lisp dialects and thereby become part of what makes Lisp so syntactically regular in the first place.

- The / macro consumes s-expressions until it peeks ) and then it stops with a list of those s-expressions.

- The `withs` macro expects an even-length binding list of alternating variables and expressions and another expression. It returns the last expression modified by successively wrapping it lexical bindings of each variable-expression pair from that binding list, in reverse order.

Between these two, it seems to me `withs` takes more time and care to document, and hence puts more strain on a Lisp dialect's claim to syntactic regularity. (Not much more strain, but more strain than / does.)

A certain quirk of `withs` is that it creates several lexical scopes that are not surrouded by any parentheses. In this way, it obscures the containment relationship of those lexical scopes.

If we start with / and `withs` in the same language, then I think the only reason to `withs` is to save some typing.

So `withs` not only doesn't pull its weight in the language, but actively works against the language's syntactic regularity and containment relationship clarity.

For those reasons I prefer / over `withs`.

And if I don't bother to put `withs` in a language design, then whenever I need to write sequences of lexical bindings, I find / to be the clear choice for that.

Adding `withs` back into a language design isn't a big deal on its own; it's just slightly more work than not doing it, and I don't think its detriment to syntactic regularity and containment clarity are that bad. I could switch back from / to `withs` if it was just about this issue.

This is far from the only place I use / though. There are several macros like `withs` that I would need to bring back. The most essential use case -- the one I don't know an alternative for -- is CPS. If I learn of a good enough alternative to using / for CPS, and if I somehow end up preferring to maintain dozens of macros like `withs` instead of the one / macro, then I'll be out of reasons to recommend Parendown.

reply

2 points by akkartik 14 days ago | link

"A certain quirk of `withs` is that it creates several lexical scopes that are not surrounded by any parentheses. In this way, it obscures the containment relationship of those lexical scopes."

That's a good point that I hadn't considered, thanks. Yeah, I guess I'm not as opposed to obscuring containment as I'd hypothesized ^_^

Maybe my reaction to (foo /a b c /d e f) is akin to that of a more hardcore lisper when faced with indentation-based s-expressions :)

Isn't there a dialect out there that uses a different bracket to mean 'close all open parens'? So that the example above would become (foo (a b c (d e f]. I can't quite place the memory.

I do like far better the idea of replacing withs with a '/' ssyntax for let. So that this:

  (def foo ()
    /a (bind 1)
    (expr 1)
    /b (bind 2)
    (expr 2)
    /c (bind 3)
    (expr 3))
expands to this:

  (def foo ()
    (let a (bind 1)
      (expr 1)
      (let b (bind 2)
        (expr 2)
        (let c (bind 3)
          (expr 3)))))
It even feels like a feature that '/' looks kinda like 'λ'.

But I still can't stomach using '/' for arbitrary s-expressions. Maybe I will in time. I'll continue to play with it.

reply

2 points by rocketnia 14 days ago | link

"Isn't there a dialect out there that uses a different bracket to mean 'close all open parens'? So that the example above would become (foo (a b c (d e f]. I can't quite place the memory."

I was thinking about that and wanted to find a link earlier, but I can't find it. I hope someone will; I want to add a credit to that idea in Parendown's readme.

I seem to remember it being one of the early ideas for Arc that didn't make it to release, but maybe that's not right.

---

"I do like far better the idea of replacing withs with a '/' ssyntax for let."

I like it pretty well!

It reminds me of a `lets` macro I made in Lathe for Arc:

  (lets
    a (bind 1)
    (expr 1)
    b (bind 2)
    (expr 2)
    c (bind 3)
    (expr 3))
I kept trying to find a place this would come in handy, but the indentation always felt weird until I finally prefixed the non-binding lines as well with Parendown.

This right here

    (expr 1)
    b (bind 2)
looked like the second ( was nested inside the first, so I think at the time I tried to write it as

       (expr 1)
    b  (bind 2)
With my implementation of Penknife for JavaScript, I finally started putting /let, /if, etc. at the beginning of every line in the "block" and the indentation was easy.

With the syntax you're showing there, you at least have a punctuation character at the start of a line, which I think successfully breaks the illusion of nested parens for me.

---

It looks like I implemented `lets` because of this thread: http://arclanguage.org/item?id=11934

And would you look at that, ylando's macro (scope foo @ a b c @ d e f) is a shallow version of Parendown's macro (pd @ foo @ a b c @ d e f). XD (The `pd` macro works with any symbol, and I would usually use `/`.) I'm gonna have to add a credit to that in Parendown's readme too.

Less than a year later than that ylando thread, I started a Racket library with a macro (: foo : a b c : d e f): https://github.com/rocketnia/lathe/commit/afc713bef0163beec4...

So I suppose I have ylando to thank for illustrating the idea, as well as fallintothis for interpreting ylando's idea as an implemented macro (before ylando did, a couple of days later).

reply

2 points by rocketnia 8 days ago | link

"Isn't there a dialect out there that uses a different bracket to mean 'close all open parens'? So that the example above would become (foo (a b c (d e f]. I can't quite place the memory."

Oh, apparently it's Interlisp! They call the ] a super-parenthesis.

http://bitsavers.trailing-edge.com/pdf/xerox/interlisp/Inter...

  The INTERLISP read program treats square brackets as 'super-parentheses': a
  right square bracket automatically supplies enough right parentheses to match
  back to the last left square bracket (in the expression being read), or if none
  has appeared, to match the first left parentheses,
  e.g.,    (A (B (C]=(A (B (C))),
           (A [B (C (D] E)=(A (B (C (D))) E).
Here's a document which goes over a variety of different notations (although the fact they say "there is no opening super-parenthesis in Lisp" seems to be inaccurate considering the above):

http://www.linguistics.fi/julkaisut/SKY2006_1/2.6.9.%20YLI-J...

They favor this approach, which is also the one that best matches the way I intend for Parendown to work:

"Krauwer and des Tombe (1981) proposed _condensed labelled bracketing_ that can be defined as follows. Special brackets (here we use angle brackets) mark those initial and final branches that allow an omission of a bracket on one side in their realized markup. The omission is possible on the side where a normal bracket (square bracket) indicates, as a side-effect, the boundary of the phrase covered by the branch. For example, bracketing "[[A B] [C [D]]]" can be replaced with "[A B〉 〈C 〈D]" using this approach."

That approach includes what I would call a weak closing paren, 〉, but I've consciously left this out of Parendown. It isn't nearly as useful in a Lispy language (where lists usually begin with operator symbols, not lists), and the easiest way to add it in a left-to-right reader macro system like Racket's would be to replace the existing open paren syntax to anticipate and process these weak closing parens, rather than non-invasively extending Racket's syntax with one more macro.

reply

1 point by akkartik 14 days ago | link

What are the semantics of `lets` exactly? Do you have to alternate binding forms with 'body' expressions?

reply

2 points by rocketnia 13 days ago | link

No, you can interleave bindings and body expressions however you like, but the catch is that you can't use destructuring bindings since they look like expressions. It works like this:

  (lets) -> nil
  (lets a) -> a
  (lets a b . rest) ->
    If `a` is ssyntax or a non-symbol, we treat it as an expression:
      (do a (lets b . rest))
    Otherwise, we treat it as a variable to bind:
      (let a b (lets . rest))
The choice is almost forced in each case. It almost never makes sense to use an ssyntax symbol in a variable binding, and it almost never makes sense to discard the result of an expression that's just a variable name.

The implementation is here in Lathe's arc/utils.arc:

Current link: https://github.com/rocketnia/lathe/blob/master/arc/utils.arc

Posterity link: https://github.com/rocketnia/lathe/blob/e21a3043eb2db2333f94...

reply

3 points by aw 14 days ago | link

> What aw wants is a more semantic change where vars `a`, `b` and `c` have the same scope. At least that's how I interpreted OP.

Just to clarify, in my original design a `var` would expand into a `let` (with the body of the let extending down to the bottom of the enclosing form), and thus the definitions wouldn't have the same scope.

Which isn't to say we couldn't do something different of course :)

Huh, I thought I had fixed the formatting in post, but apparently it didn't get saved. Too late to edit it now.

reply


Yeah, that 1830 is the number of characters since the start of the stream, like you say. (I think it's specifically 1 plus the number of bytes, as documented at [1].)

Racket errors tend to report whatever location information they can, and Racket ports track four things: The name of the file, the line number, the column number, and the overall position in the stream. But line-counting isn't enabled by default on Racket streams, so the error you see doesn't contain the line number information.

I just pushed a commit[2] to Anarki that enables line-counting on the stream when an Arc files is loaded, which should mean you can now see the line and column of a reader error. A reader error on line 17, column 6 should now be printed like so:

  sample-file.arc:17:6: read: expected a closing "
[1] (https://docs.racket-lang.org/reference/linecol.html#(def._((...)

[2] https://github.com/arclanguage/anarki/commit/208620eca083650...

reply

4 points by zck 12 days ago | link

What a nice change! Super glad to have this.

reply

4 points by kinnard 17 days ago | link

Wow.

reply

3 points by rocketnia 19 days ago | link | parent | on: Lexically scoped macros?

A situation where Common Lisp's `macrolet` or Racket's `let-syntax` would be handy came up here a few months ago. I dropped in with some thoughts on how we could add it to Arc: http://arclanguage.org/item?id=20561

I think what I describe there pretty much meshes with what you and waterhouse are describing here. :)

---

"And then "anarki-test" would be evaluated at compile time? Hmm..."

Yeah, I think basically every instance of local macros I've seen involves evaluating the expression at compile time.

That's even what Racket's `let-syntax` does, although in Racket's case it involves a little more detail since Racket enforces a strict separation between compile-time and run-time side effects. When Racket evaluates the expression at compile time (usually, in phase level 1) it first expands that expression in the phase level corresponding to the compile time of the compile time (phase level 2), and if that expression contains another `let-syntax`, then it starts expanding an expression in phase level 3 and so on.

Arc evaluates and expands everything in one phase, as far as Racket is concerned. It would make sense for `lexical-macro` to do its evaluation in the same phase too.

I notice Common Lisp's `macrolet` allows[1] inner `macrolet` macros to depend on outer ones, like this:

  (lexical-macro (foo) ''world
    (lexical-macro (bar) `(sym:+ "hello-" ,(foo))
      (bar)))
  ; could return the symbol 'hello-world
To support that, when `lexical-macro` evaluates the expression, it should expand that expression in the same macro binding environment the `lexical-macro` call was expanded in.

[1] http://www.lispworks.com/documentation/HyperSpec/Body/s_flet...

---

Here's a different take on the local macro concept by almkglor 10 years ago: http://arclanguage.org/item?id=3085

That implementation, `macwith`, is still around on Anarki's arc2.master branch, in the file lib/macrolet.arc: https://github.com/arclanguage/anarki/blob/af5b756e983807ba6...

It works by traversing the body s-expressions and expanding all occurrences it finds, leaving anything else alone. I think this means it tends to break if the code ever contains a list that looks like a call to that macro but isn't supposed to be, such as a quoted occurrence like '(my-macro), a binding occurrence like (fn (my-macro) ...), or even another `macwith` binding occurrence of the same macro name.

I don't prefer this to the other approach we're talking about, since after all it's within easy reach of the macro system to support local macros itself rather than with an error-prone code-walker like this.

However, `macwith` is a macro that makes sense in its own right. It's easy to work around many of the places the code walker runs into false positives, and if `macwith` came with support for an escape sequence, there would be easy workarounds in many other cases too.

If I tried to propose a particular escape sequence design for `macwith` right here and now, I could be here for a while. I've been working for two years to make a macro system suitable for factoring out escape sequence syntaxes into libraries, and my favorite designs for `macwith` escape sequences would be the ones that solved all the same problems I'm building that system for.

reply

More