Arc Forumnew | comments | leaders | submit | rocketnia's commentslogin

I'm excited to have something I've put together this week that seems highly relevant to discussions we've had here. :) Basically, Fexpress is an fexpr evaluator, but the fexprs are an extended kind that can also offer up compilation functionality in addition to interpretation.

There are only so many things that are "internal" in this project, because all the bits and pieces might be useful to user-defined fexprs, so you can get a pretty thorough tour of the techniques from the docs: https://docs.racket-lang.org/fexpress/index.html

I was merciless about cutting down the scope of this project so that I wouldn't overcomplicate it and make it incomprehensible to people. I specifically made certain questionable choices in the API just so that the code could be a little easier to read.

- Link to the code: https://github.com/rocketnia/fexpress/blob/main/fexpress-lib...

- Link to the current version of the code, as of writing this: https://github.com/rocketnia/fexpress/blob/3e91862efaeb61c71...

Until this week, the Fexpress repo consisted of only a non-starter attempt I made 10 years ago, so it's a bit nostalgic to finally build it out into a real project. I wrote up a blog post about this journey and where the ideas might go from here: https://rocketnia.wordpress.com/2021/08/18/fexpress-compilat...

-----

2 points by akkartik 64 days ago | link

I love the first sentence in the doc:

"As far as feasible, it macroexpands expressions ahead of time instead of just interpreting everything."

Still reading.

-----

1 point by akkartik 63 days ago | link

"However, there’s a certain kind of seamlessness Fexpress won’t attempt: Racket’s and can’t be passed into Racket’s map, and sometimes this surprises people who expect macros to act like functions. In languages with fexprs as the default abstraction, it tends to be easy to implement and and map in such a way that this interaction succeeds."

Do you have an fexpr-aware version of map at the moment? Or and? Or do you have to switch both?

-----

2 points by rocketnia 62 days ago | link

The point I'm making in that paragraph is that there are reasons for `map` and `and` not to work together even in an fexpr-capable language. So I'm not sure what you're asking for; I'd say Racket's `map` and `and` already are "fexpr-aware."

Are you asking for clarification on how "this interaction succeeds" in a language where fexprs are more pervasive?

Let's see... I found an online PicoLisp runner.

As an initial test, `prin` prints and returns its string argument, and anything non-NIL counts as truthy:

  : (and (prin "hello^J") NIL (prin "world^J"))
  hello
  -> "hello^J"
So here's `mapcar` mapping a function over two lists:

  : (mapcar '((a b) (+ a b)) (list 1 2 3) (list 4 5 6))
  -> (5 7 9)
Here it is mapping `and` over two lists, where "the interaction succeeds" in the sense I was referring to:

  : (mapcar and (list T NIL) (list NIL (prin "oops^J")))
  oops
  -> (NIL NIL)
Here it is mapping a custom fexpr over two lists so we can see what unevaluated arguments it observes:

  : (mapcar
      '(args
        (println args)
        (and (eval (car args)) (eval (cadr args))))
      (list T NIL)
      (list NIL (prin "oops^J")))
  oops
  ($177775247560141 $177775247560143)
  ($177775247560141 $177775247560143)
  -> (NIL NIL)
(PicoLisp doesn't evaluate rest arguments like `args` here, even though it does evaluate every other argument in the argument list, and that's by design. This is how we can write custom fexprs.)

It appears that what our fexpr observes is a list of what PicoLisp calls "anonymous symbols" (probably like gensyms), which apparently are bound to the already-evaluated argument values.

Incidentally, notice how this `mapcar` call fully evaluates the list arguments, including the part that prints "oops." That makes these two programs arguably inconsistent:

  : (mapcar and (list T NIL) (list NIL (prin "oops^J")))
  oops
  -> (NIL NIL)
  : (list (and T NIL) (and NIL (prin "oops^J")))
  -> (NIL NIL)
When the transformation passed to `mapcar` is a pure function, refactoring a `mapcar` call this way makes no difference. When the transformation is a side-effecting procedure, it does make a slight difference because one transformation call might perform its effects before the next transformation's arguments have started to be evaluated. And here, when the transformation is `and`, this refactoring makes a difference because `and` usually changes how often its arguments' effects are performed, but `mapcar` doesn't let it have that control.

That more or less makes sense operationally, but it means that in PicoLisp, I could imagine `mapcar` being more "fexpr-aware" than it is now. A more fexpr-aware design for `mapcar` would either:

- Allow `and` to affect the evaluation of `mapcar`'s own arguments. Now we get to implement a way to evaluate a list-valued expression without evaluating the list elements, so that we can pass the elements along to `and`. And once we're done with that, what should happen when we map fexprs that aren't even procedure-like, such as (mapcar let ...) or (mapcar setq ...)? Is there even a point to doing any of this?

- With full awareness of fexprs, make the decision to explicitly disallow non-procedure fexprs from being used with `mapcar`. That way, we don't even open up that can of worms.

Racket's `map` essentially accomplishes the latter.

-----

2 points by rocketnia 62 days ago | link

By the way, if you'd be interested in a version of `and` that can be used as both a Racket macro with short-circuiting behavior and a first-class procedure without it, you can do that in Racket (using essentially a symbol macro):

  (define (my-and-procedure . args)
    (andmap identity args))
  
  (define-syntax (my-and stx)
    (syntax-parse stx
      [_:id #'my-and-procedure]
      [(_ args:expr ...) #'(and args ...)]))
This basically reproduces the same halfheartedly fexpr-aware behavior as PicoLisp, but without any fexprs:

  > (map my-and (list #t #f) (list #f (displayln "oops")))
  oops
  '(#f #f)
  > (list (my-and #t #f) (my-and #f (displayln "oops")))
  '(#f #f)
In the context of Fexpress, this punning can be taken to another level, letting the first-class version of `my-and` be both a procedure and an Fexpress fexpr at the same time, while the second-class calls to it have Racket macro behavior. I've put together a test to demonstrate it:

- Code link: https://github.com/rocketnia/fexpress/blob/main/fexpress-tes...

- Code link pinned to the current commit as of this writing: https://github.com/rocketnia/fexpress/blob/e33c07a4e8252793a...

In the Fexpress proof of concept, there's no way to convert between Fexpress fexprs and Racket macros in either direction. That makes this punning pretty pointless, except perhaps as a technique for publishing libraries that can be used roughly the same way by both Racket and Fexpress programmers.

-----

1 point by akkartik 61 days ago | link

Sorry, I do know that fexpr lisps can combine `map` and `and`. It sounds like you're proposing a different approach in Fexpress, so I was wondering what sort of vocabulary you are imagining. Were you thinking you'd have something called `map/f` that can take non-functions as its first argument? And so there'd be fexpr-aware variants for all higher-order functions? (Thinking harder, an `and/f` doesn't really make sense here.)

-----

2 points by rocketnia 61 days ago | link

"Were you thinking you'd have something called `map/f` that can take non-functions as its first argument?"

No, I was thinking `map` should only take a procedure.

I don't think there's a particular way to "map" an fexpr that stands out as a compelling abstraction. I propped up a certain ambitious "evaluate a list-valued expression without evaluating the list elements" vision for `map` above, but this is really more of a lazy function map than an fexpr map.

The `map` operation is basically about the list data structure. Since it's a data structure that holds its elements eagerly, we can transform it with an eager function. If it held its elements lazily, we could use a lazy function. If it held its arguments unevaluated-ly, and if the elements didn't even have to be expressions, then the abstractions we could map over these lists might be fexpr-like, but I don't know of a way to make that make sense.

If you're thinking "but then what do you do with an fexpr once you have one?" I don't know. I've only found so many compelling uses for fexprs, which have to do with places where the code of the call site is needed at run time, possibly because run time and compile time are tangled up:

- Defining custom syntaxes in mutually recursive ways that involve invoking some of the syntaxes before some of the others have finished being defined.

- Making macro systems where redefinitions of the macros automatically propagate to change the behavior of previously defined things that call them.

- Writing abstractions that take care of their own JIT behavior. In order to strategically recompile their use sites according to usage statistics collected at run time, they need access to both the source code and the ability to observe run-time conditions.

In each case, fexprs would be serving as a language front-end layer, not as internal building blocks for general-purpose computation. I think source code is second-class in an essential way; although we can write macro abstraction layers that fabricate source code as they go along, this makes it harder to report informative errors in terms of the user's original code. Macros and fexprs take source code as input, so they're tied to the language front-end in a way that procedures aren't.

And when the source code isn't something we're abstracting over much, then neither is the environment of macros or fexprs that are in scope there. (Like, the environment is more or less an annotation on the source code, and if we focus on having only one original source codebase to annotate this way, then we only have one set of annotations to produce.) So a lot of the best techniques for macros or fexprs are likely to be self-contained abstractions that keep their first-class manipulations of macros and fexprs behind the scenes, rather than whole programming styles which make pervasive use of macros or fexprs as first-class values.

-----

2 points by akkartik 61 days ago | link

Thank you, that is very helpful! And this also matches some of my growing ambivalence with macros.

When I was building Wart[1], I repeatedly ran into situations that seemed very difficult to debug. Looking back, I tend to debug by simulating computation in my head, and macros made it easy to create code that was more complicated than I could visualize[2]. When I built Mu[3] I hoped that backing away from first-class macros and pervasive support for traces[4] would help. That's still in early days, but the evidence is quite ambivalent so far.

[1] https://github.com/akkartik/wart

[2] https://www.goodreads.com/quotes/273375-everyone-knows-that-...

[3] https://github.com/akkartik/mu

[4] https://archive.org/details/akkartik-mu-2021-06-23

-----

2 points by rocketnia 55 days ago | link

I think macros can get pretty complicated fast. I think in some ways it's at least as hard to write well polished macros (with good error messages, editor support, debugging steppers, pretty printers, and so on) as it is to build a well polished language without macros. I think the reason it feels easy to write macros in many languages is because many languages don't set a high bar for a polished experience.

I don't think taking macros out is an answer I'm very interested in, though. I want people to be able to use my stuff without having to write code the same way I do, and vice versa. Some kind of syntactic extensibility is important to me.

And I don't think taking macros out even really eliminates the "tied to the front end" non-compositionality issues. Every abstraction that can be used has a user experience, and macros just take more control of that experience than functions do. Every time a programmer tries to report a friendly error message from a function, they're trying to take some control over that experience, but the capabilities of a function only give them so much information to work with.

A language without macros can supply a one-size-fits-all kind of polished experience, which may indeed be a much better experience than what most macros provide in practice. But a language like that also imposes a practical limit on how much more polished it can get, especially for domain-specific applications that don't have the attention of the language developers.

reply

1 point by akkartik 54 days ago | link

To be clear I still use macros. I'm skeptical of the benefit of fexprs, but it totally makes sense to support them as a sharp-edged, high-power capability intended to be used rarely.

reply

2 points by rocketnia 54 days ago | link

There's not much use I have for fexprs. I just had certain ideas about how they fit in with other concepts and how they could be made compatible with a compilation-favoring workflow.

I consider it basically a mistake to use the same syntax for macro calls and function calls... but not because they're different things. When we want to invoke a function like a macro, we can just conceptualize it as a macro that expands into a function call.

I think it's the same for fexprs. Macro calls can more or less expand into fexpr calls by simply preserving all the code they received under a `quote` in their expansion result. This doesn't quite get us a lexical-scope-respecting version of fexprs unless we also capture the local lexical environment, and that's not necessarily possible depending on the macro system, but it's not much of a stretch:

- Racket's macroexpander keeps track of the set of local variables in scope, but it doesn't quite expose it to user-defined macros.

- Arc's macroexpander `ac` keeps track of the set `env` of Arc local variables in scope, but it doesn't expose it to user-defined macros.

- In Guile, the local environment can be captured using (procedure-environment (lambda () '())).

- Fexpress currently lets compilation-friendly fexprs do this using the `depends-on-env?` field of a `compilation-result?` data structure. If a subexpression depends on the lexical environment this way, (fexpress-clambda ...) forms surrounding that subexpression expand differently so that they create a run-time representation of the lexical scope. This way, no one step in the code has to traverse or build up the whole environment, so the cost is spread out. Since Fexpress's variables are immutable, variable accesses don't even have to go through this run-time environment; they can be Racket variable accesses.

Anyhow, I basically consider fexprs to be one of the things a macro-capable language is theoretically capable of, even if people don't commonly prefer to use that functionality and macro systems don't always quite offer it.

I've been interested in exploring the concept of typed macros in general for the purposes of designing module systems which have both macros and typed API signatures. And I've had typed fexprs on my mind as an optimization approach since the Eight thread way back when, and I expect these two trains of thought to be on their way to the same destination.

With Fexpress, I explored the typed fexpr side, and I kept a really narrow focus on types for optimization rather than API boundary delineation so that I wouldn't make it any more complicated than it had to be. I don't want to be like "oh, by the way, in this complex type system for macros, you can find fexprs if you look for them," because some people might recoil at that. :-p And if I said something like that, but I didn't actually have an fexprs-by-default language ready to show, that would be quite a tease.

The actual languages I build with these ideas will probably not have fexpr calls as the default behavior for calling a local variable. Personally, I prefer not to have any default; it's not too much work to write out `funcall` unless I have a whole DSL's worth of local variables I'm invoking. But if it's a typed language, a lot of the local variables will probably have Lisp-1-style behavior anyway, because a variable of function type could be known to have function-call-like macro behavior and so on. And if some of the types people use turn out to declare that their variables have fexpr-like call behavior, I'll consider that to be an interesting outcome. The types, if they do any kind of soundness enforcement (unlike Fexpress's unsound hints), can keep those fexprs from sneaking into other people's code unless they're ready to receive them.

I know API enforcement wouldn't necessarily be your favorite feature in those designs. :-p But I think that's basically the picture of where Fexpress-style fexprs slot into my long-term goals, basically as one part of the possibility space that macros have more room to explore with the help of a type system.

reply


If I understand right, I think those are supposed to be errors still. The documentation for `expand-user-path` says "In addition, on Unix and Mac OS, a leading ~ is treated as user’s home directory and expanded[...]" but it doesn't say there's any corresponding behavior on Windows.

-----

3 points by zck 112 days ago | link

Ah, interesting. That's a good point that it's Unix/MacOS-specific. My hope was that it should make Arc interpret paths the same way the default shell on the system does. On Linux, ~/ is the home directory of the current user, so it makes sense that Anarki would work that way on Linux. But Windows doesn't work that way, so it makes sense Anarki wouldn't either.

I don't know of any similar path tricks on Windows, But, if this change hasn't broken anything on Windows, I would be satisfied with that. Can you verify it still works to open an existing file? Are there any other special ways to refer to files or directories on Windows?

-----

2 points by rocketnia 106 days ago | link

"On Linux, ~/ is the home directory of the current user, so it makes sense that Anarki would work that way on Linux. But Windows doesn't work that way, so it makes sense Anarki wouldn't either."

Yeah, I don't know much about what a Windows user would expect ~ to do. I would say `expand-user-path` leaving ~ alone on Windows is probably as good a behavior as any. It coincides with Command Prompt, where ~ just refers to a file or folder named "~". In PowerShell, ~ seems to be expanded to C:\Users\[username]\ somehow, so there's potentially an alternative design there.

---

"I don't know of any similar path tricks on Windows, But, if this change hasn't broken anything on Windows, I would be satisfied with that. Can you verify it still works to open an existing file?"

On Windows 10 64-bit with Racket 8.1, I've at least run the tests.arc unit tests, the unit-test.arc/tests.arc unit tests, and build-web-help.arc, and they seem to work.

---

"Are there any other special ways to refer to files or directories on Windows?"

Racket deals with a lot more Windows path features than I've ever learned about or encountered, and there's some gritty documentation of that here: https://docs.racket-lang.org/reference/windowspaths.html

I hardly know where to begin learning about and testing those features, and the documentation makes it look like Racket has explored hat rabbit hole pretty thoroughly already, so I'm inclined to suggest we just piggyback on Racket's work here.

-----


At one point, years ago, I relied on every lambda returning a newly allocated object. I think I brought it up as a bug in Rainbow that the [] expressions in my Arc code were generating the same lambda-lifted result instead of creating unique objects each time.

But I think this was never something I could rely on in the original, Racket-based Arc either. The Racket docs don't specify whether a lambda expression returns a new object or reuses an existing one, and in fact they explicitly allow for the possibility of reusing one:

"Similarly, evaluation of a lambda form typically generates a new procedure object, but it may re-use a procedure object previously generated by the same source lambda form."

That sentence has been in the reference since the earliest versions I can find online:

The latest docs, currently for 8.1 (2021): https://docs.racket-lang.org/reference/Equality.html#%28part...

5.0 (2010): https://download.racket-lang.org/docs/5.0/html/reference/eva...

4.0.0.1 (2008): https://web.archive.org/web/20080620030058/http://docs.plt-s...

3.99.0.13 (2008): https://web.archive.org/web/20080229164003/http://docs.plt-s...

I'm not sure if the bug you're referring to had to do with making the same mistake I was making back then, or if you're talking about making the opposite assumption (expecting two procedures to be equal), but it's probably best not to expect stability either way.

-----

2 points by zck 140 days ago | link

I was doing a deep-`iso` comparison I called `same`, because in Arc, hash tables are not the same. In Arc3.1:

  arc> (iso (obj) (obj))
  nil
  arc> (iso (obj 1 2) (obj 1 2))
  nil
So I made a function that iterated over the key/value pairs and compared them. In some unit tests, I used that on a hash table that contained functions. Yesterday, I refactored the unit tests to instead make sure the keys were right.

Interestingly, it seems that Anarki can compare hash tables:

  arc> (iso (obj) (obj))
  't
  
  arc> (iso (obj 1 2) (obj 1 2))
  't
Anyway, this should fix the unit test tests in the Anarki repo.

-----


There are a number of system calls like (system "rm ...") in news.arc and other places. These run shell commands, and most of them only work on a POSIX shell. I'm sorry to say, I'm not sure anyone has ever gone through all these calls and made them portable so that news.arc can run on Windows.

If you use the Anarki master branch, I think a few of those calls have been replaced with more portable code, but not all of them. I've at least been able to get a server started up and showing content before.

I used to use Arc on Windows quite a bit, and I tried the web server a couple of times, but I never really used the web server. Nowadays I still use Windows, but I do a lot of my programming projects inside a Linux Mint VM using VirtualBox. I think running news.arc inside a VM like that would be one of the easier ways to get it working. Of course, learning to set up a VM like that could be a lot to figure out, but at least that's something a lot of people out there have already written up tutorials for.

Another possibility is to look for one of the many Hacker News clones that people have whipped up in other languages. You probably won't get to use Arc that way, but some of those clones may be better architected and better maintained than news.arc.

-----

2 points by Jonesk79 171 days ago | link

I have found this and it looks promising. https://coderrocketfuel.com/courses/hacker-news-clone

My only concern is that it won't use the news.arc algorithm. In that case, I think I'd take the time to learn how to set up a virtual environment.

-----


I've brought up an idea a couple of times here: If (quasiquote ...) and (unquote ...) match up and nest like parentheses, then what if we generalize that analogy to higher dimensions?

After years of going back to the drawing board over and over, I finally have a working, reasonably elegant (if slow), and most importantly, documented implementation of hypersnippets and hyperbrackets in Punctaffy. I'm really excited to be able to give examples of how these can be used for things other than quasiquotation.

The "Introduction to Punctaffy" section introduces the idea of degree-2 hyperbrackets. There are precious few examples where degree-3 hyperbrackets would come in handy (much less higher-degree ones), but the infrastructure is working for those as well.

There are still a number of directions I hope Punctaffy will take from here, but it's finally past the "sorry, I haven't written up good documentation, so I don't know how to explain" stage that it's been stuck in for quite a while. :) If you're interested, I've written up some in-depth discussion of those future ideas in the "Potential Applications" sections. One of those (the interaction between 'unsyntax and ellipses in the Racket syntax template DSL) would even be a practical application of degree-3 hyperbrackets.

This documentation totals to something like 30,000 words. Most of those words are the reference documentation of Punctaffy's hypersnippet-shaped data structures (hypertees and hypernests). Don't worry about reading through all of it, but I'd be very happy to see if anyone could make it through the introduction. After 4 years of not having much to show people, I really hope that introduction will help. :)

-----

3 points by shader 219 days ago | link

Thanks for sharing! I always like ideas that integrate previously distinct concepts into a single recursive hierarchy.

I am curious what makes it slow? Is it just an implementation detail, or something fundamental to the algorithms / concepts?

-----

3 points by rocketnia 217 days ago | link

There are a number of factors going into what makes it slow (and what might make it faster). I'm particularly hopeful about points 4 and 5 here, basically the potential to design data structures which pay substantially better attention to avoiding redundant memory use.

By the way, thanks for asking! I'm reluctant to talk about this in depth in the documentation because I don't want to get people's hopes up; I don't know if these optimizations will actually help, and even if they do, I don't know how soon I'm going to be able to work on building them. Still, it's something I've been thinking about.

1. Contracts on the whole

Punctaffy does a lot of redundant contract checking. I have a compile-time constant that turns off contract checking[1], and turning it off gives a time reduction in the unit tests of about 70%, reducing a time like 1h20m to something more like 20m. That's still pretty slow, but since this is a quick fix for most of the issue, it's very tempting to just publish a contractless variation of the library for people who want to live on the edge.

2. Whether the contracts trust the library

Currently, all the contracts are written as though they're for documentation, where they describe the inputs and the outputs. This constrains not just that the library is being used with the correct inputs but that it's producing the correct outputs. Unless the library is in the process of being debugged, these contracts can be turned off.

(Incidentally, I do have a compile-time constant that turns on some far more pervasive contract-checking within Punctaffy to help isolate bugs.[2] I'll probably want it to control these output-verifying contracts as well.)

3. The contract `hypernest/c`

One of the most fixable aspects here is that my `hypernest/c` contract checks a hypernest's entire structure every time, verifying that all the different branches zip together. If I verified this at the time a hypernest was constructed, the `hypernest/c` contract could just take it for granted that every hypernest was well-formed.

4. Avoidable higher-dimensional redundancy in the data structure

Of course, even without contracts, 20 minutes is still a pretty long time to wait for some simple tests to compile. I don't want to imagine the time it would take to compile a project that made extensive use of Punctaffy. So what's the remaining issue?

Well, one clue is a previous implementation of hypersnippets I wrote before I refactored it and polished it up. This old implementation represented hypertees not as trees that corresponded to the nesting structure, but as plain old lists of brackets. Every operation on these was implemented in terms of hyperstacks, and while this almost imperative style worked, it didn't give me confidence in the design. This old implementation isn't hooked up to all the tests, but it's hooked up to some tests that correspond to ones that take about 3 minutes to run on the polished-up implementation. On the old list-of-brackets implementation, they take about 7 seconds.

I think there's a reason for that. When I represent a hole in a hypertee, the shape of the hole itself is a hypertee, and the syntax beyond each of the holes of that hole is a hypertee that fits there. A hypertee fits in a hole if its low-degree holes match up. That means that in the tree representation, I have some redundancy: Certain holes are in multiple places that have to match up. Once we're dealing with, say, degree-3 hypertees, which can have degree-2 hypertees for holes, which can have degree-1 hypertees for holes, which have a degree-0 hypertee for a hole, the duplication compounds on itself. The data structure is using too much space, and traversing that space is taking too much time.

I think switching back to using lists of brackets and traversing them with hyperstacks every time will do a lot to help here.

But I have other ideas...

5. Avoidable copying of the data structure

Most snippets could be views, carrying an index into some other snippet's list of brackets rather than carrying a whole new list of brackets of their own. In particular, since Punctaffy's main use is for parsing hyperbracketed code, most snippets will probably be views over a programmer's hand-written code.

6. The contract `snippet-sys-unlabeled-shape/c`

There's also another opportunity that might pay off a little. Several of Punctaffy's operations expect values of the form "snippet with holes that contain only trivial values," using the `snippet-sys-unlabeled-shape/c` contract combinator to express this. It would probably be easy for each snippet to carry some precomputed information saying what its least degree of nontrivial-value-carrying hole is (if any). That would save a traversal every time this contract was checked.

This idea gets into territory that makes some more noticeable compromises to conceptual simplicity for the sake of performance. Now a snippet system would have a new dedicated method for computing this particular information. While that would help people implement efficient snippet systems, it might intimidate people who find snippet systems to be complicated enough already.

It's not that much more complicated, so I suspect it's worth it. But if it turns out this optimization doesn't pay off very well, or if the other techniques already bring Punctaffy's performance to a good enough level, it might not turn out to be a great tradeoff.

---

[1] `debugging-with-contracts-suppressed` at https://github.com/lathe/punctaffy-for-racket/blob/399657556...

[2] `debugging-with-contracts` at https://github.com/lathe/punctaffy-for-racket/blob/399657556...

-----

2 points by shader 199 days ago | link

I want to think more about this and continue the conversation, but I'm worried the reply window will close first.

Since I presume your input space is relatively small (the AST of a program, which usually only has a few thousand nodes), it sounds like you have some sort of state-space explosion. Your comment about recursive matching of hypertees sounds like the biggest problem. Just a shot in the dark (having not studied what you're doing yet), but is there any chance you could use partial-order reduction, memoization, backtracking, etc. to reduce the state-space explosion?

I could be wrong, but most of the other optimizations sounded like they address constant factors, like contract checking. But then I don't know much about how contracts work; I guess the verification logic could be rather involved itself.

If the window closes, maybe we could continue at #arc-language:matrix.org

-----

3 points by rocketnia 193 days ago | link

I'm sorry, I really appreciate it, but right now I have other things I need to focus on. I hope we can talk about Punctaffy in the future.

-----

3 points by akkartik 237 days ago | link

Thanks for doing this. https://docs.racket-lang.org/punctaffy/motivation.html in particular really helps understand where you're going with this.

-----


I appreciate your frankness, lol. Anarki maintenance happens whenever we feel like it. :-p I have been feeling an inkling of an Arc urge lately, hence this post, but I have other priorities right now that this is basically procrastination from.

What do you mean by getting GitHub to build docs from the trunk branch? That's what that part of the CI script is there to simulate; I think there's still no actual way to configure GitHub Pages to build from anything but a `gh-pages` branch or a `<username>.github.io` repository.

-----

2 points by akkartik 293 days ago | link

A few years ago GitHub added the ability to specify a branch for a project. See the GitHub pages section of https://github.com/arclanguage/anarki/settings

I have a site at http://akkartik.github.io/mu but it's served directly from the main branch. There's no gh-pages branch at https://github.com/akkartik/mu

-----

2 points by rocketnia 292 days ago | link

Oh, I see, you're right. I checked that settings page, but the big blank "social preview" image made me think I was at the end of the page, so I forgot I could scroll down. That's interesting.

-----


Did you build all these dialogue boxes yourself too? Very snazzy. :D

-----

2 points by akkartik 341 days ago | link

https://github.com/akkartik/mu/blob/3d31467c0d/apps/tile/box... :D

-----


Not that it helps you much now, but the best documentation for tagged types (including the `rep` operation) is probably this: http://arclanguage.github.io/ref/type.html

You can get to it from the "Type operations" link in the table of contents. Of course, you might've been drawn to the "Templates" page instead.

And unfortunately, that documentation describes Anarki's stable branch. Making documentation of similar quality for Anarki's master branch might be quite a bit more work. So even if you were familiar with tagged types already, the knowledge that template instances were a tagged type on Anarki master might be hard to discover.

I suppose a more ideal form of the documentation would describe on the "Templates" page that template instances were a tagged type with a certain representation. It could describe the direct implications of that, but it would also have a link to the "Type operations" page for more background.

-----

2 points by krapp 521 days ago | link

>I suppose a more ideal form of the documentation would describe on the "Templates" page that template instances were a tagged type with a certain representation. It could describe the direct implications of that, but it would also have a link to the "Type operations" page for more background.

That would have been tremendously helpful, can we do that? Who actually maintains the documentation, can it be updated?

-----

1 point by akkartik 522 days ago | link

Perhaps we should maintain the docs only in plain-text and only within the repo, just because of the operational overheads of managing multiple branches. At least that way they won't seem to lie to a newcomer. I'm curious to hear what others think.

-----

2 points by krapp 521 days ago | link

There is already a wiki that isn't getting much use[0].

[0]https://github.com/arclanguage/anarki/wiki

-----


"But in general, having incompatible types easily share functions without sharing too much is an open problem: https://en.wikipedia.org/wiki/Expression_problem A language can easily add a method to many types, or add a new type to many methods. But we don't yet know how to achieve both sides."

I'm trying to follow, but I think you and I must have different understandings of the expression problem. That article lists several known solutions to the expression problem. The solution Anarki uses is `defextend`.

What do you mean by "sharing too much"?

Is Anarki's `defextend` technique already encouraging a bloated codebase, or is there some other technique you're thinking of that would do that?

-----

2 points by akkartik 522 days ago | link

Yeah, I suppose you could say the problem is 'solved'. I think of it as a trade-off with costs. We don't know how to achieve zero cost.

For example, I absolutely agree with you that 2 lines per method to extend every table method to some new type constitutes a solution for us. But if we had a thousand such types and a thousand such methods, it may seem like less of a solution. But then `defextend` would be the victim rather than cause of bloat.

-----

3 points by rocketnia 521 days ago | link

Ah, you're imagining us having to write and maintain 1000×1000 individual `defextend` forms someday? Yeah, that does seem like a problem that would not feel solved once we got to it. :-p

I don't think that aspect of the expression problem is solvable in a language design. Instead, it's an ongoing conversation in the community. Sometimes the intent of one feature and the intent of another feature interact, leading people to do a nonzero amount of work to figure out the intent of the two features put together. That work is an essential part of what the community is trying to accomplish together, so it's a cost that can't be eliminated. The intent has to be reflected in the code somewhere, so there will be a nonzero amount of code that serves feature-coordinating purposes.

Regardless, I'm optimistic that although the amount of code will be nonzero, it'll still have a manageable size. To the extent we have any kind of consistency around these feature interaction decisions, those consistent principles can develop into abstractions. The only way we'll have 1000×1000 individual intersections to maintain is if we as a community culture are already maintaining 1,000,000 compelling and distinct justifications for them. :)

-----

1 point by akkartik 521 days ago | link

Indeed. I'm curious: does my interpretation of the expression problem miss what the papers tend to focus on?

-----

2 points by rocketnia 520 days ago | link

Well... That's a good question.

I haven't read any more than a few papers on it, and maybe only one of those in depth (which I'll mention below). Mostly I'm going by forum threads, wiki articles, and the design choices certain languages make (like Inform's multimethods and Haskell's type classes).

As far as I understand the history, Philip Wadler's work basically defined the strict parameters of the expression problem and explored solutions for it. Separate compilation and the avoidance of dynamic casts were big deals for Wadler for some reason.

That work was focused on Java, where it's easy to define new classes that implement existing interfaces but impossible to implement new interfaces on existing classes.

The solution I'm most familiar with for Java-style languages is the use of object algebras, as described in Oliveira and Cook's "Extensibility for the Masses: Practical Extensibility with Object Algebras" (https://www.cs.utexas.edu/~wcook/Drafts/2012/ecoop2012.pdf). In this approach, when you extend the system with a new type, you define a new interface with a generic type parameter and a factory method for building that type, and you have that interface inherit all the existing factory methods. So you don't have to solve the unsolvable task of implementing a new interface for an existing class, because you're representing your types as type parameters and methods, not simply as classes.

So I think the main subject of research was how best to represent an extensible program's types and functions in a language like Java where the most obvious choices weren't expressive enough. I think it's more of a "how do we allow extensions to be made at all" problem than a "how do we make all the extensions maintainable" problem.

But then, I've really barely scratched the surface of the research, so I could easily be missing stuff like that.

-----


"to find out about it"

Sorry, to find out about which part?

-----

2 points by rocketnia 522 days ago | link

Ah, I think I understand: The fact that you can call `rep` on a template instance.

-----

More