I'm really glad you linked to that thread. I did remember it, and I meant to link there myself.
> 2) It would be extremely bad to write code that depends on unquoted symbols getting treated as quoted symbols when they happen to be unbound.
Yes. It would be a sort of inadvertent variable capture and could be the cause of some insidious bugs. Not any more insidious than the bugs you can get from an abuse of unhygienic macros or mutation though.
> (It'd probably be a good idea to make a global-assignment thing called something like "var" warn when assigning to an already-bound variable; currently this is called "safeset", and "def" and "mac" are implemented with it.)
That is a good idea! How about warning you when you assign to a variable that existing code depends on? For example:
> (def foo () (a b c))
> (= a 5)
Warning: Oh no, what are you doing? You're either
being really clever or you forgot that your previous
code depends on this variable being unbound. Why
didn't you just use the ' operator? It's only one
character, for god's sake.
Of course, at this point you're not really screwed. You can save yourself by either resetting the variable with (= a 'a) or redefining foo to use explicit quotation.
> If, say, you use (a b c) to mean the list '(a b c) in your code, and then you test things out at the REPL, you'd better be careful not to give "a" a value, or it'll break your code. I think the cognitive overhead of worrying about that far outweighs the cost of using the ' operator.
I think the warning we just talked about could go a long way toward eliminating this cognitive overhead. Don't worry about it, just deal with it if you get the warning. Or don't use it in the first place because...
The proposed changes are backwards-compatible with Arc 3.1, since all they attempt to do is provide sensible defaults for things that presently raise errors. I want to emphasize that the goal here is not to eliminate or replace quote and quasiquote. Rather it's to enhance these operators by giving you the ability to use them implicitly sometimes, instead of requiring that you be explicit in cases where it's the only useful meaning possible.
My new way of thinking about this is quote inference, a la type inference.
> How about warning when you assign to a variable that existing code depends on?
Sometimes you do that on purpose, though. E.g. you define a function that relies on a second function, and define the second function later. (rocketnia gives an example, but I'll proceed with this one anyway.)
> (def mod-expt (a n m)
(fast-expt-* a n 1 (fn (x y) (mod (* x y) m))))
> (def fast-expt-* (a n one *)
(xloop (a a n n tt one)
(if (is n 0)
(next (* a a) (/ n 2) tt)
(next a dec.n (* a tt)))))
Perhaps you could make "def" smart so it wouldn't set off the warning. What if you happened to give a function the same name as an unquoted symbol, though? Maybe you'd be careful not to do that. And what if you used a global variable that you planned to define later? The warning would be inappropriate. Perhaps you'd learn to ignore it. Or perhaps you could name your global variables in a certain way and have the warning thing recognize it. And tell everyone who uses Arc to name global variables the same way, or to at least come up with a naming scheme that can be mechanically understood by the warning thing--and to stick to it. Bah humbug.
> The proposed changes are backwards-compatible with Arc 3.1, since all they attempt to do is provide sensible defaults for things that presently raise errors.
It is nice when you can introduce something without breaking old things. However, I think this thing is bad: it's fragile and shallow, and I think most programmers would just not use it and resent the time it took to understand it.
Imagine if, say, whenever mathematical operations (e.g. sqrt) were called with a list argument, then, instead of throwing an error, the function was instead applied to the car of the list. (Mapping it over the list is more likely to be useful; applying it to the average is also possible.) Or if the expression (a < b) evaluated to (< a b) when "a" evaluated to a number or anything else passable to the < function. Or if, whenever you used the variable "it" inside a then-expression in a call to "if", and "it" was otherwise unbound, it bound "it" to the if-expression (as in "aif")? Or why not all of these at once, and more?
In principle, you might be able to ignore extra little "features" like this. I think it'd annoy me, though--in the case of "sqrt" et al. being applied to lists, I'd probably think about it every time I dealt with math and lists (which I do a lot). It adds one more case to deal with to mentally evaluate any mathematical function call. The best thing that could happen is that I'd never use it, or encounter it in anyone else's code, and my mind would freed of the impulse to worry. But even if it was never used in correct code, I'd still have to think about it whenever I made a mistake and had to diagnose a problem.
By the way, perhaps you are just looking for something you'd use at the REPL, instead of a new language feature. Maybe a REPL with capabilities like Clisp's:
> (+ 1 achtung)
*** - SYSTEM::READ-EVAL-PRINT: variable ACHTUNG has no value
The following restarts are available:
USE-VALUE :R1 Input a value to be used instead of ACHTUNG.
STORE-VALUE :R2 Input a new value for ACHTUNG.
ABORT :R3 Abort main loop
Break 1 >
It'd make it easy to put in the symbol as the value of the unbound variable. You could tweak it so that would be the default option, and you'd just have to press return again or something; or you could even make it set the unbound variable to the symbol by default. This would be on your customized REPL, of course. :-P
> Not any more insidious than the bugs you can get from an abuse of unhygienic macros or mutation though.
There's an entire style of programming devoted to minimizing and isolating mutation, and languages exist which try to disallow it entirely. There has been a lot of work done about trying to implement hygienic macros. But these things are useful enough that it's difficult to get rid of them entirely (mutation more so; many languages don't have macros at all). This idea seems it would make every variable reference (in foreign code) and every global assignment (in code you write) a potential headache, and the payoff seems to me almost zero.
Hmm... it seems that each of us has certain conveniences we want and certain sacrifices we're willing to make in order to allow for the conveniences. But which things are the conveniences and which are the sacrifices is different depending on our personal preferences. Taking your example:
> Sometimes you do that on purpose, though. E.g. you define a function that relies on a second function, and define the second function later. (rocketnia gives an example, but I'll proceed with this one anyway.)
> (def mod-expt (a n m)
(fast-expt-* a n 1 (fn (x y) (mod (* x y) m))))
> (def fast-expt-* (a n one *)
I would have always defined fast-expt-* before mod-expt and not the other way around. I think it was aw's essay on linearizing code dependencies  that finally persuaded me this is a good guarantee to have. When I'm reading my code, I value the confidence of knowing that everything below line n is unnecessary for getting everything above line n to work .
I can't remember the last time I intentionally wrote code that didn't conform to this principle. And if you're willing to write your code this way, then it eliminates the largest problems you all have identified with quote inference. But you and rocketnia seem to place value on being able reference functions before they've been defined. So while I would be willing to trade that ability for implicit quotation, it seems you would prefer the reverse.
I do usually put definitions of dependencies first (fast-expt-* is in fact before mod-expt in my Arc file), but sometimes I shuffle my code around, and I'd be annoyed if it complained when I did that. I like having the freedom to do it, though I don't use it all that much. But mutually recursive functions are a good example too--it's impossible to have both functions come after their dependency.
> But you and rocketnia seem to place value on being able reference functions before they've been defined. So while I would be willing to trade that ability for implicit quotation, it seems you would prefer the reverse.
It is true that I'd prefer being able to permute my definitions over having implicit quotation, but that's by far not my only reason for disliking the idea. As I said, it's a fragile, shallow add-on feature that would confuse me (by giving me weird results for erroneous code) and that would make code harder for me to reason about (whenever I see a variable reference, either that refers to something that's been defined, OR it refers to a symbol! and I can establish the latter only by ensuring that it's not defined anywhere).
By the way, why do you think this feature is a good idea? I find two quotes that seem to suggest your reasoning:
1. "My sense is that something like this would rate highly in both complexity of implementation and convenience for programming." How do you get this sense? Do you write a lot of code that uses quoted symbols? Let's take a count in my big fat Arc file:
$ grep -o "'" a | wc -l
$ egrep -o "[^' ()]" a | wc -l
$ egrep -o "[a-zA-Z-+$*/]+" a | wc -l
Even discounting whitespace and parentheses, quotes account for about 0.3% of the characters I type. If we count symbols, about 1.25% of the symbols I use are quoted. Are you working on something that uses a bazillion quoted symbols--symbols which would have to be quoted individually (e.g. the list '(a b c d) just requires one quote)?
2. "I just think we're missing out on such valuable real estate here!" See the paragraphs in my grandparent post beginning with "Imagine if" and "In principle". Tacking things on just because you can isn't a good idea.
> But mutually recursive functions are a good example too--it's impossible to have both functions come after their dependency.
You can do it by extending one of them:
(def foo ())
(def bar () (foo))
(extend foo () t (bar))
I'm not saying this is a better way to define mutually recursive functions. Just pointing out that it's possible.
> add-on feature
Not sure what you mean by "add-on". Is xs.0 list access an add-on? This is whatever that is.
> why do you think this feature is a good idea?
It could unify the "." and "!" ssyntaxes to some degree by allowing table access with h.k instead of h!k in most cases. alists that you presently have to express with '((x 1) (y 2)) could be contracted to (x.1 y.2).
I haven't worked out all the useful applications of this yet but I'm finding it interesting and think there's some potential.
> fragile, shallow [...] Tacking things on just because you can isn't a good idea.
I'm generally disappointed by your flamey response to my interest in exploring a core language possibility that could make arc programs shorter. Your ideas and even complete disagreement are very welcome, but your overall tone is insulting. Perhaps I misread you.
> I'm generally disappointed by your flamey response... your overall tone is insulting.
I'm sorry, my intent wasn't to insult you. (I'm glad you explained that, though.) I thought my words were clear. Let me explain:
I called the idea "fragile" because it would be easy to break code that depended on it--just by defining a new variable. You suggested a thing that would warn upon defining a previously-used-as-unquoted-symbol variable, but rocketnia and I brought up cases where the warning would be a false positive. I considered a more sophisticated warning system--one that had some kind of mechanical procedure for guessing whether an unbound variable was supposed to be an unquoted symbol or a function to be defined later--one that required the programmer to follow one naming scheme for unquoted symbols and another for functions to be defined later. My conclusion was that for this system to work, the programmer would basically have to tiptoe around it and be very careful, or else things would break. Hence, I thought the word "fragile" was appropriate, and used it.
I called it "shallow" because the maximum benefit, in the best possible case, is that you don't have to type the ' character most of the time. Contrast this with, say, learning to use lists when you're used to naming every single variable. Not only does it become easier to, say, compute the sum of the cubes of five numbers, but you can write a single function to compute the sum of the cubes of any number of numbers! And then you can make matrices--again, referencing them with a single variable--and writing a single function to deal with n x n matrices for all n! Without lists or other compound data structures, it'd seem really hard and annoying just to deal with 2 x 2 matrices, and 10 x 10 would seem impossible. There are deep and rich benefits to using compound data structures. But the only thing this unquoted symbols idea can possibly be good for is letting you omit the ' character; I therefore thought the word "shallow" was a good descriptor.
Regarding "Tacking things on just because you can isn't a good idea". Your motivation seemed like it might be, at least partially, something like this: "We can get strictly greater functionality out of Arc if we take some things that are errors and assign them a meaning in the language. Therefore, we should do it. Let's start looking for error cases we can replace with functionality!" Your comment "I just think we're missing out on such valuable real estate here!" added weight to this interpretation. And so I attacked it with a reductio ad absurdum, giving several examples of how one might "add functionality" in this way. I hoped to show that the line of reasoning "You get strictly greater functionality, therefore it can't be a bad idea" was wrong. I summed it up by saying "Tacking things on just because you can isn't a good idea."
> I'm not saying this is a better way to define mutually recursive functions. Just pointing out that it's possible.
And there are other ways to do it. But it would be impossible to do it by just writing (def foo () (bar)) and (def bar () (foo)) and putting them in the right order. Hence, this idea would make such programs more complex/verbose. (Eh, perhaps you could set up a warning system and teach it to recognize mutual recursion. I think learning about this would distract the programmer somewhat, which isn't by itself a deal-breaker but is an undesirable aspect. I suspect there are more cases yet to be covered; and you'd still have to order your definitions properly--I don't think a compiler without psychic abilities could always tell what you were going to define later; and even if you were warned when you made a function with a conflicting name, it'd be annoying to have to give your function a new name, or to change the code that used that name.)
> alists that you presently have to express with '((x 1) (y 2)) could be contracted to (x.1 y.2).
Incidentally, I do find it annoying to type out a lot of such things, and I have a routine for dealing with that. Perhaps you'd find it sufficient? So whenever I want to make a big alist, I do something like this:
(tuples 2 '(Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6
Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12))
;instead of '((Jan 1) (Feb 2) ...)
Note that I've redefined "(tuples xs n)" as "(tuples n xs)". It is much better this way. :-} I could also use "pair", I suppose.
Oh, and, by the way, if ssyntax were implemented at the reader level--which I think it should be; I think the current situation is just a hack that will be changed eventually--you could write '((x 1) (y 2)) as '(x.1 y.2). [I see you address this in a sister post.]
And this example suggests that your intent goes beyond merely having unbound symbols evaluate to themselves. In my post I cite at the top of this thread, I addressed problems with trying to have ((x 1) (y 2)) evaluate as '((x 1) (y 2)).
> make arc programs shorter.
That is a worthy goal, one I'd forgotten about. It's good that you brought it up. I suppose the shortness of a program is kind of a good static measure, whereas objections like "It'd confuse me" are usually only temporary, and the programmer gets used to it.
But I do believe that a) using it would create either horrible risks of breaking things or annoying false-positive compiler warnings, b) therefore I'd never use it, so it wouldn't actually make my programs any shorter, and c) it would, inevitably, make debugging harder--instead of UNBOUND-VARIABLE errors I'd get diverse results, depending on precisely what happens when a symbol is put in the place of the unbound variable.
Now, (c) also applies to having xs.0 list/table/string reference work. But (a) and (b) don't. I do use it, and relying on it doesn't cause any problems like the fragility I've described. And the payoffs are pretty good. Many things are significantly shorter--e.g. m.i.j for reaching into a matrix, instead of (aref m i j) or, worse, (vector-ref (vector-ref m i) j). The error-obfuscation objection still applies, but I think the benefits override the objection.
 I was thinking you could define them in the same lexical context:
 In fact, arc3.1 doesn't even provide "hash-ref" or "string-ref" functions, so you kinda have to use (x n). "list-ref" at least could be implemented by the user in terms of car and cdr.
 I was going to add: "And since I don't need to specify the type of the data structure, sometimes I or my code can forget that detail. I could change matrices to be implemented as nested hash tables or vectors, and m.i.j would still be correct." However, this part could be done with a unified "ref" function that reached into all data structures.
Is there a general interest in moving ssyntax functionality to the reader?
In the Arc runtime project, that was my assumption behind my choosing my matching library to implement the reader in Arc. The matching library is way more powerful than what would be needed to simply replace the Racket reader as-is; the goal is that when people want to experiment with different kinds of syntaxes or with extending ssyntax to work in more cases it will be easy to do.
Yeah, I like being able to reference a function before it's defined. (Macros annoy me a little for not allowing that.) For me it's a matter of the concept of "definition" being an ambient thing, where something counts as being defined if it's defined anywhere, even later on. It's like how, in a mathematical proof or prose argument, a broad claim may be reduced into a bunch of littler inferences, some handled one-by-one systematically and some left to the reader to fill in. I've read (or tried to read) a bunch of mathematical papers or books that start out building lemma after lemma and climax in a theorem, and those might even be in the majority, but sometimes I have to approach them backwards, and then I have to backtrack to figure out what their terminology means, and it's pretty frustrating.
In education, lots of the time new topics are built upon the foundations the old topics provided, but sometimes they're built upon established motivations and provide all-new foundations, like an analysis course justifying the calculus courses that came before it, or a mechanics course casting Newton's laws in a new light.
For me, the motivation comes pretty early on relative to the implementation. I could decide to put the main control flow algorighm at the top to set the stage for the rest, or I could decide to arrange things according to the order they'll be applied--or in fact I might like having them in the reverse order, the order in which they're needed to get the result from more and more convenient starting positions. That last strategy is probably closest to dependencies-come-first coding, but I don't want to be limited to it, even if I risk choosing a frustratingly haphazard strategy.
Yeah, I agree: I like to see the 'business end' of code up front. aw's article made some good points I'm still mulling over, but upgrading things seems like such a rare event compared to the day-to-day use of code. Especially if I manage to keep up my resolution to never rely on any libraries :)
 http://github.com/awwx/ar now keeps tests in a separate file. Does that weaken the case for defining things before using them? Perhaps you could define your tests bottom-up but write your code top-down, or something.
I still want to try out a test harness that analyzes dependencies and runs tests bottom-up: http://arclanguage.org/item?id=12721. That way you could write your tests in any order and they'd execute in the most convenient order, with test failures at low levels not triggering noisy failures from higher-level code.
Not by design, as it happens. I wrote some new tests for code written in Arc, and stuck them into a separate file because I hadn't gotten around to implementing a mechanism to load Arc code without running the tests.
Though I do view writing dependencies-first as a form of scaffolding. You may need or want scaffolding for safety, or because you're working on a large project, or because you're in the midst of rebuilding.
Does that mean that you always need to use scaffolding when you work on a project? Of course not. If you're getting along fine without scaffolding, then you don't need to worry about it.
Nor, just because you might need scaffolding in the future, does it mean that you have to build it right now. For example, if I had some code that I wanted to rebase to work on top of a different library, and it wasn't in dependency order, and it looked like the rebasing work might be hard, I'd probably put my code into dependency order first to make the task either. But, if I thought the rebasing was going to be easy, I might not bother. If I ran into trouble, then perhaps I'd backtrack, build my scaffolding, and try again.
Especially if I manage to keep up my resolution to never rely on any libraries :)
I have effectively the same resolution, but only 'cause of Not Invented Here syndrome. :-p Nah, I use plenty of libraries; they just happen to be the "libraries" that implement Arc. I use all kinds of those. :-p
http://github.com/awwx/ar now keeps tests in a separate file. Does that weaken the case for defining things before using them?
That file is loaded after the things it depends on, right?
...you could write your tests in any order and they'd execute in the most convenient order, with test failures at low levels not triggering noisy failures from higher-level code.
I'm not sure I understand. Do you mean if I define 'foo and then call 'foo in the process of defining 'bar (perhaps because 'foo is a macro), then the error message I get there will be less comprehensible than if I had run a test on 'foo before trying to define 'bar?
In any case, aw's post mostly struck me as a summary of something I'd already figured out but hadn't put into words: If a single program has lots of dependencies to manage, it helps to let the more independent parts of the program bubble together toward the top, and--aw didn't say this--things which bubble to the top are good candidates for skimming off into independent libraries. If you're quick enough to skim them off, the bubbling-to-the-top can happen mentally.
Lathe has been built up this way from the beginning, basically. It's just that the modules are automatically managed, and it acts as a dependency tree with more than one leaf at the "top," rather than something like yrc or Wart with a number on every file.
I'm interested in making a proper unit test system for Lathe, so we may looking for the same kinds of unit test dependency management, but I'm not sure yet about many things, like whether I want the tests to be inline or not.
Well, Lathe has an examples/ directory, which I've ended up using for unit tests. It's kind of interesting. Lathe's unit tests have become just like its modules over time, except that they print things to tell you about their status. Being a module, an example automatically loads all its dependencies, and you can load it up and play around with the things defined in it at the REPL, which is occasionally useful for debugging the example itself. But it's pretty ad-hoc right now, and I don't, for instance, write applications so that they load examples as they start up, like you might do.
"Do you mean if I define 'foo and then call 'foo in the process of defining 'bar (perhaps because 'foo is a macro), then the error message I get there will be less comprehensible than if I had run a test on 'foo before trying to define 'bar?"
If bar depends on foo (foo can be function or macro), and some tests for foo fail, then it's mostly pointless to run the tests for bar.
"That file is loaded _after_ the things it depends on, right?"
Yeah well, you gotta load code before you can run the tests for it :)
My understanding of aw's point was this: if you load your code bottom-up, then you can test things incrementally as you define them, and isolate breakage faster. Defining the tests after their code is irrelevant to the argument because it's hard to imagine an alternative.
If you put your tests in a separate file and run them after all the code has been loaded, you can still order them bottom-up. So to answer my own question, no, keeping the tests in a separate file doesn't weaken aw's argument :)
There is a small difference: if you've loaded only the code up to the point of the definition which is being tested when you run the test (either by writing tests in the same source code file as the definitions, or by using some clever test infrastructure), then you prove that your definitions aren't using anything defined later.
Of course you can probably tell whether code is in prerequisite order just by looking at it, so this may not add much value.
Something I've been thinking about, though I haven't implemented anything yet, is that there's code, and then there's things related to that code such as prerequisites, documents, examples, tests, etc. The usual practice is to stick everything into the source code file: i.e., we start off with some require's or import's to list the prerequisites, doc strings inline with the function definition, and, in my case, tests following the definition because I wanted the tests to run immediately after the definition.
But perhaps it would be better to be able to have things in separate files. I could have a file of tests, and the tests for my definition of "foo" would be marked as tests for "foo".
Then, for example, if I happened to want to run my tests in strict dependency order, I could load my code up to and including my definition of foo, and then run my tests for foo.
One reason to favor the dependencies-first approach in Arc is that we have mutability.
If you're only doing single assignment and side effect-free programming, then your code doesn't have a significant order . But insofar as your program is imperative and performing mutations, the order is significant.
A consequence of this is that if you want to be able to take advantage of imperative features, you're making it harder by ordering your code any other way. I say this because even if your code is purely functional right now, when you try to insert some imperative code later, the order is going to start mattering more. And it's going to start seeming tangled and confused if it doesn't build up in the order of execution (at least it does for me).
So dependencies-first programming plays especially well with imperative code. I'm also particularly interested in it at this moment because I'm working on a refined auto-quote mechanism that could be hard to take advantage of if you're not programming this way. ;)
> (def maximin (x) (check x number (apply max (map minimax x))))
> (def minimax (x) (check x number (apply min (map maximin x))))
Warning: Oh no, what are you doing?
The proposed changes are backwards-compatible with Arc 3.1, since all they attempt to do is provide sensible defaults for things that presently raise errors.
They're not compatible with Arc programmers who want to get those errors. Not all errors signify places where extensions can roam free. For instance, extending the 'err function itself would be particularly silly. Where and how clearly to draw the line is a matter of opinion, but I think waterhouse and I are both in the "we want errors" camp here. ^^;
And I'm making Penknife so I can explore things like this in their own quarantined namespaces where they can coexist with my other experiments in the same program. ^^ I'm not sure if my strategy's any good though; I fear it won't put enough pressure on certain experiments to see what their real strengths are. It kinda banks on the idea that people other than me will use it and apply their own pressure to their own experiments.
A strategy I'm not comfortable with? Is this also hedging? :-p Nah, in this case I'm not also taking any other strategy I'd prefer to succeed. Or is that necessary? Maybe I'm hedging, but not diversified. Probably I'm just in a bad place. :-p
"I fear it won't put enough pressure on certain experiments to see what their real strengths are. It kinda banks on the idea that people other than me will use it and apply their own pressure to their own experiments."
Even better if you could get others to put pressure on your experiments.
"Even better if you could get others to put pressure on your experiments."
Well, that's both sides of the issue I'm talking about. I do want people to put pressure on each other's experiments, but I expect to promote that by reducing the difficulty involved in migrating from one experiment to another. Unfortunately, I expect that'll also make it easier for people not to put any more than a certain amount of pressure on any one experiment.
Or are you talking about my experiments in particular? :)
No, you understood me right. If you make it too easy to fragment the language the eyeballs on any subsurface get stretched mighty thin.
This might actually be one reason lisp has fragmented: it's too easy to implement and fork, and so it has forked again and again since the day it was implemented. Suddenly ruby's brittle, hacky, error-prone parser becomes an asset rather than a liability. It's too hard to change (all those cases where you can drop parens or curlies at the end of a function call), and it keeps the language from forking.