Arc Forum | Why S-expressions are hard to read: concrete, objective reasons

Arc Forum

Why S-expressions are hard to read: concrete, objective reasons

4 points by nburns 4600 days ago | 51 comments

I've had great experiences writing Scheme. But the readability problems are real, and it seems like every time I take a break, I have to retrain my eye to start writing Scheme again. I think the problem is fundamental. S-expressions are hard for the eye to parse. Things other programming languages do, like using a variety of grouping symbols, punctuation, syntax highlighting, etc -- these communicate meaning through multiple channels at once, using more of the eye's natural bandwidth. You can take in the meaning a lot faster.

Surprisingly, the principles of readability have been pretty much fully explained. Here is an example of a book on graphic design, which gives the flavor for non-experts (like me):

http://www.peachpit.com/store/non-designers-design-book-9780321534040

I only read the first few chapters, actually, but I learned that readability is not actually all that personal and subjective.

The proposals for improving on S-expressions that I've seen look like they're on the right track. I just wanted to contribute that the problems being solved are not unique, and aren't just about people being unfamiliar with Lisp.

3 points by rocketnia 4591 days ago | link

In the spirit of finding actual "concrete, objective reasons" to discuss, here's another link: http://programmers.stackexchange.com/questions/178307/has-th...

I'm not sure what to take away from it, unfortunately.

-----

1 point by nburns 4577 days ago | link

From that link --

"One could argue that a "usability test" of Fortran II leaded to a complete new language: BASIC, which was designed to be more usable (especially for beginners) than its predecessor."

Funny. I learned BASIC first, and then the second language I learned was C, when I was about 15. I found C much more understandable than BASIC. Things that had seemed mysterious in BASIC suddenly became crystal clear in C. Like the fact that a character is the same as an integer holding its ASCII code, a string is the same thing as an array of characters -- and a pointer is the same thing as a memory address. The puzzle pieces all fit. The thinking seems to go that the more you abstract away the hardware, the more understandable the language will be. But it seems to me that a concrete concept is easier to understand than an abstract one. The hardware gives you a concrete frame of reference.

-----

1 point by rocketnia 4576 days ago | link

I'm not sure why you bring up C. A look on Wikipedia tells me FORTRAN II was made in 1958, then BASIC in 1964, and then C in 1969-1973. FORTRAN II looks like it was clearly in the same design family as BASIC, with line numbers, multiple GOTO variants, capitalized English words for syntax, and a necessity to munge obscure memory addresses to invoke advanced functionality. :)

Just like you, the first language I learned was BASIC--specifically Applesoft BASIC--and the second was C. I liked C better because its program code was more modular (no need to push around line numbers when combining programs or inserting code) and it had variables that were local to the current procedure call, which made recursion much more useful.

Then I learned JavaScript, and I no longer had to worry about choosing specific numbers for my array sizes or fumbling with pointer operators. Then I was formally taught Java, and that's when I finally felt capable of writing just about any program: Run time allocation was now easy even when the lifetimes didn't fit a recursion hierarchy (i.e. I couldn't stand malloc() before), and the notion of behavior as part of data made it easy to pursue higher-order designs.

---

"But it seems to me that a concrete concept is easier to understand than an abstract one. The hardware gives you a concrete frame of reference."

Although I briefly programmed in C and I've occasionally read machine code, I don't recall ever considering computation hardware to be a very good frame of reference. Mathematics is where I find confidence, and user experience is where I find tangible feedback.

C tells an elaborate story of a world where memory is mostly one-dimensional, mutable, and uniformly cheap to access, where this memory contains all execution state (even the stack), and where execution takes place in a sequence of discrete steps. In our present-day world of cloud and mobile computing (not to mention the future), where we use networks, caches, distributed code sandboxes, predictive branching, cryptography, etc., this metaphor of computation is a joke at spatially large scales and only an approximation at small scales.

I suspect C feels close to the hardware because (1) historically it's been much closer, (2) CPU-scale architecture design has continued to pander to it, and (3) its elaborate story provides programmers with a chance to discover escape hatch after escape hatch, until they're trained to believe that C closes no doors to them.

---

"a character is the same as an integer holding its ASCII code, a string is the same thing as an array of characters"

If the text you need to represent is always unformatted, canned English, then a sequence of ASCII codes might be the only representation you need. However, I hardly consider all text to fall into that category. I think text can be as hard as natural language and typography, which can be as hard as UI.

---

"a pointer is the same thing as a memory address"

Eh? Who says it isn't? :)

-----

2 points by nburns 4574 days ago | link

>> Then I learned JavaScript, and I no longer had to worry about choosing specific numbers for my array sizes or fumbling with pointer operators.

Javascript is easier in that way because it has garbage collection. Garbage collection makes programming easier. But you can't add garbage collection to every programming language. There are trade-offs when the language abstracts away details like memory management.

>> I suspect C feels close to the hardware because... CPU-scale architecture design has continued to pander to it

I think this gets to the issue of the Von Neumann architecture, and the fact that it isn't the only possible way to design a CPU. I'm not well educated on this subject... However, I don't think you can say that C is the reason for the Von Neumann architecture. I think it's the other way around.

>> "a pointer is the same thing as a memory address"

>> Eh? Who says it isn't? :)

I meant that pointers are the same as integers, integers that hold memory addresses. Which is what they are from the standpoint of the cpu.

-----

2 points by Pauan 4577 days ago | link

"The thinking seems to go that the more you abstract away the hardware, the more understandable the language will be. But it seems to me that a concrete concept is easier to understand than an abstract one. The hardware gives you a concrete frame of reference."

The problem is with leaky abstractions:

http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...

The reason C "feels right" is because it matches the hardware, as you said.

But that doesn't mean high-level is bad: you can have high-level hardware. I'm sure on a Lisp machine, Lisp would feel right at home, whereas C wouldn't.

Unfortunately, I predict we'll be stuck with our von Neumann hardware for quite some time, which will severely hinder the progress of our software. No such thing as a free lunch, eh?

-----

3 points by nburns 4577 days ago | link

I didn't say high-level is bad. The biggest counter-example to that I can think of is pure mathematics. Clearly, mathematics is incredibly useful.

It just seems to me that, more often than not, the orthodoxy with respect to programming is wrong. Or, at best, partly wrong.

Another thing about programming languages designed for beginners is that they tend to be very strongly typed, e.g. Basic and Pascal. This is either because the designers thought it made things easier, or perhaps because they thought it instills some useful lesson. As far as making things easier, that doesn't seem to be true: Think of how many people out there that would flunk out of CS school are managing to write javascript and PHP.

Leaky abstractions is exactly right. At some point, all abstractions seem to break down. This is why I'm not sure the OO principle of encapsulation or information hiding is a good thing. The more you hide the implementation of something, the more you are forcing the use of an abstraction, and the harder it will be to go around the abstraction when, inevitably, you have to.

I don't think there is much you can say about programming languages without qualification. The range of programs you can write is too vast to generalize.

How do you feel about Lisp versus SML? I had to learn SML my second year in college, but at that time in my educational career I wasn't doing much studying, and I only half-learned it. I think some form of Lisp might have made a better introduction to functional programming, because you wouldn't have to deal with SML's type system at the same time.

-----

2 points by Pauan 4577 days ago | link

Yes I pretty much agree with what you are saying.

---

"How do you feel about Lisp versus SML?"

I've never used SML and have only read a little about it. It looks a lot like Haskell, which I don't have much experience with either.

But from what I've seen, I don't like static type systems. I think making them optional is fantastic, but I don't like having them shoved down your throat.

I think you should be able to write your program in care-free Ruby/Arc style, and then once things have settled down, go back in and add type annotations for speed/safety. But you shouldn't have to use the type system right from the start.

The problem is that a lot of the people who find static type systems useful are also the kind of people who like safety a lot, so they want the type system on all the time. Not just to protect their code, but to prevent other programmers from making mistakes.

I don't like that mindset. Which is why I prefer languages like Ruby and Arc, even with their flaws. I don't think any restriction should be added in to prevent stupid people from doing stupid things. I think the language should only add in restrictions if it helps the smart people to do smart things. And for no other reason.

So as long as the type system helps smart people to do smart things, and doesn't get in the way too much, then sure, I think it's great. But if it gets in the way, or it's done to prevent stupid people from doing stupid things... no thanks.

-----

2 points by Pauan 4577 days ago | link

In that line of reasoning, I've been thinking about adding in a static type checker to Nulan. But I want it to use a blacklist approach rather than a whitelist.

What I mean by that is, if it can be guaranteed at compile time that a program is in error, then it should throw a well formatted and precise error that makes it easy to fix the problem.

But if there's a chance that the program is correct, the type system should allow for it. This is the opposite of the stance in Haskell/SML which says: if it cannot be guaranteed at compile-time that a program is valid, then the program is rejected.

Here's an example of what I'm talking about:

  def foo ->
    bar 10 20

The variable `bar` isn't defined. This can be determined at compile-time. Thus, Nulan throws this error at compile-time:

  NULAN.Error: undefined variable: bar
    bar 10 20  (line 2, column 3)
    ^^^

The error message is precise, and pinpoints the exact source of the error, making it easy to fix. And likewise, this program...

  def foo -> 1
    5
   
  foo 2

...creates a function `foo` that requires that its first argument is the number `1`. It then calls the function with the number `2`. This situation can be determined at compile-time, and so I would like for Nulan to throw this error:

  NULAN.Error: expected 1 but got 2
    foo 2  (line 4, column 5)
        ^

But with this program...

  def foo -> 1
    5
   
  foo a + b

...it might not be possible to determine whether the first argument to `foo` is the number `1` or not. If this were Haskell/SML, it might refuse to run the program. But in Nulan, I would simply defer the check to runtime.

This means that every program that is valid at runtime is also valid according to the type-checker. Thus the type-checker is seen as a useful tool to help catch some errors at compile-time, unlike Haskell/SML which attempt to catch all errors at compile-time.

I think this kind of hybrid system is better than pure dynamic/pure static typing.

-----

1 point by rocketnia 4576 days ago | link

How is this different from preventing stupid people from doing stupid things?

I've said this recently, but I like static typing when it contributes to the essential details of the program, rather than merely being a redundant layer for enhancing confidence in one's own code. Static typing is particularly meaningful at module boundaries, where it lets people establish confidence about each other's programs.

Anyway, enhanced usability is nothing to scoff at either. If you find this kind of static analysis important, I look forward to what you accomplish. :)

-----

1 point by Pauan 4576 days ago | link

"How is this different from preventing stupid people from doing stupid things?"

Because the only difference is whether the error occurs at compile-time or run-time. I'm not adding in additional restrictions to make the type-system happy: if the type system can't understand it, it just defers the checking until run-time.

Thus, the type system takes certain errors that would have happened at run-time, and instead makes them happen at compile-time, which is better because it gives you early error detection. What the type system doesn't do is restrict the programmer in order to make it easier to detect errors at compile-time.

---

"If you find this kind of static analysis important"

Not really, no. Useful? Yeah, a bit. It's nice to have some early detection on errors. But my goals aren't to guarantee things. So whether you have the type-checker on or off just determines when you get the errors. A small bonus, but nothing huge. So I'd be fine with not having any static type checker at all.

-----

1 point by rocketnia 4576 days ago | link

The way I see it, what you're talking about still seems like a way to cater to stupid programming. Truly smart programmers don't generate any errors unless they mean to. ;)

---

"What the type system doesn't do is restrict the programmer in order to make it easier to detect errors at compile-time."

Guarantees don't have to "restrict the programmer." If you take your proposal, but add a type annotation form "(the <type> <term>)" that guarantees it'll reject any program for which the type can't be sufficiently proved at compile time, you've still done nothing but give the programmer more flexibility. (Gradual typing is a good approach to formalizing this sufficiency: http://ecee.colorado.edu/~siek/gradualtyping.html)

I think restriction comes into play when one programmer decides they'll be better off if they encourage other programmers to follow certain conventions, or if they follow certain conventions on their own without immediate selfish benefit. This happens all the time, and some may call it cargo culting, but I think ultimately it's just called society. :-p

-----

1 point by Pauan 4576 days ago | link

"The way I see it, what you're talking about still seems like a way to cater to stupid programming. Truly smart programmers don't generate any errors unless they mean to. ;)"

Then I'll reclarify and say "any programmer who's just as smart as me", thereby nullifying the argument that a "sufficiently smart programmer would never make the mistake in the first place".

---

"If you take your proposal, but add a type annotation form [...]"

Sure, if it's optional. And not idiomatic to use it all the time. The problem that I see with languages that emphasize static typing is that even if it's technically possible to disable the type checker, it's seen as very bad form, and you'll get lots of bad looks from others.

The idioms and what is seen as "socially acceptable" matter just as much as whether it's "technically possible". If I add in type checking, it'll be in a care-free "sure use it if you want, but you don't have to" kind of way. I've seen very few languages that add in static typing with that kind of flavor to it.

---

"This happens all the time, and some may call it cargo culting, but I think ultimately it's just called society. :-p"

And I am very much so against our current society and its ways of doing things, but now we're straying into non-programming areas...

-----

1 point by rocketnia 4576 days ago | link

"And I am very much so against our current society and its ways of doing things, but now we're straying into non-programming areas..."

Yeah, I know, your and my brands of cynicism are very different. :) Unfortunately, I actually consider this one of the most interesting programming topics right now. On the 20th (two days ago) I started thinking of formulating a general-purpose language where the primitives are the claims and communication avenues people share with each other, and the UI tries its best to enable a human to access their space of feedback and freedom in an intuitive way.

I'd like to encourage others to think about how they'd design such a system, but I know this can be a very touchy subject. It's really more philosophy and sociology than programming, and I can claim no expertise. If anyone wants to discuss this, please contact me in private if there's a chance you'll incite hard feelings.

-----

1 point by Pauan 4576 days ago | link

"http://ecee.colorado.edu/~siek/gradualtyping.html "

I like that article, I think that'll be useful to me, thanks.

-----

1 point by nburns 4576 days ago | link

I think that C has a good solution. It will compile any code that's possible to compile, but it will output warnings. I don't think it's necessary to halt compilation just to get the programmer's attention. That's what Java does, and it really annoys me.

If the type-checking is not strictly necessary, maybe you should make it an option, like -Wall.

-----

1 point by Pauan 4576 days ago | link

Yes, absolutely. There are certain errors that absolutely cannot be worked around, like an undefined variable. Those are errors that actually halt the program. But the rest should be optional.

-----

1 point by akkartik 4576 days ago | link

I've learned through bitter experience to treat all C warnings as errors, and more. The presence of a single uninitialized local variable somewhere in your program makes the entire program undefined. Where undefined means "segfaults in an entirely random place."

-----

1 point by nburns 4574 days ago | link

I think that's a good practice in general. But when you are experimenting and debugging, it can be useful to eliminate chunks of code by expedient means, which often generates warnings that you don't care about.

-----

2 points by akkartik 4570 days ago | link

I find programming to fractally involve debugging all the time. So if I allowed warnings when debugging I'd be dead :)

You're right that there are exceptions. I think of warnings as something to indulge in in the short term. The extreme short-term; I try very hard not to ever commit a patch that causes warnings. It really isn't that hard in the moment, and the cost rises steeply thereafter.

Incidentally, I'm only this draconian with C/C++. Given their utterly insane notions of undefined behavior I think it behooves us to stay where the light shines brightest. Whether we agree with individual warning types or not, it's easier to just say no.

But with other languages, converting errors to warnings is a good thing in general. Go, for example, goes overboard by not permitting one to define unused variables.

-----

2 points by nburns 4576 days ago | link

"The problem is that a lot of the people who find static type systems useful are also the kind of people who like safety a lot, so they want the type system on all the time. Not just to protect their code, but to prevent other programmers from making mistakes.

I don't like that mindset. Which is why I prefer languages like Ruby and Arc, even with their flaws. I don't think any restriction should be added in to prevent stupid people from doing stupid things. I think the language should only add in restrictions if it helps the smart people to do smart things. And for no other reason."

I could not agree more. I think that the idea of preventing mistakes via restrictive language features is one of the dominant ideas behind object-oriented languages. Consider the keywords "private" and "protected;" they literally have no effect other than to cause compile-time errors. It seems to me, intuitively, that the kinds of mistakes that can be easily caught by the compiler at compile time are in general the kinds of mistakes that are easily caught, period. The kinds of bugs that are hard to find are the ones that happen at runtime and propagate before showing themselves, and they are literally impossible for the compiler to find, because that would require the compiler to solve problems that are provably uncomputable. At my last job, I was working on fairly complicated web applications in PHP, and even though occasionally I'd run into a bug that could have been prevented by static type-checking, it was always in code that I had just written and wasn't hard to find. By eliminating things like variable declarations, PHP code can be made very succinct, and I think that simplicity and succinctness more than offset the risks that come from a permissive language. But I've never used a language that came with type-checking optional, so I have never made an apples-to-apples comparison. PHP is actually an interesting example, because in PHP, the rules for variable declarations are basically inverted from normal: you have to declare global variables in every function that you use them (or access them through the $GLOBALS array), but you don't have to declare function-scope variables at all. It makes a lot of sense if you think about it.

-----

1 point by rocketnia 4576 days ago | link

"Consider the keywords "private" and "protected;" they literally have no effect other than to cause compile-time errors."

Would you still consider this semantics restrictive if the default were private scope and a programmer could intentionally expose API functionality using "package," "protected," and "public"?

IMO, anonymous functions make OO-style private scope easy and implicit, without feeling like a limitation on the programmer.

---

"It seems to me, intuitively, that the kinds of mistakes that can be easily caught by the compiler at compile time are in general the kinds of mistakes that are easily caught, period."

I think that's true, yet not as trivial as you suggest. In general, the properties a compiler can verify are those that can be "easily" expressed in mathematics, where "easily" means the proof-theoretical algorithms of finding proofs, verifying proofs, etc. (whatever the compiler needs to do) have reasonable computational complexity. Mathematics as a whole is arbitrarily hard, but I believe human effort has computational complexity limits too, and I see no clear place to draw the line between what computers can verify and what humans can verify. Our type systems and other tech will keep getting better.

---

"The kinds of bugs that are hard to find are the ones that happen at runtime and propagate before showing themselves, and they are literally impossible for the compiler to find, because that would require the compiler to solve problems that are provably uncomputable."

I believe you're assuming a program must run Turing-complete computations at run time. While Turing-completeness is an extremely common feature of programming languages, not all languages encourage it, especially not if their type system is used for theorem proving. From a theorems-as-types point of view, the run time behavior of a mathematical proof is just the comfort in knowing its theorem is provable. :-p If you delay that comfort forever in a nonterminating computation, you're not proving anything.

Functional programming with guaranteed termination is known as total FP. "Epigram [a total FP language] has more static information than we know what to do with." http://strictlypositive.org/publications.html

---

"PHP is actually an interesting example, because in PHP, the rules for variable declarations are basically inverted from normal: you have to declare global variables in every function that you use them (or access them through the $GLOBALS array), but you don't have to declare function-scope variables at all. It makes a lot of sense if you think about it."

I find this annoying. My style of programming isn't absolutely pure functional programming, but it often approximates it. In pure FP, there's no need to have the assignment syntax automatically declare a local variable. That's because there's no assignment syntax! Accordingly, if a variable is used but not defined, it must be captured from a surrounding scope, so it's extraneous to have to declare it as a nonlocal variable.

I understand if PHP's interpreter doesn't have the ability to do static analysis to figure out the free variables of an anonymous function. That's why I would use Pharen, a language that compiles to PHP. (http://arclanguage.org/item?id=16586)

-----

1 point by nburns 4574 days ago | link

>> >> "Consider the keywords "private" and "protected;" they literally have no effect other than to cause compile-time errors." >> >> Would you still consider this semantics restrictive if the default were private scope and a programmer could intentionally expose API functionality using "package," "protected," and "public"?

Actually, in C++ the default for class members is private...

It's simply a true statement that "private" generates no machine language. All it does is cause compilation to fail. Whether or not this is a good thing is a matter of opinion.

>> IMO, anonymous functions make OO-style private scope easy and implicit, without feeling like a limitation on the programmer.

If you're speaking of lexical closures, I think you're right. You don't need to declare variables as private, because you can use the rules of scoping to make them impossible to refer to. You can achieve the same thing with a simpler syntax and more succinct code.

>> I believe you're assuming a program must run Turing-complete computations at run time. While Turing-completeness is an extremely common feature of programming languages, not all languages encourage it, especially not if their type system is used for theorem proving.

I'm not assuming that programming languages must be Turing complete. It happens to be true of all general-purpose languages that are in common use today.

>> Functional programming with guaranteed termination is known as total FP. "Epigram [a total FP language] has more static information than we know what to do with." http://strictlypositive.org/publications.html

I'll take a look at that language. I think that in 50 years' time, we might all be using non-Turing-complete languages. Making a language Turing complete is the easiest way to ensure that it can solve any problem, but isn't necessarily the best way.

( Technically, a language has to be Turing complete to solve literally any problem, but my hunch is that all problems of practical utility can be solved without resorting to a Turing machine.)

-----

2 points by Pauan 4600 days ago | link

While I agree in general, I've found that S-expressions are easier to read than C syntax when you don't use syntax highlighting because the parens naturally group things together. With syntax highlighting, the situation changes so that syntax-rich languages can be significantly more readable.

I also find Ruby to be extremely readable, with or without syntax highlighting. Of course, I can only imagine how contorted and crazy Ruby's parser must be...

In any case, I tend to agree with Paul Graham on the principle that syntax should just be sugar for S-expressions. For instance, in Nulan:

  -> a b c (a + b * c)
  (&fn (&list a b c) (&add a (&mul b c)))

  foo.bar
  (&get foo "bar")

  [ "foo" 1 "bar" 2 ]
  (&dict "foo" 1 "bar" 2)

  !(foo bar @qux)
  (&not (foo bar (&splice qux)))

All the syntax expands to plain old S-expressions which can be manipulated by macros.

-----

2 points by nburns 4599 days ago | link

It's pretty rare that I don't have syntax highlighting. Another thing about the parens that is annoying is that there doesn't seem to be a single, canonical way for them to interact with indentation, like K&R style for C.

I haven't used ruby, but I think terseness and readability tend to go hand in hand, and ruby looks terse.

Having a sugar -> S-expressions transform is great. It's the best of both worlds.

-----

2 points by Pauan 4599 days ago | link

From what I understand, the "canonical way" is basically "whatever Emacs does", but I don't use Emacs. How I indent code (and how I have seen most other people indent code) is as follows:

2 spaces for indentation. Use indentation after any "block" expression:

  (def foo (x)
    (bar x))

  (let foo 5
    (bar foo))

Non-block expressions should be like so:

  (foo 1 2 3 4)
  
  (foo 1
       2
       3
       4)
  
  (foo 1
    2
    3
    4)

That last one is generally rare and considered somewhat bad form, from what I understand.

"if" indentation is flexible:

Aside from "if", the Lisp code I've seen is generally very uniform, much more so than the JavaScript code I've seen, which can vary substantially from one person to another. For instance, when reading Racket code, it is usually very readable, and the only difference I've noticed is that in some situations they use 1 space rather than 2.

I think the reason Lisp doesn't have a "standard indentation style" is because it doesn't need one. There's no curly braces or distinction between statements and expressions, so people naturally tend to converge on a single style.

And even in the case of the different styles for "if", I at least choose whichever style works the best for me on a case-by-case basis, so that it looks the best. My general heuristic is as follows:

Use this style when there's only 3 arguments:

  (if 1
      2
      3)

Use this style when there's more than 3 arguments:

Rarely use this style, only when all the arguments are short:

  (if 1  2
      3  4)

For Nulan, I haven't yet worked out the "if" idioms, but thus far I've used this form exclusively:

  (if 1
    2
    3)

-----

1 point by nburns 4599 days ago | link

The nice thing about indenting in C is simply that it falls out naturally, so you don't have to think about it.

I'm not proposing to turn Lisp into C. The formless nature of Lisp is essential to what enables it to do things you can't do in C.

-----

2 points by rocketnia 4599 days ago | link

"The nice thing about indenting in C is simply that it falls out naturally, so you don't have to think about it."

In my experience with Java, Groovy, and JavaScript, the indentation gets just as idiosyncratic when it comes to nested function call expressions, long infix expressions, and method chaining.

I've rarely used C, but does its error code idiom obscure this drawback? I imagine error codes force programmers to put many intermediate results into separate variables, whether they like it that way or not.

-----

1 point by nburns 4598 days ago | link

I'm not sure what you mean by error codes. C doesn't have built-in support for things like exceptions.

Method-chaining doesn't come into play, because, technically, there is no such thing as methods. Just regular functions.

Errors are typically indicated in the return value of some function. Error checking, then, is an if block following the function call.

-----

3 points by Pauan 4598 days ago | link

Yes, and rocketnia's point is that this error code checking forces your code into something like SSA or three address code:

http://en.wikipedia.org/wiki/Static_single_assignment_form

http://en.wikipedia.org/wiki/Three_address_code

In other words, you usually won't see this in C, right?

  foo(bar(qux(1, 2, 3)))

Instead, you'd write it more like this:

  x = qux(1, 2, 3)
  if (x == ERR) {
    ...
  }

  x = bar(x)
  if (x == ERR) {
    ...
  }

  x = foo(x)
  if (x == ERR) {
    ...
  }

Because you don't have nested expressions, there's only a single way to indent the code. But in languages like JavaScript, method chaining and nested function calls mean that there's now multiple ways to indent the same code.

Thus, C's syntax isn't actually more uniform than Lisp, it only seems that way because of C's way of handling errors.

-----

1 point by rocketnia 4598 days ago | link

That's exactly what I mean! Thanks!

Hmm, I'm losing track of my point. I prefer writing code so it models all errors explicitly, so I end up with that kind of verbosity in JavaScript too. I only get code like foo(bar(qux(x))) when I'm using lots of function calls that have no errors (or whose arguments can be errors).

EDIT: My use of the term "error code" came about because of this subthread on LtU: http://lambda-the-ultimate.org/node/4606#comment-73197. I probably didn't give the term enough context to have a clear meaning.

-----

1 point by nburns 4592 days ago | link

You've raised interesting issues. I think this thread has digressed a bit, though.

-----

1 point by nburns 4592 days ago | link

>> C's syntax isn't actually more uniform than Lisp

C's syntax is actually less uniform, isn't it?

C isn't a very sophisticated language, but it tends to be readable. At least in the sense of following the flow of control; perhaps things like error handling make it hard to see the forest for the trees.

There may be a fundamental law that the more underpowered the language, the easier it is to read. Sort of like how Dr. Seuss books are more readable than research papers on programming languages theory, right?

-----

3 points by rocketnia 4591 days ago | link

"C's syntax is actually less uniform, isn't it?"

I don't think Pauan was referring to C syntax as a whole. In this subthread, I think we've been specifically talking about whether certain languages have a "single, canonical" indentation style that "falls out naturally."

---

"There may be a fundamental law that the more underpowered the language, the easier it is to read. Sort of like how Dr. Seuss books are more readable than research papers on programming languages theory, right?"

In one sense that's true, since it's easy to make naive improvements to one feature while neglecting another. In another sense, a less readable language is always relatively "underpowered" due to its greater difficulty to use (assuming it's a language we use by reading :-p ).

-----

1 point by nburns 4589 days ago | link

I think C is a great language. It maps straightforwardly onto to the capabilities of the hardware. What I meant by calling it underpowered is that it doesn't do much to increase your power beyond freeing you from having to write assembly language.

Higher order functions and metaprogramming are the sort of things I associate with a powerful language, like Lisp. But sometimes things get so abstract you can't tell what you're looking at.

As you point out, it's easy to ruin something like a programming language while trying to improve it. (I haven't created a programming language, but I've used bad ones.)

I respect C for not trying to be too smart.

-----

1 point by akkartik 4589 days ago | link

> "There may be a fundamental law that the more underpowered the language, the easier it is to read."

That's a lot stronger claim than your original :) Aren't python, ruby, and haskell all high-power but easy to read?

There's the confounding effect of learnability; lisp gets more readable over time. There's also the confounding effect of density or difficulty. This quote captures both:

"If you're used to reading novels and newspaper articles, your first experience of reading a math paper can be dismaying. It could take half an hour to read a single page. And yet, I am pretty sure that the notation is not the problem, even though it may feel like it is. The math paper is hard to read because the ideas are hard. If you expressed the same ideas in prose (as mathematicians had to do before they evolved succinct notations), they wouldn't be any easier to read, because the paper would grow to the size of a book." (http://www.paulgraham.com/power.html)

-----

2 points by nburns 4585 days ago | link

I seem to have overlooked this post until just now...

Incidentally, I've never written python, ruby, or haskell, except for a tiny amount of python.

Good quote. I've been reading a lot of computer science papers lately, and I tend to skip over the math formulas and focus on the text. This could be because I'm reading them for "fun" and not because I have to for a class, or something. But I have always found it hard to take in dense notation, and preferred a conceptual argument. Maybe it's just that I have a deficiency in that area. But I think prose has the potential to carry powerful insights that are out of the reach of formulas; I suspect the problem is that succinct, brilliant prose is just incredibly hard to write. It's probably easier to just list formulas than to get deep ideas into prose. The reverse is also true, of course. Some ideas can only be expressed properly with notation.

But that probably has nothing to do with programming language syntax per se.

-----

1 point by rocketnia 4585 days ago | link

"I've been reading a lot of computer science papers lately, and I tend to skip over the math formulas and focus on the text."

I do that too. :) Unfortunately, at some point it gets hard to understand the prose without going back to read some of the fine details of the system they're talking about. XD

-----

2 points by nburns 4584 days ago | link

I tend to jump around. The introduction is usually boilerplate for the particular area of research, so it can be skipped. (I wonder how many different papers have told the story of the memory hierarchy and how it's getting more and more important as data gets bigger.) Then I try to figure out if the paper has anything important to say, before working on the math. I figure that sometimes the big idea of the paper is in the math, and other times, the big idea is in the text, and the math is just obligatory. (You can't publish a paper on an algorithm without spelling out the precise bounds on time and space, even if the formula contains 15 terms. Not all 15 terms can be important to the performance, but it certainly is important to put them in the paper.) I guess it depends on the field, but in the data structures papers I like to look at, it usually doesn't take a lot of math notation to express the key innovation.

-----

1 point by akkartik 4585 days ago | link

"But that probably has nothing to do with programming language syntax per se."

Why do you say that? Syntax is notation. Check out this paper by the guy who invented APL, when they gave him his Turing award: http://awards.acm.org/images/awards/140/articles/9147499.pdf

-----

2 points by nburns 4584 days ago | link

I don't disagree with you. I was arguing for the value of well-written prose.

Donald Knuth thinks that programs should be more like prose (http://en.wikipedia.org/wiki/Literate_programming) -- not that I've ever tried, or fully understand, literate programming.

-----

2 points by akkartik 4599 days ago | link

"..a single, canonical way.. like K&R style for C."

I don't follow. Isn't K&R just one possible indentation style for C-like languages? Curly on the same line vs next line, etc.?

-----

1 point by nburns 4599 days ago | link

See above reply to Pauan.

There are other styles, but the differences aren't all that significant. I like K&R.

-----

1 point by akkartik 4600 days ago | link

Thanks for the recommendation; I ordered the book.

I'll plug my whitespace-sensitive toy lisp as well: http://github.com/akkartik/wart#readme

  $ git clone http://github.com/akkartik/wart
  $ cd wart
  $ ./wart   # ./wart test runs tons of unit tests

I'd love to have you try it out and tell us if it requires less retraining.

-----

2 points by nburns 4600 days ago | link

Thanks. Your language looks like one I'd like to try. Getting rid of the outermost set of parens is a good idea; the fewer parens that you have to balance, the better. I haven't tried doing any serious programming in a whitespace-sensitive language. I've always found C very readable -- I think C hit a syntactic sweet spot. A lot of people must agree, judging by how many languages have copied C's style.

PS. The one whitespace-sensitive language I use often is make, if it counts. I like make's syntax.

-----

1 point by akkartik 4589 days ago | link

I've started reading the book, but am having trouble connecting it up to s-expressions. Can you elaborate on examples of traditional lisp code that violate its principles?

-----

3 points by nburns 4576 days ago | link

Sorry, I didn't see this for a while.

I didn't really mean to say that the book was directly applicable to the problem at hand; I was thinking of throwing in a disclaimer, but I guess I didn't. I don't have the book handy, but as I recall, some of the principles were things like grouping similar things together, and making things that are different look as different as possible. The author goes on to describe how these things determine where your eyes land and then move over the page, which consequently determines how easy or hard it is to make sense of; remarkably, it was true, as you can see from the examples. The insight was striking... I'm not sure how much of my current thinking was in the book and how much is extrapolation; but I think it's possible to bring the mindset of a graphic designer to something like code. I like how in PHP, variable references always have dollar signs attached. If you bring the mindset of a computer scientist only, you might say, well, the dollar signs are obviously to simplify the tokenization; if you can solve this engineering problem, you should get rid of them. But I think it probably makes a difference in helping your brain's "tokenizer." The connection to the design book might be tenuous, but it seems to me that you could connect it to the idea of making different things look different, or one of the other principles in the book. A minimalist implementation of lisp has practically no inherent structure, which is both a strength and a weakness. Before I knew anything about how to write lisp programs, I knew that lisp had lots and lots of parentheses. I think that if you ask programmers what they know about lisp, probably 10% have used it, and 90% know that it has lots of parentheses. This is the thing that stands out when you look at it.

I think that book is a good book, just in general, and probably not a waste of time to read. But I admit that a leap of imagination is required to see applications to programming language design. Maybe multiple leaps.

-----

2 points by rocketnia 4576 days ago | link

"I like how in PHP, variable references always have dollar signs attached. If you bring the mindset of a computer scientist only, you might say, well, the dollar signs are obviously to simplify the tokenization; if you can solve this engineering problem, you should get rid of them. But I think it probably makes a difference in helping your brain's "tokenizer.""

PHP took sigils from Perl, where Larry Wall said "Things that are different should look different" and used different sigils for different types. However, sigils in PHP, Perl, and shell syntax all make string interpolation possible, which is more of a tokenization issue. So I think sigils probably exist in PHP for both reasons.

As a side note, Groovy uses $foo or ${foo} to mark string-interpolated expressions, but that $ isn't used in other circumstances. It uses the sigil only where the tokenization really demands it.

Personally, I consider "different things should look different" to be a good justification for lisp syntax. If macros are really going to give programmers the power to make the language syntax their own, then language-imposed irregularity is at cross purposes with that open-endedness. I think the parentheses help programmers realize that they should look only at the variable names to determine meaningful differences between things.

---

"The author goes on to describe how these things determine where your eyes land and then move over the page, which consequently determines how easy or hard it is to make sense of; remarkably, it was true, as you can see from the examples."

Does the author just describe and demonstrate these things qualitatively, or are there also references to quantitative studies?

This is Arc Forum, where the regulars already know how to read lisp syntax. We may try to look at our culture with fresh eyes, but it isn't easy. And as I hope you can see, I justify lisp syntax for qualitative reasons similar to the ones you use against it, without really disagreeing with you.

Hard numbers are the kind of thing that would clarify which of these arguments should win, so that's what I hope to find in a thread about "concrete, objective reasons." I'm not surprised or offended that this discussion hasn't focused on the numbers, but I am a bit disappointed.

The Wikipedia article on readability (http://en.wikipedia.org/wiki/Readability) does seem to be heavy with empirical justification. ^_^

From the article: "Bonnie Meyer and others tried to use organization as a measure of reading ease. While this did not result in a formula, they showed that people read faster and retain more when the text is organized in topics. She found that a visible plan for presenting content greatly helps readers in to assess a text. A hierarchical plan shows how the parts of the text are related. It also aids the reader in blending new information into existing knowledge structures."

There's one point potentially in favor of object-oriented programming, where some of the inheritance and data-and-behavior bundling may be formally unhelpful, but (I feel) it does establish an informally organized structure over the code. On the other hand, it's almost never exactly the organization I feel is ideal for conveying my own program. Grr, subjectivity again. x_x

-----

2 points by Pauan 4576 days ago | link

"If macros are really going to give programmers the power to make the language syntax their own, then language-imposed irregularity is at cross purposes with that open-endedness. I think the parentheses help programmers realize that they should look only at the variable names to determine meaningful differences between things."

That's the beauty of Nulan's syntax system: it is almost completely customizable, feels very "Lispish", and plays very well with macros. You can now have your short syntax and the benefits of "code is data is code".

Even wart's system works pretty damn well, despite being much less powerful, because it has very simple rules for how to handle things.

I think the key to making syntax play well with Lisp is to make sure the syntax has a certain amount of simplicity and consistency, and is customizable. Basically, the syntax needs to follow the list structure. Beyond that, you can make it as crazy as you want.

-----

1 point by nburns 4574 days ago | link

>> PHP took sigils from Perl, where Larry Wall said "Things that are different should look different"

I didn't know that. Perl seems to me like a monstrosity. But I'm sure there are worthwhile aspects to it.

>> language-imposed irregularity is at cross purposes with that open-endedness

I agree. There seems to be a trade-off between flexibility and readability.

>> Does the author just describe and demonstrate these things qualitatively, or are there also references to quantitative studies?

It's not that kind of a book.

>> I justify lisp syntax for qualitative reasons similar to the ones you use against it

I'm not trying to argue against Lisp. I've proposed that there may be a trade-off between flexibility and readability, and Lisp sits at one end of that spectrum.

>> Bonnie Meyer and others tried to use organization as a measure of reading ease. While this did not result in a formula, they showed that people read faster and retain more when the text is organized in topics.

The kind of readability I'm talking about is at a much smaller scale, like being able to recognize a function call or a loop.

>> There's one point potentially in favor of object-oriented programming, where some of the inheritance and data-and-behavior bundling may be formally unhelpful, but (I feel) it does establish an informally organized structure over the code. On the other hand, it's almost never exactly the organization I feel is ideal for conveying my own program. Grr, subjectivity again. x_x

I agree. I think one of the problems with OOP is that it tries to organize everything into hierarchies. Most things in the real world don't fit into neat hierarchies.

Think of the problem of finding things on the web. Before it morphed into something else, Yahoo was about organizing the web into a hierarchy. Search engines, in contrast, are not hierarchical. It's clear that the search engine approach has won out over the Yahoo approach.

-----

3 points by akkartik 4570 days ago | link

rocketnia: ..Larry Wall said "Things that are different should look different"..

In the beginning, Perl was a simple language. It generalized awk with user-defined functions[1], explicit file handles (STDIN, FILE) and array variables. With the idea that different things should look different, you got sigils to namespace arrays and scalars away from keywords and the standard library[2].

Over time, however, it got more line-noise. % for hashes. & and -> for references. Constants like $_ got added, positional arguments. When you consider all the primitives and capabilities in the language today, do array/scalar variables really deserve their sigils? It feels a little like hungarian notation now that the standard library includes so much more than just functions on strings and arrays.

It's all very well to say "different things should look different", but "different" isn't some absolute property. Programming languages are human things, and subject to human limitations. Our visual and frontal cortex gets swamped by too much "difference". So we need to pick carefully what to make salient. And, above all, not paint ourselves into a corner with our early decisions.

rocketnia: "I think the parentheses help programmers realize that they should look only at the variable names to determine meaningful differences between things."

nburns: "The kind of readability I'm talking about is at a much smaller scale, like being able to recognize a function call or a loop."

The advantage of lisp isn't necessarily that it forces you to do without syntax entirely. Perhaps it is that it allows syntax to be tuned by project/codebase, to make decisions in the small based on the characteristics of the whole.

For several months now I've been sporadically mulling this rambling paper: http://davewest.us/pdfs/ducks.pdf. I'm not sure what it's trying to say, but the lesson that sticks to my mind is that of the Tibetan Thangka, arranging a number of stories spatially for maximum memetic power. A program doesn't have to be just a list of functions, macros, symbols and calls. Or indeed a long list of user stories. I'm starting to believe that it matters how things are arranged in the small, because it can help grasp how things are arranged in the large, help the program to get into my head, ensure I understand how it is organized in the large, keep me from messing up the architecture with my changes, and thereby help preserve the program's coherence over longer periods of time.

[1] nawk -- the first version of awk with user-defined functions -- was released in 1988 (http://en.wikipedia.org/wiki/AWK). Larry Wall had already released Perl in 1987.

[2] The initial release still used $ for hashes, and didn't yet allow & for functions calls (http://groups.google.com/group/comp.sources.unix/tree/browse...).

-----

2 points by nburns 4567 days ago | link

Wow... thanks for the awk history lesson. I didn't learn awk until about 10 years ago, but now it's one of my favorite programming languages. It's a fine example of a domain specific language.

My experience with lisp is basically limited to scheme, which is a major bias, I'm sure. I get the feeling that common lisp is a bit friendlier, and makes things like user-defined macros a bit easier. The scheme documentation on the web is not all that user-friendly, and I'm forced to confess that I've never created a macro.

-----