I don't believe this will catch many of the other common conflicts
Doesn't need to. This is one tool in the toolbag which does nothing but fix one problem. If you have a different problem, pull out a different tool :-)
board or group that manages some form of standards, quality and compliance
Sure, that's what's usually done, but it also has a lot of overhead in bureaucracy, politics, and conformance to poorly designed standards. I'm curious about another approach: what if patches were really easy? What if you could say, "ok, here are the fifteen patches I want in my Arc", and you could just push a button to get them? Now you don't need a board to be filtering patches for you, you can choose which ones you find to be of high quality. Or perhaps, for a particular project, a patch of terrible quality but happens to do something you want for that project :) I'm not sure of all the details yet, but I'm playing around with it.
disincentive for hackers like yourself to contribute valuable code to the community if you end up getting all kinds of requests/baggage from it
Well, my goal is to make it easy for me to share my Arc hacks. I have a whole bunch of hacks to Arc that I've written while doing my own programming, and some of them I've had the time to write up and publish. I'd like to make this easier and faster, so that I can publish more of my hacks without it being any trouble.
And I'd like to become easy for programmers such as yourself to publish your own hacks, such as the patches you made to the JSON library. That way if someone finds your version to be better, they can use yours. Or if they don't like what you did with symbols or something, they can use mine. Or pull out what changes they like from yours.
Now I'm not so worried about requests and stuff. If you see a way to make one of my hacks better, go ahead and publish your changes. No one is forced to use your version, but if it is better, then people can use it.
Eventually we have a lot of hacks floating around and people will start publishing collections. "Here are the thirty Arc hacks and libraries that I recommend". Each person who publishes a collection will have their own standards for what hacks they include. To get started, you can pick a collection author who has standards you agree with to get your initial collection of Arc hacks and libraries. But the collection still consists of individual hacks that you can pick and choose from, so you're not stuck with the collection compiler's decisions. That's the idea anyway... I still need to write up more about it.
Test cases: (some input cases are just to make sure there are no infinite runs)....
(forstep i 5 10 1 (prn i))
(forstep i 5 10 2 (prn i))
(forstep i 10 5 2 (prn i))
(forstep i 5 10 -2 (prn i))
(forstep i 0 1 0 (prn i))
(forstep i 1 0 0 (prn i))
(forstep i -2 2 1 (prn i))
(forstep i 2 -2 1 (prn i))
(forstep i -4 2 2 (prn i))
(forstep i -4 2 -2 (prn i))
(forstep i 4 -2 -2 (prn i))
(forstep i 4 -2 2 (prn i))
That being said, I don't understand why changing 'for' this way would create problems for you, or why you want
(for i 1 0 (prn i)) to return nil.
also, I still think it would be nice for arc to support keyword arguments (hint, hint pg - though he's probably not reading this), then I could have 'step' as an optional arg with a default. i.e....
(mac for (v start end (o step 1) . body)...
could allow:
(for i 20 5 (prn i)) and also something like: (for i 20 5 step:2 (prn i))
A much better alternate version too. The redundant code in my hacked version was obvious and ugly, but I struggled reducing it to a simpler form... Thanks!
I made a few slight adjustments though, as it failed 2 of my test cases (however unlikely they are to occur, infinite runs scare me).
That being said, I don't understand why changing 'for' this way would create problems for you
Perhaps I'm doing i from 1 to the number of things I have, and sometimes I have zero things.
Actually though it was a bit silly of me to suggest "forstep" when your version of "for" was doing what you wanted it to do. It's not a version that I'd want to use in my code, but that's no reason for you not to have your version.
So the form is being submitted with a http POST method. With POST, the values are passed in the body of the request, instead of added to the URL using the ?a=1&b=2 syntax. When then POST comes into the Arc server, it pulls the arguments from the request body, ignoring the URL.
What you want is to include a hidden input value in your form:
(gentag input type 'hidden name 'parm value 324342)
That way you'll get the parm value passed along in the form values.
Your use of "url" as a global variable is a bad idea if two users happened to request "test" at the same time. By the time "(do-it url)" is called, "url" might have been set to the other value by the other invocation of "test".
it would be great to supply tokeniser/parser hooks instead
I used to think that, but these days I'm coming around to applying the "Arc philosophy" to libraries: every implementation should contain the minimum amount of code needed to implement just what it does.
So we have one commit in git which is parser.arc; and another commit in git which is parser + table reader.
Now, this is a weird thing to do. What everyone does is write software so that it can be configured and extended without having to change the source. Like everyone else, I've done this myself: just look at my "extend" function for example.
Yet Arc is different. It doesn't try to let you configure it or extend it. It just does what it does in the simplest way. And that simplicity makes it easy to hack Arc to do what you want. While the usual way, of having configurations and hooks and extension methods often fails: whatever I want to do turns out not to be supported by what the original author thought of. So I end up having to hack the source anyway, which is then made difficult ironically by the added complexity of all the configuration and extension code!
So to make the parser easily extensible, rather than adding hooks, instead make sure that you've removed any code duplication.
Are all those functions defined in arc-tokeniser just to keep them out of the top level namespace? Or is something else going on?
The "add hacks not options" idea is intriguing, and I adore the simplicity it both requires and begets. I'm not sure how to control damage from conflicting patches though. And I'd much rather share hacks via libraries - it's so easy to just redefine any core arc function, and your "extend" function makes that even easier and safer. I suppose there are some kinds of hacks that are harder to share this way though - especially if you're hacking ac.scm.
Are all those functions defined in arc-tokeniser just to keep them out of the top level namespace?
Yes. I've noticed that most arc code is not like this, so maybe arc-tokeniser is really bad style. What's the correct way to deal with namespace clashes? I'd like to avoid having a whole bunch of functions called arc-tokeniser-<something>.
Although upon reflection, it's true that popping these kinds of functions up into the toplevel makes them more readily hackable (by simple substitution, or 'extend-ing). I wonder if there's a way to do that without spawning a crowd of verbosely-named little functions?
I don't have answers to any of these questions - my brain is still wired mostly in java, and in the process of re-wiring I can't write well in any language ...
I'm not sure how to control damage from conflicting patches though.
We publish a commit which is a merge of the two patches. You can see an example in my arc2.testify-table0+testify-iso0 commit, which is a merge of my arc2.testify-table0 and my arc2.testify-iso0 patches. Here's the original testify from arc2:
(def testify (x)
(if (isa x 'fn) x [is _ x]))
My arc2.testify-table0 patch makes testify treat tables like it does functions:
(def testify (x)
(if (in (type x) 'fn 'table) x [is _ x]))
My arc2.testify-iso0 patch has testify use "iso" instead of "is":
(def testify (x)
(if (isa x 'fn) x [iso _ x]))
And the merge of the two:
(def testify (x)
- (if (isa x 'fn) x [iso _ x]))
- (if (in (type x) 'fn 'table) x [is _ x]))
++ (if (in (type x) 'fn 'table) x [iso _ x]))
The first "-" line shows the arc2.testify-table0 patch, the second "-" line shows the arc2.testify-iso0 patch, and the "++" shows how I merged the two patches. (You can get this output by using the -c option to git-log: "git log -p -c")
and your "extend" function makes that even easier and safer
Right, I think that functions like "extend" would arise from seeing patterns in code and abstracting them out, making the code more succinct. It was more my thought process I was commenting on, I had thought "patches are hard to deal with so I'll make functions like extend". Now I'm thinking, "what if patches were easy?".
What's the correct way to deal with namespace clashes?
I've been wondering about that. One possibility I've wondered about is to have an abbreviation macro:
So the actual name of the function would be arc-tokeniser-make-token, but anyone can refer to it by make-token by using the abbreviation macro if that would be convenient for them.
This is even more speculative, but I've also wondered if maybe it needn't be your job as a library author to worry about namespace clashes. What if the user of the library said "oh, look, make-token is clashing", and could easily load it with particular symbols renamed...
I think the first step is not to worry about namespace clashes. Instead think, "oh, namespace clashes are easy to deal with, so if they happen no problem". Otherwise you end up doing work (and maybe making the code more complicated or harder to extend) to avoid a namespace clash that may never happen.
Where w/locals expands into a big (withs ...) form with key-value pairs taken from the list.
Use an alternative def to put function definitions in such a table instead of the global namespace:
(def-in tokeniser-helpers make-token (kind tok start length) ...)
This way, arc-tokeniser would be less horribly big, easier to hack, and non-namespace-polluting. As a kind of plugin system, it doesn't seem terribly obtrusive does it? The disadvantage is that you need to search further for the definitions of your helper functions.
Is this something like how your prefix-abbrev macro would work?
I think it's not just a question of worrying about clashes that may never happen - it also feels inelegant, dirty even, to have globally-accessible functions that are relevant only in a very specific context. Otherwise I would completely agree - it would be a kind of premature optimisation to worry about them.
it also feels inelegant, dirty even, to have globally-accessible functions that are relevant only in a very specific context
Yes, but how do you know that your Arc parser functions are only going to be relevant in the code you've written? Perhaps someday I'll be writing my own parser, or something completely different, and I'll find it useful to use one of your functions in a way that you didn't think of!
I suggest trying out writing your code is in the simplest possible way. For example, in your original:
(def arc-tokeniser (char-stream)
(withs (make-token (fn (kind tok start length)
(list kind tok start (+ start length)))
"make-token" does not use "char-stream", so we can make this simpler:
(def make-token (kind tok start length)
(list kind tok start (+ start length))
Now I can look at "make-token" in isolation. I can easily understand it. I know that all that other stuff in arc-tokeniser isn't affecting it in some way. And, if I'm writing my own parser and I want to use "make-token", I can do so easily.
And sure, down the road there may be some other library that also defines "make-token". At that point, it will be easy to make a change so that they work together. Perhaps by renaming one or the other, or by doing something more complicated. The advantage of waiting is that then we'll know which functions actually conflict, instead of going to a lot of work now to avoid any possibility of future conflict, the majority of which may never happen.
Now of course I'm not saying to pull every single function out of arc-tokenizer. You've some functions that depend on char-stream and token and states and so on. So those it makes perfect sense to leave inside arc-tokenizer. My claim is to today write the simplest possible parser.arc library, explicitly not worrying about future namespace clashes. That it is better to deal with them in the future, when they actually happen.
Hmmmmmm... I had thought that r6rs compatibility mode wasn't very useful, if it just made car a synonym for mcar etc. But it does more, for example unlike in regular plt-4 where in (lambda args ...) args is an immutable list, in r6rs mode:
#!r6rs
(import (rnrs) (rnrs mutable-pairs (6)))
(define x ((lambda a a) 'a 'b 'c))
(set-car! (cdr x) 'd)
(write x)
(newline)
$ plt-4.1.5/bin/mzscheme -t a.scm
(a d c)
This looks like it could solve a lot of problems with a port to plt-4, since otherwise we'd need to be rewriting the Arc compiler to change the expansion of (fn args ...) etc.
Played around with it a bit more, r6rs appears problematic as apparently ++ is not a legal symbol in r6rs (!)
I took a look at PLT's implementation of lambda for r6rs/r5rs (it's in collects/r5rs/main.ss), and they just simply convert to a mutable list if the lambda has a rest parameter:
(define-syntax (r5rs:lambda stx)
;; Convert rest-arg list to mlist, and use r5rs:body:
(syntax-case stx ()
[(_ (id ...) . body)
(syntax/loc stx (#%plain-lambda (id ...) (r5rs:body . body)))]
[(_ (id ... . rest) . body)
(syntax/loc stx
(#%plain-lambda (id ... . rest)
(let ([rest (list->mlist rest)])
(r5rs:body . body))))]))
(the list->mlist is the part I'm looking at)
So having for example (fn args ...) compile to an (arc-lambda args ...) which does the same thing might be simpler than trying to get Arc to compile and run in the whole complicated r6rs environment.
I participated a bit in the discussion while r6rs was being created, but it quickly became apparent to me that the goals that the editors were striving for weren't things that I personally cared about.
So, anyway, due to the "compromise", can we get mutable pairs if we run MzScheme in r6rs compatibility mode?
a.scm:
#!r6rs
(import (rnrs) (rnrs mutable-pairs (6)))
(define x (cons 'a 'b))
(set-car! x 'c)
(write x)
(newline)
$ ./mzscheme a.scm
(c . b)
So yes, though it turns out that in r6rs compatibility mode cons is really just mcons, so we don't actually gain anything.
It would be a mistake to take the atomic out of expansions of =.
Operators that modify things have to be atomic in the stretch between reading the value and writing the modified value.
Suppose x is initially 0 and you have two threads both evaluating (++ x). If the ++s
aren't atomic, you could end up with this sequence:
thread 1 reads the current value of x, 0
thread 2 runs in its entirety, leaving x = 1
thread 1 resumes, setting x to 1 + the 0 read earlier, or 1
You could similarly have two threads evaluating pushes onto the same list that
ended up losing one of the pushed values.
I don't think you should expect to be able to throw control out of an atomic expression-- at least not short of some abort-as-disaster operator. That's the definition of atomicity: it all has to complete.
It would be a mistake to take the atomic out of expansions of =
Right, but it's also a mistake to leave them in. Too much locking is as bad as too little; e.g. (obj a (readfile "foo")) will hang my web server if reading foo happens to take a long time.
Here's my current chain of reasoning around locking...
Arc's approach to programming is exploratory, building larger programs out of small, composable parts.
Locking is a problem with exploratory programming because buggy locking code usually works most of the time, unlike bugs in functional code which are usually visible. With exploratory programming, you try things and see if they work. Sure, with functional code there are the corner cases that you miss and the occasional incorrect algorithm that happens to return the right value for the input you give it, but most of the time with exploratory programming you try things and you get to see that they don't work. But with locking, you throw together some locking code and try it out, and hey, your program runs and doesn't hang and gives the right answer. The bug, if there is one, only bites once in a blue moon when the different threads happen the hit the code in exactly the wrong way.
There's an interesting social aspect to this as well. I've noticed that if I tell someone about a bug in their code, it's less likely to get fixed if it's a threading bug. If their code returns the wrong value for an input, they say "oh my gosh!" and fix it right away :-). But if it's a threading bug, well, yes, it looks like a bug, but the program appears to run ok anyway, so there's little urgency, and how do we know if we've really fixed it or haven't added another threading problem?
The social aspect goes both ways. One of the things I find so delightful about Arc is that because of your work to write concisely, I can look at a function and say "oh, there's a bug". Or, at least that the function isn't doing what I want it to do. Which I can't do with most code, not as easily, because it is surrounded with so much cruft. But I don't have the same feeling of clarity when I look at Arc's locking code. I can read through the code and perhaps pick up on a locking bug or two (e.g. atomic-invoke), but overall, is everything locked that needs to be? Anything locked that shouldn't be? I can't tell. This part of Arc feels like regular software to me... complex enough so that I imagine there are probably bugs, and I don't expect to be able to get them all.
Composibility with locking is a problem too. You have a couple of perfectly good expressions (obj a 1) and (readfile "foo") and you put them together and they break.
My next thought in the chain is, so why use threading anyway? MzScheme only runs on one CPU, so what threading gives us is a) not having to call yield in a long CPU intensive calculation and b) having our program execution randomized for us so that our program doesn't return the same output for the same inputs... unless we very carefully add the right locking in the right places.
So my current inclination is to rewrite the web server to a single threaded event driven model.
Or maybe something like Erlang, where you're not stuck with a single thread, but you're also not trying to deal with sharing modifiable data between threads either.
Hmm, a different way of looking at the issue just occurred to me: pushing atomic inside of expansions of = may make Arc plus News shorter but Arc plus my program longer.
If ++ didn't do locking, then in places where you were using x from multiple threads you'd need to say (atomic (++ x)). Pushing atomic inside ++ makes this shorter because now you can just say (++ x). But pushing atomic into expansions of = also means that locking occurs at other times, and so I can't use it in places where I'm doing things like throw and readfile that break with locking, which makes my program longer.
Which leads to a fascinating idea, if we get a large enough body of open source Arc code that we can start optimizing for code size globally... :)
That would increase the conceptual load of programming in Arc a lot. It would make people have to think about the expansions of operators like ++ to know when to wrap things in atomic and when not to. You need to be able to treat built-in operators like that as black boxes. Once you start thinking about macroexpansions, it's as if you had to write them.
Hmm, well, I can only speak with any knowledge about my own conceptual load... I expect with your background (professor, Lisp book author, tutorial writer, mentor, etc.) you have a much better idea of what other programmers would find easy or difficult.
I know that some things should be atomic, such as accessing shared mutable data structures, and some things that I need to avoid being atomic, such as doing I/O.
I find Arc's making some operations atomic for me doesn't help me all that much, because without knowing the details of the macroexpansion, I don't know if everything that needs to be atomic has been made so. And I find it unhelpful in other cases, when I need an operation to not be atomic, and so I need to look at the macroexpansion of = to find out if that particular expansion is doing something that I need it to not do, or if it's ok.
On the other hand, I have no alternative to offer yet ^_^. I surmise that if I factored Arc + News + my code, perhaps I might come up with a useful suggestion to offer, and if I do, I'll certainly post about it!