This might be a better way of doing types than annotate. It's quite similar to the way Mathematica works, where all objects are expressions of the form <type a b c>. Even lists are like this: <list 1 2 3>. If the 'type' is a function then the form gets evaluated, just like a list in Lisp.
Well, this approach basically is the 'annotate approach, except instead of having a separate basic data type - a "tagged" object - you just use cons cells, where the car is the type and the cdr the rep.
The problem is that unless you make lists also use this representation, as Mathematica does, you can't distinguish between lists and objects of other types - a major problem, especially if you use anarki stuff like 'defm and 'defcall. Moreover, objects of this form do not evaluate to themselves, but to structurally equivalent objects. This is usually okay, but if you're using shared, mutable data structures, it's kind of problematic; moreover, it can introduce a lot of unnecessary consing. This isn't likely to be a major problem (how often are objects subject to excess evaluation?), but if it does crop up, it could make for some nasty bugs.
> Well, this approach basically is the 'annotate approach, except instead of having a separate basic data type - a "tagged" object - you just use cons cells, where the car is the type and the cdr the rep.
Except for the case where (isa R T), where (annotate T R) returns R directly:
This isn't a fundamental difference. Just as 'annotate could as easily create a new doubly-wrapped object, 'annotate-cons can as easily not cons when the type is the same.
(def annotate-cons (typ val)
(if (isa val typ) val (cons typ val)))
I think I'll keep your interfaces idea. It's much more elegant than a list of symbols (although it isn't necessarily much different underneath).
I don't really understand the problem with 'in-package. Is this a SNAP-specific problem or does it affect Arc generally? Won't things work the same way as in CL?
Some other thoughts I had about this which may be useful: modules are usually kept in a file, so a 'load-in-package function which takes a package argument might be useful
Also, I thought it would be good to have a read-macro to switch packages. I'll reuse #: as it's not needed in my system:
#:foo (some expressions)
This would read the expressions in package 'foo before executing them. That might solve your problem as the package is passed to the reader explicitly.
I like the idea of symbols being read in without a package, but then getting a package at eval time. One way to implement this may be to store all the symbols in a special package when they are read, then eval can move these to a new package when it evaluates them. This makes packages very dynamic.
One other thought I've had: package names should be strings. Otherwise, 'foo::bar actually becomes 'foo::foo::bar, which is really 'foo::foo::foo...::bar etc. That;s just a bit crazy, so I think strings should be used to name packages instead. Alternatively, package names should also be interned in a special package that's treated differently. Seeing as packages are just mappings from strings to symbols, that doesn't really make much difference.
> It's much more elegant than a list of symbols (although it isn't necessarily much different underneath).
Which is the point, of course ^^
The other point is disciplining package makers to make package interfaces constant even as newer versions of the package are made. This helps preserve backward compatibility. In fact, if the ac.scm and arc.arc functions are kept in their own package, we can even allow effective backward compatibility of much of Arc by separating them by version, i.e.
(using arc v3)
(using arc v4)
(using arc v5)
> I don't really understand the problem with 'in-package.
; tell the reader that package 'foo has a symbol 'in-package
(= foo::in-package t)
; enter package foo
(in-package foo)
; now: does the reader parse this as (in-package ...) or (foo::in-package ...)
(in-package bar)
> Is this a SNAP-specific problem or does it affect Arc generally?
It's somewhat SNAP-specific, since we cannot have a stateful, shared reader, but I suspect that any Arc implementation that supports concurrency of any form will have similar problems with having readers keep state across invocations. The alternative would be having a monadic reader.
> Won't things work the same way as in CL?
Not sure: I never grokked anything except the basics of CL packages.
> #:foo (some expressions)
How about in a module file? It might get inconvenient to have to keep typing #:foo for each expression I want to invoke in the foo package, which means we really should think deeply about how in-package should be properly implemented.
> One other thought I've had: package names should be strings. Otherwise, 'foo::bar actually becomes 'foo::foo::bar, which is really 'foo::foo::foo...::bar etc.
If we don't allow packages to have sub-packages, then a name that is at all qualified will quite simply directly belong to that package, i.e. foo::bar is always foo::bar, as long as :: exists in the symbol.
so you only have to type it once (although you would at the top level!) With this syntax, #:foo x could expand to something like (read-with-package "foo" x), so you wouldn't need a stateful read. Well, unless you called 'read within the file. So I guess you do. :)
> If we don't allow packages to have sub-packages, then a name that is at all qualified will quite simply directly belong to that package
Are 'foo:quux and 'baz::quux the same package? If so, it's a bit strange that you can refer to the same thing by different symbols. That's why I think strings are better. Not sure what I think about nested packages. I'll have to ponder on that.
(def load (file (o hook))
" Reads the expressions in `file' and evaluates them. Read expressions
may be preprocessed by `hook'.
See also [[require]]. "
(push current-load-file* load-file-stack*)
(= current-load-file* file)
(or= hook idfn)
(after
(w/infile f file
(whilet e (read f)
(eval (hook e))))
(do (= current-load-file* (pop load-file-stack*)) nil)))
What magic needs to be inserted here to make 'load use the correct 'read, keeping in mind that even plain Arc supports threads and those threads share global variables?
It still looks like a stateful 'read to me, and I don't want a stateful 'read at all, because a file might want to directly use 'read:
This is one good reason to try to keep 'read stupid: one of Arc's idioms is to simply dump data as s-expressions and read them in later as list structures. If 'read is too smart, this idiom might have some subtle gotchas.
For that matter I'd prefer to keep the package definitions in the file itself, rather than have to remember to put the file in a package:
$ cat mine.arc
(in-package mine)
(def mine ()
(prn "this is my mine!!"))
- (def load (file (o hook))
+ (def load-in-package (package file (o hook))
- (whilet e (read f)
+ (whilet e (read-in-package package f)
That's the best I can do. I think that if packages are involved then read is inherently stateful, so even threads are a problem. I have no idea how CL implementations deal with threads and package*, because the spec makes no account for it. :(
'eval-cxt objects are callable, and their call is equivalent to:
(let ob (eval-cxt)
(ob x))
==>
(let ob (cxt)
(eval:ob x))
The implementation is free to define 'cxt and/or 'eval-cxt objects in terms of Arc axioms or by adding them as implementation-specific axioms.
The context object accepts a plain read expression (with unpackaged symbols) and emits an s-expression where all symbols are packaged symbols.
It is the context object which keeps track of the current package, so you might have some accessor functions to manipulate the context object (e.g. destructure it into the current package, etc.).
The read function is stateless and simply emits unpackaged symbols, and emits packaged symbols if and only if the given plaintext specifically includes a package specification.
A package object is a stateful, synchronized (as in safely accessible across different threads, and whose basic operations are assuredly atomic) object. A context is a stateful object intended for thread- and function- local usage.
context objects
===============
A context object is callable (and has an entry in the axiom::call* table) and has the following form:
(let ob (cxt)
(ob expression))
The return value of the context is either of the following:
1. If the expression is one of the following forms (the first symbol in each form is unpackaged, 'symbol here is a variable symbol):
(in-package symbol)
(interface symbol . symbols)
(using symbol)
(import symbol symbol)
...then the return value is axiom::t, and either the context's state is changed, or the state of a package (specifically the current package of the context) is changed.
2. For all other forms, it returns an equivalent expression, but containing only packaged symbols. The state of the context is not changed.
The forms in number 1 above have the following changes in the context or current package of the context:
(in-package symbol)
Changes the current package of the context to the package represented by the unpackaged symbol. The implementation is free to throw an error if the given symbol is packaged.
(interface symbol . symbols)
Defines an interface. All symbols are first applied to the current package to translate them into packaged symbols, if they are unpackaged (this translation by itself may change the package's state, and also a packaged symbol will simply be passed as-is by the package object; see section "package objects" below). It then modifies the package of the first symbol to have an interface whose symbols are the given symbols.
If the interface already exists, it is checked if the lists are the same to the existing list. If it is not the same, the implementation is free to throw an error.
(using symbol)
The given symbol must be a packaged symbol. It must name an interface of its package; if the interface does not exist on the package, the implementation must throw an error. For each symbol in the interface, this changes the current package's state, creating or modifying the mapping from the unpackaged symbol of the same name to the symbol in the interface.
For conflicting package interfaces: let us suppose that the context is in package 'User, and there exists two package interfaces, A::v1 and B::v1. A::v1 is composed of (A::foo A::bar) while B::v1 is composed of (B::bar B::quux). If the context receives (using A::v1), the User package contains the mapping {foo => A::foo, bar => A::bar}. Then if the context receives (using B::v1), the User package afterwards contains the mapping {foo => A::foo, bar => B::bar, quux => B::quux}.
(import symbol symbol)
Forces the current package to have a specific mapping. The first symbol must be a packaged symbol and the second symbol must be unpackaged. The implementation must throw an error if this invariant is violated.
Continuing the example above, after (import A::bar A-bar), this changes the package to {foo => A::foo, bar => B::bar, A-bar => A::bar, quux => B::quux}
package objects
===============
A package object is callable and has the following form:
(ob expression)
expression must evaluate to a symbol, and if a non-symbol is applied to a package object, the implementation is free to throw an error. The application otherwise evaluates to either:
1. The same symbol, if the given symbol is a packaged symbol; this does not change the state of the package
2. A packaged symbol, if the given symbol is an unpackaged symbol. If the package does not contain a mapping for the unpackaged symbol, the state of the package is changed so that a mapping for the unpackaged symbol to a packaged symbol exists.
The package object also supports an 'sref operation:
(sref ob v k)
k is an unpackaged symbol while v is a packaged symbol; the implementation is free to throw an error if this invariant is violated.
Packages are handled by interface.
Further, we also predefine two packages, axiom and arc.
The axiom package is implicitly imported into all packages. It presents no interface
The arc package contains all "standard" Arc functions and macros. The arc package is not implicitly imported into all packages.
The arc package contains the interface arc::v3 . This interface is the set of symbols currently defined on Anarki. Future extensions to the arc standard library must be first placed in the interface arc::v3-exp until they are placed into a future arc::v4 interface, and so on.
load
====
The load implementation is thus:
(def load (file (o hook))
" Reads the expressions in `file' and evaluates them. Read expressions
may be preprocessed by `hook'.
See also [[require]]. "
(push current-load-file* load-file-stack*)
(= current-load-file* file)
(or= hook idfn)
(after
(w/infile f file
(let evaller (eval-cxt)
(evaller '(in-package User))
(whilet e (read f)
(evaller (hook e)))))
(do (= current-load-file* (pop load-file-stack*)) nil)))
As for CL packages, I've decided I don't really like the way they work. If a file is compiled, then (in-package foo) is only guaranteed to work if it appears at the top level. So...
(if (eq x 10) (in-package foo) (in-package bar))
works in an interpreted file, but the behaviour is undefined if the file is compiled. CLISP handles both cases fine.
Also in CL, the value of package* doesn't always correspond to the actual current package. For example
(setf x "I'm in the default pacakge!")
(setf foo::x "I'm in the FOO package!")
(setf *package* foo)
(print *package*)
(print x)
does this when interpreted
#<PACKAGE FOO>
"I'm in the FOO package!"
but this when compiled
#<PACKAGE FOO>
"I'm in the default pacakge!"
Either the package should be determined at eval-time (as was your suggestion) or the user should be forced to use read macros like #: and #.(in-package ...) to switch packages at read time. The CL solution is an ad-hoc compromise between the two.
Forcing the user to keep using read macros doesn't feel quite right. Personally I'm more for using 'eval-cxt objects, which would do the assignment from plain symbols to qualified symbols, and keep track of the current package.
Of course, using 'eval-cxt raises questions about static whole-program compilation, I think. Hmm. I haven't thought deeply about that yet.
It looks like CLSQL needs reader macros to switch the syntax on and off locally. If Arc had reader macros, then you could do this:
#.(with-A (mac macro-A ..blah..blah..in special A syntax))
Assuming 'with-A is a function that set the read table locally, and macro-A uses quasi-quote to generate its result, this will produce a macro that produces standard Arc syntax, even though it's written in A syntax.
With reader macros, 'w/html could be implemented even if de-sugaring were moved to the reader, although you'd have to call it with #. all the time.
It makes sense to me that macros should always expand to vanilla Arc syntax (or maybe even pure s-exps without any ssyntax) so that they are portable across environments.
CL packages solve this problem fine. I was hoping Arc wouldn't have to go down that route (because it always confuses newbies and adds a lot of complexity) but the more problems arise, the more I appreciate how good packages are.
The problem is that the paper uses words like 'model' and 'function-based' rather vaguely. You can model I/O with a TM, but you can't actually do it, which is what they're getting at.
I'm very much in favour of the first one. Simple things like that can have a big impact on how people perceive a language. I even began writing an IDE in Arc as a staring point, but it got reprioritised.
Even better though would be if the Lisp community put down its stupid in-fighting and built a language-neutral easy-install environment; something that could edit and run Scheme, CL and any other dialect you might want to write. Imagine building Arc on an environment that's actually designed to have Lisps implemented on it, rather than trying to fake CL style lists in MzScheme and all that nonsense. Imagine if you could write back-ends to different architectures, making Rainbow and arc2c unnecessary.
That would be cool. As far as I know, s-exps are s-exps no matter which lisp they're written in. The general environment might need a good way of defining syntax so that it could be more helpful than just a text editor with REPL, but that sounds like something lisp should be capable of doing pretty well.
The only problem is that everyone would argue over which language to implement it in. How about we just start in arc, and let them all join us later?
Actually, it sounds like an extended DrScheme, which already supports several different languages of scheme in the editor. If the interface was improved a bit, and it was extended to handle CL and Arc, then it should suit admirably. Unfortunately, I don't know how the language selection feature works, so I don't even know if that's possible.
Arc is probably a good language to implement it in, because it has such a small core and is so flexible. I also think Arc's users are a bit more experimental and ambitious than most Lisp users. Consider the fact that Arc will probably have a shared-nothing garbage collector before any other dialect!
By globals variables, do you mean regular globals? I would have thought these should be private to each process. After all, a process shouldn't mess with another process's state. It's rude!
If they're immutable, then great. But using shared memory to transfer state is flawed, and can cause race conditions, etc. That's why we're using the actor model to begin with, right?
I suppose they can't really be immutable, or we couldn't do hot code loading. How does Erlang do it?
Heh, you're right. So, in that mindset, why not give a lot of nice macros for controlling share vs copy, and make the default be copy? Then programmers could control nearly everything. Of course, they could always hack on your vm if they really wanted tons of control.
But still, concurrent writing to a global variable sounds dangerous.
I kind of like the idea of them being "registered processes." I'll have to do some more thinking on that.
>It doesn't, actually...
Yes, that answers some of the question, but I was a bit more interested in how they implemented their hot code loading. The code still exists for a while as the existing processes continue to use it. But they eventually phase out the functions and swap to the new ones.
IMHO, hot code loading is a very nifty feature. Combined with remote REPL makes it especially useful. I don't know how well current lisps support hot swap, but I don't think it can work effectively without a concurrent system.
> Yes, that answers some of the question, but I was a bit more interested in how they implemented their hot code loading
It's in the OTP library actually. For example, they have a standard gen_server module. The gen_server would look approximately like this in snap:
(def gen-server (fun state)
(<==
('request pid tag param)
(let (state . response) (fun state param)
(==> pid (list 'response tag response))
(gen-server fun state))
; hot code swapping!!
('upgrade new-fun)
(gen-server new-fun state)
('stop)
t))
So yes: hot swapping just means sending in a message with the new code ^^
It's actually more complex than that - they generally make messages include a version number so that nodes with new versions can communicate to nodes with older versions. This versioning is, in fact, part and parcel of the gen_server series of functions. Requests to servers are made via functions which send a message (with tags and other metadata such as versions abstracted away) and wait for a receive.
I think what they say is, the programmers good at concurrency write the gen_server and other parts of the OTP, while the average programmers write application code ^^
Much of Erlang isn't implemented in the VM level ^^
It makes sense that they wouldn't do that at the vm level. Your code even makes sense, though I thought "let" only assigned one variable.
I'm still not quite able to read arc fluently, so any explanations of the subtle that I likely missed will always be appreciated. Come to think of it, any explanations of any code would be nice, as the thoughts and reasons behind code don't always come out in the source itself. And I also like learning new things :)
I'm using pattern matching. Although Arc doesn't actually have pattern-matching built in, someone wrote a pattern matching library a long time ago using macros: http://arclanguage.com/item?id=2556http://arclanguage.org/item?id=1825 . The modern evolution uses something like p-m:def to define a pattern-matching function, p-m:fn to create an anonymous pattern-matching function, etc.
('request pid tag param)
The pattern above means "match a 4-element list, whose first element is the symbol 'request, and which has 3 more elements that we'll call pid, tag, and param".
(let (state . response) (fun state param)
This is a destructuring. It simply means that (fun state param) should return a cons cell, with the 'car of the cell being placed in state and the 'cdr being placed in response. So we expect fun to return something like (cons state response)
(==> pid (list 'response tag response))
Note the use of 'tag here. We expect that 'tag would be a 'gensym'ed symbol, and is used in pattern matching so that the client can receive the message it's looking for.
(gen-server fun state))
Plain recursion.
; hot code swapping!!
('upgrade new-fun)
(gen-server new-fun state)
Functions (that aren't closures) can be safely cached because they're immutable. If we assume arc.arc is part of the 'spec' (and hence itself immutable) then we can safely link each process to the same functions, but give each one it's own global bindings, maybe?
> arc.arc is part of the 'spec' (and hence itself immutable)
But ac.scm itself is not immutable. cref lib/scanner.arc , which redefines 'car and 'cdr (which are in ac.scm). If ac.scm, which is even more basic than arc.arc, is itself not immutable, then why should arc.arc be immutable?
So no.
In arc2c functions are represented by closures. Pointers to closures are effectively handles to the actual function code.
Now the function code is immutable (that's how arc2c does it - after all, all the code has to be written in C). When a function is redefined, we create a new closure, which contains a pointer to the new function code (which was already compiled and thus immutable), then assign that closure to the global variable.
Basically my idea for a cache would also have an incremented update pointer:
class SymbolAtom : public Atom {
private:
std::string name;
Generic* value;
size_t version;
public:
friend class Process;
};
class Process : public Heap {
/*blah blah heap stuff...*/
private:
std::map<Atom*, pair<size_t, Generic*> > g_cache;
public:
Generic* get_global(Atom* a){
std::map<Atom*, pair<size_t, Generic*> >::iterator i;
i = g_cache.find(a);
// not in cache
if(i == g_cache.end()){
Generic* mycopy = a->value->clone(*this);
g_cache[a] = pair<size_t, Generic*>(a->version, mycopy);
return a->value;
} else {
pair<Atom*, pair<size_t, Generic*> > ip = *i;
pair<size_t, Generic*> ipp = ip.second();
// no change, return value
if(a->version == ipp.first()){
return ipp.second();
} else {
//recache
Generic* mycopy = a->value->clone(*this);
g_cache[a] = pair<size_t,Generic*>(a->version, mycopy);
return mycopy;
}
}
}
}
Agreed. Depending too much on global state sounds risky, though I suppose it depends on how you abstract it. You could have two types of 'globals'. One being the traditional scopeless variable, local to each process, and the other a process that is registered somewhere that makes it easy to find. So, instead of requiring pre-knowledge of the pid, you can find it easily. That sort of sounds like a global variable, but it wouldn't be limited to just storing a value.
Unfortunately, you'd still need to handle the problems that would occur if one of those global procs died, or if the table got corrupted.