Arc Forum | sacado's comments

Arc Forum

new | comments | leaders | submit | sacado's comments

6 points by sacado 6332 days ago | link | parent | on: Official versions of Arc?

Arc3 is not released yet (it was just a joke, because we are waiting for it). Arc2c is a non-official, still in development version of a stand-alone compiler from arc to C. The official version (arc2, without the c) is available here : http://arclanguage.org/install . It was released at the end of February.

You can also download Anarki, an unofficial improvement of arc2 (mostly bug fixes, a few experimental ideas and a few more libraries), available here http://github.com/nex3/arc/tarball/master .

Personally, I think you really should use Anarki as it is more mature, but is still compatible with the official arc2 (and most people here would tell you so), but it's up to you.

Oh, and in both cases, you will need to download and install mzscheme first (http://www.plt-scheme.org/software/mzscheme)

-----

1 point by almkglor 6332 days ago | link

> You can also download Anarki, an unofficial improvement of arc2 (mostly bug fixes, a few experimental ideas and a few more libraries)

Err, I'd say its more (some bug fixes, a lot of crazy experimental ideas and a bunch of unused libraries)

-----

1 point by lojic 6332 days ago | link

It was more a sign of frustration than a joke :(

-----

2 points by sacado 6332 days ago | link | parent | on: New types in Arc?

Well, this is Lisp, so the solution is probably to use a list :)

First, you would implement theorems as a pair of lists : a list for preconditions, one for consequences. Thus, A would be coded as '(nil . (P)) (no precondition, one consequence, i.e. a fact) and B as '((P) . (Q)) (one condition, one consequence) :

  (= A '(nil . (P)))
  (= B '((P) . (Q)))
  (modus-ponens A B)

After that, a list called facts* would contain the elements P and Q :

  facts*
  ==> (P Q)

That's how you would do while prototyping. Of course, you will eventually want to encapsulate everything, with dedicated functions, for example :

  (= A (fact 'P))
  (= B (theorem '(P) '(Q)))

But now, you want to add a kind of strong typing, since you don't want to mistake a theorem from another list. To achieve this you can define a new type with the 'annotate function :

  (def fact (name)
    (annotate 'theorem (cons nil (list name))))

  (def theorem (pre post)
    (annotate 'theorem (cons pre post)))

  (= A (fact 'P))
  (= B (theorem '(P) '(Q)))
  (type A)
  ==> theorem

  (modus-ponens A (cons 'x 'y))
  ==> Type error

You now have opaque 'theorem objects. To access their implementation, just use the 'rep function :

  (rep A)
  ==> '(nil . (P))

-----

2 points by thefifthman 6332 days ago | link

It seems to me that anyone can now create theorem's, even if they are not theorems! What I had in mind is more like, ok, I have these 5 operations here on theorems, only these can modify and build new theorems. Then I can somehow close up the type of theorems, and if someone later wants to create a theorem, he has to go through these 5 operations. Here it seems to me that later one can just call (annotate 'theorem something_which_is_not_a_theorem) and get away with it.

-----

4 points by almkglor 6332 days ago | link

What you could do is, you could create an "opaque" function which gives access to your theorem data. Then wrap that function in an 'annotate form.

  (def theorem (f)
    (if (check-that-f-is-valid-theorem f)
        (annotate 'theorem
          (fn (cmd)
            (case cmd
              get
                f
              something-else
                (do-something-on f))))
        (error "Not a valid theorem")))

For that matter, since Arc is around "exploratory programming", in principle Arc will not allow anything that limits what the user can do.

In practice, most users are too busy working on stuff to bother doing something as stupid as (annotate 'theorem something_which_is_not_a_theorem) anyway.

In short: don't worry, be happy ^^

-----

1 point by thefifthman 6332 days ago | link

lol Exploratory programming is fine, but I don't need to explore anymore what a theorem is :-)

> "Arc will not allow anything that limits what the user can do"

Obviously this is not true, as Arc limits me such that I cannot create a safe theorem type. What this means is that I cannot let the user directly interact with Arc when building an interactive theorem prover; I have to wrap some kind of protective layer around it, which would probably hinder lots of the exploring!

-----

1 point by almkglor 6332 days ago | link

> lol Exploratory programming is fine, but I don't need to explore anymore what a theorem is :-)

Which is an admitted problem in current Arc: it forces you to explore well-explored areas sometimes.

> I have to wrap some kind of protective layer around it, which would probably hinder lots of the exploring!

Standard answer: macros.

As for safety: Read: http://paulgraham.com/langdes.html point 3.

-----

2 points by sacado 6332 days ago | link

Well, that's not really the spirit of the language. almkglor's solution with checks in the functions with 'check-that-f-is-valid-theorem is what you should eventually do.

If you really want a stronger encapsulation, there is only one solution (I think) : using closures to keep a list of known, certified theorems.

  (def theorems-factory ()
    (let known* nil
      (def mk-theorem (pre post)
        (let result (annotate 'theorem (cons pre post))
          (push result known*)
          result))
      (def is-valid (theorem)
        (mem theorem known*))
      (def modus-ponens (a b)
        (unless (and (is-valid a) (is-valid b))
           (err "Type error"))
        (do-the-work-with a b))))

  (theorems-factory)
  (= A (mk-theorem nil '(P)))
  (= B (mk-theorem '(P) '(Q)))

Now, once you call (theorems-factory), two things happen :

- the list of certified theorems is made (and empty)

- a bunch of functions (the only ones authorized to deal with your type) are defined.

Then, you can call the theorem constructor 'mk-theorem and other functions on these built constructors.

The trick here is that you can still create bogus theorem objects (for example by doing (annotate 'theorem 'bogus)), but as they are not added to the 'known* list, they are refused by all your functions. And the user cannot access the 'known* list, so he cannot cheat this way either.

-----

2 points by thefifthman 6332 days ago | link

Hmmh, I don't really think it is against the spirit of the language. After all, I think you cannot just annotate arbitrary things to be of type integer (or can you?).

The above solution would surely work in principle, but in practice I create millions of intermediate theorems, I do not want to store all of them!

-----

2 points by almkglor 6332 days ago | link

> Hmmh, I don't really think it is against the spirit of the language. After all, I think you cannot just annotate arbitrary things to be of type integer (or can you?).

Yes, you can: (annotate 'int #\newline). It won't be useful, but yes, you can.

> The above solution would surely work in principle, but in practice I create millions of intermediate theorems, I do not want to store all of them!

Standard answer: Garbage collection

In any case: you can always destructure the annotation wrapper and work with the raw data (probably by using some sort of higher-order function, like combinators), only wrapping it in the annotation type. Sort of like explicit monads.

-----

1 point by sacado 6332 days ago | link

Garbage collection does not work here if you keep a track of all the defined theorems in a list as 'known*. We would need weak references to achieve this.

-----

1 point by almkglor 6332 days ago | link

I'm not sure we need a list of defined theorems: apparently the proposed 'modus-ponens function is supposed to accept both source theorems anyway.

-----

1 point by sacado 6332 days ago | link

You're right. It creates lots of never-freed objects, so that's obviously not a viable solution.

What are you trying to prevent ? If you're trying to prevent the use of malicious data into one of your functions (e.g. a theorem object you got from an untrusted network) you could use some kind of cryptographic mark in your objects (something that proves they have been made by your constructor function, e.g. implement theorems as closures whose actual data can only be decyphered when called with a given private key), but Arc is certainly not the language you should uyse in this case (better use Java in such situations).

If, and that's what I guess you want, you just want to protect the user from blowing his own leg off by providing wrong data to functions, almkglor's wrapper solution, with macros to simplify coding, is the way to go. You can end up with a very transparent solution if you do it right. Of course, a determined user will eventually be able to break it down if he really wants to, but that's the philosophy of the language.

Look at the way macros are implemented : they are nothing more than functions annotated with the 'mac symbol. You can access to the underlying function if you want, by doing (rep while) or do (= mymac (annotate 'mac #\space)) and try to call (mymac). Sure, and then bad things can happen. But nobody does it, because this is senseless, and noone ever got trapped because of this. And having access to this is great because it makes the language simple and hackable.

-----

1 point by sacado 6333 days ago | link | parent | on: Hypothetical Web App: Image Gallery

As for database, I think sqlite was almost working last time I looked at it. Maybe it's working now ?

-----

1 point by sacado 6334 days ago | link | parent | on: arc2c update

1- is interesting and important, thus it should eventually be done, but I don't think this is a priority right now.

2- is a good starting point. For exn:fail, that the same as for 1- : not a top priority, particularily as it is not yet implemented in canonical arc.

I'll work on error handling on the next days, as I'll have a little more time than last week(s).

-----

1 point by sacado 6334 days ago | link | parent | on: arc2c update

As almkglor said, macros are the real problem as for now. Once we will have them working, a meta-circuar arc2c should be straightforward.

We will need 'read and 'eval functions (that should be easy, and they will probably have to be fully operational for macros anyway) and error handling (a trivial error handling, sufficient for arc2c's code, could be easily implemented : just write an error message and exit). A poor, but meta-circular compiler. Then, we'll have to optimize / improve everything left.

-----

2 points by almkglor 6333 days ago | link

Re: eval - We need a way to implement 'apply.

It's not easy. We can't make it into a primitive, because CPS conversion assumes that primitives will never need access to continuations, and thus does not transform primitive calls to CPS form.

-----

1 point by sacado 6333 days ago | link

Couldn't it be implemented as :

- PUSH the closure to be called

- PUSH the list of args

- call the APPLY() function, which just POPs the two elements, PUSHes all the elements of the arg list one after one, then call the closure (maybe after changing the continuation argument of that closure by hand, at runtime) ?

That's a runtime behavior, for sure, and probably a few cases should be hard-coded in the generated C file. E.g., (apply + '(1 2)) should be translated to (+ 1 2), then to 3, if nothing bad (redefinition of '+ or 'apply by the user) happened. But in the general case, you can't know.

Anyway, we will eventually have to implement an interpreter in the generated code, to deal with dynamic stuff. Maybe closures should be made available through a hashtable mapping their name(s) to actual code, and not only through a hard-coded array as it is now ?

-----

1 point by almkglor 6332 days ago | link

> Couldn't it be implemented as :

> - PUSH the closure to be called

> - PUSH the list of args

> - call the APPLY() function, which just POPs the two elements, PUSHes all the elements of the arg list one after one, then call the closure (maybe after changing the continuation argument of that closure by hand, at runtime) ?

Which continuation argument do you pass?

Suppose it's like this:

  (%car (%apply foo bar))

Then foo is:

  (set foo
    (fn (a b)
      (ccc
        (fn (k)
          (a k b)))))

Question: How does '%apply get access to the continuation, in order to pass to 'foo, which passes it to 'ccc ?

Remember, calling a function consists of the following steps:

1. Push the closure

2. Push the continuation

3. Push the arguments

The problem is step 2: continuation.

Possibly we need to insert the default 'apply during the same step as inserting the default 'ccc ? Then we could define '%apply as accepting the function, the continuation, and a plain list of arguments.

> Anyway, we will eventually have to implement an interpreter in the generated code, to deal with dynamic stuff. Maybe closures should be made available through a hashtable mapping their name(s) to actual code, and not only through a hard-coded array as it is now ?

s/closure/global variable/, maybe?

I think what we could do is, we add a pointer to an obj in the symbol structure. If the symbol is in the GLOBAL() array, this pointer points to that entry, otherwise to a malloc()'ed location.

-----

1 point by sacado 6333 days ago | link

oh, yes, 'apply & CPS... Well, I'll have a look at that today. I'll see if I can work around it.

-----

2 points by sacado 6341 days ago | link | parent | on: Are strings useful ?

OK, I think I made my mind about this now. Thanks for your comments. Well, after fighting one more time yesterday against things that were strings and that I expected to be symbols, I would really love to see them merged. The string / symbols distinction feels like an onion to me. As long as any string can be translated in one and only one symbol, and vice versa. That means case sensitivity, UTF-8 in symbols too, etc. Everything Arc already has.

I don't know if interning strings would really kill performance. Lua does it if I remember well, and it is quite performant. Anyway, mutable strings are a real performance killer, since you must compare them with scheme's 'equal? instead of 'eq?.

Now, you are right too, they don't represent exactly the same thing, so simply removing one of them would quite feel bad, but sometimes you have to make both worlds communicate. That's a matter of syntax too. After all, "Hello, Arc world" could be a special syntax meaning : the quotation of the symbol |Hello,\ Arc\ world|. A triple-quote syntax could be added, too.

It can be said the other way around : every time you need a string and are provided with a symbol (or the opposite), an automatic coercion could be performed.

Well, anyway, I'm just thinking loud since I am not here to design the language (and since its designers don't seem to appear very often anymore) :)

-----

2 points by sacado 6343 days ago | link | parent | on: Are strings useful ?

I think that's an implementation detail. You could still somewhat keep the character type in the implementation, but write them "x" (or 'x) instead of #\x and making (type c) return 'string (or 'sym).

Or, if you take the problem the other way, you could say "length-1 symbols are quite frequent and shoudn't take too much memory -- let's represent them a special way where they would only take 4 bytes".

-----

1 point by stefano 6342 days ago | link

This would require some kind of automatic type conversions (probably at runtime), but characters-as-symbols seems doable without the overhead I thought it would lead to.

-----

2 points by sacado 6343 days ago | link | parent | on: Are strings useful ?

Symbols, ASCII only ? No way, I'm writing my code in French, and I'm now used to calling things the right way, i.e. with accents. "modifié" means "modified", "modifie" means "modifies", that's not the same thing, I want to be able to distinguish between both. Without accents, you can't.

Furthermore, that would mean coercing symbols into strings would be impossible (or at least the 1:1 mapping would not be guaranteed anymore).

-----

4 points by sacado 6347 days ago | link | parent | on: arc2c update

New update : now supports floating point numbers. They can be added, subtracted, multiplied, compared with is, isnt, <, >, <=, >=.

There are a few things incompatible with canonical arc regarding the status of flonums ending with .0. They are regarded as integers by arc (but are still written, e.g., 1.0). In arc2c, any literal written x.0 is translated into x and any generated flonum ending by .0 is considered a flonum. That will probably be fixed soon, but I think it is an almost bug in arc (I guess this will be changed in future versions of arc), so, we'll see.

-----

2 points by almkglor 6347 days ago | link

Okay, looks good.

As an aside: maybe it's better to make the primitives really, really primitive?

Basically we define separate %n+ for adding two fixnums, %f+ for adding two flonums, %fn+ for adding a flonum and a fixnum (in that order) and, %nf+ for adding a fixnum and a flonum (in that order).

Then we can write + in lib-ac.scm.arc as:

  (set +
    (fn rest
      ($+ rest)))
  (set $+
    (fn (args)
      (if args
          (if (%cdr args)
              ($sub+ (%car args) (%cdr args))
              (%car args))
          0)))
  (set $sub+
    (fn (accum rest)
      (if rest
          ((fn (val rest)
             (if (%is (%type accum) 'int)
                 (if (%is (%type val) 'int)
                     ($sub+ (%n+ accum val) rest)
                     (if (%is (%type val) 'num)
                         ($sub+ (%nf+ accum val) rest)
                         ; raise exception
                         ()))
                 (if (%is (%type accum) 'num)
                     (if (%is (%type val) 'int)
                         ($sub+ (%fn+ accum val) rest)
                         (if (%is (%type val) 'num)
                             ($sub+ (%f+ accum val) rest)
                             ; raise exception
                             ()))
                     ; raise exception
                     ())))
            (%car rest) (%cdr rest))
          accum)))

As for macros: Hmm. You see... Look, up in the sky! Is it a bird? A plane? Superman? Nothing? Oh, what were we talking about again? ^^ LOL

Okay now serious: currently I'm blocked with creating a "protected" eval for macros. Programmer's block. The bit blocking me is the destructuring thing. Erk.

-----

1 point by sacado 6347 days ago | link

I think you're right for primitives. I asked pg the same question a few time ago, he answered he didn't know yet. Anyway, for an optimizing compiler, we will need these (as in : "hmmm, n can only hold a fixnum, let's translate this heavy '+ in a lightweight '%n+ !")

Now, macros. I tried to think about it yesterday, and I don't really understand why you need this protected mode. I mean, macros have to be defined (globally) before they are called and treated as such. Since you can already identify all the globals (and since a macro is a global thing annotated by 'mac), it should be easy to

- make a list of all the symbols holding a macro at a given moment in the code

- explore all the sexprs to see if one of those symbols is the first element of any sublist

- if it is to, make the transformation

- do it again until nothing changes anymore

Am I missing something obvious ?

-----

3 points by almkglor 6346 days ago | link

  (do
    (= xe 0)
    (mac somethingstupid ()
      (= xe (+ xe 1))
      `(prn "something stupid" ,xe)))

... and 'xe is a function in....?

-----

2 points by sacado 6345 days ago | link

OK , so I was missing something obvious :)

If I understand well, we have to interpret the code as we compile it (as a macro's body is executed when a function is defined/compiled, not when executed) ? To me, macros were something translating a piece of code into another one, period. But that's not always the case as your example shows. If someone writes

  (mac foo
    (prn "blah"))

  (def bar (n)
    (foo) (* n 2))

That means we have to display "blah" on compile time, right ? Hmm, that's much harder than I thought... It really means implementing an arc interpreter in arc (the way the official one is written in mzscheme), with for example prepending all the interpreted names with _ to distinguish them from the compiler's names...

Ok, I think I'm starting to realy get it ; or am I still missing an even trickier issue ?

-----

2 points by almkglor 6345 days ago | link

That is about it. ^^ However my current implementation uses tables as environments for the function. Keys in the table are symbols for the variable names in that environment. A "parent" string key accesses the environment's parent.

The global environment is shadowed by a table; when this table is accessed, if a table entry does not exist, we look it up in the real environment using 'eval (!!) and retain a copy. This allows the macro to use stuff like 'map.

However there is still a leak here: if the real global-level environment has a function that modifies a global, it will modify a different global variable from the one that is visible to the macro. This shouldn't happen anyway: we should really avoid using globals in utility functions ^^.

-----

2 points by almkglor 6344 days ago | link

Okay, now that I've actually gotten to see it - I think you forgot to push codegen.arc, since the current version on git don't actually generate code that calls DBL2OBJ.

Edit: Also, it might be useful to organize the primitives as "functions", like what I did with cons_fun().

So:

  #define CONS() {obj d = POP(); TOS() = cons_fun(TOS(), d);}

This should allow us to express some primitive calls directly:

  (%cons foo (%car bar))
  ==>
  PUSH(cons_fun(foo, car_fun(bar)));

Also, it might be useful to merge 'aquote and 'alit structures into a single structure.

-----

1 point by sacado 6343 days ago | link

Oops, thank you :/ I just pushed codegen.

I'll work on what you said about primitive functions in a few days (Monday or Tuesday I guess).

-----

2 points by almkglor 6339 days ago | link

Err, no. I just realized while on vacation that it's not safe to use memory-allocating primitives as functions. The problem is with nested conses:

  QUOTE_CONSTANT[0] = cons_fun(FIX2OBJ(1),cons_fun(FIX2OBJ(2),NILOBJ))

Let's call the cons_fun(FIX2OBJ(1) ...) call cons_fun1, and the other as cons_fun2. cons_fun2 is called first (because it's more inner). The value is then pushed into the C stack. Then cons_fun1 is called. However, what if GC is triggered at exactly cons_fun1? Then the cons cell returned by cons_fun2 is on the C stack, not the Arc stack; the C stack is not scanned by the GC! This means we actually lose memory areas.

Probably means I have to modify the way constants are stored.

-----

1 point by sacado 6338 days ago | link

Hmmm, good point. The stack, GLOBALS and QUOTE_CONSTANTS are all explored when GC is triggered. Maybe the solution would be to add a temporary storage (that should be taken into consideration by GC too) before putting things in the constants, or something like that ?

-----

1 point by almkglor 6338 days ago | link

> Maybe the solution would be to add a temporary storage (that should be taken into consideration by GC too) before putting things in the constants, or something like that ?

Called the "Arc stack", maybe? ^^

Currently this is what is done:

  QUOTE_CONSTANTS[0] = cons_fun(FIX2OBJ(1),cons_fun(FIX2OBJ(2),NILOBJ))

It should probably be:

  PUSH(FIX2OBJ(1));
  PUSH(FIX2OBJ(2));
  PUSH(NILOBJ);
  CONS();
  CONS();
  QUOTE_CONSTANTS[0] = POP();

Of course, the thing about having _fun functions is, they are safe for non-allocating functions. So we could safely directly translate:

  (car (cdr foo))

to:

  PUSH(car_fun(cdr_fun(LOCAL(3)/*foo@42*/)));

Edit: of course, even though it's named "_fun" doesn't mean that it is a function:

  #define car_fun(o) (((pair *)(o))->car)

-----

2 points by almkglor 6338 days ago | link

Okay, converted init_constants() to use the Arc stack instead of the C stack. Fear the pointer arithmetic foo-ness when using a processor with special PUSH and POP instructions (which are arguably dying out, because RISC processors handle stacks using explicit pointer ariths).

-----

5 points by sacado 6348 days ago | link | parent | on: How would you code this in Arc?

I think pattern matching is the thing that shouldn't be missing in Arc. I can hardly do what you describe without them and I don't want to implement a regex engine this evening, so let's assume it exists :)

  (= sum 0 count 0)

  (def strn (length str (o before t))
    (let result str
      (while (< (len result) length)
        (= result (if before (+ " " result) (+ result " "))))
      result))

'strn is just leaving whitespaces where needed : if you want str to be displayed on 4 characters, it will display it with whitespaces on the left( or on the right if you give one more argument :

  (strn 4 "10") ==> "  10"
  (strn 4 "10" t) ==> "10  "

Now, we redefine * so as to be able to do

  (* "=" 5) ==> "====="

  (redef * args
    (if (and (is (len args) 2) (isa args.0 'string) (isa args.1 'int))
      (let result ""
        (for i 1 args.1
          (= result (+ result args.0)))
        result)
      (apply old args)))

Finally, the action callback (to be called on each user) :

  (def action (index user score)
    (prn (strn 2 index) ". " (strn 15 user nil) " = " (strn 4 score))
    (++ sum (coerce score 'int))
    (++ count))

And the "main" code :

  (w/infile f "arcleaders.html"
    (pmatch "(\\d\\d?)\\..*?.+?<u>(.+?)<\\/u>.+?(\\d+)" action))
  (prn (* "=" 26))
  (prn "Total   = " (strn 4 (coerce sum 'string)))
  (prn "Average = " (strn 4 (coerce (/ sum count) 'string)))

The implementation of pmatch is left as an exercise to the reader :)

-----

3 points by lojic 6348 days ago | link

Thanks, but you forgot the important part about only retrieving the first 10 entries. I purposely included that to see how Arc would handle prematurely exiting an iteration.

Also, I think you may have forgotten to read the file. w/infile only opens the file, right?

I realize Arc doesn't have regex's yet, but some folks have been asserting that you can do just fine w/o them, so I was curious about an Arcy way to solve the parsing w/o regex's.

-----

1 point by sacado 6348 days ago | link

You're right. For the file's content, I just forgot to give it to the 'pmatch function (that's her file). For the test, well, that's quite trivial, provided you know there will be at least 10 values in your file (but, being given your code, that's the case anyway). Now let's consider pmatch only returns one value, then returns it (instead of a list of all values). As the file is a parameter, it is easy to know what's left to read.

  (w/infile f "arcleaders.html"
      (while (< count 10) (pmatch f "(\\d\\d?)\\..*?.+?<u>(.+?)<\\/u>.+?(\\d+)" action)))

-----

1 point by lojic 6348 days ago | link

I realize pmatch hasn't been written yet, but this seems odd. You're calling (pmatch f pat action) repeatedly, so where is the state info kept regarding the index of the last match, etc.? Your example reminds me of strtok in this regard, but I doubt that's what you had in mind.

With the Ruby example, the scan function invokes the supplied block repeatedly, but the break statement within the block will exit the scan function which is the desired behavior.

It's somewhat moot until someone writes pmatch though.

-----

2 points by almkglor 6348 days ago | link

> so where is the state info kept regarding the index of the last match, etc.?

In the stream state?

-----

1 point by sacado 6348 days ago | link

Well the f parameter has an index anyway, so the function knows what is left to be parsed anytime you call it, I guess ? Hmmm, I'll try to make a dumb version of pmatch and see what happens...

-----