Arc Forum | > Isn't it the other way around? Why are "low bits" not an implementation de...

Arc Forum

1 point by waterhouse 5284 days ago | link | parent

> Isn't it the other way around? Why are "low bits" not an implementation detail?

The implementation detail is that "you can't tag an object without wrapping it in a compound data structure that old functions will have to be told to reach inside to get the original object".

"Hiding implementation details" isn't something to be done for its own sake. You want to hide implementation details when they're uglier than the semantics you want to convey; e.g. it's impossible to use a MUL instruction to check if the result of a multiplication is a bignum and allocate one and return a tagged pointer to it if necessary, but keeping track of all that is a pain; so instead we have a generic * function that hides all this scrambling around, and we can think of all integers as a single type of object. Exposing that stuff to a user who isn't extremely concerned with performance just sucks. It's bad because the interface sucks, and we'd identify it as "exposing implementation details" because that was the reason it happened.

On the other hand, does anyone want to hide the fact that lists are implemented as conses and that you can take their car and cdr? Not me. That stuff is useful.

> If I define a new type, it makes little sense for it to inherit the qualities of the particular old type I'm using to implement it. That idea does nothing but expose implementation details.

When you first define a new type, no preexisting functions will know how to handle the new type. You have two choices for what should happen in the absence of such instructions. Either they can give errors and fail (unless you use "rep", in which case they do their job as usual), the way they do now, or they can do their job as usual, the way it would be under my proposal. I think the latter choice is clearly--I would say strictly, unless you find the error messages useful--superior.

Now, if you define a new type, the first thing you'll probably do is define some functions that handle the new type, and they will likely need to use old functions that manipulate the objects that your new type is made of. (Perhaps the functions you're defining will be intelligent, wrapped versions of the old ones.) As mentioned, either they have to use "rep", or they don't.

And if you want to define a second new type that's supposed to be a subtype of the first one, then you can still build that on top of the provided "tag" and "type". (I think this answers your last point.) For example:

  ;Single inheritance. The path of inheritances is an improper list.
  ;Builtins will have atomic types; you could give something an
  ;atomic type to pretend it's a builtin.
  (def inherit-tag (typ x)
    (tag (cons typ (type x)) x))
  (def inherit-type (x)
    (if (acons (type x))
        (car (type x))
        (type x)))
  (def inherit-rep (x) ;more like "cast to supertype"
    (if (acons (type x))
        (tag (cdr (type x)) x)
        x))
  (def is-a (typ x)
    (if (atom (type x))
        (is typ (type x))
        (or (is typ (car (type x)))
            (is-a typ (inherit-rep x)))))

If you wanted "has-a" type semantics or something, you could figure out a similar way to define those. (Also, it occurs to me that you could use "tag" to identify your tagged objects, as in (tag 'inheritance-tagged (list (list 'type1 'type2 ...) 'implementation-type x)) or something.) I think you can build things on the type system just as much as before, and meanwhile the initial implementation is cleaner.

[Hmm. Come to think of it, given that the user doesn't have access to the internal "rep", and the built-in functions don't use it, it's meaningless (and useless) for (tag 'a (tag 'b x)) to return nested vectors; I'll make it return the same thing as (tag 'a x). So in the language primitives, everything has a single type--which would likely be just a symbol, but conceivably an Arc standard library might have things that do inheritance with lists of symbols. Anyway, this doesn't change anything I've said above.]

2 points by rocketnia 5284 days ago | link

"The implementation detail is that "you can't tag an object without wrapping it in a compound data structure that old functions will have to be told to reach inside to get the original object"."

When I use 'annotate, what I want is to do this: Take a value that's sufficient for the implementation of a new type, and produce a different value such that I can extract the old one using an operation (in this case 'rep) that I can't confuse for part of the type's stable API.

Any "tag" semantics that overwrites the original type is useless to me for that.

I do think "wrap" is a better name for what I want than "tag" or "annotate," and that's the word I reach for when naming a related utility or recreating 'annotate in a new language. I don't mind if you say that tagging is something different. ^_^

---

"You want to hide implementation details when they're uglier than the semantics you want to convey"

From a minimalistic point of view, any accidental feature is ugly complexity, right? But I grant that having easy access to accidental features is useful for a programmer who's writing experimental code.

---

"On the other hand, does anyone want to hide the fact that lists are implemented as conses and that you can take their car and cdr? Not me. That stuff is useful."

I consider Arc lists to be conses by design, not just by implementation. As you say, conses are a positive feature.

On the other hand, the core list utilities would be easier for custom types to support if they relied on a more generic sequence interface, rather than expecting a type of 'cons. This isn't hiding the implementation so much as deciding to rely on it as little as possible, but it's somewhat causally related: If I don't rely on the implementation, I don't care whether or not it's hidden. If the implementation kinda hides itself (the way the body of a 'fn is hidden in official Arc), then I tend to leave it that way, and it can come in handy as a sanity check.

---

"And if you want to define a second new type that's supposed to be a subtype of the first one, then you can still build that on top of the provided "tag" and "type". (I think this answers your last point.)"

No... I said this:

With the way I code, saying (tag a (tag b x)) means I'm making an "a"-typed value that has a "b"-typed value that has "x". I don't consider them all to be the same value[...]

There's no subtype relationship going on here. For comparison's sake, if I say (list (list 4)), I'm making a cons-typed value that has a cons-typed value that has 4. At no point should my conses inherit from 4, and at no point should my "a" inherit from "x". That's just not what I'm trying to do.

---

In any case, just because I like the status quo (or my own spin on it) doesn't mean I shouldn't give your approach a chance too.

So, as long as (tag 'foo (tag 'bar "hello")) makes something nobody can tell is a 'bar, it's consistent for them not to know it's a 'string either. But then what will core utilities which can handle strings and tables (like 'each) do when they get that value? Do they just pick 'string or 'table and assume that type by default? If they pick 'string, won't that make it seem strangely inconsistent when someone comes along and expects (each (k v) (tag 'baz (table)) ...) to work?

-----