Arc Forumnew | comments | leaders | submitlogin
4 points by Pauan 408 days ago | link | parent

I've thought about this problem for several years now.

The following properties are desirable:

- Convenience (auto-imports, imports not needing to specify the variables that are used, etc.)

- Correctness (ensuring that your code uses the variables you expect it to use, and that it doesn't easily break by changing some random unrelated file)

- Performance (only loading the functions/variables that you actually use)

Unfortunately, it is probably impossible to get all three properties at the same time.

By making code more convenient (e.g. implicit imports), you make it harder to reason about code ("I'm using the variable foo, but where does it come from?"), and you make it more likely for there to be variable conflicts (two different files defining a variable with the same name).

You also make it more likely for your program to break when upgrading a library to a new version, or even just when making a change to a seemingly unrelated file.

And if there is a variable conflict (which will happen), then you need some way of disambiguating, which generally requires you to specify which file should take precedence. So in the situation where there are variable conflicts, you lose all of the convenience benefits for that variable.

(And if you choose to not disambiguate, and instead ignore the variable conflicts, then you lose correctness, so that only works for very small projects)

Convenience also conflicts with performance. The language doesn't know which variables are defined in which file, so if you use the "foo" variable, it has to search through all of the files in order to determine which file contains the "foo" variable.

That also means your language needs to define which directories it will automatically import files from (you don't want it searching through your entire harddrive!)

Various languages (including Python[1] and Node.js[2]) have defined their own rules for determining which folders will be searched through for files, so this is a solvable problem.

But unlike Python and Node.js, implicit imports requires it to also search through all of the files inside of those folders... this is obviously an expensive operation.

You could make it work. If your language allows you to pre-compile a binary, then it only needs to lookup the files once (at compile-time). And with tree shaking, the final binary would only contain the variables that you actually use.

You would lose some correctness, and you would lose compile-time performance, but some people would consider the gains in convenience to be worth it.

Arc, however, does not have a separate compile-time, so it would need to do the folder/file lookup every single time you run any Arc program. I think the performance loss would be too great for any medium or large programs.

----

By the way, there's another interesting issue... mutually recursive functions:

  (def my-even (a)
    (if (is a 0)
      t
      (my-odd (- a 1))))

  (def my-odd (a)
    (if (is a 0)
      nil
      (my-even (- a 1))))
When the compiler is compiling the "my-even" function, it encounters the "my-odd" variable, but the "my-odd" variable hasn't been defined yet, so the compiler treats it as a global variable and tries to automatically import it.

If there is a "my-odd" variable defined in another file, then the "my-even" function will call that variable, rather than the "my-odd" variable defined in the current file.

There are various ways to fix this problem, probably the easiest of which is to define the "my-odd" variable before it is used:

  (= my-odd nil)

  (def my-even (a)
    (if (is a 0)
      t
      (my-odd (- a 1))))

  (def my-odd (a)
    (if (is a 0)
      nil
      (my-even (- a 1))))
This means that all variables need to be defined before use, which is contrary to the way Arc works. Also, if you forget to define a variable before use, then your program is buggy. And it is quite easy to forget.

So I think an auto-import system could work, but your language has to be designed around it. You can't just add an auto-import system to an existing language, because of these (and other) issues.

----

* [1]: https://docs.python.org/3/glossary.html#term-import-path

* [2]: https://nodejs.org/api/modules.html#modules_loading_from_nod...



4 points by Pauan 407 days ago | link

Also, another interesting point: if you move away from our typical text-based ways of editing code, it may be possible to achieve convenience and performance:

http://unisonweb.org/2015-05-07/about.html#post-start

In Unison, you do not need to import code. That's because every variable is actually the hash of its definition, and when you use that variable in your code, it automatically "imports" the hash. That also means that your code contains only the variables that you actually use.

You do still lose some correctness, though. As an example, if you have a function and you change the definition of that function, it will create a new hash (for the new definition), but the old hash is still being used in your code.

Which means that you have both the old version of the function and the new version of the function existing at the same time. So you need to manually update all of your code to use the new hash:

http://unisonweb.org/2015-06-12/editing.html

This problem hasn't been solved yet, but there's been discussions about providing some tools that can automate certain tasks (such as replacing an old hash with a new hash, in your entire code base).

Assuming the above problem is solved, it may actually be possible to have convenience, correctness, and performance with an auto-import system.

But Unison's approach doesn't work with existing languages, so my point that "the language needs to be designed specifically for auto-imports" still stands.

-----