Arc Forum | The things you describe - using patch and diff to update hacks for new releases,...

Arc Forum

1 point by rntz 5852 days ago | link | parent

The things you describe - using patch and diff to update hacks for new releases, applying and merging patches, etc, are essentially what a version control system like git does under the hood. So instead of letting a VCS handle this for you, you're doing it manually. And with all due respect, I suspect this works well only with small, few, and simply dependent patches.

Not that making patches small isn't a good thing for other reasons, but I don't like having arbitrary limits on the complexity of my hacks to arc. The hygiene branch on anarki, for example, introduces massive changes to the underlying arc, yet it's a worthwhile hack and I'm thinking of porting it upstream to arc3. The coerce patch, while not nearly as large, involves many deletes and hence changes many line numbers in arc: if I merely represented hacks as diffs, then combining that hack with others would be a pain in the ass due to this changing of line numbers - a simple diff/patch scheme doesn't maintain the history necessary to do this automatically, so I'm stuck fixing the patches on my own. But this is exactly one of the tasks VCSes are built to handle.

Similarly, I'm trying to port many of anarki's changes to arc2 upstream to arc3, and I've already generated 14 distinct and often dependent hacks. I can only see more being generated as the process goes on. Updating all of these hacks for the new feature additions manually would have taken me an hour, maybe several. With git it took me about half an hour (well, more thanks to a stupid mistake I made). If I write the script I have in mind for handling this (which merely drives git) I should be able to do it in ten minutes. That's about the time it should take, IMO.

Keeping changes minimal is good, but it's no panacea. Issues of scale will come up, as they have in other projects; VCSes were created for a reason. If someone were to write, as you suggest, a set of scripts for manipulating hacks - applying, merging, rebasing - with metadata about dependencies, they'd essentially have built a simple VCS!

(Darcs, in particular, is built around the notion of commuting patches - of extracting a given patch or set of changes from those surrounding it in a development history - which allows for precisely the kind of "independent" application of hacks that you desire.)

1 point by CatDancer 5852 days ago | link

If someone were to write, as you suggest, a set of scripts for manipulating hacks - applying, merging, rebasing - with metadata about dependencies, they'd essentially have built a simple VCS!

Maybe. git wasn't up to handling the dependency part.

I suspect this works for you only because you have small, few, and simply dependent patches.

Yup. That's true. I'm imagining, with sufficient effort, large patches can be made small by making Arc more hackable. But unless and until that work is done for a particular patch, if it is possible at all, then a version control system will be necessary for the reasons you describe.

I did write a script to rebase my patches on top of the succession of arc3 releases ^_^ Despite all the various features of git, I needed to write the script to keep the rebase work from being unbearably tedious. What I found interesting was that once I had the script, I didn't need git for anything.

I'm imagining that as I publish my hacks, and if there are some that you'd like to use, that I'll be able to have a script automatically push them to darcs or git (whatever VCS you want to use), showing the succession of releases of a hack as commits in a branch, and then you'll be able to use the normal VCS mechanisms to merge and keep track of changes. If this turns out not to be true, then let me know and I'll see what I can figure out...

-----

3 points by shader 5852 days ago | link

I think that both CatDancer and rntz have good points, the problem is that they are working on different things and thus have different perspectives.

CatDancer is mostly working on libraries - independent pieces of code that, while they may redifine some things, or make minor changes to the base language, mostly leave it alone. These can be written easily in the "shortest distance from arc" method and still work well together because they don't actually change the base of arc. This also means that while VC is useful in development, it is hardly needed in publishing.

rntz has been working on porting changes from arc2 to arc3 and some other hacks, many of which require major changes to the arc code base itself such as his coerce hack. These need to be VC'd so that they can be understood by other users of Anarki, and so that they can work more easily with eachother. Since many of them create major changes to the codebase, it can sometimes be a challenge to get them to work together and as he says plain diffs would be a nightmare.

I may be summarizing a bit much, and I'm sure there are more subtleties to it than that, but it's how I understand the situation.

As far as I can tell, they can be handled separately. The publishing of libraries and miscellaneous code can be done without publicly visible version control, but the Anarki base should probably be versioned.

Git is still a good choice for what we're doing, as far as I can tell, because for one thing all it is is scripts built for managing it's base. That is the essence of the so-called porcelain. I'm sure it won't hurt anyone to add a little bit more in the way of domain-specific tools. They might be useful to others as well.

So it sounds like we have to things going on here:

1) We need a meta-data and library publishing system for people to share code that they've written to use with arc.

2) We need a slightly better method of handling changes to the arc base so that people can coordinate massive changes thereto. I think that individual commits works ok for minor changes, but it rapidly gets complicated to handle if they start stepping on eachother. That's when we should switch to "branches" which can be interpreted to mean alternate conceptions of arc. Case in point being the hygienic version of arc in the hygiene branch.

At some point we just admit that the lisp community is hopelessly built on making their own incompatible versions of lisp, and everyone makes their own fork of arc. Magic merges and rebasing can only handle so much incompatibility in the history of the system.

-----

2 points by CatDancer 5852 days ago | link

Git is still a good choice for what we're doing

I wouldn't dismiss the possibility of using darcs for your project. git was designed to merge tens of thousands of lines of code in a fraction of a second; that doesn't mean it isn't useful for other tasks, but it also doesn't mean that darcs might not be better for the kinds of tasks that you find yourselves needing to do.

-----

3 points by shader 5852 days ago | link

Git was also designed to have a fast, simple back end, with a script based front end. Each command that you're used to using with git is just a shell or perl script that calls a few basic git functions to mess with blobs. Therefore, it is easy to add new tools and scripts. A good example given the darcs context would be: http://raphael.slinckx.net/blog/2007-11-03/git-commit-darcs-...

My point is that the architecture of git allows us to write scripts that work at the same level as the other git scripts; we can keep or leave the others as desired.

-----