Code-aware VCS #5
Replies: 3 comments 1 reply
-
I had a similar idea a few weeks ago, but I thought of simply outputting what code changes happened as a helper string instead of using that data to build a new vcs system. You could do something like |
Beta Was this translation helpful? Give feedback.
-
I believe a customized rope to the AST qualifies for most of these things (except if wanted semantic information) as I understand as rope 1. AST of the program, 2. traversal info for efficient lookup to modify things in tree, 3. formatting specific info associated to the AST (if the formatting is not unambiguous). For example, a lsp combines those info with semantic ones (lazily).
This means 1+2 can be solved, as we are essentially only checking AST equality wrt. some property (leaving out some parts etc).
Git can pull out function symbols for you with Does anybody have experience in AST based program minification? UPDATE: add link and clarify assumptions. |
Beta Was this translation helpful? Give feedback.
-
I just found this paper by Jonathan Edwards, which addresses this very issue. Most of the content is too academic for my smooth brain as the sun comes up, but basically the idea is tracking document changes by the edit as the elemental unit of change, which are defined as 'Insertion' of a token at the position of another token, pushing the rest of the document forward, 'Conversion' of a token to another, 'Move' (which is basically REmove, keeping the removed token as a tombstone), and the 'Identity' edit which does nothing. A 'project' function takes a source and a diff and produces the modified source from it, 'retract' attempts to do the inverse, taking a source and diff and producing the prior source, but in the case of conflicts it can't. The paper is really interesting, and I'll read the rest tomorrow with some good rest and hot coffee. The comments on Jonathan Edward's blog post link to some interesting resources as well. I'd like to work on this. Hopefully we can find a solution worth using ourselves. Follow up: Here's a video of J. Edwards' prototype in action: HATRA’21 talk |
Beta Was this translation helpful? Give feedback.
-
Version control software generally doesn't actually understand code. For example, it's a well-known fact that simply indenting a bunch of code in Git will probably result in horrific merge conflicts, because the "lines have changed", despite it being an obviously trivial and mergeable change.
What could version control systems do if they actually understood the programs they were versioning? A few starter examples:
This is essentially an open area of research that someone should dig into!
It is worth noting, by the way, that Git actually allows for smarter "merge drivers", but these do not seem to be in widespread use. It's possible that programmers could get quite a quality-of-life boost even within current version control systems.
I'm sure this idea has been explored to some extent (it's a pretty obvious idea) but I'm not familiar with what's out there myself. If people know of existing research or tools in this space, please post them in the discussion below.
Beta Was this translation helpful? Give feedback.
All reactions