Contextual commits – An open standard for capturing the why in Git history

(vidimitrov.substack.com)

30 points | by vidimitrov 12 hours ago

8 comments

teeray 10 hours ago
It continually amazes me how averse people are to just explaining why a commit exists in the body of the commit. Is all this tagging actually easier to read than written prose? You don’t even have to write it anymore if the sight of your editor opening upon `git commit` causes some instinctual revulsion.
[-]
- vidimitrov 10 hours ago
  The problem is that usually we don't write the WHY in the commits... We tend to always capture the WHAT in the form of prose. And for agents, this is just more noise, since all they need is just the diff to reconstruct the WHAT.
  I've never seen someone write decisions or the intent they started with in commit messages. Even the solutions today that auto-generate commit messages just summarise the diff.
  This was helpful when humans were the only ones reading the history. But for agents its useless.
  [-]
  - 0x457 9 hours ago
    Because commit history is here to explain WHAT and not WHY. "Why" is explained by a decision log such as ADR which can be store in the same repo and can be mutated in the same commit that has WHAT in its commit body.
    But also, if you look at large projects like Linux or FreeBSD, commits there explain why as well.
    [-]
    - teeray 2 hours ago
      > Because commit history is here to explain WHAT and not WHY.
      When that commit gets implicated by `git bisect` and all you see in the message is exactly what you’d see by reading the patch anyway, you’ll wish the author answered why they did what they did. This, especially, when the author is no longer at the company.
    - agateau 8 hours ago
      I disagree with this: commit messages should explain the Why. For the What, I can read the diff. Sadly, many commit messages are about the What.
  - skydhash 9 hours ago
    > I've never seen someone write decisions or the intent they started with in commit messages
    You may not have seen enough good repos. The following is an example commit from freebsd
    https://cgit.freebsd.org/src/commit/?id=ac5ff2813027c385f903...
    A proper email is like an email. You have the first line as the subject and it may be enough to explain the intent of the diff. But sometimes it’s not enough and you add more details in the body. I strongly believe that people who write the WHAT again don’t know that there’s a diff attached to the commit and think of them a separatete objects. GitHub and VSCode do not really help in that regard.
    [-]
    - codethief 6 hours ago
      > The following is an example commit from freebsd
      The Linux kernel is another great example. Random commit from yesterday:
      https://github.com/torvalds/linux/commit/d56b5d163458c45ab8f...
    - vidimitrov 9 hours ago
      This looks very good. Thanks for sharing. I can only imagine how much discipline it takes to write these kinds of commits manually.
      [-]
      - skydhash 9 hours ago
        Is it discipline?
        When you think of the patch as an unit of idea and the commit as the means to convey that idea, it takes the same amount of effort to write an email message.
        BTW you do not have to write those for every single commit. You can always rebase interactively and create a final set of commits for sharing. No one cares about what’s in your local copy of the repo.
    - svstoyanovv 9 hours ago
      I think this requires discipline. The good thing is that we have coding agents, but again, you need a standard to tell the agent what to always look for, how to find it, and to describe your modules properly (even Claude Opus 4.6 makes mistakes when doing hops when tracing code spanning files). Btw, there is also a paper on this issue, Google released it recently
gorgoiler 8 hours ago
Anyone who wants their commit titles to be less like document headings and more like parseable data structures is going to find it difficult when their peers don’t play along.
To that end you will want to provide a validating parser and then start rejecting commits whose messages don’t validate. If your validator has even one or two bugs you’re going to see all goodwill evaporate, and for what? So that you could read:
```
  bug(fix)[8177] Add missing paren
```
instead of
```
  Add missing paren

  …

  Fixes http://bugs.com/8177
```
Commit messages are the primary source of why you did something. Focus all of your energy on writing clearly, concisely, and compellingly, and helping others to get better at doing so. Working on anything else is wasted energy compared to the importance of honing written communication skills.
[-]
- mnahkies 7 hours ago
  I like to follow conventional commit style, and some repos I work on have CI checks for it. It's been fixed now, but for a long time the validator we were using would reject commits that included long urls in the body (for exceeding the width limit).
  It was enraging - I'm trying to provide references to explain the motivation of my changes, all my prose is nicely formated, but the bulleted list of references I've provided is rejecting my commit.
  I generally think it's in the category of a social problem not a technical problem - communicate the expectations but don't dogmatically enforce them
evolve2k 9 hours ago
I like these conventions. Another personal practice I us the body for is where I’ve relied on any webpages; blogs, issue reports, stack overlap pages etc to help the commit come together.
The example of using one library over another, especially if research has gone into which to choose, regularly involves say finding a good article that compares the alternatives.
I’ll say though that I usually include links to more notable references, I won’t usually commit refs to a libraries own docs and more obvious stuff; revealing and keeping references to resources found that went towards getting it done are what I keep and add to commit body.
Maybe there’s spaces for useful references to be added to the spec/conventions. Personally I usually show links like this after the body message.
Example of the commit body:
refs(oath-library):
www.something.com/picking-a-thing
stephbook 9 hours ago
> intent(auth): users need social login, starting with Google before GitHub and Apple
Your 'intent' is 'users need social login'? That does not make sense.
Your intent is 'Getting more users by lowering barriers to sign up', a business goal. That business goal might have hierarchical children – for example, Jira epics – such as 'offer social sign-in', or 'declutter landing page.'
Also, the commit mentions 'Google before GitHub', but how can a commit (a snapshot of the repository) know the future? What if your product manager decides Google is fine enough and GitHub/Apple aren't needed?
I wish our profession would stop trying to reinvent issue tracking in git every week.
[-]
- sshine 6 hours ago
  > Your intent is 'Getting more users by lowering barriers to sign up', a business goal.
  Important point.
  But it says more about OP’s perspective on product-driven development than this “contextual commits”.
  The idea that OP does not follow up on is whether these codified commit messages with a very shallow summary of the LLM context improve anything compared to the normal commit messages that Claude will generate.
  I suspect they’re equally insufficient since even a compacted context seems less diffusing and it’s still a lot bigger than these.
  > I wish our profession would stop trying to reinvent issue tracking in git every week.
  I would love to have good git-powered issue tracking. I haven’t seen it yet.
- svstoyanovv 9 hours ago
  I understood the example, and it could be a minor hiccup there, but the essence is different:
  By having a structured context of the key session discoveries, decisions, rejected items (if there were past commits with decisions that had been rejected, etc..) you achieve a type of contextual storage of the reason, thus after a month, when a team member wants to start working on a task that you have touched, and now forgot since you are doing ai-assisted coding and pr throughput is to sky right now, your collegue at least will know the rational behind the decission and working with his agent, the agent will produce more reliable code not introducing something for the sake of solving the task.
SamuelAdams 9 hours ago
Our standard of practice is to document the “why” in Jira. Then reference that card in the commit message.
This gives product owners the ability to embellish as they wish and reduces the need of the dev to repeat themselves.
[-]
- vidimitrov 9 hours ago
  That’s cool. But how does it work in agentic environment? Do you get any benefit from it? Or it’s intended only for humans to read?
  [-]
  - pamcake 8 hours ago
    > how does it work in agentic environment?
    Giving agent scoped access to ticket system. Whis is this obvious answer not the obvious solution?
agateau 10 hours ago
Would be curious to know if it works better than writing the Why as human-friendly paragraphs in the body of the commit message.
[-]
- vidimitrov 10 hours ago
  A few examples are the ability to query historical data and using each action line as a signal for other tooling to build on top but there are many others… you can check what Conventional Commits did in the past and what they unlocked only by introducing structure to commit subjects
  [-]
  - agateau 8 hours ago
    I guess it does not help that I dislike conventional commits :)
jamietanna 7 hours ago
See also: https://news.ycombinator.com/item?id=40949229
keybored 10 hours ago
> an open standard for capturing the WHY in git history
Agentic coding keeps reinventing coding.
That was my first thought.
> And then it hit me - the commit body has always been there. Completely underutilised.
Wait. What? This is the standard?
> Here is an example of how a Contextual Commit looks:
The format is key-value stuff. You can already use trailers for that. The syntax here doesn’t work with that stuff.
If you have already readh the “conventional commits” (pronounce with a sneer) specification you have already seen them. They’re called footers because they also didn’t know about trailers.
> No new tools. No infrastructure. Just better commits.
Okay, let’s cut right to the point..
[-]
- vidimitrov 9 hours ago
  Trailers were not suitable for the use case.
  The scope in parentheses is doing real work. `rejected(oauth-library)` lets you do `git log --grep="rejected(auth"` to find every rejected auth decision across history.
  If you flatten it to a trailer token you either lose the scope or encode it awkwardly as `Rejected-auth-oauth-library: value`, which doesn't grep cleanly and doesn't parse naturally.
  [-]
  - 0x457 9 hours ago
    > The scope in parentheses is doing real work. `rejected(oauth-library)` lets you do `git log --grep="rejected(auth"` to find every rejected auth decision across history.
    I'm 99% sure that grep won't find your commit because you rejected "oauth-library" and grepping for "auth" rejection. Given that LLM will make up category name, it will just get worse unless there is deterministic enforcement.
    All of this really feels like people that never wrote code starting doing it via agents and started reinventing already solved issues.
    [-]
    - vidimitrov 9 hours ago
      The “deterministic enforcement” is exactly what this enables but its not the responsibility of the spec to say that. Its harnesses or IDEs or you own implementation that will enforce that.
      [-]
      - 0x457 9 hours ago
        Then why the last thing blog post says is: "No new tools. No infrastructure. Just better commits."
  - vidimitrov 9 hours ago
    The format is optimised for agent querying and human readability in `git log`, not for `git interpret-trailers` compatibility. Those are different use cases.
    [-]
    - keybored 7 hours ago
      git int-trailers compatibility is a nonsense phrase. You don’t care about compatibility with a helper tool. You care about the tools that use them... and git log uses them.
      > The format is optimised for agent querying and human readability
      Yours is key value pairs. Trailers are key value pairs. The git log can be read by humans and agents... what’s even the differentiator here?
      Agents read English. But every little minutia of programming now needs something “for agents and humans”? Which is like colon-separated key value pairs... except they also have a scope in parens. Which makes all the difference to agents? tuts
  - keybored 9 hours ago
```
     Rejected: (auth-library) ...

?
```
  - skydhash 9 hours ago
    I think those are better suited to an issue tracker. As for changes that affected the source code, you can grep the patch in the git log too.
    [-]
    - vidimitrov 9 hours ago
      Issue trackers are full of intent and decisions, that's true, but that's not the point here... It's about a storage that agents can use natively without the need of callings external APIs or MCPs.
      And there is a slight difference between what you capture in issue trackers and what happens in reality in coding sessions.