It continually amazes me how averse people are to just explaining why a commit exists in the body of the commit. Is all this tagging actually easier to read than written prose? You don’t even have to write it anymore if the sight of your editor opening upon `git commit` causes some instinctual revulsion.
The problem is that usually we don't write the WHY in the commits... We tend to always capture the WHAT in the form of prose. And for agents, this is just more noise, since all they need is just the diff to reconstruct the WHAT.
I've never seen someone write decisions or the intent they started with in commit messages. Even the solutions today that auto-generate commit messages just summarise the diff.
This was helpful when humans were the only ones reading the history. But for agents its useless.
Because commit history is here to explain WHAT and not WHY. "Why" is explained by a decision log such as ADR which can be store in the same repo and can be mutated in the same commit that has WHAT in its commit body.
But also, if you look at large projects like Linux or FreeBSD, commits there explain why as well.
> Because commit history is here to explain WHAT and not WHY.
When that commit gets implicated by `git bisect` and all you see in the message is exactly what you’d see by reading the patch anyway, you’ll wish the author answered why they did what they did. This, especially, when the author is no longer at the company.
A proper email is like an email. You have the first line as the subject and it may be enough to explain the intent of the diff. But sometimes it’s not enough and you add more details in the body. I strongly believe that people who write the WHAT again don’t know that there’s a diff attached to the commit and think of them a separatete objects. GitHub and VSCode do not really help in that regard.
When you think of the patch as an unit of idea and the commit as the means to convey that idea, it takes the same amount of effort to write an email message.
BTW you do not have to write those for every single commit. You can always rebase interactively and create a final set of commits for sharing. No one cares about what’s in your local copy of the repo.
I think this requires discipline. The good thing is that we have coding agents, but again, you need a standard to tell the agent what to always look for, how to find it, and to describe your modules properly (even Claude Opus 4.6 makes mistakes when doing hops when tracing code spanning files). Btw, there is also a paper on this issue, Google released it recently
Anyone who wants their commit titles to be less like document headings and more like parseable data structures is going to find it difficult when their peers don’t play along.
To that end you will want to provide a validating parser and then start rejecting commits whose messages don’t validate. If your validator has even one or two bugs you’re going to see all goodwill evaporate, and for what? So that you could read:
bug(fix)[8177] Add missing paren
instead of
Add missing paren
…
Fixes http://bugs.com/8177
Commit messages are the primary source of why you did something. Focus all of your energy on writing clearly, concisely, and compellingly, and helping others to get better at doing so. Working on anything else is wasted energy compared to the importance of honing written communication skills.
I like to follow conventional commit style, and some repos I work on have CI checks for it. It's been fixed now, but for a long time the validator we were using would reject commits that included long urls in the body (for exceeding the width limit).
It was enraging - I'm trying to provide references to explain the motivation of my changes, all my prose is nicely formated, but the bulleted list of references I've provided is rejecting my commit.
I generally think it's in the category of a social problem not a technical problem - communicate the expectations but don't dogmatically enforce them
I like these conventions. Another personal practice I us the body for is where I’ve relied on any webpages; blogs, issue reports, stack overlap pages etc to help the commit come together.
The example of using one library over another, especially if research has gone into which to choose, regularly involves say finding a good article that compares the alternatives.
I’ll say though that I usually include links to more notable references, I won’t usually commit refs to a libraries own docs and more obvious stuff; revealing and keeping references to resources found that went towards getting it done are what I keep and add to commit body.
Maybe there’s spaces for useful references to be added to the spec/conventions.
Personally I usually show links like this after the body message.
> intent(auth): users need social login, starting with Google before GitHub and Apple
Your 'intent' is 'users need social login'? That does not make sense.
Your intent is 'Getting more users by lowering barriers to sign up', a business goal. That business goal might have hierarchical children – for example, Jira epics – such as 'offer social sign-in', or 'declutter landing page.'
Also, the commit mentions 'Google before GitHub', but how can a commit (a snapshot of the repository) know the future? What if your product manager decides Google is fine enough and GitHub/Apple aren't needed?
I wish our profession would stop trying to reinvent issue tracking in git every week.
> Your intent is 'Getting more users by lowering barriers to sign up', a business goal.
Important point.
But it says more about OP’s perspective on product-driven development than this “contextual commits”.
The idea that OP does not follow up on is whether these codified commit messages with a very shallow summary of the LLM context improve anything compared to the normal commit messages that Claude will generate.
I suspect they’re equally insufficient since even a compacted context seems less diffusing and it’s still a lot bigger than these.
> I wish our profession would stop trying to reinvent issue tracking in git every week.
I would love to have good git-powered issue tracking. I haven’t seen it yet.
I understood the example, and it could be a minor hiccup there, but the essence is different:
By having a structured context of the key session discoveries, decisions, rejected items (if there were past commits with decisions that had been rejected, etc..) you achieve a type of contextual storage of the reason, thus after a month, when a team member wants to start working on a task that you have touched, and now forgot since you are doing ai-assisted coding and pr throughput is to sky right now, your collegue at least will know the rational behind the decission and working with his agent, the agent will produce more reliable code not introducing something for the sake of solving the task.
A few examples are the ability to query historical data and using each action line as a signal for other tooling to build on top but there are many others… you can check what Conventional Commits did in the past and what they unlocked only by introducing structure to commit subjects
> an open standard for capturing the WHY in git history
Agentic coding keeps reinventing coding.
That was my first thought.
> And then it hit me - the commit body has always been there. Completely underutilised.
Wait. What? This is the standard?
> Here is an example of how a Contextual Commit looks:
The format is key-value stuff. You can already use trailers for that. The syntax here doesn’t work with that stuff.
If you have already readh the “conventional commits” (pronounce with a sneer) specification you have already seen them. They’re called footers because they also didn’t know about trailers.
> No new tools. No infrastructure. Just better commits.
The scope in parentheses is doing real work. `rejected(oauth-library)` lets you do `git log --grep="rejected(auth"` to find every rejected auth decision across history.
If you flatten it to a trailer token you either lose the scope or encode it awkwardly as `Rejected-auth-oauth-library: value`, which doesn't grep cleanly and doesn't parse naturally.
> The scope in parentheses is doing real work. `rejected(oauth-library)` lets you do `git log --grep="rejected(auth"` to find every rejected auth decision across history.
I'm 99% sure that grep won't find your commit because you rejected "oauth-library" and grepping for "auth" rejection. Given that LLM will make up category name, it will just get worse unless there is deterministic enforcement.
All of this really feels like people that never wrote code starting doing it via agents and started reinventing already solved issues.
The “deterministic enforcement” is exactly what this enables but its not the responsibility of the spec to say that. Its harnesses or IDEs or you own implementation that will enforce that.
The format is optimised for agent querying and human readability in `git log`, not for `git interpret-trailers` compatibility. Those are different use cases.
git int-trailers compatibility is a nonsense phrase. You don’t care about compatibility with a helper tool. You care about the tools that use them... and git log uses them.
> The format is optimised for agent querying and human readability
Yours is key value pairs. Trailers are key value pairs. The git log can be read by humans and agents... what’s even the differentiator here?
Agents read English. But every little minutia of programming now needs something “for agents and humans”? Which is like colon-separated key value pairs... except they also have a scope in parens. Which makes all the difference to agents? tuts
Issue trackers are full of intent and decisions, that's true, but that's not the point here... It's about a storage that agents can use natively without the need of callings external APIs or MCPs.
And there is a slight difference between what you capture in issue trackers and what happens in reality in coding sessions.
I've never seen someone write decisions or the intent they started with in commit messages. Even the solutions today that auto-generate commit messages just summarise the diff.
This was helpful when humans were the only ones reading the history. But for agents its useless.
But also, if you look at large projects like Linux or FreeBSD, commits there explain why as well.
When that commit gets implicated by `git bisect` and all you see in the message is exactly what you’d see by reading the patch anyway, you’ll wish the author answered why they did what they did. This, especially, when the author is no longer at the company.
You may not have seen enough good repos. The following is an example commit from freebsd
https://cgit.freebsd.org/src/commit/?id=ac5ff2813027c385f903...
A proper email is like an email. You have the first line as the subject and it may be enough to explain the intent of the diff. But sometimes it’s not enough and you add more details in the body. I strongly believe that people who write the WHAT again don’t know that there’s a diff attached to the commit and think of them a separatete objects. GitHub and VSCode do not really help in that regard.
The Linux kernel is another great example. Random commit from yesterday:
https://github.com/torvalds/linux/commit/d56b5d163458c45ab8f...
When you think of the patch as an unit of idea and the commit as the means to convey that idea, it takes the same amount of effort to write an email message.
BTW you do not have to write those for every single commit. You can always rebase interactively and create a final set of commits for sharing. No one cares about what’s in your local copy of the repo.
To that end you will want to provide a validating parser and then start rejecting commits whose messages don’t validate. If your validator has even one or two bugs you’re going to see all goodwill evaporate, and for what? So that you could read:
instead of Commit messages are the primary source of why you did something. Focus all of your energy on writing clearly, concisely, and compellingly, and helping others to get better at doing so. Working on anything else is wasted energy compared to the importance of honing written communication skills.It was enraging - I'm trying to provide references to explain the motivation of my changes, all my prose is nicely formated, but the bulleted list of references I've provided is rejecting my commit.
I generally think it's in the category of a social problem not a technical problem - communicate the expectations but don't dogmatically enforce them
The example of using one library over another, especially if research has gone into which to choose, regularly involves say finding a good article that compares the alternatives.
I’ll say though that I usually include links to more notable references, I won’t usually commit refs to a libraries own docs and more obvious stuff; revealing and keeping references to resources found that went towards getting it done are what I keep and add to commit body.
Maybe there’s spaces for useful references to be added to the spec/conventions. Personally I usually show links like this after the body message.
Example of the commit body:
refs(oath-library):
www.something.com/picking-a-thing
Your 'intent' is 'users need social login'? That does not make sense.
Your intent is 'Getting more users by lowering barriers to sign up', a business goal. That business goal might have hierarchical children – for example, Jira epics – such as 'offer social sign-in', or 'declutter landing page.'
Also, the commit mentions 'Google before GitHub', but how can a commit (a snapshot of the repository) know the future? What if your product manager decides Google is fine enough and GitHub/Apple aren't needed?
I wish our profession would stop trying to reinvent issue tracking in git every week.
Important point.
But it says more about OP’s perspective on product-driven development than this “contextual commits”.
The idea that OP does not follow up on is whether these codified commit messages with a very shallow summary of the LLM context improve anything compared to the normal commit messages that Claude will generate.
I suspect they’re equally insufficient since even a compacted context seems less diffusing and it’s still a lot bigger than these.
> I wish our profession would stop trying to reinvent issue tracking in git every week.
I would love to have good git-powered issue tracking. I haven’t seen it yet.
By having a structured context of the key session discoveries, decisions, rejected items (if there were past commits with decisions that had been rejected, etc..) you achieve a type of contextual storage of the reason, thus after a month, when a team member wants to start working on a task that you have touched, and now forgot since you are doing ai-assisted coding and pr throughput is to sky right now, your collegue at least will know the rational behind the decission and working with his agent, the agent will produce more reliable code not introducing something for the sake of solving the task.
This gives product owners the ability to embellish as they wish and reduces the need of the dev to repeat themselves.
Giving agent scoped access to ticket system. Whis is this obvious answer not the obvious solution?
Agentic coding keeps reinventing coding.
That was my first thought.
> And then it hit me - the commit body has always been there. Completely underutilised.
Wait. What? This is the standard?
> Here is an example of how a Contextual Commit looks:
The format is key-value stuff. You can already use trailers for that. The syntax here doesn’t work with that stuff.
If you have already readh the “conventional commits” (pronounce with a sneer) specification you have already seen them. They’re called footers because they also didn’t know about trailers.
> No new tools. No infrastructure. Just better commits.
Okay, let’s cut right to the point..
The scope in parentheses is doing real work. `rejected(oauth-library)` lets you do `git log --grep="rejected(auth"` to find every rejected auth decision across history.
If you flatten it to a trailer token you either lose the scope or encode it awkwardly as `Rejected-auth-oauth-library: value`, which doesn't grep cleanly and doesn't parse naturally.
I'm 99% sure that grep won't find your commit because you rejected "oauth-library" and grepping for "auth" rejection. Given that LLM will make up category name, it will just get worse unless there is deterministic enforcement.
All of this really feels like people that never wrote code starting doing it via agents and started reinventing already solved issues.
> The format is optimised for agent querying and human readability
Yours is key value pairs. Trailers are key value pairs. The git log can be read by humans and agents... what’s even the differentiator here?
Agents read English. But every little minutia of programming now needs something “for agents and humans”? Which is like colon-separated key value pairs... except they also have a scope in parens. Which makes all the difference to agents? tuts
And there is a slight difference between what you capture in issue trackers and what happens in reality in coding sessions.