Imo Cursor did had the first mover advantage by making the first well known AI coding agent IDE. But I can't help but think they have no realistic path forward.
As someone who is a huge IDE fan, I vastly prefer the experience from Codex CLI compared to having that built into my IDE, which I customize for my general purposes. The fact it's a fork of VSCode (or whatever) will make me never use it. I wonder if they bet wrong.
But that's just usability and preference. When the SOTA model makers give out tokens for substantially less than public API cost, how in the world is Cursor going to stay competitive? The moat just isn't there (in fact I would argue its non-existent)
Yeah, hard disagree on that one, based on recent surveys, 80-90% of developers globally use IDEs over CLIs for their day-to-day work.
I was pretty worried about Cursor's business until they launched their Composer 1 model, which is fine-tuned to work amazingly well in their IDE. It's significantly faster than using any other model, and it's clearly fine-tuned for the type of work people use Cursor for. They are also clearly charging a premium for it and making a healthy margin on it, but for how fast + good it's totally worth it.
Composer 1 + now eventually creating an AI native version of GitHub with Graphite, that's a serious business, with a much clearer picture to me how Cursor gets to serious profitability vs the AI labs.
As the other commenter stated, I don't use CLIs for development. I use VSCode.
I'm very pro IDE. I've built up an entire collection of VSCode extensions and workflows for programming, building, customizing build & debugging embedded systems within VSCode. But I still prefer CLI based AI (when talking about an agent to the IDE version).
> Composer 1
My bet is their model doesn't realistically compare to any of the frontier models. And even if it did, it would become outdated very quickly.
It seems somewhat clear (at least to me) that economics of scale heavily favor AI model development. Spend billions making massive models that are unusable due to cost and speed and distill their knowledge + fine tune them for stuff like tools. Generalists are better than specialists. You make one big model and produce 5 models that are SOTA in 5 different domains. Cursor can't do that realistically.
> My bet is their model doesn't realistically compare to any of the frontier models.
I've been using composer-1 in Cursor for a few weeks and also switching back and forth between it, Gemini Flash 3, Claude Opus 4.5, Claude Sonnet 4.5 and GPT 5.2.
And you're right it's not comparable. It's about the same quality of code output of the aforementioned models but about 4x as fast. Which enables a qualitatively different workflow for me where instead of me spending a bunch of time waiting on the model, the model is waiting on me to catch up with its outputs. After using composer-1, it feels painful to switch back to other models.
I work in a larg(ish) enterprise codebase. I spend a lot of time asking it questions about the codebase and then making small incremental changes. So it works very well for my particular workflow.
Other people use CLI and remote agents and that sort of thing and that's not really my workflow so other models might work better for other people.
Does it have some huge context window? Or is it really good at grep?
The Copilot version of this is just fucking terrible at suggesting anything remotely useful about our codebase.
I've had reasonable success just sticking single giant functions into context and asking Sonnet 4.5 targeted questions (is anything in this function modifying X, does this function appear to be doing Y) as a shortcut for reading through the whole thing or scattershot text search.
When I try to give it a whole file I actually hit single-query token limits.
But that's very "opt-in" on my part, and different from how I understand Cursor to work.
It is really good at grep and will make multiple grep calls in parallel.
And when I open it in the parent directory of a bunch of repos in our codebase, it can very quickly trace data flow through a bunch of different services. It will tell me all the files the data goes through.
It's context window is "only" 200k tokens. When it gets near 200k, it compresses the conversation and starts a new conversation..... which mostly works but sometimes it has a bit of amnesia if you have a really long running conversation on something.
composer 1 has been my most used model the past few months. but i only use it to execute plans that i write with the help of larger, more intelligent models like opus 4.5. composer 1 is great at following plan instructions so after some careful time providing the right context and building a plan, it basically never messes up the implementation. sometimes requires a few small tweaks around the edges but overall a fantastic workflow that's so delightfully fast
It does not matter what 80-90% of developers do. Code development is heavily tail-skewed: focus on the frontier and on the people who are able to output production-level code at a much higher pace than the rest.
OP isn't saying to do all of your work in the terminal; they're saying they prefer CLI-based LLM interfaces. You can have your IDE running alongside it just fine, and the CLIs can often present the changes as diffs in the IDEs too.
This is how some folks on my team work. Ran into this when I saved a file manually and the editor ran formatting on it. Turns out that the dev that wrote it only codes via CLI though reviews the files in an IDE so he never manually saved it and ran the formatter.
I expect the formatter/linter to be run as part of presubmit and/or committing the code so it doesn't matter how it's edited and saved by the developer. It's strange to hear of a specific IDE being mandated to work around that, and making quick edits with tools like vi unsupported.
Part of a healthy codebase is ensuring that anyone can hack on it, regardless of their editor setup. Relying on something in .vscode and just assuming people are using that editor is what leads to this kind of situation.
> Yeah, hard disagree on that one, based on recent surveys, 80-90% of developers globally use IDEs over CLIs for their day-to-day work.
I have absolutely no horse in this race, but I turned from a 100% Cursor user at the beginning of the year, to one that basically uses agents for 90% of my work, and VS Code for the rest of it. The value proposition that Cursor gave me was not able to compete with what the basic Max subscription on anthropic gave me, and VS Code is still a superior experience to Claude in the IDE space.
I think though that Cursor has all the potential to beat Microsoft at the IDE game if they focus on it. But I would say it's by no way a given that this is the default outcome.
How does company X dependant on company Y product beat company Y in what is essentially just small UI differences? Can cursor even do anything that vscode can't right now?
> Can cursor even do anything that vscode can't right now?
Right now VSCode can do things that Cursor cannot, but mostly because of the market place. If Cursor invests money into the actual IDE part of the product I can see them eclipsing Microsoft at the game. They definitely have the momentum. But at least some of the folks I follow on Twitter that were die-hard Cursor users have moved back to VSCode for a variety of reasons over the last few months, so not sure.
Microsoft itself though is currently kinda mismanaging the entire product range between GitHub, VS Code and copilot, so I would not be surprised if Cursor manages to capitalize on this.
I use an IDE. It has a command line in it. It also has my keybinds, build flow, editor preferences, and CI integrations. Making something CLI means I can use it from my IDE, and possibly soon with my IDE.
GPT contender. There has been talk on the cursor forums. I think largely people have e slept on coding models and stick with Anthropic thinking it’s the best. Composer fit that niche of extremely fast and smart enough. Sometimes you just want a model that has a near instant response. The new Gemini preview is overtaking my usage of Composer.
The problem is companies like OpenAI have the upper hand here as they show with the Codex models.
Which is what I was mentioning elsewhere. They build huge models with infinite money and distill them for certain tasks. Cursor doesn't have the funding, nor would it be wise, to try to replicate that.
Why do you think so? Cursor has raised what north of $3bn. That’s enough money to train or tune a model for coding. With their pricing changes I suspect they are trying to get at least to breakeven as quick as possible. They have massive incentives both on the quality of the model for tool chain use and from a cost perspective to try and run their own model generation.
I used it extensively for a week and gave it an honest chance. It’s really good for quickly troubleshooting small bugs. It doesn’t come anywhere close to Opus 4.5 though.
As someone who uses Cursor, i don't understand why anyone would use CLI AI coding tools as opposed to tools integrated in the IDE. There's so much more flexibility and integration, I feel like I would be much less productive otherwise. And I say this as someone who is fluent in vim in the shell.
Now, would I prefer to use vs code with an extension instead? Yes, in the perfect world. But Cursor makes a better, more cohesive overall product through their vertical integration, and I just did the jump (it's easy to migrate) and can't go back.
I agree. I did most of my work in vim/cli (still often do), but the tight agent integrations in the IDEs are hard to beat. I'm able to see more in cursor (entire diffs), and it shows me all of the terminal output, whereas Claude Code hides things from you by default, by only showing you a few pieces and summaries of what it did. I do prefer to use CC for cli usage though (e.g. using aws cli, Kubernetes, etc). The tab-autocomplete is also excellent.
I also like how cursor is model-agnostic. I prefer codex for first drafts (it's more precise and produces less code), for Claude when less precision or planning is required, and other, faster models when possible.
Also, one of cursor's best features is rollback. I know people have some funky ways to do it in CC with git work trees etc, but it's built into cursor.
Mobile developer here. I historically am an emacs user so am used to living in a terminal shell. My current setup is a split pane terminal with one half running claude and the other running emacs for light editing and magit. I run one per task, managed by git worktrees, so I have a bunch of these terminals going simultaneously at any given time, with a bunch of fish/tmuxinator automation including custom claude commands. I pop over to Xcode if I need to dig further into something.
I’ve tried picking up VSCode several times over the last 6-7 years but it never sticks for me, probably just preference for the tools I’m already used to.
Xcode’s AI integration has not gone well so far. I like being able to choose the best tool for that, rather than a lower common denominator IDE+LLM combination.
Now that I can do a lot with 3-6 AI agents running usefully 2-5min at a time to crank through my plans, the IDE is mostly just taking valuable space
For backend/application code, I find it's instead about focusing on the planning experience, managing multiple agents, and reviewing generated artifacts+PRs. File browsers, source viewers, REPLs, etc don't matter here (verbose, too zoomed-in, not reflecting agent activity, etc), or at best, I'll look at occasionally while the agents do their thing.
It is very easy to open multiple terminals, have them side by side, do different things. It is more natural to invoke agents and let them do their things.
One of the biggest values for Cursor is getting all these different models under a single contract. A contract that very importantly covers the necessary data privacy we want as a business. We can be sure that no matter which model a developer chooses to use, we are covered under the clauses that disallow them from retaining and training on our conversations.
I struggle with understand why engineers enjoy using these CLI coding tools so much. I have tried a few times and I simply cannot get into a good workflow. Cursor, Kline and others feel like the sweet spot for me.
It's really nice that the integrated nature means that, with no extra work on my part, the agent can see exactly what I'm seeing including the active file and linter errors. And all the model interaction is unified. I point them to specific files in the same way, they all have access to the same global rules (including team-global rules), documentation is supplied consistently, and I can seamlessly switch between models in the same conversation.
That has been my experience as well. When I am prompting an agent it is using my open tabs first. When changes are made I get green and red lines and quickly can digest the difference. I don’t want it going off building a big feature form start to finish. I want to maybe use an AI to map out a plan but then go through each logical step of the implementation. I can quickly review changes and at least for me have the context of what’s happening.
As an older engineer, I prefer CLI experiences to avoid mouse usage. The more I use the mouse, the more I notice repetitive stress injury symptoms
But also, 90% of the time if I'm using an IDE like VSCode, I spend most of my time trying to configure it to behave as much like vim as possible, and so a successful IDE needn't be anything other than vim to me, which already exists on the terminal
What I don't understand why people would go all in on one IDE/editor and refuse to make plugins for others. Whether you prefer the CLI or the integrated experience, only offering it on vscode (and a shitty version of it, as well) is just stupid.
Cursor if I recall actually started life as a VScode plugin. But the plugin API didn’t allow for the type of integration & experiences they wanted. Hit limits quickly and then decided to make a fork.
Not to mention that VSCode has been creating many "experiemental" APIs that are not formalized for years which become de facto private APIs that only first party extensions have access to.
Good thing that Copilot is not the dominant tool people use these days, which proves that (in some cases) if your product is good enough, you can still win an unfair competition with Microsoft.
Codeium (now Windsurf) did this, and the plugins all still work with normal Windsurf login. The JetBrains plugin and maybe a few others are even still maintained! They get new models and bugfixes.
(I work at Windsurf but not really intended to be an ad I’m just yapping)
Windsurf is at least 10x better than Cursor in my opinion... I'm honestly still puzzled it doesn't seem to get as much buzz on HN! I had to literally cmd+F to find a reference here and this is the only comment ;-;
> As someone who is a huge IDE fan, I vastly prefer the experience from Codex CLI compared to having that built into my IDE, which I customize for my general purposes
Fascinating.
As a person who *loathes VS Code* and prefers terminal text editors, I find Cursor great!
Maybe because that I have zero desire to customize/leverage Cursor/VS Code.
Neat. Cursor can do what it wants with it, and I can just lean into that...
Tab complete is still useful and code review/suggesting changes can be better in a GUI than in a terminal. I think there is still a radically better code review experience that is yet to be found, and it's more likely to come from a new player like Cursor/Graphite than one of the giants.
Also Cursor's dataset of actual user actions in coding and review is pure gold.
God cursors tab complete is woeful in basically all of my usage at work. It’s so actively wrong that I turned it off. It’s agent flows are far far more useful to me
I personally use CLI coding agents as well, but many people do prefer tight IDE integration.
I’ve tried every popular agent IDE, but none of them beat Cursor’s UX. Their team thought through many tiny UX details, making the whole experience smooth like a butter. I think it’s a huge market differentiator.
Virtually anybody going all in AI is exposing itself of being redundant.
I don't envy startups in the space, there's no moat be it cursor or lovable or even larger corps adopting ai. What's the point of Adobe when creating illustrations or editing pics will be embedded (kinda is already) in the behemoth's chat it's?
And please don't tell me that hundreds of founder became millionaires or have great exits or acquihires expecting them. I'm talking about "build something cool that will last".
Hi all! Graphite cofounder Greg here - happy to help answer questions. To preempt one: I’ve been asked a few times so far why we decided to join.
Personally, I work on Graphite for two reasons. 1) I love working with kind, smart, intense teammates. I want to be surrounded by folks who I look up to and who energize me. 2) I want to build bleeding-edge dev tools that move the whole industry forward. I have so much respect for all y’all across the world, and nothing makes me happier than getting to create better tooling for y’all to engineer with. Graphite is very much the combination of these two passions: human collaboration and dev tools.
Joining Cursor accelerates both these goals. I get to work with the same team I love, a new bunch of wonderful people, and get to keep recruiting as fast as possible. I also get to keep shipping amazing code collaboration tooling to the industry - but now with more resourcing and expertise. We get to be more ambitious with our visions and timelines, and pull the future forward.
I wouldn’t do this if I didn’t think the Cursor team weren’t standup people with high character and kindness. I wouldn’t do this if I thought it meant compromising our vision of building a better generation of code collaboration tooling. I wouldn’t do it if I thought it wouldn’t be insanely fun and exciting. But it seems to be all those things, so we’re plunging forward with excitement and open hearts!
Congrats!! I see this as two great companies joining forces in a crowded space where it is clear the whole is worth more than the sum of their parts. Best of luck on your journey
Makes sense and appreciate the transparency. Have admired what you're building at Graphite and look forward to seeing what you build as part of the Cursor team. Congrats!
If these ai companies had 100x dev output, why would you acquire a company? Why not just show screenshots to your agent and get it to implement everything?
Is it market share? Because I don't know who has a bigger user base that cursor.
The claims are clearly exaggerated or as you say, we'd have AI companies pumping out new AI focused IDEs left and right, crazy features, yet they all are Vs code forks that roughly do the same shit
A VSCode fork with AI, like 10 other competitors doing the same, including Microsoft and Copilot, MCPs, Vs code limitations, IDEs catching up. What do these AI VsCode forks have going for them? Why would I use one?
I am validating and testing these for the company and myself. Each has a personality with quirks and deficiencies. Sometimes the magic sauce is the prompting or at times it is the agentic undercurrent that changes the wave of code.
More specific models with faster tools is the better shovel. We are not there yet.
Heyo, disclosure that I work for graphite, and opinions expressed are my own, etc.
Graphite is a really complicated suite of software with many moving pieces and a couple more levels of abstraction than your typical B2B SaaS.
It would be incredibly challenging for any group of people to build a peer-level Graphite replacement any faster than it took Graphite to build Graphite, no matter what AI assistance you have.
It’s always faster and easier to copy than create(AI or not). There is lot of thought and effort in doing it first, which the second team(to an extent) can skip.
Much respect to what have you have achieved in a short time with graphite.
A lot of B2B SaaS is about tones of integrations to poorly designed and documented enterprise apps or security theatre, compliance, fine grained permissions, a11y, i18n, air gapped deployments or useless features to keep largest customers happy and so on and on.
Graphite (as yet) does not any of these problems - GitHub, Slack and Linear are easy as integrations go, and there is limited features for enterprises in graphite.
Enterprise SaaS is hard to do just for different type of complexity
If you've used Graphite as a customer for any reasonable period of time or as part of a bigger enterprise/org and still think our app's particular integration with GH is easy... I think that's more a testament to the work we've done to hide how hard it is :)
Most of the "hard" problems we're solving (which I'm referencing in my original comment) are not visually present in the CLI or web application. It's actually subtle failure-states or unavailability that you would only see if I'm doing my job poorly.
I'm not talking about just our CLI tool or stacking, to clarify. I'm talking about our whole suite, especially the review page and merge queue.
What kind of enterprise SaaS features do you wish you had in Graphite? (We have multiple orgs with 100s-1,000s of engineers using us today!)
I hate the unrealistic AI claims about 100X output as much as anyone, but to be fair Cursor hasn't been pushing these claims. It's mostly me-too players and LinkedIn superstars pushing the crazy claims because they know triggering people is an easy ticket to more engagement.
The claims I've seen out of the Cursor team have been more subtle and backed by actual research, like their analysis of PR count and acceptance rate: https://cursor.com/blog/productivity
So I don't think Cursor would have ever claimed they could duplicate a SaaS company like Graphite with their tools. I can think of a few other companies who would make that claim while their CEO was on their latest podcast tour, though.
My guess is the purchase captures the 'lessons learned' based upon production use and user feedback.
What I do not understand is that if a high level staff with capacity can produce an 80% replacement why not assign the required staff to complete that last 10% to bring it to production readiness? That last 10% is unnecessary features and excess outside of the requirements.
I'm really used to my Graphite workflow and I can't imagine going without it anymore. An acquisition like this is normally not good news for the product.
Graphite isn’t really about code review IMO, it’s actually incredibly useful even if you just use the GitHub PR UI for the actual review. Graphite, its original product anyway, is about managing stacks of dependent pull requests in a sane way.
Heard on the worry, but I can confirm Graphite isn’t going anywhere. We're doubling down on building the best workflow, now with more resourcing than ever before!
Supermaven said the same thing when they were acquired by Cursor and then EOLed a year later. Honestly, it makes sense to me that Cursor would shut down products it acquires - I just dislike pretending that something else is happening.
we are a 70 person team, bringing in significant revenue through our product, have widespread usage at massive companies like shopify robinhood etc, this is a MUCH MUCH MUCH different story than supermaven (which I used myself and was sad to see go) which was a tiny team with a super-early product when they got acquired.
everyone is staying on to keep making the graphite product great. we're all excited to have these resources behind us!
Obviously what you need to say but the reality is that you’re not in control anymore. That’s what an acquisition is.
If Cursor wants to re-allocate resources or merge Graphite into to editor or stagnate development and use it as a marketing/lead gen channel, it will for the business.
Anything said at time of acquisition isn’t trustworthy. Not because people are lying at the time (I don’t think you are!) but because these deals give up leverage and control explicitly. If they only wanted tighter integration, they could fund that via equity investment or staffing engineers (+/- paying Graphite to do the same.) Companies acquire for a reason and it isn’t to let the team + product stay independent
There is literally nothing anyone can say to convince me any product or person is safe during an acquisition. Time and time again it's proven to just not be true. Some manager/product owner/VP/c-suite will eventually have the deciding factor and I trust none of them to actually care about the product they're building or the community that uses it
i mentioned a few months ago that it was a shame where graphite was headed re: AI (https://news.ycombinator.com/item?id=44955187). this appears to be the final nail in the original products coffin
for anyone else looking for a replacement, git spice and jujutsu are both fantastic
I’m working on something in a similar direction and would appreciate feedback from people who’ve built or operated this kind of thing at scale.
The idea is to hook into Bitbucket PR webhooks so that whenever a PR is raised on any repo, Jenkins spins up an isolated job that acts as an automated code reviewer. That job would pull the base branch and the feature branch, compute the diff, and use that as input for an AI-based review step. The prompt would ask the reviewer to behave like a senior engineer or architect, follow common industry review standards, and return structured feedback - explicitly separating must-have issues from nice-to-have improvements.
The output would be generated as markdown and posted back to the PR, either as a comment or some attached artifact, so it’s visible alongside human review. The intent isn’t to replace human reviewers, but to catch obvious issues early and reduce review load.
What I’m unsure about is whether diff-only context is actually sufficient for meaningful reviews, or if this becomes misleading without deeper repo and architectural awareness. I’m also concerned about failure modes - for example, noisy or overconfident comments, review fatigue, or teams starting to trust automated feedback more than they should.
If you’ve tried something like this with Bitbucket/Jenkins, or think this is fundamentally a bad idea, I’d really like to hear why. I’m especially interested in practical lessons.
> What I’m unsure about is whether diff-only context is actually sufficient for meaningful reviews, or if this becomes misleading without deeper repo and architectural awareness.
The results of a diff-only review won't be very good. The good AI reviewers have ways to index your codebase and use tool searches to add more relevant context to the review prompt. Like some of them have definitely flagged legit bugs in review that were not apparent from the diff alone. And that makes a lot of sense because the best human reviewers tend to have a lot of knowledge about the codebase, like "you should use X helper function in Y file that already solves this".
At $DAYJOB, there's an internal version of this, which I think just uses Claude Code (or similar) under the hood on a checked out copy of the PR.
Then it can run `git diff` to get the diff, like you mentioned, but also query surrounding context, build stuff, run random stuff like `bazel query` to identify dependency chains, etc.
They've put a ton of work into tuning it and it shows, the signal-to-noise ratio is excellent. I can't think of a single time it's left a comment on a PR that wasn't a legitimate issue.
I work at Graphite, our reviewer is embedded into a bigger-scope code review workflow that substitutes for the GH PR Page.
You might want to look at existing products in this space (Cursor's Bugbot, Graphite's Reviewer FKA Diamond, Greptile, Coderabbit etc.). If you sign up for graphite and link a test github repo, you can see what the flow feels like for yourself.
There are many 1000s of engineers who already have an AI reviewer in their workflow. It comments as a bot in the same way dependabot would. I can't share practical lessons, but I can share that I find it to be practically pretty useful in my day-to-day experience.
cursor has a reviewer product which works quite well indeed, though I've only used it with github. not sure how they manage context, but it finds issues that the diff causes well outside the diff.
We have coding agents heavily coupled with many aspects of the company's RnD cycle. About 1k devs.
Yes, you definitely need the project's context to have valuable generations. Different teams here have different context and model steering, according to their needs.
For example, specific aspects of the company's architecture is supplied in the context. While much of the rest (architecture, codebases, internal docs, quarterly goals) are available as RAG.
It can become noisy and create more needless review work. Also, only experts in their field find value in the generations. If a junior relies on it blindly, the result is subpar and doesn't work.
I was scared to learn but then a coworker taught me the 4 commands I care about (jj new, jj undo, jj edit, jj log) and now I can't imagine going back to plain git.
Obviously the working tree should be a commit like any other! It just makes sense!
Took me a month to learn jujutsu. Was initially a skeptic but pulled through. Git was always easy to me. Its model somehow just clicks in my brain. So when I first switched to jj, it made a lot of easy things hard due to the lack of staging (which is often part of my workflow). But now I see the value & it really does make hard things easy. My commit history is much cleaner for one.
Well, Graphite solves the problem of how to keep your stack of GitHub pull requests in sync while you squash merge the lowest pull request in the stack; which as far as I know jujutsu does not help with.
jj is actually perfectly fit for this and many other problems. In fact, this is actually the default behavior for jj -- if you squash a bunch of jj commits, the bookmarks on top automatically point to the updated rev tree. Then when syncing the dependent branches to git they all rebase automatically.
The problem however lies in who or what does this rebasing in a multi-tenant environment. You sort of need a system that can do it automatically, or one that gives you control over the process. For example, jj can often get tripped up with branch rules in git since you might accidentally move a bookmark that isn't yours to move, so to speak.
Correct (Graphite eng here for context) - we've thought about extending our CLI to allow it to sync jj with GH pull requests to do exactly this. Essentially - similar workflow but use `jj` as the frontend instead of `gt`
Love this announcement style. Direct, confident, and not a word longer than it needs to be. Gives major "the work speaks for itself" vibes. OpenAI's comms used to be like this, until it morphed into Apple-like grandiosity that instead comes off as try-hard.
Sure, but if the concern is googling "graphite" and finding results that aren't the Graphite you're looking for, it's the same problem. There will always be more results for graphite, the mineral than graphite, the enterprise-ready monitoring tool.
If that's not the concern, then what's the big deal?
Good news. Been using Cursor heavily for over a year now (on the Ultra plan currently). Hope we get access to this as part of our existing subscriptions.
Does anyone get actual insightful reviews from these code review tools? From most people I've spoke with, it catches things like code complexity, linting, etc but nothing that actual relates to business logic because there's no way it could know about the business logic of the product
I have gotten code reviews from OoenAI's Codex integration that do point out meaningful issues, including across files and using significant context from the rest of the app.
Sometimes they are things I already know but was choosing to ignore for whatever reason. Sometimes it's like "I can see why you think this would be an issue, but actually it's not". But sometimes it's correct and I fix the issue.
I just looked through a couple of PRs to find a concrete example. I found a PR review comment from Codex pointing out a genuine big where I was not handling a particular code path. I happened to know that no production data would trigger that code path as we had migrated away from it. It acted as a prompt to remove some dead code.
I built an LLM that has access to documentation before doing code reviews and forces devs to update it with each pr.
Needless to say, most see it as an annoyance not a benefit, me included.
It's not like it's useless but... people tend to hate reviewing LLM output, especially on something like docs that requires proper review (nope, an article and a product are different, an order and a delivery note are as well, and those are the most obvious..).
Code can be subpar or even gross but to the job, but docs cannot be subpar as they compound confusion.
I've even built a glossary to make sure the correct terms are used and kinda forced, but LLMs getting 95% right are less useful than getting 0, as the 5% tends to be more difficult to spot and tends to compound inaccuracies over time.
It's difficult, it really is, there's everything involved from behaviour to processes to human psychology to LLM instructing and tuning, those are difficult problems to solve unless your teams have budgets that allow you hiring a functional analyst that could double as a technical and business writer, and these figures are both rare and hard to sell to management. And then an LLM is hardly needed.
I wonder about this. Graphite is a fantastic tool that I use every day. Cursor was an interesting IDE a year ago that I don't really see much of a use case for anymore. I know they've tried to add other features to diversify their business, and that's where Graphite fits in for them, but is this the best exit for Graphite? It seems like they could have gotten further on their own, instead of becoming a feature that Cursor bought to try to stay in the game.
IMO this is a smart move. A lot of these next-gen dev tools are genuinely great, but the ecosystem is fragmented and the subscriptions add up quickly. If Cursor aquires a few more, like Warp or Linear, they can become a very compelling all-in-one dev platform.
I've been using git spice (https://abhinav.github.io/git-spice/) for the stacked PRs part of graphite and it's been working pretty well and it's open source and free.
GitHub have proven the ability to execute very well when they _want_ to. Their product people are top notch.
Given the VP of GitHub recently posted a screenshot of their new stacked diff concept on X, I'd be amazed if Graphite folks (whos product is adding this function) didn't get wind of it and look for a quick sell.
This was "announced" in October, and last week they were saying they're shipping to trusted partners to kick the tires before a real release, with posted screenshots.
So, we'll see what it ends up like, but they have apparently already executed.
Woahhhhh I missed this. Got a reference or link? My Googling is failing me. That's my biggest complaint about Github coming from Gerrit for Open Stack.
There are two Graphite companies. The time series DB for metrics (not this) and the stacked diff code review platform (this). Looking at other comments under the post, they seem to have executed a hard AI pivot recently.
Are there thoughts on getting to something more like a "single window dev workflow"? The code editing and reviewing experiences are very disjoint, generally speaking.
My other question is whether stacked PRs are the endpoint of presenting changes or a waypoint to a bigger vision? I can't get past the idea that presenting changes as diffs in filesystem order is suboptimal, rather than as stories of what changed and why. Almost like literate programming.
Stacked PRs are a really natural fit for vibe coding workflows, it helps turn illegible 10k+ line PRs into manageable chunks that you can review independently. (Not affiliated with Cursor or Graphite)
This is annoying, Graphite's core feature of stacked PRs is really good despite all the AI things they've added around their review UI. I doubt we'll want to keep relying on that for very long now.
You can still think of AI as one facet of Graphite's product that you can use or not depending on your work style. Stacked PRs are still a core piece and not going anywhere :)
Never heard of graphite before today. Were they built specifically for AI code reviews or it's a pivot / new feature from a company that started with something else?
No, they've been doing "managing stacks of dependent pull requests" for a lot longer than AI code review. I've mostly been a happy user, they simplify a lot of the git pain of continually rebasing and the UI makes stacks much easier to work with than Github's own interface.
They started as a better PR review tool, with the main feature that you can stack PRs that have dependencies on each other. It solves the problem of having PRs merging into other PR branches, or having notes not to merge something until another PR merges. Recently they became an AI code review tool, and just added a bunch of AI tools to the review UI, but you could just ignore it and the core functionality was still great.
As someone who is a huge IDE fan, I vastly prefer the experience from Codex CLI compared to having that built into my IDE, which I customize for my general purposes. The fact it's a fork of VSCode (or whatever) will make me never use it. I wonder if they bet wrong.
But that's just usability and preference. When the SOTA model makers give out tokens for substantially less than public API cost, how in the world is Cursor going to stay competitive? The moat just isn't there (in fact I would argue its non-existent)
I was pretty worried about Cursor's business until they launched their Composer 1 model, which is fine-tuned to work amazingly well in their IDE. It's significantly faster than using any other model, and it's clearly fine-tuned for the type of work people use Cursor for. They are also clearly charging a premium for it and making a healthy margin on it, but for how fast + good it's totally worth it.
Composer 1 + now eventually creating an AI native version of GitHub with Graphite, that's a serious business, with a much clearer picture to me how Cursor gets to serious profitability vs the AI labs.
I'm very pro IDE. I've built up an entire collection of VSCode extensions and workflows for programming, building, customizing build & debugging embedded systems within VSCode. But I still prefer CLI based AI (when talking about an agent to the IDE version).
> Composer 1
My bet is their model doesn't realistically compare to any of the frontier models. And even if it did, it would become outdated very quickly.
It seems somewhat clear (at least to me) that economics of scale heavily favor AI model development. Spend billions making massive models that are unusable due to cost and speed and distill their knowledge + fine tune them for stuff like tools. Generalists are better than specialists. You make one big model and produce 5 models that are SOTA in 5 different domains. Cursor can't do that realistically.
I've been using composer-1 in Cursor for a few weeks and also switching back and forth between it, Gemini Flash 3, Claude Opus 4.5, Claude Sonnet 4.5 and GPT 5.2.
And you're right it's not comparable. It's about the same quality of code output of the aforementioned models but about 4x as fast. Which enables a qualitatively different workflow for me where instead of me spending a bunch of time waiting on the model, the model is waiting on me to catch up with its outputs. After using composer-1, it feels painful to switch back to other models.
I work in a larg(ish) enterprise codebase. I spend a lot of time asking it questions about the codebase and then making small incremental changes. So it works very well for my particular workflow.
Other people use CLI and remote agents and that sort of thing and that's not really my workflow so other models might work better for other people.
The Copilot version of this is just fucking terrible at suggesting anything remotely useful about our codebase.
I've had reasonable success just sticking single giant functions into context and asking Sonnet 4.5 targeted questions (is anything in this function modifying X, does this function appear to be doing Y) as a shortcut for reading through the whole thing or scattershot text search.
When I try to give it a whole file I actually hit single-query token limits.
But that's very "opt-in" on my part, and different from how I understand Cursor to work.
And when I open it in the parent directory of a bunch of repos in our codebase, it can very quickly trace data flow through a bunch of different services. It will tell me all the files the data goes through.
It's context window is "only" 200k tokens. When it gets near 200k, it compresses the conversation and starts a new conversation..... which mostly works but sometimes it has a bit of amnesia if you have a really long running conversation on something.
How does that work? Multiple agents grepping simultaneously?
Bake that into the workflow some other way.
Composer is extremely dumb compared to sonnet, let alone opus. I see no reason to use it. Yes, it's cheaper, but your time is not free.
I have absolutely no horse in this race, but I turned from a 100% Cursor user at the beginning of the year, to one that basically uses agents for 90% of my work, and VS Code for the rest of it. The value proposition that Cursor gave me was not able to compete with what the basic Max subscription on anthropic gave me, and VS Code is still a superior experience to Claude in the IDE space.
I think though that Cursor has all the potential to beat Microsoft at the IDE game if they focus on it. But I would say it's by no way a given that this is the default outcome.
Right now VSCode can do things that Cursor cannot, but mostly because of the market place. If Cursor invests money into the actual IDE part of the product I can see them eclipsing Microsoft at the game. They definitely have the momentum. But at least some of the folks I follow on Twitter that were die-hard Cursor users have moved back to VSCode for a variety of reasons over the last few months, so not sure.
Microsoft itself though is currently kinda mismanaging the entire product range between GitHub, VS Code and copilot, so I would not be surprised if Cursor manages to capitalize on this.
What are we talking about? Autocomplete or GPT/Claude contender or...? What makes it so great?
Which is what I was mentioning elsewhere. They build huge models with infinite money and distill them for certain tasks. Cursor doesn't have the funding, nor would it be wise, to try to replicate that.
Now, would I prefer to use vs code with an extension instead? Yes, in the perfect world. But Cursor makes a better, more cohesive overall product through their vertical integration, and I just did the jump (it's easy to migrate) and can't go back.
I also like how cursor is model-agnostic. I prefer codex for first drafts (it's more precise and produces less code), for Claude when less precision or planning is required, and other, faster models when possible.
Also, one of cursor's best features is rollback. I know people have some funky ways to do it in CC with git work trees etc, but it's built into cursor.
I’ve tried picking up VSCode several times over the last 6-7 years but it never sticks for me, probably just preference for the tools I’m already used to.
Xcode’s AI integration has not gone well so far. I like being able to choose the best tool for that, rather than a lower common denominator IDE+LLM combination.
For backend/application code, I find it's instead about focusing on the planning experience, managing multiple agents, and reviewing generated artifacts+PRs. File browsers, source viewers, REPLs, etc don't matter here (verbose, too zoomed-in, not reflecting agent activity, etc), or at best, I'll look at occasionally while the agents do their thing.
I use VS Code, open a terminal with VS Code, run `claude` and keep the git diff UI open on the left sidebar, terminal at the bottom.
But also, 90% of the time if I'm using an IDE like VSCode, I spend most of my time trying to configure it to behave as much like vim as possible, and so a successful IDE needn't be anything other than vim to me, which already exists on the terminal
A simple text interface, access to endless tools readily available with an (usually) intuitive syntax, man pages, ...
As a dev in front of it super easy to understand what it's trying to do, and as simple as it gets.
Never felt the same in Cursor, it's a lot of new abstractions that dont feel remotely as compounding
Good thing that Copilot is not the dominant tool people use these days, which proves that (in some cases) if your product is good enough, you can still win an unfair competition with Microsoft.
(I work at Windsurf but not really intended to be an ad I’m just yapping)
Fascinating.
As a person who *loathes VS Code* and prefers terminal text editors, I find Cursor great!
Maybe because that I have zero desire to customize/leverage Cursor/VS Code.
Neat. Cursor can do what it wants with it, and I can just lean into that...
I can't randomly throw credits into a pit and say "oh 2000$ spent this month whatever". For larger businesses I suspect it is even worse.
If they had a 200$ subscription with proper unlimited usage (within some limits obviously) I would have jumped up and down though.
Relatively heavy cursor usage in my experience is around 100USD/month. You can set a limit to on demand billing.
Also Cursor's dataset of actual user actions in coding and review is pure gold.
I’ve tried every popular agent IDE, but none of them beat Cursor’s UX. Their team thought through many tiny UX details, making the whole experience smooth like a butter. I think it’s a huge market differentiator.
Also their own composer model is not bad at all.
I don't envy startups in the space, there's no moat be it cursor or lovable or even larger corps adopting ai. What's the point of Adobe when creating illustrations or editing pics will be embedded (kinda is already) in the behemoth's chat it's?
And please don't tell me that hundreds of founder became millionaires or have great exits or acquihires expecting them. I'm talking about "build something cool that will last".
Personally, I work on Graphite for two reasons. 1) I love working with kind, smart, intense teammates. I want to be surrounded by folks who I look up to and who energize me. 2) I want to build bleeding-edge dev tools that move the whole industry forward. I have so much respect for all y’all across the world, and nothing makes me happier than getting to create better tooling for y’all to engineer with. Graphite is very much the combination of these two passions: human collaboration and dev tools.
Joining Cursor accelerates both these goals. I get to work with the same team I love, a new bunch of wonderful people, and get to keep recruiting as fast as possible. I also get to keep shipping amazing code collaboration tooling to the industry - but now with more resourcing and expertise. We get to be more ambitious with our visions and timelines, and pull the future forward.
I wouldn’t do this if I didn’t think the Cursor team weren’t standup people with high character and kindness. I wouldn’t do this if I thought it meant compromising our vision of building a better generation of code collaboration tooling. I wouldn’t do it if I thought it wouldn’t be insanely fun and exciting. But it seems to be all those things, so we’re plunging forward with excitement and open hearts!
Somebody screenshot this please. We are looking at comedy gold in the next 3 years and there’s no shortage of material.
Is it market share? Because I don't know who has a bigger user base that cursor.
A VSCode fork with AI, like 10 other competitors doing the same, including Microsoft and Copilot, MCPs, Vs code limitations, IDEs catching up. What do these AI VsCode forks have going for them? Why would I use one?
More specific models with faster tools is the better shovel. We are not there yet.
Graphite is a really complicated suite of software with many moving pieces and a couple more levels of abstraction than your typical B2B SaaS.
It would be incredibly challenging for any group of people to build a peer-level Graphite replacement any faster than it took Graphite to build Graphite, no matter what AI assistance you have.
Much respect to what have you have achieved in a short time with graphite.
A lot of B2B SaaS is about tones of integrations to poorly designed and documented enterprise apps or security theatre, compliance, fine grained permissions, a11y, i18n, air gapped deployments or useless features to keep largest customers happy and so on and on.
Graphite (as yet) does not any of these problems - GitHub, Slack and Linear are easy as integrations go, and there is limited features for enterprises in graphite.
Enterprise SaaS is hard to do just for different type of complexity
If you've used Graphite as a customer for any reasonable period of time or as part of a bigger enterprise/org and still think our app's particular integration with GH is easy... I think that's more a testament to the work we've done to hide how hard it is :)
Most of the "hard" problems we're solving (which I'm referencing in my original comment) are not visually present in the CLI or web application. It's actually subtle failure-states or unavailability that you would only see if I'm doing my job poorly.
I'm not talking about just our CLI tool or stacking, to clarify. I'm talking about our whole suite, especially the review page and merge queue.
What kind of enterprise SaaS features do you wish you had in Graphite? (We have multiple orgs with 100s-1,000s of engineers using us today!)
I hate the unrealistic AI claims about 100X output as much as anyone, but to be fair Cursor hasn't been pushing these claims. It's mostly me-too players and LinkedIn superstars pushing the crazy claims because they know triggering people is an easy ticket to more engagement.
The claims I've seen out of the Cursor team have been more subtle and backed by actual research, like their analysis of PR count and acceptance rate: https://cursor.com/blog/productivity
So I don't think Cursor would have ever claimed they could duplicate a SaaS company like Graphite with their tools. I can think of a few other companies who would make that claim while their CEO was on their latest podcast tour, though.
What I do not understand is that if a high level staff with capacity can produce an 80% replacement why not assign the required staff to complete that last 10% to bring it to production readiness? That last 10% is unnecessary features and excess outside of the requirements.
Also, graphite isn't just "screenshots"; it's a pretty complicated product.
My usually prefer Gemini but sometimes other tools catch bugs Gemini doesn't.
As someone who has never heard of Graphite, can anyone share their experience comparing it to any of the tools above?
everyone is staying on to keep making the graphite product great. we're all excited to have these resources behind us!
If Cursor wants to re-allocate resources or merge Graphite into to editor or stagnate development and use it as a marketing/lead gen channel, it will for the business.
Anything said at time of acquisition isn’t trustworthy. Not because people are lying at the time (I don’t think you are!) but because these deals give up leverage and control explicitly. If they only wanted tighter integration, they could fund that via equity investment or staffing engineers (+/- paying Graphite to do the same.) Companies acquire for a reason and it isn’t to let the team + product stay independent
> "Will the plugin remain up? Yes!"
> https://supermaven.com/blog/sunsetting-supermaven
sweet summer child.
for anyone else looking for a replacement, git spice and jujutsu are both fantastic
The idea is to hook into Bitbucket PR webhooks so that whenever a PR is raised on any repo, Jenkins spins up an isolated job that acts as an automated code reviewer. That job would pull the base branch and the feature branch, compute the diff, and use that as input for an AI-based review step. The prompt would ask the reviewer to behave like a senior engineer or architect, follow common industry review standards, and return structured feedback - explicitly separating must-have issues from nice-to-have improvements.
The output would be generated as markdown and posted back to the PR, either as a comment or some attached artifact, so it’s visible alongside human review. The intent isn’t to replace human reviewers, but to catch obvious issues early and reduce review load.
What I’m unsure about is whether diff-only context is actually sufficient for meaningful reviews, or if this becomes misleading without deeper repo and architectural awareness. I’m also concerned about failure modes - for example, noisy or overconfident comments, review fatigue, or teams starting to trust automated feedback more than they should.
If you’ve tried something like this with Bitbucket/Jenkins, or think this is fundamentally a bad idea, I’d really like to hear why. I’m especially interested in practical lessons.
The results of a diff-only review won't be very good. The good AI reviewers have ways to index your codebase and use tool searches to add more relevant context to the review prompt. Like some of them have definitely flagged legit bugs in review that were not apparent from the diff alone. And that makes a lot of sense because the best human reviewers tend to have a lot of knowledge about the codebase, like "you should use X helper function in Y file that already solves this".
Then it can run `git diff` to get the diff, like you mentioned, but also query surrounding context, build stuff, run random stuff like `bazel query` to identify dependency chains, etc.
They've put a ton of work into tuning it and it shows, the signal-to-noise ratio is excellent. I can't think of a single time it's left a comment on a PR that wasn't a legitimate issue.
You might want to look at existing products in this space (Cursor's Bugbot, Graphite's Reviewer FKA Diamond, Greptile, Coderabbit etc.). If you sign up for graphite and link a test github repo, you can see what the flow feels like for yourself.
There are many 1000s of engineers who already have an AI reviewer in their workflow. It comments as a bot in the same way dependabot would. I can't share practical lessons, but I can share that I find it to be practically pretty useful in my day-to-day experience.
Yes, you definitely need the project's context to have valuable generations. Different teams here have different context and model steering, according to their needs. For example, specific aspects of the company's architecture is supplied in the context. While much of the rest (architecture, codebases, internal docs, quarterly goals) are available as RAG.
It can become noisy and create more needless review work. Also, only experts in their field find value in the generations. If a junior relies on it blindly, the result is subpar and doesn't work.
> After bringing features of Supermaven to Cursor Tab, we now recommend any existing VS Code users to migrate to Cursor.
Supermaven was acquired by Cursor and sunset after 1 year.
Obviously the working tree should be a commit like any other! It just makes sense!
This is something GitHub should be investing time in, it’s so frustrating.
The problem however lies in who or what does this rebasing in a multi-tenant environment. You sort of need a system that can do it automatically, or one that gives you control over the process. For example, jj can often get tripped up with branch rules in git since you might accidentally move a bookmark that isn't yours to move, so to speak.
https://www.merriam-webster.com/dictionary/graphite
If that's not the concern, then what's the big deal?
Looks bad: https://forum.cursor.com/t/font-on-the-website-looks-weird/1...
Huge fans of their work @ GitStart!
Sometimes they are things I already know but was choosing to ignore for whatever reason. Sometimes it's like "I can see why you think this would be an issue, but actually it's not". But sometimes it's correct and I fix the issue.
I just looked through a couple of PRs to find a concrete example. I found a PR review comment from Codex pointing out a genuine big where I was not handling a particular code path. I happened to know that no production data would trigger that code path as we had migrated away from it. It acted as a prompt to remove some dead code.
Needless to say, most see it as an annoyance not a benefit, me included.
It's not like it's useless but... people tend to hate reviewing LLM output, especially on something like docs that requires proper review (nope, an article and a product are different, an order and a delivery note are as well, and those are the most obvious..).
Code can be subpar or even gross but to the job, but docs cannot be subpar as they compound confusion.
I've even built a glossary to make sure the correct terms are used and kinda forced, but LLMs getting 95% right are less useful than getting 0, as the 5% tends to be more difficult to spot and tends to compound inaccuracies over time.
It's difficult, it really is, there's everything involved from behaviour to processes to human psychology to LLM instructing and tuning, those are difficult problems to solve unless your teams have budgets that allow you hiring a functional analyst that could double as a technical and business writer, and these figures are both rare and hard to sell to management. And then an LLM is hardly needed.
Then Cursor takes on GitHub for the control of the repo.
Given the VP of GitHub recently posted a screenshot of their new stacked diff concept on X, I'd be amazed if Graphite folks (whos product is adding this function) didn't get wind of it and look for a quick sell.
So, we'll see what it ends up like, but they have apparently already executed.
My other question is whether stacked PRs are the endpoint of presenting changes or a waypoint to a bigger vision? I can't get past the idea that presenting changes as diffs in filesystem order is suboptimal, rather than as stories of what changed and why. Almost like literate programming.
- Hunter @ Ellipsis