Atlassian Enables Default Data Collection to Train AI

(letsdatascience.com)

268 points | by kevcampb 4 hours ago

28 comments

  • martinald 3 hours ago
    Atlassian just goes from misstep to misstep. I still use their products quite often. The amount of P0 bugs I experience is absolutely crazy:

    - Bitbucket workers are hopelessly out of date (self hosted). We've had to put so many random workarounds in especially for Docker, as they don't keep them up to date enough

    - I have had a bug in JIRA for years where I can't reorder a new ticket unless I refresh the page

    - Every new feature they introduce into JIRA/Bitbucket over the past couple of years just doesn't work.

    - I tried their AI stuff on the free trial, didn't work at all, tried to cancel, can't cancel the free trial online and had to write a load of support tickets (of which the support ticket contact form bugged out multiple times).

    Anyone have any insight into why things have got so so dysfunctional? Tech debt? Talent leaving? Both? Even 'bad' enterprise software tends to be able to keep the most basic features running, but Atlassian is a whole new category. If you check their 'community' it is just hundreds/thousands of bugs with workarounds.

    • rurp 1 hour ago
      > I tried their AI stuff on the free trial, didn't work at all, tried to cancel, can't cancel the free trial online and had to write a load of support tickets (of which the support ticket contact form bugged out multiple times).

      Absolutely insane that this is legal. The only reason to do this is to trick and abuse customers. It would be trivially easy to legislate away if our government cared to.

      Atlassian seems like a typical entrenched big company, albeit an extreme example. They make money by selling to the bosses of their users and being the default name brand for many cases. Once a company gets to a certain size and doesn't directly compete much on quality internal corruption and incompetence can run rampant.

      • HoldOnAMinute 14 minutes ago
        >> internal corruption and incompetence can run rampant

        This affliction happens to almost every company, eventually. Nobody seems to have solved this.

      • colechristensen 1 hour ago
        It's explicitly not legal in California and some other places.
        • pintxo 1 hour ago
          Also for business customers? I would expect such regulations to only apply to b2c contexts.
    • mhitza 2 hours ago
      Featureatis. Just keep pumping out features with no thought. Today, probably also AI-coded .

      Even in mid-sized projects if you keep pushing for only new features you'll get a similar system. At least my experience in 3 or so midsized projects that I've worked on where nothing else mattered than checking of features from a huge backlog.

    • wsatb 2 hours ago
      The search function in Jira has always been unusable. It’s perhaps the worst part of the entire platform, but nice to see they’re still focused on adding features I will never use.
      • saganus 2 hours ago
        I've always thought I was the only one experiencing this and felt like I was crazy.

        I guess it's "good" to know that I'm not alone.

        The amount of times I've searched for a ticket that I know it's there (because I either have it opened in a different tab, or because I just created it), but can't find, it's just way to many.

        • wsatb 1 hour ago
          The results usually seem completely random to me. It's like the feature never made it out of proof of concept territory. The only advantage of all the email noise Jira sends out is that I can usually search my email for what I'm looking for.
          • jsk2600 5 minutes ago
            I've used JIRA back in 2009 and that is exactly what we did to work around shitty search function in JIRA.
      • pydry 24 minutes ago
        ironically it's the one place where an agent might be of some use and they created one and it's terrible.
        • siva7 12 minutes ago
          at least they didn't break their pattern of disappointing users. consistency is key.
    • ravenstine 1 hour ago
      Jira is buggy as hell these days. Lots of desyncing that forces me to refresh the page. I can have a ticket open on a sprint board and the modal spontaneously closes after a while, forcing me to reopen it frequently. The other week there were tickets that simply refused to show up in their respective sprint board no matter what I did; later the epic magically appeared on the board out of nowhere, then finally the individual tickets themselves reappeared.

      Gotta love the value that vibe coding has added to this world.

      • myself248 23 minutes ago
        I'm sure Atlassian's shareholders appreciate your sacrifice.
    • ezoe 2 hours ago
      Umm? Is there single step Atlassian did it right? It's a cancer of software development the suits force us to swallow while real development and useful documents are outside of their service because it's so stressful to use.
  • kevcampb 4 hours ago
    I really wish I could find a better source to link to for this. By default, all free and paid customers are being opted-in to their data being used for AI training.

    All your Confluence pages, Jira tickets, etc.

    https://support.atlassian.com/security-and-access-policies/d... describes how to disable this, but it also appears that the setting to disable this doesn't exist (it's not visible on any of our instances).

    • pryanbeng 1 hour ago
      They said the opt out features will be rolled out to the Admin portal in May.

      I got this info from an email they sent out

      >To give you control over this change, we're introducing new in‑app settings that allow you to manage in‑app data contribution. Initially, these settings will apply to data in Jira, Confluence, and Jira Service Management, including data in your Atlassian Platform apps (Rovo, Home, Teams, Projects, Assets, Goals, Analytics, and Administration). We'll notify you when settings become available for additional apps you own, so you can review them in Atlassian Administration. Between today and May 19, 2026, we'll gradually roll out these settings in Atlassian Administration. We'll send you another email on May 19th as a reminder, so you have time to review and make any adjustments before August 17, 2026.

      • toyg 11 minutes ago
        How is it acceptable to say "hey, we're stealing all your data right now, but in a few weeks you can tell us not to steal it please" ...?

        I don't pay for Atlassian tools myself, but if I were a CxO I'd be preparing a lawsuit.

    • carld 1 hour ago
      I also do not see the setting to opt out. I'm at Atlassian Administration > Security, and I do not see Data contribution. I've looked at other, multiple setting pages and I do not see it.

      So, is this an automatic opt-in without the ability to opt-out?

      • somewhatgoated 22 minutes ago
        Opt out features will be introduced at a later time
    • m4rtink 1 hour ago
      What about really sensitive stuff like if possibly private tickets that have all kinds of stuff like customer data, embargoed CVE fixes or even sensitive health related data, are they just cobble that all into a model so it can leak out to random people ?
    • kepano 2 hours ago
      This seems to be the official description of the changes:

      https://www.atlassian.com/trust/ai/data-contribution/faqs

    • itomato 37 minutes ago
      Opt-out at the Org level.

      To get value out of Rovo, it needs detail. Your over-subscribed Jira power user/admin can't effectively make it happen. No guarantees Atlassian (Rovo itself) can make it happen either, but the patterns are going to develop and evolve closer and closer to the Agents that make the features.

      They have a peculiar definition of Metadata, however. It's a proprietary data product derived from user content. It's a bit shit they way they sell it as metadata. It's a derivation. It's a product of Content, so it's Content - privacy safeguards cannot begin to cover the variation.

      \"Metadata includes two data types referred to as content attributes and common patterns.

      Content attributes are statistical characteristics, numeric fields, and derivatives of your in-app data. Examples of content attributes may include the number of story points assigned to a Jira work item or the complexity of a Confluence page. Common patterns are phrases, keywords, and topics we extract from search queries and results, Rovo Chat (conversations, prompts, and responses), and custom configuration data that are frequently seen across many customers, while omitting rare data that may be unique to your organization. Examples of common patterns may include common words, phrases, or Rovo Chat prompt topics that are frequently used by customers, such as “vacation policy” or “recap team activity.”\"

    • bradleyankrom 3 hours ago
      • kevcampb 2 hours ago
        Unfortunately that one has a subheading of "From August 17, the outfit will collect customer metadata by default unless you pay for the top tier"

        It's not just metadata, it's all "in-app data"

    • tgv 1 hour ago
    • MagicMoonlight 30 minutes ago
      That's insane. Every single one of those things is highly sensitive and confidential information. How could you ever trust them after this? That information is priceless for shorting your company on the stock market.

      Not that they'd ever do that of course. Nobody with highly sensitive information about rival companies would ever do that.

    • Nathanba 2 hours ago
      • kevcampb 2 hours ago
        "Your available data contribution settings will be available no later than May 19, 2026."

        So let me guess, they're hoping that we forget about this by then, so that they can scoop up our data? I can't think any other reason for it.

  • Bnjoroge 1 hour ago
    Plenty of other companies enable this by default too, such as Github, Figma, Adobe, Vercel. I think it's fair to assume that if you ahve data stored within any company, they'll by default use it for training.
    • tombert 56 minutes ago
      Maybe this will become The Year of the Self Hosted.

      For stuff that I don't particularly care about privacy I've kept on the cloud (e.g. my blog, which is public anyway and as such is probably training bots regardless), but for stuff that I don't want to be used to train their models and/or sell to advertisers I have moved to be self hosted on my own network.

      • pydry 23 minutes ago
        self hosting needs to be easier to set up for that to happen.

        we're not far off it being good enough but it's not there yet.

        • jsk2600 18 minutes ago
          Atlassian made self-hosting 'less easier' on purpose. They even discontinued their on-prem products.
  • wingmanjd 15 minutes ago
    I made this a while back to move us off our on-prem Atlassian to Gitlab [1]. Maybe it'll help someone if they want something similar. Fair warning: I haven't tried this recently, so YMMV.

    [1] https://gitlab.com/jeremygonyea/jira-to-gitlab-migration-too...

  • huwsername 2 hours ago
    If the rumours of an Anthropic acquisition are true, this makes a lot of sense. Anthropic are probably looking for a clean, high-signal dataset of metadata around business tasks that they can buy.
    • m4rtink 1 hour ago
      I'm thinking it would be ideal if Broadcom buys Attlassian instead and pulls another VMware. Problem solved - for ever. ;-)
      • siva7 8 minutes ago
        Oh what the.. i can't pay for a 2000$ max sub :/
    • ezoe 2 hours ago
      I doubt data in Atlassian are anywhere close to clean or organic. It was designed by hell to swallow shit to real programmer who does real works outside of Atlassian.
      • jerjerjer 2 hours ago
        Programmer adjacent data can already be consumed from git repos. Atlassian has PM data.
  • dreknows 2 hours ago
    The opt-out-by-default pattern has been gradually normalizing in enterprise SaaS, but what makes this particularly egregious is the combination of two things: the data scope (not just metadata, but all in-app content per kevcampb's link) and the broken opt-out (the disabling setting not rendering on any instance).

    One is a policy decision you can argue about. Both together suggest the friction is intentional.

    The data residency point is worth flagging separately - a lot of enterprise buyers treat region-pinning as a privacy guarantee for everything in their contract. It was never that. Residency tells you where data is stored at rest, not who can access it for what purpose.

    • tgv 1 hour ago
      What makes this extra scummy is this:

      “If customers were to right now terminate their contract, the new data contribution settings will not apply to them as these will not be enforced until August 17, 2026,” (from https://www.theregister.com/2026/04/18/atlassians_new_data_c...)

      So you can't even take a bit of time to consider your options.

  • maxloh 19 minutes ago
    The adage was "If you're not paying for the product, you are the product." Now enterprises are paying to become the product. That's ridiculous.
  • firesteelrain 1 hour ago
    No wonder they wanted to stop supporting the Data Center versions for on prem.
  • qsera 1 hour ago
    I am wondering why not just rsyncrypt the source code before pushing to the repo?

    >rsyncrypto is a utility that encrypts a file (or a directory structure) in a way that ensures that local changes to the plain text file will result in local changes to the cipher text file. This, in turn, ensures that doing rsync to synchronize the encrypted files to another machine will have only a small impact on rsync's wire efficiency.

    https://manpages.ubuntu.com/manpages/focal/man1/rsyncrypto.1...

  • yalok 30 minutes ago
    Does this include repos content in BitBucket?
  • jerhewet 1 hour ago
    Will Atlassian be harvesting code and content from private Bitbucket repositories? The wording in their policies and FAQ's is vague, so I'd like to get a definitive (Yes / No) answer.
  • reeseparker63 2 hours ago
    Worth noting that Atlassian's data residency options don't exempt you from this—your data can still be used for training even if you've pinned it to a specific region.
  • microflash 1 hour ago
    I read this as "Stop using this product" toggle every time a company does this without consent. It has done a good amount of mental and financial improvements to me.
  • willis936 1 hour ago
    Presumably the government and HIPAA carveouts are for legal obligations. Trade secret theft is illegal so I wonder why they're not considering this.
  • kepano 2 hours ago
    The official Atlassian FAQ on this change:

    https://www.atlassian.com/trust/ai/data-contribution/faqs

  • titzer 55 minutes ago
    AI contributing to rising natural stupidity.
  • jason_s 50 minutes ago
    I'm really tired of JIRA, to the point where I have expressed it publicly: https://www.embeddedrelated.com/showarticle/1772.php
  • rvz 37 minutes ago
    No surprise here. It's by design.
  • rsynnott 1 hour ago
    Imagine an AI based on jira tickets. _That's_ the torment nexus.
  • shadowgovt 42 minutes ago
    The only silver lining I can see in this is that if they replace their existing tooling with AI integration, we might actually get search and confluence that works.

    I've lost count of how many times I search for a keyword and get no relevant results, but the document I'm looking for, which contains the keyword, is in my automatic pop-up of recent documents visited.

  • arjunthazhath 26 minutes ago
    Omg
  • pkilgore 1 hour ago
    Does this apply to Loom?
    • itomato 47 minutes ago
      Loom isn't mentioned in the Partner materials I have read. That's about all I can say.
  • sebakubisz 1 hour ago
    [dead]
  • boxingdog 2 hours ago
    [dead]
  • oliver236 2 hours ago
    genius move.
  • an0malous 1 hour ago
    We need to kill SaaS. Apps should be local-first and have peer-to-peer data sync. These companies won't stop until they use your data to replace you and enrich their owners.
    • rogerthis 1 hour ago
      Beautiful on paper. But it does not scale outside a certain type of tech people.
      • an0malous 22 minutes ago
        What’s the scaling bottleneck? If you made a local-first, P2P version of Figma what would break first? For a company of like 50 people, I doubt you’d have more than 100GB of data so it should fit on everyone’s computers. The P2P syncing part seems solvable, even if you need a centralized handshake server somewhere. And from the user perspective I don’t see why the UX couldn’t be identical, so it’s all the same to them.

        It seems like the real bottleneck is something else.

  • tqwhite 2 hours ago
    I don't see it as a misstep at all. The purpose of StackOVerflow is to share expertise.

    I am 100% supportive of it being used for training... AI, you, everyone.