OSS Rebuild: open-source, rebuilt to last

(security.googleblog.com)

178 points | by tasn 2 days ago

9 comments

  • lrvick 2 days ago
    IMO you need an immutable appliance-like OS that is deterministic and full source bootstrapped to do reproductions with minimized trusting-trust attack risk.

    We built ReprOS to solve this problem: https://codeberg.org/stagex/repros

    "Git push" to it and it will do a build in a throw-away VM then have the host sign the artifact results and push signatures to the same or a different repo.

    • conradev 2 days ago
      Or Debian or NixOS? Both of which are pretty reproducible: https://reproducible-builds.org/success-stories/

      You just need a read-only system partition, like macOS or NixOS or Silverblue.

      • lrvick 1 day ago
        MacOS is closed source centralized trust so it is not even trying to have accountability. If you add in Homebrew you -also- add trust in hundreds of unvetted internet randos pushing and merging unsigned commits without strict review requirements.

        Silverblue and NixOS are not fully reproducible, and rely on blind maintainer trust to merge anything they want, then shifts to a centralized build/signing trust model for all packages (single points of failure)

        Debian has inflated reproducibility stats as it allows bootstrapping packages from binary blobs exposing it to trusting trust attacks, and overall relies on a distributed trust model where different maintainers maintain and sign different packages (many points of failure)

        Stagex, which is what ReprOS is built with, relies on a decentralized trust model with 100% full-source-bootstrapped, deterministic, multi-party-signed commits, with multi-party-signed builds, (no single computer or human is trusted).

        The former options are highly complex desktop focused operating systems that are fast and loose with supply chain security in order to maximize contributions and package variety.

        Stagex by contrast is -not- a desktop distribution and cranks supply chain security all the way up focusing primarily on those few packages needed for mission critical build, administrative, and server use cases.

        ReprOS by extension you can build from hex0 all the way up to compilers and a final .iso bit for bit identical every time, so admins can form confidence they are running unmodified upstream source code without any chance of tampering by a compromised maintainer.

        • conradev 23 hours ago
          That is very cool and makes sense! I can see myself using this for sure. Put that in the README! :P
    • chubot 2 days ago
      Hm that looks pretty nice and useful

      But I'd also say another way to do it is to build / cross compile on two totally different machines, say Linux and OS X, or Linux and FreeBSD, or even a modern Debian and some Linux VM from 2005

      If the results are exactly the same, then I think it can be trusted

      I guess that's like Diverse Double-Compiling, but extended to the whole machine:

      https://dwheeler.com/trusting-trust/

      • lrvick 1 day ago
        That is exactly what we do in StageX on which ReprOS is based.

        Given all builds start from hex0 you can build from an Arch Linux host and I can build from a Debian host and we are still guaranteed identical results every time.

        Every altered package is reproduced bit-for-bit identically by multiple different maintainers and signed every release.

        Our entire distro and release process is built around addressing trusting trust attacks which is why we must be 100% deterministic or we cannot ship a release.

        • chubot 1 day ago
          Ah OK that's interesting ... so I guess ReprOS saves other projects the effort of doing that themselves!

          I am honestly not sure how much effort it would be; I was thinking about doing it / verifying it for https://oils.pub/ , which has a large build process

          But it may save time to use ReprOS. I'll look at it more - it sounds interesting!

    • tomcam 2 days ago
      Love this project; thanks for letting us know about it. I have been voted "Least likely to succeed in Web Hosting Security" by HN for 13 years in a row, so apologies if this is irrelevant. But being able to know precisely what software you're running would be a great way to run a web server, no? Or is it not efficient enough running in a container or what?
      • lrvick 2 days ago
        That is why we made StageX, which allows you to generate bootable web server images or containers bit for bit identical every time so prod is predictable and accountable.

        https://stagex.tools

      • andrecarini 2 days ago
        > I have been voted "Least likely to succeed in Web Hosting Security" by HN for 13 years in a row,

        Curious, what is this about? Care to share the context?

        • lrvick 1 day ago
          That was clearly just a joke meant to indicate they are not a security professional.
    • 0cf8612b2e1e 2 days ago
      This is a great idea. Less boil the ocean than getting everything into Nix. Are any notable projects using this?
      • lrvick 2 days ago
        Not yet, although Turnkey and Stagex are in the process of integrating it for automated reproductions as part of their CI for everything.

        It will no doubt mature a fair bit more through this process

  • simonw 2 days ago
    I'm very excited about this project, but it could really do with a web UI of some sort! Having to build a Go CLI tool in order to access it is a massive amount of friction.

    I reverse-engineered it a tiny bit, looks like you can get a list of all builds so far like this:

      gsutil ls -r 'gs://google-rebuild-attestations/**'
    
    I ran that and got back 9,507 - here's that list as a Gist: https://gist.github.com/simonw/9287de5900d5b76969e331d9b4ad9...
    • msuozzo 2 days ago
      Author here!

      > I'm very excited about this project

      Thank you!

      > but it could really do with a web UI of some sort!

      Couldn't agree more! The terminal UI exists (`./tools/ctl tui`) but is oriented towards developers on the project or to those who wish to run their own instance. Making the system and data more accessible to general users is a big priority.

      • simonw 2 days ago
        I got a basic web UI working here: https://storage.googleapis.com/rebuild-ui/index.html

        It's using that fixed list from the Gist though, I haven't set it up to update and reflect new listed projects.

        • BrentBrewington 2 days ago
          nice! bit of UI feedback: when I type "pypi/requests" into the search bar, I expected to see versions sorted descending, so newer ones show up higher
  • WhatIsDukkha 2 days ago
    So this seems like a bit of a half measure in the sense that it doesn't provide client side build?

    With guix I can bit for bit reproduce with my client machine the upstream binaries.

    This seems flawed to assume that google's servers are uncompromised, its vastly better to have distributed ability to reproduce.

    https://guix.gnu.org/manual/en/html_node/Invoking-guix-chall...

    • akdor1154 2 days ago
      It does provide client side build, see the bottom shell snippets.
  • Flux159 2 days ago
    So this seems to be a build definition and some form of attestation system? Does this require builds are done via CI systems instead of on adhoc developer machines?

    I find that for many npm packages, I don't know how builds were actually published to the registry and for some projects that I rebuilt myself in docker, I got vastly different sizes of distribution artifacts.

    Also, it seems like this is targeting pypi, npm, and crates at first - what about packages in linux distro repositories (debian, etc.)?

    • msuozzo 2 days ago
      Author here!

      > Does this require builds are done via CI

      Nope! One use for OSS Rebuild would be providing maintainers that have idiosyncratic release processes with an option for providing strong build integrity assurances to their downstream users. This wouldn't force them into any particular workflow, just require their process be reproducible in a container.

      > for some projects that I rebuilt myself in docker, I got vastly different sizes of distribution artifacts.

      Absolutely. OSS Rebuild can serve to identify cases where there may be discrepancies (e.g. accidentally included test or development files) and publicize that information so end-users can confidently understand, reproduce, and customize their dependencies.

      > what about packages in linux distro repositories (debian, etc.)

      OSS Rebuild actually does have experimental support for Debian rebuilds, not to mention work towards JVM and Ruby support, although no attestations have been published yet. There is also no practical impediment to supporting additional ecosystems. The existing support is more reflective of the size of the current team rather than the scope of the project.

    • candiddevmike 2 days ago
      The industry has been coalescing around third-party attestation for open source packages since COVID. The repercussions of this will be interesting to watch, but I don't see any benefits (monetary or otherwise) for the poor maintainers dealing with them.

      There's probably a lot of people that see GenAI as the solution to Not Invented Here: just have it rewrite your third party dependencies! What could go wrong. There will also be some irony of this situation with third party dependencies being more audited/reviewed than the internal code they get integrated into.

      • Y_Y 2 days ago
        I don't mind if the "third parties" are other trusted developers of the same project, for example. But please let's not centralise it. We're just going to get Robespierre again.
      • jpalomaki 2 days ago
        Two fold: AI makes it easier to find "issues" in existing software and automate the CVE process. This means more "critical" vulnerabilities that require attention from developers using these packages.

        At the same time rolling your own implementation with GenAI will be quick and easy. Outsiders are checking these, so no CVEs for these. Just sit back and relax.

    • kpcyrd 1 day ago
      For packages in Linux distributions you probably want one of:

      - https://reproducible.archlinux.org/ (since 2020)

      - https://reproduce.debian.net/ (since 2024)

      There's arch-repro-status and debian-repro-status respectively to show the status of the packages you have installed, but since it's not yet possible to make an opensource computer out of reproducible-only software, there isn't really any tooling to enforces this through policy.

  • Weethet 2 days ago
    nixpkgs already has 107158 packaged libraries/executables. Nix has infrastructure to support arbitrary build systems and can create docker images. I fail to see any advantages of creating a more narrow version of it that has fewer uses and has to start from scratch
    • msuozzo 2 days ago
      Author here!

      Both nix and guix are exciting projects with a lot of enviable security properties. Many here can attest that using them feels like, and perhaps is, the future. I see OSS Rebuild as serving more immediate needs.

      By rebuilding packages from the registries people already use, we can bring some of those security properties to users without them needing to change the way they get their software.

      • kam 2 days ago
        Nixpkgs pulls source code from places like pypi and crates.io, so verifying the integrity of those packages does help the Nix ecosystem along with everyone else.
      • Y_Y 2 days ago
        Why not help them help bring their packages to users, rather than borrowing and circumventing the existing effort?
    • mbonnet 8 hours ago
      Advantages: potentially useful/extant/readable documentation.
    • hollerith 2 days ago
      The Nix community has a poor record on security and supply-chain integrity in particular [1] whereas Google has a great record on security, and this announcement (of OSS Rebuild) was written by a member of the "Google Open Source Security Team".

      [1]: "it means effectively a decision was made for NixOS to be a hobby distro not suitable for any targeted applications or individuals. It really sucks, because I love everything else about nix design. Instead I am forced to bootstrap high security applications using arch and debian toolchains which are worse than nix in every way but supply chain integrity given that all authors directly sign package sources with their personal well verified keys."

      https://news.ycombinator.com/item?id=36268776

      • lrvick 2 days ago
        Since writing the post you link, I finally threw my hands up and made a new distro with some security engineer peers that prioritizes supply chain security and mandates 100% full source bootstrapping and determinism: https://stagex.tools

        It does not even try to be a workstation distro so we can get away with a small number of packages, focusing on building software with high accountability.

        Thankfully OCI build tooling is mature enough now that we can build using standards and do not need a custom build framework and custom languages like nix/guix does anymore.

      • arianvanp 2 days ago
        They could've contributed SLSA attestations support to nix instead. There's a few people working on bringing SLSA build provenance to nix(pkgs) including me. But limited time and resources unfortunately. Would love to see Google contribute to nix in this space :)

        E.g.:

        https://talks.nixcon.org/nixcon-2024/talk/AS373H/

        https://GitHub.com/arianvp/nix-attest

        • msuozzo 2 days ago
          Author here!

          > could've contributed SLSA attestations support to nix

          That sounds like a great idea! However one important consideration is that while an artifact on nixpkgs may aim to replicate the function of the upstream package, it must adhere to and interoperate with the rest of the distribution. Ultimately, its 'ecosystem' is nix. Work that goes into writing and maintaining the nix build does not generally filter back upstream to impact the build integrity of, say, its associated PyPI package. So if users continue to consume from PyPI, improving nix won't serve them.

          This is not to say that the long-term source of truth for packaging will remain the language registries. Just that today's reality demands we meet users where they are.

          > Would love to see Google contribute to nix in this space :)

          Same :)

          • ramses0 2 days ago
            Are all your comments being run through an "AI appropriateness and enthusiasm" filter?
            • hollerith 2 days ago
              I think he's just young and not-yet-disillusioned.
      • Weethet 2 days ago
        This is an issue with nixpkgs not nix. Google could've just bootstrapped their own nixpkgs from scratch if they wanted to, see Guix (not a perfect example but still). Creating a whole new tool is still completely unnecessary
        • carlhjerpe 2 days ago
          One could argue that creating Nix from scratch would be beneficial at some point. There's a lot of legacy hardcoded weirdness, Nix doesn't setup the containers with standard state of the art tools, the language is evaluated in a single thread and using values from derivations means a build blocks evaluation so it doesn't properly parallelise (nixpkgs bans "IFD" but it is useful for meta packaging).

          Nixpkgs is more valuable than Nix at this point, but also quite vulnerable. In practice it has worked out reasonably well so far, I don't know of anyone who got "owned" because of Nix.

          • nothrabannosir 1 day ago
            > using values from derivations means a build blocks evaluation so it doesn't properly parallelise (nixpkgs bans "IFD" but it is useful for meta packaging).

            Not anymore with the introduction of dynamic derivations (experimental)

      • woile 2 days ago
        Well, Google is using nix and nixpkgs...

        https://firebase.google.com/docs/studio#how-does-it-work

        • lrvick 2 days ago
          Encouraging the use of Nix in production is wildly irresponsible. I am really surprised to see Google do this given their generally high security bar. Maybe this team operates in a bubble and gets to prioritize developer experience above all else.
          • georgyo 2 days ago
            Nix in production is more common than you think, even at scale.

            It's hard to know what exactly your security concerns are here, but if you look at the current ecosystem of using containers and package registries, Nix is pretty clearly a solid contender, security-wise.

            • lrvick 1 day ago
              Plenty of wildly unsafe behavior is common in production infrastructure today. This is also why compromised corporate infrastructure is in the news so often. Few orgs hire or even contract security engineers with Linux supply chain and hardening experience, opting to blindly trust the popular options and their maintainers.

              NixOS knowingly discards vital supply chain integrity controls to minimize developer friction and maximize package contributions. It is a highly complex Wikipedia style distribution optimizing for maximum package variety which is absolutely fine and great for hobby use cases, but use in security critical applications is absolutely irresponsible.

              Guix goes some big steps further in supply chain integrity but still ultimately trusts individual maintainers.

              See this chart to understand how NixOS compares in terms of threat model https://codeberg.org/stagex/stagex#comparison

            • hollerith 2 days ago
      • ChocolateGod 2 days ago
        Nix/NixOS files often break due to Nix pkg maintainers not caring about keeping support for existing configuration formats. I experience a breakage roughly every 2 weeks when a variable/package gets renamed or changed.
        • nothrabannosir 1 day ago
          I stopped tracking unstable for this reason. And hey—fair enough I guess. The name should have been a sign.
        • 0x457 2 days ago
          At least it breaks during evaluation most of the time, and rolling back is super easy if it broke after activation.

          Also, it seems like you're using unstable if you're having that many breakages, sooo kinda asking for it?

    • kpcyrd 1 day ago
      oss-rebuild is for independent verification, which cache.nixos.org doesn't have yet. I'm still waiting for https://github.com/nix-community/trustix to become a thing.

      Until then they are still behind Debian and Arch Linux, which do in fact implement this with rebuilderd and debrebuild/archlinux-repro.

    • pjmlp 1 day ago
      Not everyone uses Nix, and there are other operating systems used in the world.
    • nicce 2 days ago
      Corporation with the size of Google must be in control by themselves.
  • riffraff 1 day ago
    I'm curious how the system detects "unusual build patterns".

    I.e. how would the xz backdoor be identified? Does the system have logic like "the build should not us binary bits already in the repo"? Or it's even more specific , like "all build files must come from a single directory? If it's more generic, how does it work?

  • bgwalter 2 days ago
    Thanks that the OSS value is $12 trillion, but only packagers, security experts and SaaS companies get any of that.

    Rebuilt to Last? It is a Google project, so I give it two years.

    • chubot 2 days ago
      At first I thought this might be promising, given

      without burden on upstream maintainers

      Then I see

      This is not an officially supported Google product

      on https://github.com/google/oss-rebuild

      And then I also see

      oss-rebuild uses a public Cloud KMS key to validate attestation signatures. Anonymous authentication is not supported so an ADC credential must be present.

          $ gcloud init
          $ gcloud auth application-default login
      
      I would not use this with a dependency on Google Cloud, or the gcloud command line tool.

      Mainly because Google has horrible customer support.

      It would be more interesting if they came up with something hosted on third party infrastructure. Last I heard, Google Cloud is run by Oracle executives

      ---

      e.g. in particular the Unisuper incident led me to believe that a lot of operational stuff is being outsourced, and is of poor quality

      UniSuper members go a week with no account access after Google Cloud misconfig

      https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

      Google accidentally deleted a $125 billion pension fund's account

      https://qz.com/google-cloud-pension-fund-unisuper-1851472990

      I would not say this is unrelated, because operations in the underlying cloud can be a weak link in security

      Although I'd certainly be interested in an argument otherwise

    • esafak 2 days ago
      1. That is two years better than nothing.

      2. It will likely be well architected.

      3. It is open source, so others can fork it when Google abandons it. https://github.com/google/oss-rebuild

      • gostsamo 2 days ago
        > 1. That is two years better than nothing.

        This means two migrations in two years.

    • ezekg 2 days ago
      There are some initiatives to help change that: https://osspledge.com
      • blitzar 2 days ago
        Once there are as many private jets available for open source devs as there are for google employees then we are making progress.
        • ezekg 2 days ago
          I think that's the wrong way to frame it. OSS is not meant to make you rich, and expecting that is going to bring more pain than joy. However, I do think businesses should use their success to fund their dependencies in a way that makes sense for them.
          • pessimizer 2 days ago
            > businesses should use their success to fund their dependencies in a way that makes sense for them.

            They already do, and always have. It doesn't make any sense to most of them to fund their OSS dependencies at all, because they're available for free. They should do more than what makes sense for them, and they should have to pay professional consequences if they don't.

            Programmers should have enough unity to bring pressure against companies that make a lot of money from software they don't pay for. Or rather, should have had, because LLMs have changed anything.

  • ChrisArchitect 2 days ago
    Obligatory xkcd: Dependency https://xkcd.com/2347/

    This fit in somewhere here?

    • msuozzo 2 days ago
      Author here!

      OSS Rebuild should give that Nebraskan the peace of mind to continue their everyday heroism without being pulled away to set up security configs or debug release CI. The rest of the blocks on top can contribute the support to assure themselves and the community that those critical builds are trustworthy.