“I would never have thought you could trust a random unauthenticated person on the Internet to provide a big binary image for the OS platform your company uses to distribute its app. And maybe you can’t. But people do…”
See also Our Software Dependency Problem
For decades, discussion of software reuse was far more common than actual software reuse. Today, the situation is reversed: developers reuse software written by others every day, and the situation goes mostly unexamined.
Software dependencies carry with them serious risks that are too often overlooked. The shift to easy, fine-grained software reuse has happened so quickly that we do not yet understand the best practices for choosing and using dependencies effectively, or even for deciding when they are appropriate and when not.
Interesting, I wonder if there is a corresponding issue for cloud images. I know Amazon AWS has “community” (i.e. user-contributed) Amazon Machine Images that you can use as the starting point for your “Infrastructure as a Service” VMs, but I don’t know if there’s any kind of quality control.
I have no real experience with other IaaS tools, but it wouldn’t surprise me if they also had uninspected user-contributed starting images.
Docker does seem to have some tools for inspecting and labeling images, the “new” (since Dec. 2018) Docker Hub has “Docker Certified” images, and Docker in general has a “content trust” scheme and/or tool. Possibly these are premium tools.
Many years ago, I used to feel comfortable building everything off a Red Hat CD or similar image: supposing the organisation to have the competence and to value its reputation highly enough to do as good as job as it could, and also supposing that any subsequent reason to distrust a released image would make a big enough noise that I’d certainly become aware of it.
But now, taking into account the ‘left pad’ debacle, I’d say we’ve collectively abdicated our role in judging trustworthiness. It’s interesting that Google, for example, must internally pay a huge amount of attention to the trustworthiness of everything in their stack, as exposed as they are. See the second link in my post (which I’d forgotten to paste in…)
I still have a pretty high level of trust in signed, mainstream distro packages, more or less for the reasons you gave – the developers mostly know what they’re doing, and the user-base is pretty big, so issues will come up quickly, and even for bad bugs, workaround will likely become available.
Once you get farther afield, though, it definitely gets twitchier. NPM is conspicuous because they had that famous incident, but your random Ubuntu PPA, Conda Python, PIP, Ruby gems, Mac Ports, and various others could potentially do something similar.
Indeed they could! At minimum I’d like them to signed, so they are not modified in transit by three letter agencies. But also I’d like them to be filtered through someone or something with high reputation. I used to be quite keen on Turnkey Linux for this reason. But it might just be one coder in a shed.