What are the pros/cons of deb vs. rpm?
For whatever reasons, I’ve always used RPM based distributions (Fedora, Centos and currently openSUSE). I have often heard it stated that deb is better than rpm, but when asked why, have never been able to get a coherent answer (usually get some zealous ranting and copious amounts of spittle instead).
I understand there may be some historical reasons, but for modern distributions using the two different packaging methods, can anybody give the technical (or other) merits of one vs. the other?
One thing I like about RPMs is the (recent?) addition of delta RPMs. This allows for easier updating, reducing bandwidth required.
DEBs are standard ar files (with more standard archives inside), RPMs are “proprietary” binary files. I personally think the former is more convenient.
Just two things I can think off the top of my head. Both are very comparable. Both have excellent tools for packaging. I don’t think there are so many merits for one over the other or vice versa.
RPM:
- ‘Standardized’ (not that there isn’t a deb spec)
- Used by many different distributions (but packages from one do not necessarily work on another)
- IIRC allows dependencies on files, not only on packages
DEB:
- Growing popularity
- Allows recommendations and suggestions (possibly newer RPM allows it as well)
Probably the more important question is the package manager (dpkg vs. yum vs. aptitude etc.) rather than the package format (as both are comparable).
Debian packages can include an installed size, but I don’t believe RPMs have an equivalent field. It can be computed based on files included in the package, but also can’t be relied upon because of actions that can be taken in the pre/post install scripts.
Here is a pretty good reference for comparison of some specific features that are available for each specific packaging format: http://debian-br.sourceforge.net/txt/alien.htm (according to the web server, that document is fairly old: Last-Modified: Sun, 15 Oct 2000 so this might not be the best reference.)
Main difference for a package maintainer (I think that would be ‘developer’ in Debian lingo) is the way package meta-data and accompanying scripts come together.
In the RPM world, all your packages (the RPMs you maintain) are located in something like ~/rpmbuild
. Underneath, there is the SPEC
directory for your spec-files, a SOURCES
directory for source tarballs, RPMS
and SRPMS
directories to put newly created RPMs and SRPMs into, and some other things that are not relevant now.
Everything that has to do with how the RPM will be created is in the spec-file: what patches will be applied, possible pre- and post-scripts, meta-data, changelog, everything. All source tarballs and all patches of all your packages are in SOURCES.
Now, personally, I like the fact that everything goes into the spec-file, and that the spec-file is a separate entity from the source tarball, but I’m not overly enthusiastic about having all sources in SOURCES. IMHO, SOURCES gets cluttered pretty quick and you tend to lose track of what is in there. However, opinions differ.
For RPMs it is important to use the exact same tarball as the one the upstream project releases, up to the timestamp. Generally, there are no exceptions to this rule. Debian packages also require the same tarball as upstream, though Debian policy requires some tarballs to be repackaged (thanks, Umang).
Debian packages take a different approach. (Forgive any mistakes here: I am a lot less experienced with deb’s that I am with RPM’s.) Debian packages’ development files are contained in a directory per package.
What I (think to) like about this approach is the fact that everything is contained in a single directory.
In the Debian world, it is a bit more accepted to carry patches in a package that are not (yet) upstream. In the RPM world (at least among the Red Hat derivatives) this is frowned upon. See “FedoraProject: Staying close to upstream projects”.
Also, Debian has a vast amount of scripts that are able to automate a huge portion of creating a package. For example, creating a – simple – package of a setuptool’ed Python program, is as simple as creating a couple of meta-data files and running debuild
. That said, the spec-file for such package in RPM format would be pretty short and in the RPM world, too, there’s a lot of stuff that is automated these days.
The openSUSE Build Service (OBS) and zypper are a couple of the reasons I prefer RPM over deb from a packager and user point of view. Zypper has come a long way and is pretty dang fast. OBS, although it can handle debs, is really nice when it comes to packaging rpms for various platforms such as openSUSE, SLE, RHEL, centos, fedora, mandriva, etc.
A lot of people compare installing software with apt-get
to rpm -i
, and therefore say DEB better. This however has nothing to do with the DEB file format. The real comparison is dpkg
vs rpm
and aptitude
/apt-*
vs zypper
/yum
.
From a user’s point of view, there isn’t much of a difference in these tools. The RPM and DEB formats are both just archive files, with some metadata attached to them. They are both equally arcane, have hardcoded install paths (yuck!) and only differ in subtle details. Both dpkg -i
and rpm -i
have no way of figuring out how to install dependencies, except if they happen to be specified on the command line.
On top of these tools, there is repository management in the form of apt-...
or zypper
/yum
. These tools download repositories, track all metadata and automate the downloading of dependencies. The final installation of each single package is handed over to the low-level tools.
For a long time, apt-get
has been superior in processing the enormous amount of metadata really fast while yum
would take ages to do it. RPM also suffered from sites like rpmfind where you would find 10+ incompatible packages for different distributions. Apt
completely hid this problem for DEB packages because all packages got installed from the same source.
In my opinion, zypper
has really closed the gap to apt
and there is no reason to be ashamed of using an RPM-based distribution these days. It’s just as good if not easier to use with the openSUSE build service at hand for a huge compatible package index.
There is also the “philosophical” difference where in Debian packages you can ask questions and by this, block the installation process.
The bad side of this is that some packages will block your upgrades until you reply. The good side of this is, also as a philosophical difference, on Debian based systems, when a package is installed, it is configured (not always as you’d like) and running. Not on Redhat based systems where you need to create/copy from /usr/share/doc/* a default/template configuration file.
From a system administrator’s point of view, I’ve found a few minor differences, mainly in the dpkg/rpm tool set rather than the package format.
-
dpkg-divert
makes it possible to have your own file displace the one coming from a package. It can be a lifesaver when you have a program that looks for a file in/usr
or/lib
and won’t take/usr/local
for an answer. The idea has been proposed, but as far as I can tell not adopted, in rpm. -
When I last administered rpm-based systems (which admittedly was years ago, maybe the situation has improved), rpm would always overwrite modified configuration files and move my customizations into
*.rpmsave
(IIRC). This has made my system unbootable at least once. Dpkg asks me what to do, with keeping my customizations as the default. -
An rpm binary package can declare dependencies on files rather than packages, which allows for finer control than a deb package.
-
You can’t install a version N rpm package on a system with version N-1 of the rpm tools. That might apply to dpkg too, except the format doesn’t change as often.
-
The dpkg database consists of text files. The rpm database is binary. This makes the dpkg database easy to investigate and repair. On the other hand, as long as nothing goes wrong, rpm can be a lot faster (installing a deb requires reading thousands of small files).
-
A deb package uses standard formats (
ar
,tar
,gzip
) so you can inspect, and in a pinch tweak) deb packages easily. Rpm packages aren’t nearly as friendly.
I think the bias comes not from the package format, but from the inconsistencies that used to exist in RedHat’s repositories.
Back when RedHat was a distribution (before the days of RHEL, Fedora, and Fedora Core), people would sometimes find themselves in “RPM Hell” or “dependency Hell”. This occurred when a repository would end up with a package that had a dependencies (several layers deep, usually) which were mutually exclusive. Or it would arise when two different packages had two mutually exclusive dependencies. This was a problem with the state of the repository, not with the package format. The “RPM Hell” left a distaste for RPM systems among some population of Linux users who had gotten burned by the problem.
For Debian Packages there is a large set of helper scripts, a consistent policy manual and at least one way of doing almost everything. Dependencies are handled very well and can be defined in very good granularity. Re-building packages is very easy with debian packages and well supported by the tools available.
As several responders said, it is not so much that a certain package format is clearly superior. Technically, they may be more or less comparable. From my perspective a lot of the differences, and why people prefer one over the other, have to do with:
- The philosophy of the original package design and the target audience
- The community size, and by extension, the quality and richness of the repositories
Philosophy:
In the Ubuntu/Debian/Mint/… world, users expect the installed package to “just work” once it is installed. This means that during installation, packages are expected to take care of everything needed to actually make them run well, including but not limited to:
- setting up needed or optional cron jobs
- setting up alternatives/aliases
- setting up startup/shutdown scripts
- including all needed configuration files with defaults that make sense
- keeping old versions of libraries and adding the right versioned symlinks to libraries (.so’s) for backward compatibility
- clean support for multi-arch (32 and 64 bit) binaries on same machine
and so on.
In the rpm world — admittedly this was the situation several years back, and it may have improved since then — I found myself having to run additional steps (e.g. chkconfig, enabling cron jobs) to actually make packages really work. This may be ok for sysadmins or people who are knowledgeable about Unix, but it makes newbie experiences suffer. Note that it is not that the RPM package format itself prevents this from happening, it is just that many packages are de-facto not “fully done” from the perspective of a newbie.
Community size, participation, and richness of repositories:
Since the ubuntu/debian/mint/… community is larger, more people are involved in packaging and testing software. I found the richness and quality of the repositories to be superior. In ubuntu I rarely, if at all, need to download source and build from it. When I switched from Red Hat to Ubuntu at home, the typical RHEL repo had ~3000 packages in it, while at the same time, ubuntu+universe+multiverse all available directly from any Canonical mirror, had ~30,000 packages (roughly 10x). Most of the packages I was looking for in RPM format, were not readily accessible via simple search and click in the package manager. They required switching to alternate repositories, search the rpmfind service web site etc. This, in most cases, rather than solve the problem, broke my installation by failing to restrict what dependencies can or cannot be upgraded correctly. I hit the “dependency hell” phenomenon, as described above by Shawn J. Goff.
In contrast in Ubuntu/Debian I found that I almost never need to build from source. Also because of:
- The Ubuntu fast (6 month) release cycle
- The existence of fully compatible PPAs which work out of the box
- The single source repositories (all hosted by Canonical) no need to search for alternative/complementary repos
- Seamless user experience from click to run
I never had to compromise on older versions of packages I cared about, even when they were not maintained by official (Canonical) developers. I never had to leave my favorite friendly GUI package manager to perform a convenient search by keyword, to find and install any package I wanted. Also, a few times I installed debian (non Canonical) packages on Ubuntu and they worked just fine, despite this compatibility not being officially guaranteed.
Note that this is not intended to start a flame war, it is just sharing my experience having used both worlds in parallel for several years (work vs home).
None of the other answers touch on these three fundamental differences with real consequences:
deb
files are basically justar
archives containing two compressed tarballsdeb
packages and thedpkg
system store your maintainer scripts as separate filesdpkg
andrpm
run the maintainer scripts in a different order during upgrades.
Together, these differences have made it much easier for me to fix problems caused by bad packages, and to make packages behave the way I need them to, on deb
-based systems than on rpm
-based systems, both as a system administrator and as a packager.
Because of #1, if I need to change a deb
file, I can trivially pop it open, make any changes I want, and repackage it, using standard tools which exist on most systems.
This includes changing/adding/removing any dependencies, or any of the package files, or any of the maintainer scripts, or changing the package version or name.
Because of #2, if there is a problem in the "remove" scripts installed by a package that is already installed, I can trivially fix it, using standard tools which exist on any system.
Because of #3, I can do some of those fixes just by releasing a new version of my package, because during upgrade, dpkg
runs the "pre-install" script of the new version of the package before the "post-remove" script of the old version.
This means that the surface area for violating the "recoverability principle" is smaller in deb
packages: more mistakes in an earlier version of the package can be recovered from with a new version.
And since modifying the package is so easy – the actual package-specific fiddling and knowledge overhead is tiny – it’s accessible to more people and takes less time and effort, with deb
files.