Thursday, February 22, 2007

PostgreSQL RPM vs Debian Packages


I've been spoiled by Debian.

Today I had to install PostgreSQL into a server at work. It was a Red Hat EL 4 server, with the x86_64 kernel. Luckily -- I thought -- I found RPM packages made by the Command Prompt folks.

Downloading the RPMs was a bit of a pain. The PostgreSQL download service first redirects you to a page where you select a mirror, and when you do the link to the file you want is URL encoded, so it ends up looking like http://wwwmaster.postgresql.org/redir?ftp%3A%2F%2Fftp5.us.postgresql.org%2Fpub%2FPostgreSQL%2Fbinary%2Fv8.2.3%2Flinux%2Frpms%2Fredhat%2Frhel-es-4-x86_64%2Fpostgresql-8.2.3-1PGDG.x86_64.rpm

That looks really amateurish. C'mon folks, you can do much better than that for almost no extra effort (and I'll do the work). How many servers do you work on have a browser up and running? Or even a graphical interface? When I need to get things up on a server, I usually just use wget.

But with the urlencoded link to the file, I can't just paste that into the command. I had to open up a python interpreter, import urllib and urllib.unquote() the URL, then paste that into the command line. That's a lot of unnecessary work because of some (apparently) lazy programming on the PostgreSQL site.

So, I finally grab all the RPMs. I install them. Used to Debian, I expect now to have a functioning PostgreSQL server that I have to tune a little bit to get away from the conservative defaults.

Man, was I wrong.

The RPMs install PostgreSQL, but it doesn't work at all. It doesn't even perform the necessary initdb. The configuration files are not even in place. I had to rpm -ql | grep conf the postgresql package to find out where the config files were. They were in /usr/share/pgsql and they were just fully commented out samples.

So I grab the sample files, and copy them to /etc/sysconfig/pgsql, (which I also found by grepping the package listing) and remove the .sample suffix from each of them. I expected the package to have created /etc/postgresql or something similar, but sysconfig is Red Hat's thing (apparently at least, since Debian doesn't have it), so I let it slide, somewhat grudgingly.

I modify the config files to my liking. I try to start PostgreSQL. Ooops, the packages didn't run initdb, so I have to do that myself. Ok, it seems to have started. I try to get my application to connect and it fails with an ident error. I look at my pg_hba.conf file (it controls how clients connect to the server, hba stands for Host Based Access, I believe.)

I look at my pg_hba.conf and it looks fine, what gives? Well, apparently the config files have to be in /var/lib/pgsql, the directory where the PostgreSQL databases are stored. Now that doesn't make any sense. Configuration files are meant to be put in /etc, not /var, much less the PostgreSQL data directory.

I then modify the /etc/init.d/postgresql script and add -c config_file=/etc/sysconfig/pgsql/postgresql.conf to the line that starts the daemon. I then modify postgresql.conf to have the explicit path to the pg_hba.conf file.

Restart PostgreSQL. It finally works. What a work out! Here's what would have happened in Debian:

aptitude install postgresql-8.1. Apt grabs all the dependencies, install them, and I have a working PostgreSQL installation that I can connect to (locally). That's it.

The config files I may modify if I want to, are somewhere expected: /etc. And they are working files with the right names (i.e. without .sample suffixes that I have to remove)

Yes, I know I could have used yum or something like it to simplify the packages part. But it would still have left everything else to do, and that was the most time-consuming part. With Debian, everything is simpler. It just helps me get the job done faster and simpler. Because package owners have to adhere to a set of standards, packages behave very consistently, which further reduces guess work and helps me be more productive.

So instead of saying "Debian spoiled me", maybe I should have just said "Debian got me used to being productive". And it's hard going back to not being so productive.

5 comments:

  1. Linux is *so* easy to use I don't know why more people haven't adopted it for their only operating system

    ReplyDelete
  2. Now, now, we are talking about a *server* here, not your mamma's desktop.

    For setting up a fully-featured database server such as PostgreSQL, in Debian, that's pretty darn easy. Things are so much harder in Red Hat land.

    ReplyDelete
  3. Hmmm. Thinking about it in Windows land...

    Download the package, unzip, and double-click the exe, during which, you are asked a number of questions about what to do when, where and why. No wonder people just "Next","Next","Next" as fast as they can. PostgreSQL installed. Time to reboot the machine.

    Yup. Debian-based distros, taking advantage of apt, are *much* easier to install software than other package managers or operating systems.

    ReplyDelete
  4. You are comparing RPM versus APT, which is terribly wrong. Please compare RPM with dpkg, or compare apt with yum.

    So, if you need a yum repo, please use http://yum.pgsqlrpms.org for RPM installations of PostgreSQL and PostgreSQL related projects.

    We don't perform initdb -- it is intentional and users are warned against this.

    All distros use "ident sameuser" auth in prepackaged versions -- this is for security and IMHO there is no reason to change this behaviour.

    Also, we don't use /etc for PostgreSQL config files -- it is not PostgreSQL's default. Per documents, it is under $PGDATA. It is Debian that changes this behaviour.

    Sincerely,
    --
    Devrim GÜNDÜZ - PostgreSQL RPM Maintainer

    ReplyDelete
  5. Devrim,

    I *wasn't* comparing RPM to apt. I'm very aware of their differences. I was comparing the idiomatic, normal, *standard* usage of Debian packages versus RPM packages, and their usability at the time, from my point of view.

    As far as I know YUM is an *add-on* tool, not an official part of Red Hat, not sure it's part of CentOS. It's probably part of Fedora though.

    I wasn't given any warnings about the need to run initdb.

    Yes, the default for PG config is $PGDATA, but that's not the LSB default. If I was expecting all the defaults of building from scratch, I would have built from scratch, not looked for a binary package.

    The Debian packages:

    1) Put a *working* set configuration files (with decent defaults) under /etc, where I expect configuration files to go.

    If you prefer them to be under /var/lib/pgsql to stay close to the PG default, fine, but you could easily include symlinks from /etc/postgresql or something similar, therefore inclusing usability.

    2) Give me a working cluster by running initdb for me.

    3) Setup cron jobs to do regular, commonly-needed maintenance (a pg_maintenance script that runs vacuum analyze)

    4) Handle minor and major versions upgrades smoothly with scripts that know how to handle dumping, creating, removing, loading of clusters.

    In short, the Debian packages give me a much smoother and more productive experience by taking care of the defaults and giving me a working installation.

    Peter Eisentreut is part of the the team that packages PG for Debian, and I'm sure they're under some free software license.

    ReplyDelete