Setting up a mirror of Rufus.W3.Org RPM database
This page explain how to set-up a Web database for RPM packages similar to the one running on rpmfind.net . You should first get
acquainted on the mirroring principle described shortly on the mirroring proposal. However the setup should be
fairy simple:
Prerequisites
- You must of course have a Web server running, I suggest Apache the obvious choice for a
Linux machine, it's probably installed by default anyway.
- You should run a mirror of the RDF database available on ftp://rpmfind.net/linux/RDF . To help
boostraping the mirroring process it may prove more efficient to fetch
first a compressed archive of
the whole RDF tree and expand it. Note that you don't need to mirror
the full tree, you can select to prune some of the subtrees (but do not
break the overall structure !). I suggest using rsync to do the mirroring. Another
alternative is to use mirror-2.8 perl
script, but it's somewhat more difficult to set-up.
- You should get a recent copy of rpm2html, you can grab an rpm for example :-)
(the version must be >= 0.90, and it's generally a good idea to follow
closely the releases), install it.
- Of course, you need disk space, currently the RDF tree requires
1.3GBytes while the full HTML tree built consumes nearly 4GBytes.
- Subscribe to the rpm2html mailing-list, send a mail to [email protected] with the
line
subscribe rpm2html
in the body of the message. The list archive are
on-line.
Setting up the mirror
You need to replicate the RDF database available on ftp://rpmfind.net/linux/RDF .
The simplest is to use rsync, the command is simply
rsync -az --delete rpmfind.net::RDF /linux/RDF
I you want to keep the metadata mirror under /linux/RDF. Note also that I am
interested in people providing HTTP access to metadata so on a standard linux
setup /home/httpd/html/linux/RDF would be even better !
Instead, if you want to use mirror, basically install it (this is a set of
perl scripts dedicated to the job of mirroring FTP sites), and add to the
default configuration (usually named mirror.defaults) an entry for the RDF
repository. Just add the following lines at the end of your
mirror.defaults:
package=rdf
������� site=rpmfind.net
������� remote_dir=/linux/RDF
������� local_dir=/home/httpd/html/linux/RDF
������� remote_user=anonymous
������� remote_password=me@machine RDF mirroring
Try it by launching "mirror -d -p rdf" and check for possible problems.
Setting up the rpm2html config file
I suggest grabbing my
existing config file and modify it, this is a bit painful, but hopefully
has to be done only once:
Modify the Global section
- Change the maint and mail values to reflect your name and
prefered E-mail address for feedback
- Change the dir path to the actual directory where the HTML file
have to be produced (something like /home/httpd/html/RPM if you use the
standard apache setup). This has to be in your server exported space and
the tree may grow to 200 MBytes so check first that you have
enought space !
- Change url to the prefix to access teh pages on your HTTP server.
For example if you are serving them from /home/httpd/html/RPM, the
full URL to access them is http://my.server.org/RPM and the correct
value would be : url=/RPM .
- Remove any rdf=true or rdf_dir=/linux/RDF if present,
those are used on rufus to create the .rdf files from the .rpm ones. You
don't need them on a mirror.
Modify each Directory section
After the global section, the config file is a list of directory specific
informations, usually related to one specific distribution. The goal here is
to adapt it to your local filesystem and point to the local FTP mirrors (for
example, you wouldn't point directly to RedHat site but to one of the mirrors
in your area). You may drop some for the directories of you are too tight on
space or if there is no near mirror for this specific distribution. Let's
examine one entry:
- [/linux/RDF/redhat/5.0/i386] :� change /linux
to the actual location on your disk for the mirror, e.g.:
[/home/ftp/pub/mirror/redhat/5.0/i386]
- name=RedHat-5.0 for i386 : You probably don't have to change
the name of the distribution, unless you want to translate it.
- subdir=redhat/5.0/i386 : local path, don't change it !
- ftp=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS
: The origin server for the packages, don't change it !
- ftpsrc=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/SRPMS : The
origin server for the sources, you may want to point to a near server
providing the sources RPMs.
- color=#ffe0ff: Color code for this distribution, you can change
that but avoid giving nearly the same color for two different
distribution.
- mirror=ftp://rpmfind.net/linux/redhat/redhat-5.0/alpha/RedHat/RPMS
: The first nearest mirror, customize to reduce the bandwidth
traffic (don't reference rufus server if you are located in Australia
!).
- mirror=ftp://ftp.redhat.com/pub/redhat/redhat-5.0/alpha/RedHat/RPMS
: additionnal mirrors may be added, rpm2html currently don't use
this feature, but will in a near future ...
Note that if you changed the configuration file for an existing setup,
you need to pass the -force option to rpm2html to ensure that all the pages
are updated.
Run rpm2html
Try it:
rpm2html config.rpm2html.mirrors
Check for error messages, indicating path or directory rights problems,
then point your favorite browser to the Web pages and ensure that the links
generated internally are correct, as well as the outside links to the actual
RPM mirrors.
�
Automate the process
Add the mirror command to update the RDF directory and the call to rpm2html to
your crontab. Note that rpm2html never clean up old pages generated but
no more accurate, you need to add this to your cron job before running
rpm2html:
- 0 4 * * * /usr/local/lib/mirror/mirror
- 30 6 * * * find /serveur/WWW/public/linux/RPM -not -type d -mtime +15
-exec rm {} \; ; /usr/bin/rpm2html -q
/usr/share/rpm2html.config.mirrors
Announce it and register
Once you have a working setup, it would be cool to announce it to the rpm2html mailing-list, and to your
local linux users group Don't forget to give location (country, state)
information as well as the dataset indexed if you don't run the full archive.
this has to be shared ! Contact me if you
want to localize the output of rpm2html, it's not that hard !
�
Daniel Veillard
$Id: mirror.html,v 1.10 2001/07/17 22:50:09 veillard Exp $
�