Creating a local Ubuntu mirror using rsync

2009年06月09日 编程开发 暂无评论 阅读 1 次

by Donald » Thu, 08 Nov 2007 @ 10:53am

The benefit of having a local mirror is that you can install any package without having to wait for long downloads. It is also helpful if you have to regularly maintain or install a lot of Ubuntu machines. This guide will show you how to create and maintain your own local Ubuntu mirror using rsync. Other options for package mirroring are apt-mirror, apt-proxy and debmirror.

Beware that hard disk usage for a Ubuntu mirror which holds only i386 architecture is over 120GB (ubuntu supports i386, amd64, powerpc and sparc architectures) and the initial sync to download all the packages can take days/weeks on a 512k streamyx connection. To figure out the time, simply divide 120GB by your connection speed. For those who live around the Kuching area, you are welcome to come copy my existing mirror if you bring a hard disk along.

1. Install rsync

sudo aptitude install rsync
wget -c http://www.debian.org/mirror/anonftpsync
mv anonftpsync anonftpsync-ubuntu

2, Configure the script. Here's my configuration. The script is well documented.

nano anonftpsync-ubuntu

Code: Select all
#! /bin/sh
set -e

# This script originates from http://www.debian.org/mirror/anonftpsync

# CVS: cvs.debian.org:/cvs/webwml - webwml/english/mirror/anonftpsync
# Version: $Id: anonftpsync,v 1.33 2007/09/12 15:19:03 joy Exp $

# Note: You MUST have rsync 2.6.4 or newer, which is available in sarge
# and all newer Debian releases, or at http://rsync.samba.org/

# Don't forget:
# chmod u+x anonftpsync

# Set the variables below to fit your site. You can then use cron to have
# this script run daily to automatically update your copy of the archive.

# TO is the destination for the base of the Debian mirror directory
# (the dir that holds dists/ and ls-lR).
# (mandatory)


# RSYNC_HOST is the site you have chosen from the mirrors file.
# (http://www.debian.org/mirror/list-full)
# (mandatory)
# (https://wiki.ubuntu.com/Mirrors)


# RSYNC_DIR is the directory given in the "Packages over rsync:" line of
# the mirrors file for the site you have chosen to mirror.
# (mandatory)


# LOGDIR is the directory where the logs will be written to
# (mandatory)


# ARCH_EXCLUDE can be used to exclude a complete architecture from
# mirrorring. Please use as space seperated list.
# Possible values are:
# alpha, amd64, arm, hppa, hurd-i386, i386, ia64, m68k, mipsel, mips, powerpc, s390, sh and sparc
# There is one special value: source
# This is not an architecture but will exclude all source code in /pool
# eg.
# ARCH_EXCLUDE="alpha arm hppa hurd-i386 ia64 m68k mipsel mips s390 sparc"
# With a blank ARCH_EXCLUDE you will mirror all available architectures
# (optional)

ARCH_EXCLUDE="amd64 powerpc sparc"

# EXCLUDE is a list of parameters listing patterns that rsync will exclude, in
# addition to the architectures excluded by ARCH_EXCLUDE.
# Use ARCH_EXCLUDE to exclude specific architectures or all sources
# --exclude stable, testing, unstable options DON'T remove the packages of
# the given distribution. If you want do so, use debmirror instead.
# The following example would exclude mostly everything:
# --exclude stable/ --exclude testing/ --exclude unstable/
# --exclude source/
# --exclude *.orig.tar.gz --exclude *.diff.gz --exclude *.dsc
# --exclude /contrib/ --exclude /non-free/
# "

# With a blank EXCLUDE you will mirror the entire archive, except the
# architectures excluded by ARCH_EXCLUDE.
# (optional)


# --exclude *.orig.tar.gz --exclude *.diff.gz
# "

# MAILTO is the address to send logfiles to;
# if it is not defined, no mail will be sent
# (optional)


# LOCK_TIMEOUT is a timeout in minutes. Defaults to 360 (6 hours).
# This program creates a lock to ensure that only one copy
# of it is mirroring any one archive at any one time.
# Locks held for longer than the timeout are broken, unless
# a running rsync process appears to be connected to $RSYNC_HOST.


# There should be no need to edit anything below this point, unless there
# are problems.


# If you are accessing a rsync server/module which is password-protected,
# uncomment the following lines (and edit the other file).

# . ftpsync.conf


# Check for some environment variables
if [ -z $TO ] || [ -z $RSYNC_HOST ] || [ -z $RSYNC_DIR ] || [ -z $LOGDIR ]; then
echo "One of the following variables seems to be empty:"
exit 2

if ! [ -d ${TO}/project/trace/ ]; then
# we are running mirror script for the first time
umask 002
mkdir -p ${TO}/project/trace

# Note: on some non-Debian systems, hostname doesn't accept -f option.
# If that's the case on your system, make sure hostname prints the full
# hostname, and remove the -f option. If there's no hostname command,
# explicitly replace `hostname -f` with the hostname.

HOSTNAME=`hostname -f`

# The hostname must match the "Site" field written in the list of mirrors.
# If hostname doesn't returns the correct value, fill and uncomment below
# HOSTNAME=mirror.domain.tld


# The temp directory used by rsync --delay-updates is not
# world-readable remotely. It must be excluded to avoid errors.
TMP_EXCLUDE="--exclude .~tmp~/"

# Exclude architectures defined in $ARCH_EXCLUDE
--exclude binary-$ARCH/
--exclude disks-$ARCH/
--exclude installer-$ARCH/
--exclude Contents-$ARCH.gz
--exclude Contents-$ARCH.diff/
--exclude arch-$ARCH.files
--exclude arch-$ARCH.list.gz
--exclude *_$ARCH.deb
--exclude *_$ARCH.udeb "
if [ "$ARCH" == "source" ]; then
--exclude source/
--exclude *.tar.gz
--exclude *.diff.gz
--exclude *.dsc "

# Logfile

# Get in the right directory and set the umask to be group writable
cd $HOME
umask 002

# Check to see if another sync is in progress
if [ -f "$LOCK" ]; then
# Note: this requires the findutils find; for other finds, adjust as necessary
if [ "`find $LOCK -maxdepth 1 -amin -$LOCK_TIMEOUT`" = "" ]; then
# Note: this requires the procps ps; for other ps', adjust as necessary
if ps ax | grep '[r]'sync | grep -q $RSYNC_HOST; then
echo "stale lock found, but a rsync is still running, aiee!"
exit 1
echo "stale lock found (not accessed in the last $LOCK_TIMEOUT minutes), forcing update!"
rm -f $LOCK
echo "current lock file exists, unable to start rsync!"
exit 1

touch $LOCK
# Note: on some non-Debian systems, trap doesn't accept "exit" as signal
# specification. If that's the case on your system, try using "0".
trap "rm -f $LOCK" exit

set +e

# First sync /pool
rsync --recursive --links --hard-links --times --verbose
$RSYNC_HOST::$RSYNC_DIR/pool/ $TO/pool/ >> $LOGFILE 2>&1

if [ 0 = $result ]; then
# Now sync the remaining stuff
rsync --recursive --links --hard-links --times --verbose --delay-updates --delete-after
--exclude "Archive-Update-in-Progress-${HOSTNAME}"
--exclude "project/trace/${HOSTNAME}"

LANG=C date -u > "${TO}/project/trace/${HOSTNAME}"
echo "ERROR: Help, something weird happened" | tee -a $LOGFILE
echo "mirroring /pool exited with exitcode" $result | tee -a $LOGFILE

if ! [ -z $MAILTO ]; then
mail -s "debian archive synced" $MAILTO < $LOGFILE fi savelog $LOGFILE >/dev/null

rm $LOCK

Do not modify the rest of the file. Save and quit.

3. Make the script executable

chmod u+x anonftpsync

4. Create the necessary directories

sudo mkdir /mnt/mirrorsite
sudo mkdir /mnt/mirrorsite/ubuntu
sudo mkdir /var/log/mirroring
sudo chown my-username my-username /mnt/mirrorsite/ubuntu
sudo chown my-username my-username /var/log/mirroring

5. Run the script and wait a long long time.

sh anonftpsync-ubuntu &

you can monitor progress of downloads by running

tail -f /var/log/mirroring/ubuntu-mirror.log

6. Using the mirror.

My mirror is served using http access. Install apache and create a link to /mnt/mirrorsite/ubuntu so that the mirror can be accessed using http://servername/ubuntu.

In order to avoid having to edit the typical lines in sources.list, I created a DNS entry on my server to point my.archive.ubuntu.com to my local IP. This is particularly useful for laptop users who move around so that you can update from the my.archive once outside the local network. The only lines needed to be changed in the sources.list are the security repo. For those who are extremely security conscious, you might skip this. Here's part of the sources.list

Code: Select all
# deb http://security.ubuntu.com/ubuntu gutsy-security main restricted
# deb-src http://security.ubuntu.com/ubuntu gutsy-security main restricted
# deb http://security.ubuntu.com/ubuntu gutsy-security universe
# deb-src http://security.ubuntu.com/ubuntu gutsy-security universe
# deb http://security.ubuntu.com/ubuntu gutsy-security multiverse
# deb-src http://security.ubuntu.com/ubuntu gutsy-security multiverse

deb http://my.archive.ubuntu.com/ubuntu gutsy-security main restricted
deb-src http://my.archive.ubuntu.com/ubuntu gutsy-security main restricted
deb http://my.archive.ubuntu.com/ubuntu gutsy-security universe
deb-src http://my.archive.ubuntu.com/ubuntu gutsy-security universe
deb http://my.archive.ubuntu.com/ubuntu gutsy-security multiverse
deb-src http://my.archive.ubuntu.com/ubuntu gutsy-security multiverse

7. Schedule daily updates of the mirror

crontab -e

Code: Select all
# m h dom mon dow command

05 04 * * * /full/path/to/anonftpsync-ubuntu

This will run the rsync script every day at 4.05am

Comments welcome



Copyright © 浩然东方 保留所有权利.   Theme  Ality 07032740