Moving from Xen Backports to Debian Etch Xen Packages - Attempt1

From Wiki

Jump to: navigation, search

Contents

Overview

With Debian sarge I used Xen from Debian Backports. Due to the DS3000 hardware I was using, only 1 particular kernel worked successfully, i.e. linux-image-2.6.16-1-xen-k7 as described here: XenDebianBackports#Now_-_Update.2C_Dist-upgrade_to_recieve_backports_and_get_packages.

With this Xen install from Debian Backports, I obtained a bug/problem when +4GB of data was transmitted/downloaded from a domU and the network interface would hang. More information here: Ongoing_Experiences_with_Xen#Conclusion_to_eth0_Crash.2FHang_at_4gb_TX

So when Debian Etch became stable last week, 8th of April, 2007, I wanted to move to using Xen from Etch. I wanted to remove the backport packages used for Xen and to apt-get install them from Debian Etch.

Summary

  • Moving to Debian Etch Xen did not work.
  • There was no console access to the server to debug what exactly was the problem.
  • There was only 2 choices of Xen Kernel from Debian Etch, neither worked.
  • The 2 choices were only: linux-image-xen-686 and linux-image-xen-vserver-686. There was no sign of a K7 Xen Etch Kernel :(
  • The Xen Kernel from Backports did not work with xen-utils from Etch.
  • Reboot using old Xen Kernel, install old packages from Sarge for Python to get xen-utils from Backports working.

Procedure

Recommended Step which I didnt realise

Backup all of /boot -- When xen-hypervisor is removed, the current xen kernel will be removed, providing no easy way back!
Backup all of /etc/xen

Remove Xen utils and Upgrade to Etch

dpkg -l | grep xen
//the following are the ones which need to be removed:
ii  xen-hypervisor-3.0-i386        3.0.2+hg9697-0bpo1              The Xen Hypervisor for i386
ii  xen-tools                      2.3-1~bpo.1                     Tools to manage debian XEN virtual servers
ii  xen-utils-3.0                  3.0.2+hg9697-0bpo1              XEN administrative tools
apt-get remove xen-hypervisor-3.0-i386 xen-tools xen-utils-3.0
dpkg -P //purge all packages above. Note this will remove some configs.

Edit /etc/apt/sources.list. Comment out the Backports entry: #deb http://www.backports.org/debian/ sarge-backports main

Perform apt-get update, apt-get upgrade and apt-get dist-upgrade. This seamlessly upgraded grub, udev and other old packages which were in backports.

Install New Xen from Etch

apt-get install xen-tools xen-hypervisor-3.0.3-1-i386 xen-utils-3.0.3-1 libc6-xen
//libc6-xen is to illeviate having to move /lib/tls to /lib/tls.disabled

I rebooted at this stage without touching the current k7 Xen kernel. It rebooted fine.

Xend would not start, giving the error:

[12:15:00 xend] ERROR (__init__:1072) Exception starting xend ((22, 'Invalid argument'))
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvDaemon.py", line 281, in run
    xinfo = xc.xeninfo()
error: (22, 'Invalid argument')
[12:15:00 xend] INFO (__init__:1072) Xend exited with status 1.

From reading this webpage: http://www.novaglobal.com.sg/?q=xenlinux I was led to believe that there was an incompatibility between xen-tools (etch) and the current xen kernel (backports). I proceeded to upgrade the Xen Kernel to Etch.

Install Xen Kernel from Etch

apt-cache search linux-image | grep xen

At first I didnt realise there was no Xen Kernel for my K7, e.g. linux-image-2.6.18-xen-k7 etc. I also found some discrepancies in apt-cache, and I changed the mirrors in /etc/apt/sources.list to make sure I was getting an accurate listing.

Sure enough there were only 2 Xen Kernels available:

twister:~# apt-cache search linux-image | grep xen
linux-headers-2.6.18-4-xen-686 - Header files for Linux 2.6.18 on i686
linux-headers-2.6.18-4-xen-vserver-686 - Header files for Linux 2.6.18 on i686
linux-image-2.6-xen-686 - Linux kernel 2.6 image on i686
linux-image-2.6-xen-vserver-686 - Linux kernel 2.6 image on i686
linux-image-2.6.18-4-xen-686 - Linux 2.6.18 image on i686
linux-image-2.6.18-4-xen-vserver-686 - Linux 2.6.18 image on i686
linux-image-xen-686 - Linux kernel image on i686
linux-image-xen-vserver-686 - Linux kernel image on i686

linux-image-2.6-xen-686 == linux-image-2.6.18-4-xen-686 == linux-image-xen-686

apt-get install linux-image-xen-686
//this installed, and put itself in grub, booting off /dev/sda2 instead of /dev/md0

I also edited /etc/initramfs-tools/modules, making sure raid1 was uncommented as described on: RAID_1_and_Xen_(dom0)#Add_support_for_RAID1_into_XEN_Boot_Module and made the "mkinitramfs -o .... again.

Anyways, long story short, it didnt boot. I had no console access. Log files in /var/log were untouched, so the system never did boot fully. I tried it without raid, and I tried:

apt-get install linux-image-xen-vserver-686

No joy :(

Further problems - corrupt /var/lib/dpkg/status

To add to the problem, booting in and out of the rescue system and not booting off /dev/md0, and chrooting into /mnt/ in the rescue system and apt-get installing the xen kernel there, the raid array fell out of sync, and there were errors on /dev/sda2. In the rescue system, I had to fsck both /dev/sda2 and /dev/sdb1, assemble the raid array, sync the raid array and make sure there were no problems before booting back into the main system. I also realised that when I exited out of chroot, umount /dev/md0 would not work. A lsof showed files still open on /mnt/. I tracked down the processes, such as atd, cron etc. and killed them. I safely unmounted /dev/md0 and rebooted into the main system using the old Xen Kernel from backports.

After fsck had finshed, /var/lib/dpkg/status had become corrupt :-/ ;-( Things went downhill. From reading: http://www.debian.org/doc/manuals/reference/examples/debian-package-database-rebuild and http://linuxmafia.com/faq/Debian/package-database-rebuild.html I done the following:

dselect
//I done a [U]pdate and an [I]nstall. It asked to whether to download and reinstall the packages. I instructed it to do so. I got an error in the process about debconf. The Install ended. I done:
apt-get install dpkg
dselect
//choose [I]nstall

The new system was rebuilt :)

The Road Back to Backports

dpkg -l | grep xen
//apt-get remove all xen-utils etc. etc.

apt-get install xen-utils-3.0/sarge-backports xen-hypervisor-3.0-i386/sarge-backports 
//didnt work. Python2.4 was too new :(
apt-get remove python  //and associated packages found via dpkg -l | grep python
dpkg -i libreadline4_4.3-11_i386.deb python2.3_2.3.5-3sarge2_i386.deb python_2.3.5-2_all.deb
//search for packages on www.debian.org and take them from sarge.
apt-get install xen-utils-3.0/sarge-backports
//worked, - finally :)

Problems

A reboot, came back up. No domU's were started :(

vi /etc/xen/xend-config.sxp
//make sure the following are the only ones uncommented:
(network-script network-bridge)
(vif-script vif-bridge)
(dom0-min-mem 128)

Still didnt work. Looking in /var/log/xend.log revealed:

brctl addif xenbr0 vif1.0 failed
Could not load /lib/modules/2.6.16-1-xen-k7/modules.dep: No such file or directory

I had apt-get install apt-get install linux-image-2.6.16-1-xen-k7 inside the rescue system :( I though that just the kernel was the result of this, but no, libraries had to be placed in /lib/modules/2.6.16-1-xen-k7

apt-get install linux-image-2.6.16-1-xen-k7
//fixed and added necessary files into /lib/modules/2.6.16-1-xen-k7

A reboot, and most of the domU's started. One didnt. It was using a file based disk and gave the error:

Error: Device 2049 (vbd) could not be connected. Backend device not found.

As on: http://www.debian-administration.org/articles/396#comment_17 I done the following:

modprobe loop

The file based domU started.

Problems with incrementing eth0; changing mac address, udev, xen and etch

Lastly on one of the domU's, I had recently upgraded it to Etch. It was rebooted previously and did work. However after going back to Xen with Backports, its network didnt work. The following error was seen on booting the domU:

Configuring network interfaces...SIOCSIFADDR: No such device
eth0: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
eth0: ERROR while getting interface flags: No such device
Failed to bring up eth0.

//also this failed when booted:
ifup eth0
eth0: ERROR while getting interface flags: No such device

After some googling, I found udev was at fault, and had instead put eth0 as eth1. Read more here:
http://lists.alioth.debian.org/pipermail/pkg-xen-devel/2007-March/001106.html
http://www.nabble.com/Ethernet-interface-numbering-in-etch-t3467984.html

vi /etc/network/interfaces
//change eth0 to eth1 and that was it.

However, if the domU is restarted, eth1 will increment again!!! The problem is that when a domU is restarted (i.e. xm shutdown && xm create) it gets a new mac address, and then udev on the domU thinks its a new network card, and increments eth*. Note: a "reboot" from within the domU will not cause xen to assign a new mac address and udev will not increment eth*.

Solution:

Specify a fixed mac address in /etc/xen/domains/vm01 of dom0.

vi /etc/xen/domains/name_of_vm
vif = ['mac=aa:00:00:7d:f8:77, bridge=xenbr0']

Then to remove all old traces of eth1 and eth2 and eth3 etc.:

vi /etc/network/interfaces
#change back to eth0
rm /etc/udev/rules.d/z25_persistent-net.rules
#the above file will be recreated to new by udev.

To test - reboot, xm shutdown etc. Please let me know if there is a better solution to this. Feel free to email me at: sburke [at] burkesys.com

Solution 2

===============
I've found a different solution to fix the udev problem. You can modify the udev rules to ignore xen network interfaces.

In /etc/udev/persistent-net-generator.rules simple add a GOTO directive behind SUBSYSTEM="xen":

SUBSYSTEMS=="xen", \
 ENV{COMMENT}="Xen virtual device", GOTO="persistent_net_generator_end"                                                                                                
Best regards                                                                                                                                                           Timo Stripf
Timo Stripf
===============

I tried this. It works fine. Its needed moreso for lenny which requires udev to be installed on domU's. See Upgrade_Xen_on_dom0.

Warning about MAC addresses

If you happen to use the same mac address for dom0 and a domU - then you will experience intermittent network connectivity. In implementing the above solution, I read the wrong z25_persistent-net.rules and duplicated the mac address. Dom0 lost network connectivity and was extremely intermittent at best. I realised my mistake and managed to do a poweroff of the offending domU and changed the mac address contained in th vif of the domU xen config.



Finally, the working kernel was:

title Xen 3.0 / XenLinux 2.6-k7
root            (hd0,1)
kernel          /boot/xen-3.0-i386.gz dom0_mem=128000
module          /boot/vmlinuz-2.6.16-1-xen-k7 root=/dev/md0 ro panic=10
module          /boot/initrd.img-2.6.16-1-xen-k7
savedefault     fallback
boot

Next Step

I will ask via forums on the server provider if there is any known issue with the hardware DS3000, and what Debian Etch Xen kernel works, and/or what configs are required. Feel free to email me comments etc. to sburke [at] burkesys.com

Completed Etch Xen Install How-To

Debian Etch and Xen was fully installed in the end. The PAE version was chosen. See Debian_Etch_Xen_Install for full details.

Personal tools