Back to Hastur
- Debian installation
- Serial Console
- Kernel Customization
- Cross Compiler
- RAID Configuration
- RAID Benchmarks
- mdadm --detail --scan --verbose >> /etc/mdadm/mdadm.conf
- mdadm --create --metadata=1.2 --verbose --chunk 128 --level=5 --raid-devices=5 --spare-devices=1 /dev/md3 /dev/sd{b..g}3
- cryptsetup -c aes-lrw-benbi:sha256 -s 256 luksFormat /dev/md3
- cryptsetup luksOpen /dev/md3 crypt-md3
- mkfs.xfs -f -d sunit=128,swidth=640 /dev/mapper/crypt-md3
- mount -t xfs /dev/mapper/crypt-md3 /mnt/md3
- bonnie++ -f -d /mnt/md3 -s 4024 -n 0 -u root
- echo 'deb http://www.backports.org/debian etch-backports main contrib non-free' >> /etc/apt/sources.list
- apt-get update
- apt-get -t etch-backports install foo
- cat >> /etc/profile
- dpkg -i /usr/src/linux-image-2.6.31-pmp_hastur.1.0_amd64.deb
- fails out of space on /lib!
- Extend home-root by 512GB, leave the rest unused
- Redo kernel upgrade
- resume udev install (with some deps including gcc4.4-base)
- check anyevent-perl deps
- backup
- remove
- extract backup and merge configs
- restart
- Stop and backup
- Drop default 8.4 cluster
- Upgrade 8.3 to 8.4
- check the git staging area and remove unwanted stuff
- Log
Debian installation
After mininst CD installation
# vim /etc/apt/sources.list
#comment out cdrom entry
#deb cdrom:[....
#add multimedia repos
deb http://debian-multimedia.fx-services.com/ stable main
deb-src http://debian-multimedia.fx-services.com/ stable main
EOF
#
Configure network
# ifdown eth0
# vim /etc/network/interfaces
#replace dhcp with static
#iface eth0 inet dhcp
iface eth0 inet static
address $IP
netmask $NETMASK
gateway $GATEWAY_IP
EOF
# ifup eth0
Update, install SSH
# apt-get install ssh
# apt-get install iproute
# apt-get install bzip2
# apt-get install hdparm
Install SSH keys
hastur$ mkdir ~/.ssh
hastur$ chmod go-rwx ~/.ssh
other$ scp ~/.ssh/authorized_keys me@hastur:~/.ssh/
Secure SSH Daemon
# vim /etc/ssh/sshd_config
PermitRootLogin no
AllowUsers me
PasswordAuthentication No
EOF
# /etc/init.d/ssh restart
Serial Console
2007-10-02: Initial config 2013-10-31: Boot console, sulogin and fstab fixes
Configure serial console
Most serial console guides don't cover setting the serial console for fsck recovery at boot time.
When the fsck fails at boot sulogin is run (the prompt is
"Enter root password or Ctrl-D to continue"
or similar) on the default console only (console or tty0).
sysvinit and /etc/inittab
Enable console on /dev/ttyS0 In /etc/inittab set
- single user sulogin tty
- z6 emergency fallthrough (if it exists)
-
getty on ttyS0
~~:S:wait:/sbin/sulogin /dev/ttyS0 ... z6:6:respawn:/sbin/sulogin /dev/ttyS0 ... T0:23:respawn:/sbin/getty -L ttyS0 115200 vt100
Set default console in sysvinit settings (/etc/default/rcS)
CONSOLE=/dev/ttyS0
This is used through the init.d files when sulogin is called.
Allow root login
# vim /etc/securetty
ttyS0
Test serial console
# kill -s SIGHUP 1
All further work can now be completed over serial console and SSH.
Grub1
In /boot/grub/menu.lst:
- Set serial config
- Set terminal config
- Append to the kernel kopt line (including the #)
e.g.:
serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1
terminal console serial
Append kernel options for serial console to # kopt=root=... line e.g.
# kopt=root=/dev/mapper/hastur-root ro console=ttyS0,115200n8 console=tty0
Grub2
In /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200n8 console=tty0"
GRUB_CMDLINE_LINUX="console=ttyS0,115200n8 console=tty0"
# Uncomment to disable graphical terminal (grub-pc only)
GRUB_TERMINAL=serial
GRUB_SERIAL_COMMAND="serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1"
Then regenerate the config
grub-mkconfig
fstab
To assist in avoiding unnecessary boot failures, ensure the fs_passno field in fstab is set connectly for all filesystems.
The sixth and final field in each fstab line determines whether and in which order the filesystem is checked at boot.
: Do not fsck
1 : Root filesystem
2-n : All other filesystems
Patch fstab to disable fsck of raid array started manually after boot
-/dev/vg-md6/home /mnt/md6-home ext4 defaults,noatime,nosuid,noauto,acl 0 3
+/dev/vg-md6/home /mnt/md6-home ext4 defaults,noatime,nosuid,noauto,acl 0 0
Kernel Customization
Install kernel build tools
# apt-get install kernel-package ncurses-dev fakeroot wget bzip2
Get and extract kernel source and Tejun's libata patch
$ wget http://home-tj.org/files/libata-tj-stable/libata-tj-2.6.22.1-20070808.tar.bz2
$ wget http://www.eu.kernel.org/pub/linux/kernel/v2.6/linux-2.6.22.1.tar.bz2
$ tar -xjvf linux-2.6.22.1.tar.bz2
$ tar -xjvf libata-tj-2.6.22.1-20070808.tar.bz2
Kernel Config
CONFIG_MCORE2=y # set in place of generic x86_64
CONFIG_NR_CPUS=4 # set in place of 32 to save memory
Patch and configure kernel
$ cd linux-2.6.22.1
$ cp /boot/config-2.6.18-5-amd64 linux-2.6.22.1/.config
$ make oldconfig
$ patch -p1 < ../libata-tj-2.6.22.1-20070808/combined.patch
$ make menuconfig # check config
$ export CONCURRENCY_LEVEL=4 # Quad-core, don't use -j
$ make-kpkg clean
$ fakeroot make-kpkg --initrd --revision=libata.1.0 kernel_image
Needs the --initrd to generate an initrd image for booting from LVM
Install kernel
# dpkg -i linux-image-2.6.22.1-pmp_libata.1.0_amd64.deb
Updates GRUB automagically
Module Autoloading
Load DVB module for Hauppauge Nova-T
# echo "cx88_dvb # DVB support for Hauppauge Nova-T" >> /etc/modules
Cross Compiler
(Don't remember why I needed this)
RAID Configuration
2007-10-03
Create test RAID array
# apt-get install mdadm xfsprogs bonnie++
# for dev in {b..g} ; do echo ",125,fd" | sfdisk /dev/sd$dev ; done
# mdadm --create --verbose /dev/md0 --level=0 --raid-devices=6 /dev/sd{b..g}1
Partition the disks
# cat > sfdisk.format
,125,fd
,12450,fd
,,fd
EOF
# for dev in {b..g} ; do cat sfdisk.format | sfdisk /dev/sd$dev ; done
Create [RAID0][63][?][63] for swap
# mdadm --create --verbose /dev/md0 --level=0 --raid-devices=6 /dev/sd{b..g}1
# mkswap /dev/md0
# swapon /dev/md0 -p0
Create [RAID10][64][?][64] for database
# mdadm --create --verbose /dev/md1 --level=10 --raid-devices=6 /dev/sd{b..g}2
# mkfs.xfs -f /dev/md1
Create [RAID5][65][?][65] for general data
# mdadm --create --verbose /dev/md5 --level=5 --raid-devices=5 --spare-devices=1 /dev/sd{b..g}3
# mkfs.xfs -f /dev/md5
Install SMART mon
# apt-get install smartmontools
Tested various RAID configurations. Seems /dev/sdc is broken.
RAID Benchmarks
http://linux-ata.org/faq.html - setting and checking NCQ
md0, raid 0, 6 disks
- md1, raid 10, 6 disks, stripe of 3 mirrored pairs
- md5, raid 5, 5 disks + 1 hot spare
Setup: 4 Seagate, 2 Samsung. XFS with default options. 2.6.22.1. NCQ 31/32. SATA PM through 2 [SATA300][66][?][66] channels. 3 disks multiplexed per channel.
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Test | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | md0 | 4024M | | | 160706 | 15 | 67128 | 7 | | | 116615 | 6 | 392.9 | | | md1 | 4024M | | | 80089 | 10 | 53176 | 6 | | | 117342 | 7 | 166.1 | | | md5 | 4024M | | | 55785 | 8 | 34142 | 4 | | | 82070 | 5 | 318.3 | |
Then md5 with varying NCQ depths
# for depth 1 8 31; do
for dev in {b..g} ; do
echo $depth > /sys/block/sd$dev/device/queue_depth;
done;
bonnie++ -f -d /mnt/md5 -s 4024 -n 0 -u root | tee ~/bonnie.raid5.ncq=$depth.out;
done
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Test | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | depth=1 | 4024M | | | 60390 | 9 | 33056 | 3 | | | 73552 | 4 | 311.3 | | | depth=8 | 4024M | | | 53196 | 8 | 33107 | 3 | | | 83029 | 5 | 311.4 | | | depth=31 | 4024M | | | 52550 | 8 | 34127 | 4 | | | 81684 | 4 | 306.5 | |
Without PM
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Test | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | md0 | 4024M | | | 352560 | 33 | 154813 | 19 | | | 315842 | 19 | 669.9 | 1 | | md1 | 4024M | | | 222888 | 31 | 72417 | 9 | | | 170133 | 12 | 776.2 | 1 | | md5 | 4024M | | | 171088 | 28 | 68525 | 9 | | | 271605 | 20 | 641.8 | |
2007-10-09
Finish RAID configuration
Optimize
# blockdev --setra 4096 /dev/md0 # default 1536
# blockdev --setra 3072 /dev/md1 # default 768
# blockdev --setra /dev/md5 #
Deprecated. Proper read-ahead testing done later. References suggest optimal config is 0 on all layers except the top-layer (dmcrypt).
Post-optimization benchmarks
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Test | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | md0 | 4024M | | | 188582 | 18 | 86119 | 10 | | | 179046 | 10 | 481.6 | | | md1 | 4024M | | | 95734 | 12 | 49719 | 6 | | | 127244 | 7 | 482.1 | | | md5 | 4024M | | | | | | | | | | | | |
Now rearrange md0 to alternate [PMs][67][?][67]
# mdadm --stop /dev/md{0,1,5}
# mdadm --create --verbose /dev/md0 --level=0 --raid-devices=6 /dev/sd{b,e,c,f,d,g}1
# mdadm --create --verbose /dev/md1 --level=10 --raid-devices=6 /dev/sd{b,e,c,f,d,g}2
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Test | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | md0 | 4024M | | | 192926 | 18 | 86918 | 11 | | | 180066 | 11 | 470.8 | | | md1 | 4024M | | | 97683 | 12 | 50525 | 6 | | | 120018 | 8 | 480.4 | | | md5 | 4024M | | | | | | | | | | | | |
Or alternately
# mdadm --create --verbose /dev/md1 --level=10 --raid-devices=6 /dev/sd{b,d,f,c,e,g}2
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Test | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | md1 | 4024M | | | 96439 | 12 | 50751 | 6 | | | 118106 | 7 | 463.7 | |
Save to mdadm.conf
-
http://dev.riseup.net/grimoire/storage/software-raid/#updating_mdadmconf
mdadm --detail --scan --verbose >> /etc/mdadm/mdadm.conf
Disk Encryption
# apt-get install dmsetup
Filesystem Using /dev/md0 to test on
# time dd if=/dev/urandom of=/dev/md0 bs=10240k
real 16m23.806s
user 0m0.004s
sys 15m40.363s
# apt-get install cryptsetup hashalot
Create the encrypted partition
# cryptsetup --verbose --verify-passphrase luksFormat /dev/md0
WARNING!
========
This will overwrite data on /dev/md0 irrevocably.
Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase: not my real passphrase
Verify passphrase: not my real passphrase
Command successful.
Now open it
# cryptsetup luksOpen /dev/md0 crypt-md0
Enter LUKS passphrase: not my real passphrase
key slot 0 unlocked.
Command successful.
Create a filesystem, mount it
# mkfs.xfs /dev/mapper/crypt-md0
# mount /dev/mapper/crypt-md0 /mnt/md0
# bonnie++ -f -d /mnt/md0 -s 4024 -n 0 -u root
Clean up
# umount /mnt/md0
# cryptsetup luksClose crypt-md0
aes-x86_64 - load the module
# rmmod aes
# modprobe aes-x86_64
Setup and Benchmark
# cryptsetup -c aes-cbc-essiv:sha256 luksFormat /dev/md0
# cryptsetup luksOpen /dev/md0 crypt-md0
# mkfs.xfs /dev/mapper/crypt-md0
# mount /dev/mapper/crypt-md0 md0
# bonnie++ -f -d /mnt/md0 -s 4024 -n 0 -u root
# umount /mnt/md0
# cryptsetup luksClose crypt-md0
Next, try experimental LRW block mode
# modprobe lrw
# cryptsetup -c aes-lrw-benbi:sha256 -s 256 luksFormat /dev/md0
# cryptsetup luksOpen /dev/md0 crypt-md0
# mkfs.xfs /dev/mapper/crypt-md0
# mount /dev/mapper/crypt-md0 md0
# bonnie++ -f -d /mnt/md0 -s 4024 -n 0 -u root
# umount /mnt/md0
# cryptsetup luksClose crypt-md0
For twofish:
# cryptsetup -c twofish-cbc-essiv:sha256 luksFormat /dev/md0
# cryptsetup luksOpen /dev/md0 crypt-md0
# mkfs.xfs /dev/mapper/crypt-md0
# mount /dev/mapper/crypt-md0 md0
# bonnie++ -f -d /mnt/md0 -s 4024 -n 0 -u root
# umount /mnt/md0
# cryptsetup luksClose crypt-md0
Twofish-x86_64 - load the module
# rmmod twofish
# modprobe twofish-x86_64
Setup and Benchmark
# cryptsetup -c twofish-cbc-essiv:sha256 luksFormat /dev/md0
# cryptsetup luksOpen /dev/md0 crypt-md0
# mkfs.xfs /dev/mapper/crypt-md0
# mount /dev/mapper/crypt-md0 md0
# bonnie++ -f -d /mnt/md0 -s 4024 -n 0 -u root
# umount /mnt/md0
# cryptsetup luksClose crypt-md0
Encryption Benchmarks
md0, XFS
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Test | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | aes1 | 4024M | | | 169349 | 17 | 46817 | 8 | | | 114262 | 13 | 394.4 | | | aes-642 | 4024M | | | 163673 | 17 | 46350 | 8 | | | 115287 | 13 | 398.4 | | | aes-lrw-643 | 4024M | | | 174988 | 18 | 49748 | 9 | | | 115869 | 13 | 400.7 | | | twofish4 | 4024M | | | 140027 | 14 | 40582 | 7 | | | 109515 | 12 | 368.4 | | | twofish-645 | 4024M | | | 159518 | 16 | 43404 | 7 | | | 112616 | 13 | 403.9 | |
1 aes-cbc-essiv:sha256 128-bit key ⇑
2 aes-cbc-essiv:sha256, 128-bit key, 64-bit ⇑
3 aes-lrw-benbi:sha256, 256-bit key, 64-bit ⇑
4 twofish-cbc-essiv:sha256 128-bit key ⇑
5 twofish-cbc-essiv:sha256, 128-bit key, 64-bit ⇑
RAID Configuration - Take 2
Random data
# for dev in /dev/sd{b..g} ; do dd if=/dev/urandom of=$dev bs=1024k & done
With port multipliers, may be faster this way:
# for dev in /dev/sd{b..d} ; do dd if=/dev/urandom of=$dev bs=1024k ; done &
# for dev in /dev/sd{e..g} ; do dd if=/dev/urandom of=$dev bs=1024k ; done &
Get progress reports:
# # set delay, finished flag, get current tty device
# delay=3 ; finished=0; tty=`tty | cut -d/ -f3-`
# # get progress reports from dd
# while (( ! $finished )) ; do pkill -USR1 -t $tty dd ; finished=$? ; sleep $delay ; done
Much faster to do this in parallel on the raw disks, not through the raid devices.
To kill the dd's:
# pkill -t $tty dd
Partition
2007-10-12
Then
- Repartition
- Recreate raid arrays
Partitioning scheme:
Seagate 500GB = 500106780160 bytes = 476938.9917 [MiB][78][?][78] = 465.760734 [GiB][79][?][79] Samsung 500GB = 500107862016 bytes = 476940.0234 [MiB][78][?][78] = 465.761742 [GiB][79][?][79]
md | Start | End | Blocks | Raid Size | Partition | Notes |
---|---|---|---|---|---|---|
md0 | 0M | 749M | 4.5GB | Swap | Separate so crypto can be random | |
md2 | 750M | 50G | 300GB | [RAID0][63][?][63] | /var,/tmp | |
md3 | 50G | 465G | 1660GB | [RAID5][65][?][65] | Everything else |
Partition the disks, use sfdisk [MiB][78][?][78] format
# cat > sfdisk.format
,750,fd
,51200,fd
,,fd
EOF
# for dev in {b..g} ; do cat sfdisk.format | sfdisk -uM /dev/sd$dev ; done
md0 - [RAID0][63][?][63] - swap
Create [RAID0][63][?][63] for swap
# mdadm --create --metadata=1.2 --verbose --level=0 --raid-devices=6 /dev/md0 /dev/sd{b..g}1
Edit /etc/crypttab and /etc/fstab
# echo "/dev/mapper/md0-swap /dev/md0 /dev/random swap" >> /etc/crypttab
# echo "/dev/mapper/md0-swap none swap sw 0 0" >> /etc/fstab
Do first initialization manually
# cryptsetup -s 128 create --key-file /dev/random md0-swap /dev/md0
# mkswap /dev/mapper/md0-swap
# swapon /dev/mapper/md0-swap -p0 # set higher priority
Don't think chunk size matters for [RAID0][63][?][63]. [RAID5][65][?][65] must be carefully tuned however.
md2 - [RAID0][63][?][63]
Create [RAID0][63][?][63] for general use
# mdadm --create --metadata=1.2 --verbose --level=0 --raid-devices=6 /dev/md2 /dev/sd{b..g}2
# cryptsetup -c aes-lrw-benbi:sha256 -s 256 luksFormat /dev/md2
WARNING!
========
This will overwrite data on /dev/md2 irrevocably.
Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase: (enter short password, for testing)
The passphrase is only for testing chunk-size performance. Later, we'll remove the passphrase and replace it with random key material stored on a USB token.
Edit /etc/crypttab and /etc/fstab
# echo "/dev/mapper/crypt-md2 /dev/md2 none luks" >> /etc/crypttab
# echo "/dev/mapper/crypt-md2 /mnt/md2 xfs defaults,noatime,noexec,noauto 0 3" >> /etc/fstab
# cryptsetup luksOpen /dev/md2 crypt-md2
# mkfs.xfs -f -d sunit=16,swidth=96 /dev/mapper/crypt-md2
# mount -t xfs /dev/mapper/crypt-md2 /mnt/md2
# bonnie++ -f -d /mnt/md2 -s 4024 -n 0 -u root
swidth = sunit × num-raid-devices
Chunk size benchmarks
Cleanup
# umount /mnt/md2
# cryptsetup luksClose crypt-md2
# mdadm --stop /dev/md2
Chunk size
# mdadm --create --metadata=1.2 --verbose --chunk 128 --level=0 --raid-devices=6 /dev/md2 /dev/sd{b..g}2
# cryptsetup -c aes-lrw-benbi:sha256 -s 256 luksFormat /dev/md2
# cryptsetup luksOpen /dev/md2 crypt-md2
# mkfs.xfs -f -d sunit=16,swidth=96 /dev/mapper/crypt-md2
# mount -t xfs /dev/mapper/crypt-md2 /mnt/md2
# bonnie++ -f -d /mnt/md2 -s 4024 -n 0 -u root
etc...
[RAID0][63][?][63]
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Chunk-size | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | 64 | 4024M | | | 166964 | 18 | 48156 | 9 | | | 112035 | 13 | 369.9 | | | 128 | 4024M | | | 170101 | 18 | 44732 | 9 | | | 93034 | 11 | 388.1 | | | 256 | 4024M | | | 168815 | 18 | 43214 | 8 | | | 89604 | 10 | 410.9 | |
Stick to 64k chunk size.
md3 - [RAID5][65][?][65]
Create [RAID5][65][?][65] for general use
-
Test various chunk sizes
mdadm --create --metadata=1.2 --verbose --chunk 128 --level=5 --raid-devices=5 --spare-devices=1 /dev/md3 /dev/sd{b..g}3
cryptsetup -c aes-lrw-benbi:sha256 -s 256 luksFormat /dev/md3
cryptsetup luksOpen /dev/md3 crypt-md3
mkfs.xfs -f -d sunit=128,swidth=640 /dev/mapper/crypt-md3
mount -t xfs /dev/mapper/crypt-md3 /mnt/md3
bonnie++ -f -d /mnt/md3 -s 4024 -n 0 -u root
swidth = sunit × num-raid-devices
Edit /etc/crypttab and /etc/fstab
# echo "/dev/mapper/crypt-md3 /dev/md3 none luks" >> /etc/crypttab
# echo "/dev/mapper/crypt-md3 /mnt/md3 xfs defaults,noatime,noexec,noauto 0 3" >> /etc/fstab
sunit and swidth
mkfs.xfs can't work out sunit and swidth from a dmcrypt device. So run mkfs.xfs on the md device first and use the values it calculates there when running mkfs.xfs on the dmcrypt device.
# mkfs.xfs -f /dev/md3
meta-data=/dev/md3 isize=256 agcount=32, agsize=13599264 blks
= sectsz=4096 attr=0
data = bsize=4096 blocks=435176448, imaxpct=25
= sunit=16 swidth=80 blks, unwritten=1
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=32768, version=2
= sectsz=4096 sunit=1 blks
realtime =none extsz=327680 blocks=0, rtextents=0
Note that the log size is 128MB. ( bsize × blocks = 4K × 32×2^10 = 128M)
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | Chunk-size | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | 64 | 4024M | | | 50391 | 7 | 28868 | 6 | | | 88188 | 11 | 302.0 | | | 128 | 4024M | | | 45829 | 6 | 28207 | 5 | | | 77503 | 9 | 294.1 | | | 256 | 4024M | | | 37982 | 5 | 27898 | 5 | | | 70849 | 9 | 313.4 | |
Readahead and stripe cache size
Chunk=256
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | RA1 | SC2 | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | 256 | xxx | 4024M | | | 36932 | 5 | 28244 | 6 | | | 69821 | 8 | 300.9 | | | 4096 | xxx | 4024M | | | 36161 | 5 | 28999 | 4 | | | 111310 | 11 | 306.8 | | | 4096 | 4096 | 4024M | | | 76893 | 11 | 40381 | 6 | | | 111537 | 10 | 282.5 | |
Chunk=64
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | RA | SC | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | 4096 | 4096 |
Here I got fed up with PM and directly connected the drives.
# mkfs.xfs -f -d sunit=16,swidth=80 /dev/mapper/crypt-md3
chunk=64, bsize=4k, sunit=16, swidth=80
| Version 1.03 | Sequential Output | Sequential Input | Random | || | | Per Chr | Block | Rewrite | Per Chr | Block | Seeks | | RA | SC | Size | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | K/sec | %CP | /sec | %CP | | 4096 | 4096 | 4024M | | | 196838 | 26 | 73407 | 11 | | | 238919 | 22 | 402.0 | |
1 read-ahead ⇑
2 stripe cache size ⇑
echo 4096 > /sys/block/md3/md/stripe_cache_size
/dev/sdc died during benchmarking so:
# mdadm --create --metadata=1.2 --verbose --chunk 128 --level=5 --raid-devices=5 /dev/md3 /dev/sd{b,d,e,f,g}3
# cryptsetup -c aes-lrw-benbi:sha256 -s 256 luksFormat /dev/md3
# cryptsetup luksOpen /dev/md3 crypt-md3
# mkfs.xfs -f -d sunit=128,swidth=640 /dev/mapper/crypt-md3
# mount -t xfs /dev/mapper/crypt-md3 /mnt/md3
Limit rebuild speed ([KiB][84][?][84]/sec)
# echo 1000 > /proc/sys/dev/raid/speed_limit_min
# echo 20000 > /proc/sys/dev/raid/speed_limit_max
Save dmraid configuration so far
# mdadm --detail --scan --verbose >> /etc/mdadm/mdadm.conf
Shutdown
# umount /mnt/md2 /mnt/md3
# swapoff /dev/mapper/md0-swap
# cryptsetup remove md0-swap
# cryptsetup remove crypt-md2
# cryptsetup remove crypt-md3
# mdadm --stop /dev/md*
PM RAID take 2
2007-11-30
- Added a [SiI3132][85][?][85] [PCIe][86][?][86] controller. Should exhibit 130mbit bandwidth limit.
- Only 5 disks working now so have to force bonnie++ to run with < 2×RAM.
Benchmark [RAID0][63][?][63]
# mdadm --create --metadata=1.2 --verbose --level=0 --raid-devices=6 /dev/md0 /dev/sd{b..g}1
# cryptsetup -c aes-lrw-benbi:sha256 -s 256 luksFormat /dev/md0
# cryptsetup luksOpen /dev/md0 crypt-md0
# mkfs.xfs -f -d sunit=16,swidth=96 /dev/mapper/crypt-md0
# mount -t xfs /dev/mapper/crypt-md0 /mnt/md0
# for dev in sd{b..f} ; do blockdev --setra 128 /dev/$dev ; done
# blockdev --setra 128 /dev/md0
# blockdev --setra 8192 /dev/mapper/crypt-md0
# bonnie++ -f -d /mnt/md0 -r 1800 -s 3700 -n 0 -u root
Backports
-
http://www.backports.org/dokuwiki/doku.php?id=instructions
echo 'deb http://www.backports.org/debian etch-backports main contrib non-free' >> /etc/apt/sources.list
apt-get update
apt-get -t etch-backports install foo
Performance Tweaks
Tweak RAID array parameters in rc.local
# vim rc.local
# Configure RAID
DEVICES="sda sdb sdc sdd sde sdf"
# disable NCQ
for dev in $DEVICES ; do
echo 1 > /sys/block/$dev/device/queue_depth;
done;
# set read-ahead
for dev in $DEVICES ; do
blockdev --setra 128 /dev/$dev
done
blockdev --setra 128 /dev/md{0,2,3}
blockdev --setra 8192 /dev/mapper/md0-swap /dev/mapper/crypt-md{2,3}
# set stripe_cache_size
echo 8192 > /sys/block/md3/md/stripe_cache_size
exit 0
Have to fix detecting the devices. May not always be sd{a..f}
Boot probe order
-
sata_sil24 gets probed before pata_jmicron so can't boot with RAID array attached.
- Solution: remove sata_sil24 from initramfs
Put just the needed modules in the initramfs
vim /etc/initramfs-tools/initramfs.conf
MODULES=depUnload sata_sil24 # cryptsetup luksClose /dev/mapper/crypt-md3 # mdadm --stop /dev/md3 # rmmod sata_sil24
Recreate initramfs # mkinitramfs -k 2.6.25-pmp -o /boot/initrd.img-2.6.25-pmp
Replacement HDD
2008-08-06 - Finally RMA'd the faulty Samsung HDD.
Initialize new disk
# badblocks -c 10240 -s -w -t random -o sdg.new.badblocks.out -v /dev/sdg
Partition and add to RAID
# cat sfdisk.format | sfdisk -uM /dev/sdg
# mdadm /dev/md3 --add /dev/sdg3
Limit rebuild speed
In [KiB][84][?][84]/sec:
# echo 1000 > /proc/sys/dev/raid/speed_limit_min
# echo 20000 > /proc/sys/dev/raid/speed_limit_max
Grow array from 5+1 to 6
# mdadm --grow /dev/md3 --raid-devices=6 --backup-file=/var/tmp/raidresize
Expand LUKS partition
# cryptsetup resize crypt-md3
Expand XFS
# xfs_growfs /mnt/md3
Process Limits
-
Set limits to prevent processes like lsdvd killing the system when freaking out on dodgy [ISOs][87][?][87]
cat >> /etc/profile
if [ $UID -ge 1000 ] then ulimit -m 1000000 # Max resident memory 1GB ulimit -v 1000000 # Max virtual memory 1GB ulimit -u 150 # Max processes 150 fi
ATA Hard Resets
- Started getting ATA hard resets
- At or around the same time:
- the power supply on DGS-1008D switch died
- bad interference on digital TV and cellphone conversations was noticed
- Narrowed it down to one enclosure slot
- Could be a power supply problem?
- Cable problem?
- See HasturAtaFailures
Monitor RAID
Configure cron
Monitors the array every 20 minutes
$ crontab -e
0,20,40 * * * * /sbin/mdadm --monitor --oneshot --mail yourname@yourisp
Upgrade to Lenny
2009-05-13
- Replaced all occurrences of stable and etch with lenny
- apt-get update && apt-get dist-upgrade
- /boot was mounted ro, had to remount and retry upgrade
Recover RAID after failed disks
- Recreated RAID array superblocks with mdadm-2.5.6
- Script to permute ordering.
- HasturRaidRecovery
- HasturRaidConfiguration
Boot Reconfiguration
2009-09-09
Disable boot-time serial console
- Edit /boot/grub/menu.lst
- Edit inittab
2009-09-13
-
Turn serial console back on but give precedence to console
serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1 terminal console serial
Don't start cryptdisks on boot
- sysv-rc-conf: disabled cryptdisks, cryptdisks-early
Recover RAID again
2009-09-10
md2
mdadm --create --assume-clean --metadata=1.2 --verbose --level=0 --raid-devices=6 /dev/md2 /dev/sd{e,f,g,b,c,d}2
md3
mdadm-2.5.6 -C --assume-clean -f -e 1.2 -l 5 -c 128 -n6 /dev/md3 /dev/sd{b,g,d,c,f,e}3
Move mounted home
2009-09-10
# mv /mnt/md3/systems/hastur/home /mnt/md3/home
# cd /mnt/md3/systems/hastur && ln -s ../../home
Kernel Upgrade
2009-09-16
Array Upgrade
Extend Logical Volumes
# lvextend -L +100G /dev/vg-md6/home
# lvextend -L +300G /dev/vg-md6/media
# resize2fs /dev/vg-md6/home &
# resize2fs /dev/vg-md6/media &
Prepare Backup Disks
# Create mdadm raid 1 with metadata at the end
mdadm --create /dev/md5 -e 1 --level=1 --raid-devices=2 /dev/sdg /dev/sdh
# resync
mdadm --readwrite /dev/md5
# init LUKS, keysize (-s) is required for aes-xts-plain
cryptsetup luksFormat -c aes-xts-plain -s 512 /dev/md5
# filesystem
mkfs.ext3 -m0 /dev/md5
- copy
- prep for shipping
Array Upgrade 2
Upgrade to Squeeze
2012-11-05
Followed howtoforge.
Backed up /etc tar -czvf /mnt/md6-media/systems/etc.tgz /etc
-
Found Lenny archive at: http://ftp.de.debian.org/debian-archive/debian
apt-get clean # running out of space on /var apt-get update # update old distro apt-get upgrade # upgrade old distro
Upgrade incomplete. Still on Lenny.
/boot was mounted ro, had to remount and retry upgrade.
2013-10-30
- Reattempted upgrade
Kernel
Following Debian manual
- apt-get upgrade
apt-get install linux-image-2.6-amd64
fails out of space on /lib!
Out of space in /lib
- Reconfigure sshd to allow root login temporarily
- ssh in as root
- Resize hastur-home LVM
- Reduce fs to 50G, reduce lv to 52G, expand fs again to fill lv
umount /home
HOME_DEV=/dev/mapper/hastur-home
e2fsck -f $HOME_DEV
resize2fs $HOME_DEV 50G
e2fsck -f $HOME_DEV
lvreduce -L 52G $HOME_DEV
resize2fs $HOME_DEV
Extend home-root by 512GB, leave the rest unused
ROOT_DEV=/dev/mapper/hastur-root
lvextend -L +512M $ROOT_DEV
resize2fs $ROOT_DEV
Redo kernel upgrade
apt-get -f install
udev
- Problem with libc6-i386 dependencies related to transition to multiarch.
- Solution: Removed ia32-libs and all dependents. Removed libc6-i386
apt-get remove dpt-i2o-raidutils
dpkg --remove lib32asound2 lib32gcc1 lib32ncurses5 lib32stdc++6 lib32z1 lib32z1-dev libc6-dev-i386 libc6-i386
resume udev install (with some deps including gcc4.4-base)
apt-get install udev
dist-upgrade
apt-get upgrade
apt-get dist-upgrade # fails with perl libanyevent problems
check anyevent-perl deps
apt-cache showpkg anyevent-perl # none!
dpkg -r anyevent-perl
apt-get install -f # fix the packages that anyevent-perl broke
mediatomb
backup
/etc/init.d/mediatomb stop
cd /etc
tar -czvf mediatomb.tgz mediatomb/
remove
apt-get purge mediatomb mediatomb-common mediatomb-daemon
rm -rf /etc/mediatomb
apt-get install mediatomb-daemon
extract backup and merge configs
cd /etc/mediatomb
tar -xzvf /etc/mediatomb.tgz
mv mediatomb old
vimdiff config.xml old/config.xml
restart
/etc/init.d/mediatomb restart
Switch postgresql
Stop and backup
/etc/init.d/postgresql stop
tar -czvf 8.3.bak.tgz /etc/postgresql/8.3 /var/lib/postgresql/8.3
tar -czvf 8.4.bak.tgz /etc/postgresql/8.4 /var/lib/postgresql/8.4
Drop default 8.4 cluster
/etc/init.d/postgresql start
pg_dropcluster --stop 8.4 main
Upgrade 8.3 to 8.4
pg_upgradecluster 8.3 main
pg_dropcluster --stop 8.3 main
Upgrade to Wheezy
'2013-10-31'
Preparation
- install etckeeper and baseline
apt-get install etckeeper
cd /etc
etckeeper init
check the git staging area and remove unwanted stuff
git commit -a -m "etc: squeeze baseline"
git tag -a -m "squeeze"
- updated apt-sources
- Check estimated space and extend /var by 2GB
apt-get -o APT::Get::Trivial-Only=true dist-upgrade
lvextend /dev/mapper/hastur-var -L +2G
resize2fs /dev/mapper/hastur-var
- keep polipo
- apt-get autoremove
Upgrade
Minimal upgrade first
- apt-get upgrade
Lots of fixes dist-upgrades, apt-get -f installs, etc.
Breakages from:
- vlc and libav
- apt-get removed vlc and continued, later reinstalled
Removed old custom-built kernels.
Removed custom install of rtorrent now that Wheezy has an up-to-date one.
Fixed md127 mdadm device in /etc/mdadm/mdadm.conf
ARRAY /dev/md6 UUID=<array:uuid>
Migrate to rsyslogd
sudo apt-get install rsyslog
sudo apt-get purge inetutils-syslogd
Fix Cacti
Cacti node tree UI wasn't working at all. Missing js library jquery-cookie
apt-get install libjs-jquery-cookie
Remaining Issues
- Mythbackend broken
- Cacti tree ui broken
Log
2011-07-09
- Extended md6-media by 100G
2013-10-30
- Completed upgrade to Squeeze
2013-10-31
- Upgraded to Wheezy
- vim:set syntax=pmwiki: