Oct 2017

Kickstart File....
from heck

Ever had to provision a cluster? I know I have, going on the 7th time around in the last 12 years. Back in the day (11 years ago) cluster provisioning systems that were commercial but not latched to a particular hardware vendor were not too expensive. Notice the past tense. These days there are plenty to choose from, none of which I can afford. In addition to all of that the only viable free one (I don't care what anyone else says) is rocks cluster. The biggest issue I have come across with provisioning systems is I keep having to add hooks either in the kickstart that gets generated, a ridiculous post script, an actual completely separate post install script after the initial install is done, a parallel management command using pdsh and/or ansible or - and this is usually the case - some disgusting combination of all of the above. I already know what you are going to say: open source product X, Y, Z or some vendor is just awesome.

I don't care

The environment I work in is very wild west. Which is great for me because I am not a rules kinda person. The cost is a large amount of customization (this is code for shoe horning stuff). So this go around I took a simple approach everything gets done from within the kickstart script. Even if I need a post install script it still gets done within the kickstart script.

Assumptions

  • Working tftpboot and dhcp
  • Mounted iso available to http
  • The provisioning host has a copy of all the packages inserted into the kickstart file
  • The provisoning host has copies of key files needed
  • The hosts file has a mapping of the ip address to host

Now it just so happens when I got this working, I changed it so a script actually stands up tftpd and dhcpd - but that is beyond the scope of this text suffice to say it makes reinstalling old nodes or installing new ones super easy. On to the one script I needed.

Setup Network Stuff

We go over this first so the focus shifts to only the kickstart file. The one post script I needed uses the hosts file and ip address to determine the production ip address and it is static. Remember dhcpd is shut off in my set up after an install, even if it wasn't the server would detect it is in use and not dole it out:

#!/bin/bash

# we use ifconfig because netinstall still uses it and not 'ip'
# get the current ip address and make it static
ipaddr=`ifconfig|grep 172|awk '{print $2}'|sed s/addr://`

# We are ASSUMING the interface name right now this can change
cd /etc/network
cp -p interfaces /root/etc.network.interfaces.ORIG
echo "source /etc/network/interfaces.d/*" > interfaces
echo "auto lo" >> interfaces
echo "iface lo inet loopback">>  interfaces
echo "auto ens3" >> interfaces
echo "iface ens3 inet static" >> interfaces
echo "address $ipaddr" >> interfaces
echo "netmask 255.255.255.0" >> interfaces
echo "gateway 10.2.0.1" >> interfaces
echo "dns-nameservers 155.52.45.100 155.52.46.53" >> interfaces

# set the hostname based on the hosts file ipaddr entry
hname=`grep $ipaddr /etc/hosts|awk '{print $2}'`
echo $hname >/etc/hostname

That script is called mknet and kept in /var/www/html/ks (or simply ~ks/ on the webserver).

The Kickstart

Since this ubuntu there is no setting root instead I used an admin account with sudo which gets setup further on, following is the top part of the file:

install
lang en_US.UTF-8
keyboard us
timezone --utc America/Eastern
network --noipv6 --onboot=yes --bootproto dhcp
authconfig --enableshadow --enablemd5
firewall --disabled
selinux --disabled
rootpw --disabled
bootloader --location=mbr --driveorder=sda --append="crashkernel=auth rhgb"
text
user sadmin --fullname "SAdmin" --password $SOMEPASSWD

# Disk Partitioning XXX-JRF not sure IF THESE WILL WORK on stage vms
clearpart --all
zerombr
part /boot --fstype=ext4 --size=300
part pv.1 --grow --size=1
volgroup vg1 --pesize=4096 pv.1
logvol / --fstype=ext4 --name=lv001 --vgname=vg1 --size=10480
logvol swap --name=lv004-swap --vgname=vg1 --size=2098
# END of Disk Partitioning

reboot

# Package Selection XXX
%packages --nobase --excludedocs
@ubuntu-standard

Not exactly too difficult although the disk part took some figuring out. These are smallish nodes (most of the action is on NFS). Next up is the package list, I truncated the actual list but essentially here it is:

-b43-openfwwf
linux-firmware
efibootmgr
nfs-common
wget
acpid
openipmi
pciutils
python-libxml2
bc
lsof
ntpdate
snmp
tcl
gfortran

And it just goes on from there. The list is derived by snagging dpkg --list and cramming it into the middle of the kickstart file. Now the fun part:


%pre

%post --log=/root/install-post.log
(

echo "SELINUX=disabled" > /etc/selinux/config
PATH=/bin:/sbin:/usr/bin:/usr/sbin
export PATH
echo "%sudo   ALL=NOPASSWD:ALL" > /etc/sudoers.d/admin
/bin/chmod 0400 /etc/sudoers.d/admin
cd /var/tmp
# The provision script is responsible for staging these on the fly
/usr/bin/wget http://10.2.0.165/post/mknet
/usr/bin/wget http://10.2.0.165/files/allfiles.tgz
tar xzvf allfiles.tgz
cd etc/
mv -f passwd /etc
mv -f group /etc
mv -f shadow /etc
mv -f gshadow /etc
mv -f hosts /etc/hosts
mv -f ssh/sshd_config /etc/ssh
chmod +x mknet
./mknet
cd /var/tmp 
tar xzf sadmin.tgz
mv -f sadmin /
cd /
chown -R sadmin:sadmin sadmin/
usermod -d /sadmin sadmin
rm -rfv /home/sadmin
echo "10.2.0.6:/home /home  nfs    defaults    0 0" >>   /etc/fstab`
mount /home
) 2>&1 >/root/install-post.log

There is a lot of craziness in there, so lets break it down first some standard stuff shut off selinux, setup sudoers to allow no passwd for memnbers of admin (there is only one in this case):


%pre

%post --log=/root/install-post.log
(

echo "SELINUX=disabled" > /etc/selinux/config
PATH=/bin:/sbin:/usr/bin:/usr/sbin
export PATH   
echo "%sudo   ALL=NOPASSWD:ALL" > /etc/sudoers.d/admin
/bin/chmod 0400 /etc/sudoers.d/admin

None of that is too nuts, the fun starts when kickstart pulls stuff off of the provisioning server to complete the configuration. It goes into /var/tmp and pulls down the aforementioned network setup script and a tarball of key configuration files. Then it:

  • Moves keyfiles into position first
  • Sets exec then executes the network config script
  • Unpacks the admin accounts files (this is for SSH keys)
  • Makes the admin account operate off of /, this is on purpose because home is NFS mounted
d /var/tmp
# The provision script is responsible for staging these on the fly
/usr/bin/wget http://10.2.0.165/post/mknet
/usr/bin/wget http://10.2.0.165/files/allfiles.tgz
tar xzvf allfiles.tgz
cd etc/
mv -f passwd /etc
mv -f group /etc
mv -f shadow /etc
mv -f gshadow /etc
mv -f hosts /etc/hosts
mv -f ssh/sshd_config /etc/ssh
chmod +x mknet
./mknet
cd /var/tmp 
tar xzf sadmin.tgz
mv -f sadmin /
cd /
chown -R sadmin:sadmin sadmin/
usermod -d /sadmin sadmin
rm -rfv /home/sadmin

That seems like a lot but it works rather well. I was able to easily deploy nodes using that as long as I made sure the hosts file was up to date. The last step is NFS:

echo "10.2.0.6:/home /home  nfs    defaults    0 0" >>   /etc/fstab`
mount /home
) 2>&1 >/root/install-post.log

Which is why the nfs-common package was included.

Summary

With a lot of time and patience a really good, single point kickstart network isntall can be used to simplify life. This setup negated my having to go out and find a provisioning system. There is however one command I had to run afterwards: a host keyscan of the node for the submit hosts. Perhaps at some point I will write up the wrapper script I wrote that stands all of this up temporarily.