SysAdmin's Journey

Installing the Pidgin-Encryption Plugin on OpenSolaris

You can install Pidgin from the OpenSolaris.org repository, but there’s no package for the Pidgin-Encryption plugin. Once you point it in the right direction, it’s not hard to install the plugin from source.

pfexec pkg install SUNWgcc SUNWxorg-headers SUNWgnome-common-devel
tar -xzvf pidgin-encryption-3.0.tar.gz
cd pidgin-encryption-3.0
./configure --with-nspr-includes=/usr/include/firefox/nspr/ \
  --with-nspr-libs=/usr/lib/firefox/ \
  --with-nss-includes=/usr/include/firefox/nss/ \
  --with-nss-libs=/usr/lib/firefox/ --prefix=/usr
make
pfexec make install

Restart Pidgin, and go to Tools->Plugins, and enable the Pidgin Encryption plugin.

Installing Synergy2 From Source on OpenSolaris 2009.06

At work, I can’t live without Synergy, the program that allows you to share one keyboard and mouse with multiple systems across multiple systems. Here’s a quick post on how to install it in OpenSolaris:

pfexec pkg install SUNWgcc SUNWxwinc
tar -xzvf synergy-1.3.1.tar.gz
cd synergy-1.3.1
./configure
make
pfexec make install
mkdir -p ~/.config/autostart
cat > ~/.config/autostart/synergys.desktop <<EOD

[Desktop Entry]
Type=Application
Name=Synergy2 Server
Exec=/usr/local/bin/synergys
Icon=system-run
Comment=Synergy2 Display Server
X-GNOME-Autostart-enabled=false
EOD

To enable startup at login, create the appropriate /etc/synergy.conf and navigate to System->Preferences->Sessions, and place a check next to “Synergy2 Server”.

Fixing Metacity's Window Placement on Dual Head Setups on OpenSolaris

It took me a few minutes to get dual-head working on OpenSolaris, but once I did, I immediately found something that greatly annoyed me. Every new window I opened would launch in the middle of the two heads (half of it on one monitor, the other half on the other monitor). Also, maximizing a window made it stretch across both monitors. It turns out this bug made it just in time for the OpenSolaris 2009.06 release. To fix the issue, you have to download a patched metacity binary referenced in this bug report. Download the binary, then run the following:

pfexec cp /usr/bin/metacity /usr/bin/metacity.orig
pfexec cp metacity /usr/bin/metacity
/usr/bin/metacity --replace

Now, metacity should behave itself. Hopefully the devs will push out an update that resolves this soon.

The Problem With Web-based Everything

So, I’ve been tinkering with the free version of Toodledoo - a web-based GTD task manager, and I was thinking about upgrading to a “pro” account. Unfortunately, they had a storm run through last night, which engaged the generators. When the generators kicked in, something didn’t work right, and power was lost. This in turn caused a database crash, which caused it to corrupt, and they are still down now. Here’s what their homepage states right now: So, here’s the story. A big storm went through the city where our datacenter is located. The datacenter decided to proactively switch to generators. During the switch, something got screwed up, and the power went off for a few minutes. As (bad) luck would have it, this caused our database to get corrupted. We are currently working to bring it back online and restored from the live backup. The crack team at Rackspace is on the job. Thanks Rackspace! Unfortunately, the database is so large, that it will take some time to transfer and verify all the data. Hopefuly not more than a few hours. We know that this is very bad, and we apologize for any inconvience that this will cause. Please check the forums when we are back online for a full report. Update: Its obviously taking longer than we expected and we are really sorry for that. Now, I’m not paying anything for the service, and I’m fine with the downtime. However, I don’t think I’ll be upgrading anytime soon - this outage tells me a few things:

  • They don’t use UPS’s.
  • They don’t use more than one data center.
  • They likely don’t manage their own servers. Again, all of this is fine - it costs money to do all these things, and I understand the decision to not do it. However, when I pay for software as a service, I expect the software and the service to be highly available.

Upgrading From Solaris 9 With a Root SVM Mirror to Solaris 10 With a Root ZFS Mirror With Less Than 10 Minutes of Downtime

As sysadmins, many times the entire task laid out in front of us has no documentation. One of the biggest skill an admin can have is the ability to problem solve, breaking down a large task into smaller sub-tasks. Often times, you might be able to find documentation on some of those sub-tasks. A perfect example is upgrading a server from Solaris 9 with root in an SVM mirror to Solaris 10 with a ZFS mirror. Not only is this large task doable, but thanks to LiveUpgrade, it can be done with less than 10 minutes of downtime (3 reboots @ roughly 3 minutes each)! Part of the beauty of Solaris when compared to Linux is the tools made available to the admin. I didn’t even like working in Solaris until I started learning about zones, ZFS, LiveUpgrade, DTrace, etc. Now, on the server-side, I can’t use it enough. I would be hard- pressed to do a similar upgrade with Linux - it’s almost impossible to do in 10 minutes of downtime on RHEL. Debian might be able to do it, but LiveUpgrade gives you the ability to roll back to the previous state, which I don’t believe ‘apt-get dist-upgrade’ allows. Anyways, enough evangelism, onto the howto! If you’re subscribed to my RSS feed you may not even have noticed, but all of the steps have been already laid out over the past few posts. All that remains is to put them back together into one big chain of subtasks.

Step One: Break the SVM Mirror

Step Two: Upgrade Solaris 9 to Solaris 10 using LiveUpgrade

Step Three: Migrate from UFS to ZFS root using LiveUpgrade

Step Four: Add the Second Disk to a ZFS Root Mirror

  • Total Downtime: None
  • Link to Article: Adding a 2nd Disk to a ZFS Root Pool We’ve taken a large behemoth of a task that sounds like a large amount of downtime would have been incurred, and broken it down into smaller, more manageable substeps. As an added bonus, using Solaris technologies, downtime is kept to a minimum!

New and Improved check_mem.pl Nagios Plugin

UPDATE 9/19/2011: I’ve moved this plugin over to github: https://github.com/justintime/nagios-plugins. It now has a PNP template, and support for AIX as well.

We have always monitored RAM usage on all of boxes. Sure, there’s the argument that unused RAM is money wasted, but I always like to know not just when the box is swapping, but when it’s about to start swapping. There have been a few plugins over the years that I’ve used for this - check_ram for Solaris, check_mem for Linux, and there’s also check_mem.pl. Well, migrating to Solaris 10 and ZFS started tripping the check_ram thresholds due to the ZFS ARC cache. So, I attempted to pull together a cross platform Nagios plugin that did it’s best to give me what I wanted, and what do you know, it works! This graph shows the ZFS ARC cache at it’s best:

cacti.png

So, I started with the check_mem.pl script that’s included in the contrib folder of the official Nagios Plugins. What emerged when I was done was quite different. Here’s some key differences:

  • If run on a Solaris host:
    • If the Sun::Solaris::Kstat module is available, it grabs the total memory, memory in use by the ZFS ARC cache, and free memory using that module. If not, it uses vmstat and prtconf to determine total, used, and free memory. There’s no easy way to track ARC cache usage without the module.
  • If run on a Linux host:
    • It uses /proc/meminfo to gather total memory, used memory, free memory, and cache/buffer memory.
  • If run on another Unix host:
    • It uses vmstat to do what it can. This code is unchanged from the original check_mem.pl.
  • If ran on a supported OS (Solaris with Kstat, or Linux), you can use the -C command line option which counts the cache memory as free memory when comparing it to the warning and critital thresholds.
  • I enabled perfdata output for Nagios to use.
  • Any user can run the plugin.

Let’s show an example, run from a Solaris host with ZFS:

$ /usr/local/nagios/libexec/check_mem.pl -w 10 -c 5 -f 
WARNING - 9.9% (406520 kB) free!|TOTAL=4113824KB;;;; USED=3707304KB;;;; FREE=406520KB;;;; CACHES=816947KB;;;;

Uh oh! I have less than 10% free of the 4GB total. Wait, the ZFS ARC is using up 800MB of that! Let’s try again with the -C option:

$ /usr/local/nagios/libexec/check_mem.pl -w 10 -c 5 -f -C
OK - 29.7% (1220611 kB) free.|TOTAL=4113823KB;;;; USED=2893212KB;;;; FREE=1220611KB;;;; CACHES=817075KB;;;;

That’s better! You’ll see the same sort of thing on Linux. Maybe some day I’ll share the nasty hackery that is getting Nagios perfdata into Cacti automagically, but I don’t know if the world’s ready for that yet ;-) Until then, give my plugin a try, and let me know how it works. If you have another OS for me to add, I’d love to code it up!

Use LiveUpgrade to Migrate From UFS to ZFS With Minimal Downtime

Continuing on from the article on how to use LiveUpgrade to upgrade from Solaris 9 to Solaris 10, we now migrate our Solaris 10 UFS file systems to ZFS. LiveUpgrade handles the migration of the critical filesystems, we’ll manually migrate three other filesystems from UFS to ZFS using ufsdump and ufsrestore to minimize downtime.

Phase One: Delete the Old Solaris 9 Boot Environment

We still have an old Solaris 9 boot environment laying around. It’s time to move on, and blow it away.

ludelete Solaris9 && lustatus

There – that felt good now, didn’t it?

Phase Two: Create the Solaris 10 ZFS Boot Environment

Now that we’ve freed up c1t0d0 by removing the Solaris 9 boot environment, we can use it for our ZFS boot environment. Before we can do anything, it’s a lot easier to reformat the disk to use just one big slice. Go ahead and use ‘format’ to reconfigure your slices so that s0 consists of the whole disk. With our slices in place, we need to create our root pool. Do this by running:

zpool create rpool c1t0d0s0

With that out of the way, we can now create a new boot environment named Solaris10ZFS that is a copy of the current one on our newly created ZFS pool named rpool:

lucreate -n Solaris10ZFS -p rpool

Phase Three: Boot Into the Solaris 10 ZFS Boot Environment

The next step is to activate our ZFS boot environment, and boot into it.

luactivate Solaris10ZFS

Note: if that fails with the message ‘/usr/sbin/luactivate: /etc/lu/DelayUpdate/: cannot create’, then you’ve tripped over a bug described here. To work around it, run the following:

export BOOT_MENU_FILE="menu.lst"
luactivate Solaris10ZFS

You’ll get output similar to the following - be sure to print it out, or copy it someplace you can get to it later:

**********************************************************************

The target boot environment has been activated. It will be used when you 
reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You 
MUST USE either the init or the shutdown command when you reboot. If you 
do not use either init or shutdown, the system will not boot using the 
target BE.

**********************************************************************

In case of a failure while booting to the target BE, the following process 
needs to be followed to fallback to the currently working boot environment:

1. Enter the PROM monitor (ok prompt).

2. Change the boot device back to the original boot environment by typing:

     setenv boot-device /pci@1c,600000/scsi@2/disk@1,0:a

3. Boot to the original boot environment by typing:

     boot

**********************************************************************

Modifying boot archive service
Activation of boot environment  successful.

When ready, run

init 6

to reboot into your ZFS boot environment.

Phase Four: Migrate Non-Critical UFS Filesystems to ZFS

In our example, I had three UFS filesystems that were not a “critical” filesystem as marked by Sun:

/apps

/dev/dsk/c1t1d0s5

/export/home

/dev/dsk/c1t1d0s4

/usr/local

/dev/dsk/c1t1d0s3

We will create new ZFS filesystems for these, and use ufsbackup and ufsrestore to quickly sync them over. You could write a script for this, but scripts that muck with filesystems make me nervous. Here’s a list of steps you’ll want to do for each filesystem you want to migrate. For this example, I’ll use the /apps partition above.

  • First, create the ZFS filesystem under the rpool pool:

    zfs create rpool/apps

  • Next, change to the new ZFS directory:

    cd /rpool/apps

  • Next, we do a backup and restore from UFS to ZFS:

    ufsdump 0uf - /apps | ufsrestore rf -

  • Now, create a temporary mountpoint for the UFS filesystem:

    mkdir /ap/apps

    • Stop all processes that are accessing the UFS filesystem. Use ‘fuser -c /apps’ to make sure it’s no longer in use.

    • Unshare the filesystem from NFS:

    unshare /apps

  • Unmount the UFS filesystem from it’s old location:

    umount /apps

  • Mount the UFS slice to the new, temporary location:

    mount /dev/dsk/c1t1d0s5 /a/apps

  • Change to the ZFS directory:

    cd /rpool/apps

  • Run a level one backup/restore. This will only copy over files that have changed since we did the level 0 backup above (and should be very quick):

    ufsdump 1uf - /a/apps | ufsrestore rf -

  • Get out of the ZFS directory:

    cd /

  • Set the mountpoint for the ZFS filesystem to be where the old UFS one was:

    zfs set mountpoint=/apps rpool/apps

  • Start up your daemons and whatnot that were needing access to the filesystem.

  • Unmount the temporary mount, and cleanup the directory:

    umount /a/apps && rmdir /a/apps

  • Edit /etc/vfstab, and comment out the line mounting /apps. ZFS handles mounting for us now.

  • Wash, Rinse, Repeat - repeat this for /usr/local and /export/home.

Phase Four: Test

First, go ahead and exhale – holding your breath for that long isn’t good for you! You need to take a look around the system, and make sure everything is running properly. Check ‘dmesg’, /var/log/syslog, /var/adm/messages, etc. Run ‘mount’ and make sure there’s no UFS mounts in there that you don’t want. I recommend a reboot, but it’s not really needed.

Summary

Well, you did it! Migrating an entire system from UFS to ZFS isn’t as painful as it could be, thanks to LiveUpgrade. If you have non-critical UFS filesystems you want to migrate, it requires a little elbow grease, but is easily done with minimal downtime. Welcome to your new ZFS root!

Adding a 2nd Disk to a ZFS Root Pool

So, let’s say you’ve just completed migrating to ZFS from UFS using LiveUpgrade, and now you want use that leftover disk to make a mirror. Easily done, but there’s one caveat – you need to make the second disk bootable in case the first fails. So, starting off from where we left off, you have an old UFS based boot environment sitting on c1t1d0. First, delete the old environment:

ludelete Solaris10 && lustatus

That was easy. Now run ‘format’ on c1t1d0, and make slice 0 encompass the whole disk. Write out the label, and get back to the prompt. Now, we need to make our single-disk ZPool into a two-way mirror. This operation is mind- blowingly simple and is one of the showcases of ZFS and its ease of management:

zpool attach rpool c1t0d0s0 c1t1d0s0

This sets up the mirror, and automatically starts the resilvering (syncing) process. You can monitor its progress by running ‘zpool status’. The final step is to actually make c1t1d0 bootable in case c1t0d0 fails. Here, we use the ‘installboot’ program for SPARC:

installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t1d0s0

or use installgrub if you’re on x86:

installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1d0s0

That’s it, you can now boot from either drive!

Catch CommunityOne West Technical Sessions From Anywhere!

Sun is broadcasting their CommunityOne West technical sessions free via Ustream. Of particular importance to the audience of this blog: Channel 3 - Managing OpenSolaris™:

Time Title/Speakers

10:50 - 11:40 am

What’s New in the OpenSolaris™ 2009.06 Operating System
Chris Armes and Pete Dennis, Sun Microsystems, Inc.

11:50 am - 12:40 pm

Becoming an OpenSolaris™ Operating System Power User
David Miner and Nicholas Solter, Sun Microsystems, Inc.

1:40 - 2:30 pm

Built-in Virtualization for the OpenSolaris™ Operating System: Containers, Sun™ Logical Domains (LDOMs) and xen
Jerry Jelinek, Sun Microsystems, Inc.

2:40 - 3:30 pm

Open Networking with Crossbow
Sunay Tripathi, Sun Microsystems, Inc.

4:00 - 4:50 pm

OpenSolaris™ Operating System Secure Deployment: Roles, Privileges and Crypto
Christoph Schuba, Sun Microsystems, Inc.

5:00 - 5:50 pm

Open Storage with the Solaris ZFS™ File System and COMSTAR
Scott Tracy, Sun Microsystems, Inc.

Call for Comments: Help Advise an Aspiring SysAdmin

One of the things I love about blogging, and about the Internet in general, is that it makes the world much smaller. I’ve met so many people via my blog, it makes all the work pay off. A good example of a blog introducing two people just happened the other day. Andrew is an aspiring sysadmin who is moving soon, and wanted to ask me some questions regarding how to “break in” to a career in systems/network administration. In my response, I asked if he would mind if I quote our conversation and post it up on my blog, so that he could ask many of us the same questions at once. He agreed emphatically. So, click through, read Andrew’s questions, and comment away – what else are you doing on a Friday??? Andrew first sent me this email:


I’ve only newly discovered www.planetsysadmin.com and came across your blog. I just read your About section and I loved it, as well as the other posts I’ve read from SAJ. I thought you’d be a good person to talk to about this. The purpose of this email is to ask for a bit of advice from someone having traveled down the same road I hope to. I first discovered Linux about late 2005 and played around with it, but never got around to seriously using it until about mid 2006. I’ve been a Debian (and Ubuntu) Linux user ever since (as well as casually seeing what’s new with OpenSolaris). Around the same time, I discovered a love for networking, despite not knowing much about it. Long story short, I’ve been wanting to start a career as a Linux/UNIX Network Administrator since then. I’ve come a long way since then, but I’ve still got a long way to go I’m sure. My girlfriend and I are moving out of Michigan to either Gainesville Florida, or Santa Rosa California. No matter where we move, I’m looking to break into the IT sector and actually starting my career. Thanks to an unfortunate niche I fall into with the US’s student Financial Aid system, I can’t really start attending college until next fall ‘10, so I’ve no college under my belt, and the bulk of my tech knowledge is self-taught. I’ll be taking my Network+ before we move, so that’s the only real credential I’ll have to put on my resume. My real question is, what advice would you give someone looking to get into this type of position? I’ve done my homework, and I’ve read a few books on both the technical and non-technical side of the job. I feel I know enough about what a SysAdmin actually does, and I haven’t been scared off by any of it. In fact, it sounds absolutely awesome. I figured I could get more by asking someone who does this for a living as opposed to some books and a ton of web-sites. I can’t seem to get my fill of learning and hearing about the job, so if you have anything to share, I encourage you to ramble.


I responded with this:


… Unfortunately, like many careers, “breaking in” to the market is often the hardest step. I might suggest trying to get a helpdesk/support job to start out with. Often times, finding a local ISP is a good place to start - that’s where I started. Since they are usually small, you can often get promoted quickly if you show potential. Most often, they are nix shops, and have a heavy focus on TCP/IP network troubleshooting. Unfortunately, the smaller mom and pop ISP’s are a dying breed since the cable and phone companies tend to buy them out. Usually, if you can find them, they are operating wireless ISP’s now. If you can’t find a suitable job there, look for anything as a junior sysadmin. In my opinion, the most important skill a sysadmin has is his ability to solve problems. Memorization can only get you so far - there’s too much information out there to memorize it all. Having the ability to troubleshoot a problem and quickly isolate it down to a specific subsystem is of the utmost importance. In my experience, something that often sets apart nix sysadmins from Windows sysadmins is their ability to build scripts to automate their jobs. If you haven’t already, pick up a book on shell scripting, and start learning how to write shell scripts if you haven’t already. To me, the most important personality traits are: 1) An insatiable thirst for knowledge, and 2) Patience. It sounds like you already have the first covered. Patience is required because computers are stubborn, and if you don’t want to burn out, you have to keep your cool. …


Andrew then wrote back:


… I really liked the tip about looking for smaller ISPs. When I was working for PC Club computers (not sure if you heard of them or not), the tech I replaced took a job for Sonic.net, a local ISP for the northern California area. Checking the Bay Area’s Craigslist, I see two job postings from Sonic looking for what seem to be basic Network tech skills (I should be a shoe-in if I pass my Network+). The Bay Area is also the only place out of Metro Detroit and Central Florida that have job offers looking for OSX server skills. A few are asking for ACSA certs. In either area, Helpdesk jobs are relatively easy to find and hopefully even easier to hire into. Actually now that I think about it, I remember all this buzz on CL about a MySQL developer job paying $11/hr in the Greater Detroit area. As far as skills go, purely for learning, I’ve been trying to setup an overkill home network to hone my skills using services like Apache, BIND, Postfix, MySQL, monitoring, etc. However, right now, my network consists of one machine; mine. A VirtualBox VM is my server, handling LDAP, DHCP, and DNS services. I’m trying to figure out ways to incorporate other popular services like MySQL, Postfix, and Apache, as well as monitoring. I know that shell scripting is a desired skill by many, but is there really a difference between knowing BASH scripts vs Python or Perl? I’ve never had much of an aptitude for programming. I tried reading a beginner’s type guide to C++, but I got lost after the introductory paragraph. I’ve always had an interest in Python and Java, but not for any particular reason. For a few years now, I’ve been checking out job listings and seeing what a lot of these employers are looking for. It seems that a lot of them want very specialized people, while some are looking for generalists. Generally, what I’ve seen is that you pretty much can’t go wrong with Microsoft, Cisco, and VMWare knowledge, as well as Blackberry Enterprise server experience. Not to mention Linux knowledge. I was studying for the CCNA exam, but too much material and not enough time before we move to really take it all in. The Network+ is much more general and easier. One of the ways I’ve been trying to hone my skills aside from making my own network to test things out on is writing. I’ve always been fairly good at writing reports and such for school, so this comes pretty easily to me. Over the last few years, I’ve found lots of great sources for Linux info, from what Linux is to how to more advanced topics. I have a blog on Wordpress that needs more attention from me, but I’ve basically gone back to square one and started writing how-to articles teaching people in my shoes (future SysAdmins looking for easy to read learning material) things like what UNIX is, Open Source licensing, what Linux is, and I’m currently going through a series of articles teaching people how to use the shell. After going through shell basics, I’m working on a series of networking tutorials that sit around the Network+ level. I’ve actually had my blog linked to on sites like lxer.com and LinuxToday.com. It’s not near where you guys on PlanetSysAdmin are at, but I hope to make it that grade of material some day. The site is thatLinuxguy.wordpress.com …


There you have it! What was the wisest career choice you made? Conversely, what would you have done differently? Share your experiences by taking the time to write up a comment!