Skip to content.

TalkBMC

Sections
You are here: Home » Blogs » Steve Carl » Adventures in Linux

Adventures in Linux Adventures in Linux

Document Actions
Steve Carl, R&D Manager at BMC Software, muses about his adventures in Linux.
OpenSUSE 11 General Availability as an ELD, Now with secret sauce

I mentioned a few posts back that I had a test system stack: four identical older systems that I set up to be able to test Linux. The idea was the I could do back to back comparisons and have a good idea how each Distro of Linux stacked up on the same hardware at the same time. No sequential reloading of Distros on the same computer. Just a quick switch of the console via the KVM to look at the same thing (OpenOffice.org, Gnumeric, Evolution, Firefox, whatever is peaking my interest...) on the same type of computer, but two different Distros.

I took it all apart today. I did not reckon with two problems.

  1. Heat and noise while they were up (I left it all down when I was not using 'the stack'). The noise came from the fans in the KVM. Note to self: data center grade gear is lousy for office use.
  2. Even if the hardware looks the same, and specs out the same, when it is old, it does not necessarily act the same. This is probably true even when hardware is new, but as it ages, it becomes more pronounced. In particular, the video cards and how well they worked with the KVM, and the hard drives, and how some systems seemed to be in I/O wait for no apparent reason against /dev/sda.

I had a third reason for doing what I did, which was to learn how our new data center standard KVM switches work from actual setup type experience. I am always looking to stay as current as I can on all sorts of tech, and I had not had a chance to "play" with these yet. that being done, it was time to move on.

OpenSUSE 11

I mentioned in that post about the test stack that I was testing OpenSUSE 11 Alpha. It has since GA'ed, so it was time to go back and have a look. Unlike Fedora 9, OpenSUSE 11 had installed fairly easily even in Alpha state. I expected the GA to be smooth, and it was. All you have to do is look at all the trade reviews of OpenSUSE 11, and read all the praise for the changes that it has brought to the OpenSUSE party to get the feeling the R11 is a significant upgrade over what came before it.

A great deal of the excitement surrounds the fact that the software installer and updating process are significantly improved. They are not yet quite Ubuntu / Mint easy, but they are light years better than they were, and are closing in on the leaders of the pack. It is now dead easy to enable alternate repositories, including ones that allow you to install binary only drivers like Nvidia and ATI's. This, as it turned out, would be key for me.

I did not want to install R11 on 'the stack'. I wanted to turn that off and take it out of my office. My IBM T41 was nominated instead. It has always worked well with SUSE in the past, so I assumed it would be easy, and it was. Boot the LiveCD, run the installer, answer a very similar to Ubuntu set of questions, lay out the hard drive manually as always, and then let it spin on down.

Since the T41 had been running Mint 4, the OpenSUSE look and feel was replaced from the get-go with my customized desktop: Space Shuttle landing at night picture, standard Gnome tool bar at the top of the screen. Some things are missing:

  • No gkrellm is available from any standard OpenSUSE repository. My favorite system monitor... well, other than Patrol of course. I am sure it is out there someplace, and when I get a spare moment, I'll find it.
  • Sensors, avahi, etc all have to be installed since they were not on the LiveCD image, but they are available. 
  • HDDtemp is not available! 
In no time at all the desktop looks more or less the way I like. The tool bars are stocked with goodies. The Wifi card works out of the box and with no muss or fuss (something that Fedora would not have done). Evolution finds the Mint created config files and appears to work well.

Phase 1 complete. No casualties.

Crispy Nvidia 7300

Shortly after I finished up the T41, my Dell 745 desktop, running Mint 4.0, starts acting flaky. It moaned and hummed and whined and wheezed. I opened the case, and watched the fan on the video card stop and start. Speed up, then slow down. Whine then run silently. Uh oh.

A few days later, video stops working on the second monitor. "lspci" says that there is no Nvidia card at all. 

I do what any geek faced with such a situation would do. I went to Fry's (I gotta love a store that has parts to build your own Linux computer and also sells Apple stuff). There I picked up an Nvidia 7200CS that had a big heat sink rather than a fan on it.

In the 7200CS went, and no luck. Mint acts like it can not see it. I decided to try OpenSUSE and see what it would do. My thinking was that OpenSUSE, being from Novell and the Open Source members of that project, should have the worlds best implementation of Evolution on it: Novell bought Ximian, creators of Evolution and the Evolution connector. In the past the SUSE version of Evolution had always been at least workable. This would give me a chance to see how well OpenSUSE worked on desktop hardware, with dual heads, with the Nvidia repositories, and with Evolution.

Late that night, I booted the OpenSUSE 11 LiveCD that I had used on the T41, and it all worked pretty much the same as it had. For fun I tried to use the Open Source Nvidia drivers first but they would not enable the second monitor. The closed source ones worked fine, and enable the "twinhead" setup. I was back in business. Even Compiz worked, and that had never happened on the 745 with the Nvidia 7300 and Mint.

Evolution came up, found everything where Mint 4 had left it, and I was off and running. Well. Not so much

Stable for 24 hours, then a failure. Evolution Connector crashed. 

Evolution 2.22

Evo 2.22 in SUSE has a slightly updated look and feel relative to that same app in Mint 5.0. A few more plugins appeared to ship, all though I did not compare them line by line. 

My desktop can *not* have an unstable version of Evolution on it. It is my main place to read email, check my calendar, open tasks to myself, update contacts, filter emails from various mailing lists into folder for offline reading, etc.

I installed the debugging symbols for Evolution and Connector, and went into the business of sending crashes into the Gnome project. At first it crashed when I was using it. Then it started to crash even was I was no where near the computer. More and more, faster and faster, closer and closer together.

When I say crash, I mean Connector crashed. Evolution stayed up and running. It was just useless.

I created a clean ~/.evolution file, and slowly brought back over the mail folders from the backup copy now called .evolution.mint. I went through and disabled plugins that were not useful in our MS Exchange based shop, like Hula and Groupwise related things.

Crash. crash. crash. 

And now, the secret sauce....

I was trying to decide what to do, and had just about opted to move to Mint 5.0 on the desktop, with a fall back plan to Mint 4, which has been stable. Then, I noticed something odd. A pattern emerged. Every single time Evolution Connector had crashed when I was there to observe it, it had been when the inbox was being filtered: When the rules were running that kept my inbox cleared out. A little status message in the taskbar about filters running was there every time, and always at 0% complete. It looked like a new message was arriving, triggering the rule to run and parse it, but that the rule was immediately freezing and Connector was crashing shortly after that. I have about 20 Rules in the ruleset. I would not think that was a large number, but who knows? My quick looks at the crash dumps before I sent them in to Gnome made me think the crash was happening in the same way every time.

I decided to try something.

  • I disabled filters aka 'Rules' in Evo-speak on INBOX for Evolution Connector.
  • Created and enabled IMAP account to the same MS Exchange 2003 server Inbox
  • Turned on filtering on IMAP. Same exact rule set, same exact Inbox, just running via IMAP rather than Connector. 
  • I made IMAP my default account. The Connector account was there and active, just not default. This means, among other things that outbound email is being delivered via SMTP rather than through the Connector's WebDAV protocol.
My idea and experiment: use Connector *only* for Calendaring, Tasks, and Contacts (including GAL lookups). Take the stress off the Connector code. If this was a timing related or load related issue.....

It has not failed even once since I did this, which means about 5 working days of uptime. Other than the first 24 hours of stability, I could not get Connector to stay up for more than a few hours at a time. It appears that Evolution Connector and the built in rules facility are not compatible at this time, at least with OpenSUSE 11 and Evolution 2.22.

In retrospect, it probably should have been a clue that the OpenSUSE 11 installation on my T41 laptop never had an Evolution crash. I do not run filters there.

Enterprise

As usual, I have to ask the question, is OpenSUSE 11 a viable desktop for an enterprise.  Not for geeks like me but for the average computer user that does not want to know anything about the computer itself: they just want a tool to get a job done. 

The desktop itself is easy to use, easy to configure, easy to update, and a strong preview of what is to come in the next release of SLED (SUSE Linux Enterprise Desktop). It has all sorts of standard Open Support, from Wikis to mailing lists to online doc.

From what I have seen the system is pretty solid except for my corner case of Evolution against MS Exchange 2003 running a fairly large set of filters on my inbox via Connector. I'd have to say I would probably have no problem supporting it, and would prefer all the new shiny goodness of OpenSUSE R11 versus the getting-long-in-the-tooth SLED 10. For the first time ever, I have left OpenSUSE on my primary desktop to be used as my primary OS at the office.

Mint will stay my primary at-home Linux version. Instead of Mint-everywhere, I'll be jumping back and forth. A new experiment has begun.



_____
tags:
Sunday, June 29, 2008  |  Permalink |  Comments (1)
Interesting, but not Enterprise. Not that they ever said it was

The problem with asking a technogeek whether or not something is possible is that you will almost always get back the answer "Yes".

"Can a program be written that monitors all the computers on the network, regardless of who makes it, or what OS it is running?"

"Yes"

"Can it be ready a week from Tuesday?"

"What year?"

There is the rub: a technical question needs a scale framed around it. Is Linux a viable desktop OS: Yes. Can we use Fedora at the office? Yes.

Those last two, while true, ignore scale and ignore training and ignore whether or not other 'flavors' of Linux would be better. Our recent experience with replacing our Tru64 TruCluster with a CentOS based cluster is a lesser example: yes it was possible, but it did require having a guy like Dan Goetzman to read the kernel code, read the traces, find the problem, and write a workaround. Since then, it has worked extremely well. You can not ignore the fact however that most companies do *not* have a Dan or even a Dan-like person on their team. Such skills, while not unavailable are rare enough that most folks just go with a vendor created solution.

That is the eternal tradeoff of IT: Roll your own and get exactly what you want, but then be forever locked in to being the maintenance and update group, or go with a vendor solution where all of this is essentially outsourced.

Our CentOS cluster is an enterprise grade solution, but in point of fact, only because Dan is standing behind it. CentOS has no vendor support. Without Dan, we would have used RedHat Enterprise Linux and bought support instead.

It is in this frame of reference that I went to look at Fedora 9. I know it is not supported, and that it is not meant to be an Enterprise Linux Desktop, any more than my recent foray with Mepis is or was. Fedora is a technology exploration, and I was exploring.

Back to the Stack

I started out a while back to create a test environment where I could compare various Linux environments side by side. At the time, Fedora 9 was pre-GA, and was not behaving well on the test gear. At the time I was trying out the LiveCD version of the install, but Fedora was just not getting the video right, where pre GA or just-recently-GA versions of Ubuntu, OpenSUSE, and Mandriva were working fine on the exact same type of computers. These are standard Dell desktops no less. Nothing to weird about them. Certainly not laptops and their more esoteric hardware.

Even before I did the test installs, I was starting to feel a certain level of frustration with Fedora. I could not quite figure it out. It used to be my *main* distro. I used it ahead of everything else: Where Mint sits today, once there sat Fedora: From Fedora releases 1-5, it was, for me, the *it* distro, replacing Mandrake.

With Fedora 1 through 5 I had to hack the wireless to work on all my laptops. I was getting downright fast at it. Either finding the unsupported-by-Fedora-but-Linux-native-driver-stuff, like MadWifi, or shortcutting it with NDISWrapper. Either way, Fedora was on the air in short order. It was no harder to get going than SUSE back then, and Fedora hacks were better documented on the Internet. it seemed like everyone used it.

When I started using Linux as my full time desktop here at the office, it was Fedora. Not any more, and not for a while. Now-a-days, I only install it to see what is happening in it, and it usually ends up being frustrating because in terms of ease of install and supported hardware it has been passed standing still. Fedora feels stuck in the past, with the Anaconda installer: In truth it is no different to install now, in terms of difficulty, and need to add in 3rd party repositories, than it was in the beginning, or at least that is the way that it feels. One person at the office (a fellow Linux desktop user) said that they felt that Anaconda itself was getting more fragile with every release.

Ubuntu, Mint, Xandros, OpenSUSE, Mandriva, PCLinuxOS... you name it. All of them are dead easy installs, and usually they just work out of the box.

Fedora stands alone

Fedora is outstanding in its field: That is where we found it. Out standing in a field... Sorry.

I have known intellectually for a long time that Fedora is different from all the other highly used Linux Distros. Knowing that and have a visceral understanding of it are not the same thing though. I used to think of OpenSUSE as being a kissing cousin to Fedora, once SUSE started to use the Fedora-like development model. But there is a big big difference, especially now.

Here is where I get into trouble sometimes when I am looking at things like this. I have to recall that Fedora may look exactly like any other Gnome based Linux; Same menus, same packages, same projects underneath it all, but it is assembled out of the bleeding edge stuff. Can it be made to work: yes. Is it interesting to see what some packages are doing? Yes. Should you use it as an ELD: Only of you don't need support.

The OLPC project has been working for a long time to create a production version of Fedora 7... and the Fedora project is two releases down the road from there. Support on OLPC is about what you'd get from Fedora too: online forums, Wiki pages for Doc, etc. No number to call, no throats to choke if you are so inclined.

You can get support, from a commercial company, for years, on Ubuntu (especially the LTS versions like the current 8.04). Mint is community supported but close enough to Ubuntu to be pretty supportable. Many of the things published in the Ubuntu forums work on Mint. Xandros and SUSE stand behind their versions with support options.

You want support on a Linux desktop from RH, you go with Red Hat Enterprise Linux 5 Desktop or one of its variants.

Part of what made relative lack of support for Fedora pop back into focus for me was a note I got from the CodeWeavers folks:

...The bad news is that extensive testing on Fedora Core 9 has revealed
severe problems with FC9 itself.  There's a serious font-drawing
problem, and also a periodic crashing bug.  Both of these are problems
in Fedora and outside of our control, so this latest release is likely
to exhibit those problems as much as the betas did...

That was interesting two ways: The obvious technical issue, but also that Codeweavers was *trying* to support Fedora as a viable desktop for Linux. 

Looking at BMC for a moment, we only support versions of Linux that have vendor support for the version: currently Novell and RedHat GA releases. I personally would like to see Ubuntu added to that list, but I am sure that comes as no surprise to anyone that reads this blog. No I am not announcing or hinting at anything. Just wishing.

ELD and Fedora

Anyone who is a Linux maven could make a go of Fedora as an ELD. I know lots of people here at BMC that do just that. Fragile installers do not scare them, and fixing drivers is no big deal, etc. Fedora, for them, is beauty because it is bleeding edge. Sure the Xorg server in R9 is experimental and causing screen tearing. Now. In a few days or weeks it will get fixed, and then they'll have access to the latest greatest features. The speed. The bleeding edge hardware support. It will have been worth it. To them.

As an ELD for the masses though, all Fedora is going to do it give you a clue as to what you will see in some point in the future: maybe RH ELD 6.  And even that is not a dead certainty: RH will err to the side of stability, so some bleeding edge stuff will not make the cut. Maybe RH ELD 7. Maybe never.

For use in a shop like ours.. an MS Exchange based shop, I always look at what Evolution is looking like and how it is behaving. I have learned over the years that the point release of the project is not all you need to know. The way that the Distro packages and tests it is important. See what happened when I tried to run Evolution under Mepis for example.

I did test 2.22 on Fedora 9. It works almost the same as 2.12 did on the last Fedora, and it also works about the same as 2.12 or 2.22 does on Ubuntu or Mint. Recall the 2.12 and 2.22 are adjacent releases, despite the jump in the numbering. Evo has all the same features, and all the same problems. Do a mass delete from Evolution on one computer, and the other one will completely loose track of the inbox message count. Exchange back end crashes fairly often still. Finally, nothing has really happened (as I feared it would not) on the MAPI support front.

I did do one experiment I have never done before: I set up both IMAP and Connector at the same time and pointing at the same inbox on the same server. When the connector crashes, IMAP keeps right on running. This tells me that the instability is probably not in the base Evolution code, but in the protocol connector of "Connector" itself.

Install of Fedora 9


I originally planned this post to be about how the Fedora 9 installer has changed between the pre GA and GA code. I changed my mind. I was interested in the 'fragile' comment that had been made. It matched my experience with the pre-GA LiveCD. I decided to go conservative, and download the install CD set (the Dell test computers not having bootable DVD media). It was by and large the same Anaconda install I have seen for a while now except that it would not run in GUI mode. I had to run it in the ASCII character based curses based mode to see it. No big deal: Done that before.

When the final boot came, the same thing happened that did with the LiveCD: The video mode was whacked (same as the GUI install it appeared), and the boot messages were invisible. The Dell monitor said "This video mode can not be displayed".

I booted to single user, erased the /etc/X11 xorg.conf, ran 'system-config-monitor', and got past that problem. But now 'firstboot' had not run, so I manually added userids and config-ed things on the system that the firstboot stuff normally does.

I don't know if this is a global thing or not, but I have to agree now about the fragile comment my co-worker made: the installer is not very solid. We both have Dell gear to work with so it could just be a limited sample type problem. Given the ubiquity of Dell gear, and the fact no other OS is having these issues with the same hardware, that seems odd. Perhaps by being on the bleeding edge some backward compatibility was left behind?

Once up and running, it is a very standard Gnome desktop: None of that MS look-and-feel stuff that SLED or Mint is going in for. It is fairly crisp on the older hardware, but that is a Linux hallmark. I would have been shocked to see it going slowly. 2.0 Ghz and 512 MB of RAM is still a *big* Linux system, even if this hardware is over three years old.

Applying maintenance via yum makes the video break again on reboot. I guess a new xorg came in, and replaced the /etc/X11/xorg.conf on arrival, but that is just a guess. I powered it off and put it away. I know what I came here to find out. It will be there should I get curious about something else.

The Computer Pile

Fedora now lives in that same logical place in my pile of computers that MS Vista does. Something I fire up whenever I get curious about something. Not that Fedora is as bad as Vista: I got curious about how Vista works after SP1 is applied, and I came to test that last weekend. Answer: No idea: It needs 8 GB of free space just to unroll the patch bundle! Vista takes up 12GB of the 15GB partition. It would be all kinds of work to get it more space.

I decided to try applying point patches. Over two hours later, I had about 23 patches installed. When I did the patch update on Fedora I installed over 100 new or patched versions of things, on much slower hardware, in about 15 minutes. No: Fedora is not Vista. It just is not Mint-Ubuntu-SUSE-Xandros-PCLinuxOS-etc either. That is neither bad, nor good. It just is marching to the beat of a diffrent drummer and and I need to remind myself of that from time to time.



_____
tags:
Monday, June 16, 2008  |  Permalink |  Comments (1)
This is a Beta? Wow....

These days the Distro I am always watching and waiting for a new release of, more than any other, is Mint. I have expressed that preference here quite a bit since I discovered Mint back in its 3.x release days. Mint 5.0 Beta is out now, and I am not really sure why it is considered Beta. All I have had with it so far is about 24 hours, but it is solid.

Good Parents

It helps that Mint starts with Ubuntu 8.04. I have been extremely impressed with that release on every computer I have put it on. The Mint team then layers on its own package selections as defaults, adds in its own themes and toolsets, probably does other things I don't know about, and creates Mint.

I really like the cool, dark themes of Mint. I also liked the earth tones of Ubuntu, and I really like the Heron desktop background artwork, but I think that, in the West, Mint's default colors are probably going to be more universally well liked, for the reasons I discussed in my last post at my personal blog.

Just an assumption though.

The Install

There is not much point going into depth on what a Mint install looks like here. It looks like an Ubuntu install with a Mint themed LiveCD desktop behind it. It is a LiveCD with and install icon on the desk. Seven panels. Same questions. Same annoying new time zone slippy sliddy map. I was glad to read in several full on Ubuntu install reviews that every one I read found that feature to be useless. It may take an impressive bit of graphical programming to make that happen, but I can not determine what useful purpose it has. I just use the TZ chooser pull-down menu below the graphic these days. In fact, one of the things I was curious about in Mint 5 was whether they would go to the trouble to take the Ubuntu-ism back out. Answer: No. Oh well.

The computer I used for this is my Dell D620 laptop. I had Ubuntu 8.04 on it already, so it was a pretty good bet that it would all work, and it does. With 2 GB RAM, dual core 2.0 Ghz T7200 processors, 1440x900 flat panel (detected and configured automatically!), and the Intel GMA 945 chipset, the D620 is a middle of the road laptop by today's standards. Ubuntu and now Mint make it act like a top of the line unit though. Everything is fast. The Compiz GUI effects are enabled by default and do not visibly slow the computer. The default effect choices are mostly useful rather than eye candy: Things like task bar preview, and window re-sizing feedback.

I have said it before, but it bears repeating: If Vista could feel shame it would be hanging its head. Linux and OS.X are living proof that you do not have to have top flight graphics cards to do all the fancy video compositing.

One other preference oft mentioned here: Mint does a SLED looking menu by default, with I just as quickly ignore (mostly) and restore the Gnome standard menus across the top.

First Pass

After I very quickly (less than 10 minutes) spun the Mint 5 Beta stuff to the laptops SATA hard drive, formatting over the "/"(sda2) but preserving "/home" (sda4). Layout like this as usual for me:

Disk /dev/sda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00019fb7

Device Boot Start End Blocks Id System
/dev/sda1 * 1 1824 14651248+ 7 HPFS/NTFS
/dev/sda2 1825 3040 9767520 83 Linux
/dev/sda3 3041 3283 1951897+ 82 Linux swap / Solaris
/dev/sda4 3284 9729 51777495 83 Linux

I booted back into Mint 5.0, logged back in to me, and it pretty much looked like the same desktop I had under Ubuntu. That in turn looked like the Mint 4.0 before that. All the look and feel elements, backgrounds, font settings, etc all there. A quick trip into Synaptic re-added all my default applications that I was missing. Things like HFS for reading Mac disks, Evolution plus debugging packages for email, and so forth: All told about 40 things not in the default LiveCD.

Things all worked, but I was having a bit of trouble with Evolution remembering a MS Exchange hosted IMAP server my email used to be on a while back. Nothing I could do surgically would convince it that the server no longer existed. It kept prompting me for my password to it. Very annoying. I used 'find' to search every file in my home directory for the server string, but nothing doing. It must have been coded or compressed in some way that a plain text search would not find.

I use HTML editing far more than standard OpenOffice file formats. It makes it easy to start a document in one place, load it up to Google Docs, and edit it there some more, then download it for a final polish in a different place. That means the OpenOffice writers default launch needs to be set to "ooffice -web %U". 'System / Preferences / Main Menu' makes short work of that tweak.

HTML Editor Digression

I did part of this post's edit is Quanta 3.5 under Mint 5.0. It crashed a couple of times, but saved the work so I did not lose anything. I am guessing it would be more stable in a KDE environment than a Gnome one... or it is just that this is a Beta. It is usable and useful though. Quanta is an interesting project, but it did manage to almost torch half this post. I had saved the post and exited Quanta... I thought. I then edited for a while in a few other editors. Then Quanta somehow got restarted, and saved a version from its project store back out to the disk, overwriting what was there with a much older copy. I cussed a bit, but then realized that the last editor I had been using, Komposer, kept a backup of the document, and was able to get most of it back. 

I do not now really blame Quanta for this near erasure of my post. It is a sophisticated tool with tons of features, and it was working the same way many SDK's: Like my blog post was just part of a much larger project. It was trying to organize it into its internal data structures and not really assuming I was working from the external disk copy. Quanta wants to be used in the way it was designed to be used, within its project paradigm, not as a casual editor.

I did part of this post in Kompozer (yea backup files!), and for extra fun, part in Bluefish. I wish there was one really good HTML editor out there for Linux, but all of them bring something to the party I like, and other things I dislike, so that depending on what I am doing, I fire up one or the other. 

  • OpenOffice Web: Good WYSIWYG, spell checking, but upper cases all the HTML tags, and drops in extra stuff at the drop of a hat. Trys way to hard to make things into paper documents rather than simple web pages like these posts.
  • Komposer: Based off NVU. Not being deeply developed anymore (which is better than NVU, which has not seen any work for years), has crashed a few times on me) tends to be more for WYSIWYG work. I like the tag cleanup tool, but wish it did more. If Komposer / NVU were being actively developed and had better spell checking I think that is what I would use more.
  • Google Docs:  Has taken to "severely uglifying" the HTML tags with "ID" stuff.  Its habit of dropping in unwanted <br> tags is unreformed, and Google has never fixed the screen presentation so it is more WYSIWYG. I mostly use it when I am doing a great deal of editing from all over the place or collaborating on something. One thing though: You just can't beat its revision system. It has saved my document-editing-bacon more than once. Makes its unruly behaviors all that much more irritating, because otherwise it would be my HTML editor of choice.
  • Bluefish: Reminds me a lot of Quanta: More project oriented, more raw HTML editor. WYSIWYG features are sort of grafted on, but handy for serious tag slinging.
  • Quanta: All noted above.

Back to Mint..

I ran 'Mintupdate' from 'System /  Administration' and downloaded the current stuff that Mint has defined as safe updates. About 40 packages altogether were in their safe classification of '3'. 

When I was in Synaptic adding in packages like HDDtemp GkrellM, etc, I saw that there were updates to several packages available that I was not installing. MintUpdate does not offer then at all, or displays them in state other than three depending on your display preferences. One of the updates in Synaptic that is not in MintUpdate is ... MintUpdate. Looks like MintUpdate is not ready to replace itself...

This update safety system that MintUpdate brings to the table is probably at least part of the reason why this so-called Beta is so solid.

Pass Two

This D620's Gnome desktop has survived quite a number of OS upgrades, so it was time to clean out the whole thing and start over. I erased .gconf* and .gnome* and then logged back in and re-laid out my default desktop. It now looks the same as it did before I started, and only took about 10 minutes, but it does not have any weird behaviors anymore. Evolution has forgotten the old MS Exchange server. What 'find' could not find, 'rm' took care of. Brute force rather than finesse though. More of that in the Evolution section below.

I added a new panel to the top, and repopulated it Gnome style, but left the SLAB-looking panel thing (MintMenu and is at version 3.3) at its default location on the bottom for reference. MintMenu has one feature I really like, which is the ability to triage-filter applications as I type their names. If there is an application I do not use very often such that I do not know which menu they appear on, the triage-filter-feature is pressed into service. Example: Sometimes a thing like the Bluefish HTML editor shows up in one distro Gnome menus in the 'Internet' section and others like Mint place it in 'Programming'. With MintMenu I don't have to know where something is. Kind of like Spotlight on OS.X, but more focused.

While the screen resolution was configured correctly, the default DPI was not quite right. In System/Preferences/Appearance/Fonts it was set to 92 or so, and D620s flat panel is really 121. At 121 the fonts were a little bigger than I needed, so I set the DPI to 108 and that seems to work pretty well. Lovely anti-aliasing, smooth round easy to read shapes. Very very nice now.

Evolution 2.22.1

Since I am looking at this as a desktop for complete replacement of MS windows at the office and in am MS Exchange 2003 shop, naturally Evolution has to be considered. It worked fine under Ubuntu 8.04 though, and nothing really changes for Mint 5.0. It is in fact the Ubuntu packages: Mint does not version them:

dpkg -l | grep -i evolution
ii evolution 2.22.1-0ubuntu3.1 groupware suite with mail client and organizer
ii evolution-common 2.22.1-0ubuntu3.1 architecture independent files for Evolution
ii evolution-data-server 2.22.1-0ubuntu2.1 evolution database backend server
ii evolution-data-server-common 2.22.1-0ubuntu2.1 architecture independent files for Evolution Data Serv
ii evolution-dbg 2.22.1-0ubuntu3.1 debugging symbols for Evolution
ii evolution-exchange 2.22.1-0ubuntu1 Exchange plugin for the Evolution groupware suite
ii evolution-exchange-dbg 2.22.1-0ubuntu1 Exchange plugin for Evolution with debugging symbols
ii evolution-plugins 2.22.1-0ubuntu3.1 standard plugins for Evolution
ii evolution-webcal 2.21.92-0ubuntu1 webcal: URL handler for GNOME and Evolution
ii mail-notification-evolution 4.1.dfsg.1-4.1ubuntu1 evolution support for mail notification
ii nautilus-sendto 0.13.2-0ubuntu1 integrates Evolution and Pidgin into the Nautilus file
ii openoffice.org-evolution 1:2.4.0-3ubuntu6 Evolution Addressbook support for OpenOffice.org

For this list I deleted the libraries and compressed some whitespace for brevity....

There we have that big version jump: the last version of Evolution was 2.12 and these are 2.22, but there are no intervening releases. Evolution is now aligned to the Gnome release numbers.

A few settings I always do in Evolution later (like making Sunday the start of the week, setting my default calendar to be the one on the MS Exchange server, limiting GAL responses to 50, Turning off all the Groupwise plugins, etc) and it is ready to go. Stable so far, but...

Inbox State with Multiple Clients and Evolution

Evolution has one pretty ugly behavior, and it has been there for quite a while. I see it all the time but have not said too much about it here. I assume that my use of both MS Exchange and Evolution is not utterly typical. Others may not see this often. Just in case, here it is:

I do not run email from just one Linux system, but most of the time from at least two. My desktop, currently running Mint 4.0 on a Dell 745, has the mission of filtering my email. Evolutions filtering tools are pretty good, so I have it keeping all my various subscriptions to various email lists organized.

Further: MS Exchange is well known for being snarky about a users inbox getting too big. Unnatural things happen when a server side PST gets too large (although this has gotten better in successive releases of MS Exchange). Add to that inbox size limits: most shops, ours included, has server side limits for how much storage you can use for email.

If I am logged in to MS Exchange from both the desktop and the laptop, and am using Evolution to read my mail in both places, and I then archive a large amount of email from the desktop, the laptop utterly loses track of what is *left* in the inbox. It can only see emails that arrive *after* the archive happens. Log out and back in all you like: it never figures out that its view of the Server side Inbox is out of whack.

There is an easy fix though, and I put it into a batch file. I run this script every single time right before I log in to Evolution. It looks like this more or less:

rm ~/.evolution/mail/exchange/linuxboy@ms-exchange.bmc.com/personal/subfolders/Inbox/summary
rm ~/.evolution/mail/exchange/linuxboy@ms-exchange.bmc.com/personal/subfolders/Inbox/summary-meta
rm -Rf ~./evolution/cache

Yeah: Its ugly. Brute force effective though.

Interestingly, it does not help to be logged out on the laptop instance of Evo. Logout, do a large archive from the desktop, and then log in from the laptop, and it has lost track of the MS Exchange hosted Inbox. There do not appear to be any 'clear cache at startup' or 'rebuild Inbox meta-data at startup' configuration settings, at least in the GUI. 

It does not appear to slow down MS Exchange login to do the scorched earth script before starting Evo, so it is cheap insurance.

This is not a Mint thing, or an Ubuntu thing. The exact same thing happens no matter what Distro or release of Evolution I use. Evo just can't keep track of Inbox state all the time. Oddly if I delete a few files from one computer, the other copy of Evo on the other computer figures it out. Seems to be a size or number of items deleted related problem.

Other Office Stuff

This is a beta and it has only been about 24 hours of testing so I can not say that every single thing works as it should. That will require weeks of testing. I did have some emails with .pdf, .doc, and .xls attachments I needed to look at during the day, and everything seemed to work perfectly with the included OpenOffice 2.4. I will probably try out the Beta 3.0 version at some point as well, sinccne it is supposed to, like every release has so far, increase fidelity of MS format import. 

I went to the web interface of Remedy to research an Incident, a Change Request and a task, and Firefox 3.0b5 functioned extremely well against the Remedy 7.0 Mid-Tier server. I pointed Firefox at an internal Oracle application and had no issues there. Finally, TSClient against an MS W2k3 virtual machine running in the VMware farm downstairs had no issues.

As Nero Wolfe often tells Archie: "Satisfactory"

PS: A note in the Mint Wiki about 5.0 says that they are watching Firefox 3.0 closely and want to ship that as soon as they can. I have been running the 3.0 Betas for a while on Linux and OS.X and it is looking very good. Solid. Fast. Cool new features. I am glad Mint 5.0 will have it when it GA's.... or at least goes Release Candidate.



_____
tags:
Wednesday, May 14, 2008  |  Permalink |  Comments (0)
At work with Ubuntu's latest Long Term Support version of Linux

I have been experimenting with Ubuntu 8.04 codename "Hardy Heron" on two of my personal systems and the Linux test system stack I mentioned last post. I have not written much about 8.04 until now because, being pre-GA, there was not much I wanted to get into as far a discussion of its Enterprise Desktop Worthiness Quotient (EDWQ?)

I started testing with 8.04 Alpha 6, on my IBM X30 laptop. When that looked pretty good, I tried it on my Acer 5610 laptop, temporarily replacing Mint 4.0 there. I used both of those computers for while to do various things like write previous blogs and other perosnal documents, surf the web, and so forth. I kept them updated nearly daily, just to see how 8.04 was trending.

I later installed the Alpha 6 version at the office on one the Dell DX260's in the test stack. The point of this was to configure Evolution 2.22 to get a feel for how that was shaping up. Well, that and I had this cool new set of computers that were crying out to test something.

Personal GA

When 8.04 GA'ed, I did a clean install on the Acer 5610 and tested it for the evening.

To this point, I was looking at personal usability stuff, not Enterprise. My first off the cuff reactions to that were:

  1. I like the new artwork, especially the abstract Heron on the desktop background. I showed it to my wife, and she does too. I showed it to some folks at the office, and got a "Yuch: too brown" reaction. Looks like the preference stuff I talked about in " Color Theory" are still true...
  2. When I first brought up 8.04, I had no network connections. 8.04 "saw" the wireless card, and configured it. The wireless card "saw" all the local access points. I just had to pick one and all was good. But I knew to click on that icon and map that AP. Not sure a new Linux person would not have been frustrated there.be nice if a pop up or something mentioned "Pick an AP point to get started" or something.
  3. Compiz seems to work really nicely on the Intel GMA 950 chipset on the Acer 5610, and I did *not* have to tell xorg anything about the screen resolution: It was correct from the get-go. Compiz even works on the X30 with its tiny amount of graphics memory: Vista should be hanging its head in shame. I turned Compiz off on the X30, preferring a crisp screen response to a pretty one. No point turning it off on the Acer: It snaps along pretty well there.
  4. I'd have no issues installing this OS for a non-computer person. My brothers Mint 4.0 install is not in danger of being replaced though.
  5. It seems like every Linux lately gets a bit faster: A bit crisper. I assume this is the latest set of tweaks to both Compiz and the Kernel. Ubuntu also tossed AT&T style Init a while back to get boot cracking along more quickly and that seems to be getting better all the time.
  6. The new version does the best job yet identifying and configuring the ENE Technologies chipped MMC card slot on the Acer. Still does not see the card insert event to mount the card automatically, but if it is there at boot it sees it, and data transfers from it are faster than before.

LTS

This particular release of Ubuntu is more interesting than Ubuntu-average as it pertains to the subject of its viability as an ELD (Enterprise Linux Desktop). This is one of the Long Term Support versions of Ubuntu (the last LTS version being 6.06), with the desktop version of 8.04 being supported for the next three years (April 2011) and the server version for the next five years (April 2013). Given the amount of time a large company takes to get a new release of a desktop image ready, tested with all the corporate apps, and then pushed out to all the desktops, three years support is pretty much a requirement.

Ubuntu, knowing 8.04 was going to need to be supported for a while would tend to focus more on functionality and security related issues than latest and greatest eye candy, or at least that is my assumption. This is the kind of assumption that I'll be looking to test.

While I read some things in the trades about 7.10 being unstable because of how much feature and glitz the Ubunites added, I have to say that I never saw that. 7.10, and its Mint 4.0 variant, have been dead reliable for me other than the few Evolution issues I have already documented in this blog.

GA

With the arrival of the GA version, I installed it not only on my personal Acer 5610, but my BMC laptop, a Dell D620, and did a fresh install over the Alpha 6 version running on one of the DX260's in the test system stack. For the D620 and DX260 installs I took some notes:

First off, I used the 64 bit installer on the D620 because its Core 2 Duo CPU's support that. My first ever 64 bit personal computer install. Done 64 bit servers before, but for some reason, even with the 64 bit capable hardware like the D620, I had never installed a 64 bit OS. The Acer with its Core Duo processors (Not Core 2 Duo, surely one of the most ill advised processor naming conventions on the planet today, Isn't Core 2 Duo alot like say Core Two Dos?) and the DX260 with its P4 received 32 bit versions.

Install

There are seven screens that appear after you click the 'Install' icon.

  1. For the first one, I picked "English". That seemed to make the most sense to me at the time.
  2. This makes one of the most annoying new features appear: The time zone chooser. The picture of the world zooms in when you try to pick a TZ (I was going for Chicago, and the picture kept sliding about trying to escape the mouse. There was an old program I used to have for MS Windows back in the 3.1 days that you could put on someones computer and then sit back to watch them scream. It made all the desktop icons dance out of the way of the mouse so that you could never click on anything. The new TZ set feature reminds me of that program. Easier to just use the pick list, which thankfully is still included.
  3. Screen 3 of 7 is what keyboard to use, and I have never once seen this program not know my keyboard type. I assume that there are issues here in other locales to make this screen worth stopping on.
  4. For 4 of 7, I picked the manual disk layout option. I tried the default option with Alpha 6 and everything was laid out in one partition. That was not what I wanted so for GA I went my usual way:
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        1216     9767488+  83  Linux
/dev/sda2            1217        1459     1951897+  82  Linux swap / Solaris
/dev/sda3            1460        2434     7831687+  83  Linux

        SDA1 is '/' and SDA3 is '/home'.

  1. 5 of 7 I gave the computer my name, the name of the default account I wanted, and the name the computer should have on the network.
  2. 6 of 7 was the recently added screen where it tries to find old userids from which to derive the settings. It found nothing. This was odd. It did not find anything on my Acer 5610 either, yet it has Vista as a dual boot, and this D620 has XP. Both had previously existing Mint installs too. This feature has never really done me any good to date. Someone must find it useful though.
  3. 7 of 7 verified all my settings. I pulled the trigger, and for about 8 minutes things were copied off the CD. It spent about 2 minutes configuring things, and then was ready to boot to Ubuntu 8.04 LTS GA.
I am not complaining (much) because the install is so dead easy, but the thing is that it could be even easier. Four maybe five screens. Combine a few things. Be smarter about the disk layout so I don't have to keep overriding it. Not saying it has to be the same as mine, but putting everything in one partition is not as brilliant as most of the OS is.

The 2.6.24-16 versioned kernel stopped to check out my hard drives on the way up, which added some boot time the first time.

There were no updates in the package repositories even four days after the GA date, which might be a first. Usually there is a last minute something or the other. Maybe being LTS they are more cautious about releasing things.

I proceeded to add the packages I need: things like Avahi, Sensors, HDDTemp, Macutils, HFS support, the debug packages for Evolution, Pidgin-SIP, WINE, and the 32 bit compatibility libraries (on the 64 bit installed D620 only).

Evolution was then configured, same as I ever do, against the MS Exchange 2003 servers.

Why Evo: a quick review

i have stated some of these next things about using MS Exchange from Linux several times, but I can not assume that everyone who might be reading this right now is familiar with everything I have ever written on this topic. Too easy to get here via a direct Google transporter. If you are a pure Linux, or at least Open Standards based shop, you might be thinking to yourself "Using MS Exchange is just suboptimal: Why not use something else?". Reality is that 50% of the big shops out there use MS Exchange, so no matter: it is something that just has to be coped with.

To work as an Enterprise desktop here, I need to be able to get at my MS Exchange sourced Calendar and email from Linux. The only ways to do this that is viable *at the moment* is:

  • The Gnome Projects Evolution package, with the MS Exchange Connector (WebDAV)
  • Using a web browser to get email off the MS Exchange servers Webmail. I am on record as not really liking this option because the web client is very heavy without real benefit and could use a strong lesson from Gmail on how to do Webmail right.

I do not currently consider the KDE Office stuff as all that workable against MS Exchange, but I am waiting for KDE 4.1 to see if the rumored updates/improvements in this area are in fact there.

Whats in a Name?

Evo is now at release 2.22, jumping from 2.12 in the last release. That is not as wide a jump as it sounds: they are just lining up the Evo release number with the Gnome release number. Nothing really new stands out in the user interface, although I have a feeling it has many subtle upgrades and tweaks and I just have not found them yet. Like those newspaper games in the comics section: "What is the difference between these two pictures?".

EVO Has Needs Too

Lots of them.

Evolution is a problematic beast. It is a large project, and none of the new tweaks appear to be in the code size reduction department. The only thing I noticed different during configuring the email client to use MS Exchange is that is appears to do a better job looking around for available GAL (Global Address List) servers. Every time I have configured it, it has presented a different one. Before, I always got the same wrong one. What algorithm it is using is still not obvious. I still have to over-ride it to put in the one I want it to use: the one nearest to me in the network.

I have noted in these early experiments with Evolution is that it is far more stable on the Dell D620 laptop than the Dell DX260 desktop. The laptop is far more powerful, with dual core 1.73 Ghz processors (7984.3 BogoMIPS) and 2 GB of RAM. The single 2.0 Ghz P4 (3989.49 BogoMIPS) and 512 MB of RAM of the DX260 just don't seem to be a good place for Evolution to live. It fails frequently there, and when it fails, I have to run my cleanup scripts before I restart it. Not a quality, Enterprise level experience. Sure, the DX260 is hardly what a current hardware shop would be using, but I expected Linux and its apps to work in 512MB.

Quick poking with a diagnostic sharp stick, and I have the performance problem down as not enough RAM on the DX260: With Evo up and running under the D620, only 15% of the RAM or 300 MB is in use for programs. With Evo running on the DX260, 50% of the RAM or 256MB is in use by programs.

I expected the desktop system to be crisper, given its faster hard drive, but the extra RAM on the laptop appears to more than compensate for the lower RPMS of the disk (4200 versus the desktops 7200), at least as far as Evolution is concerned.

When Evolution fails (and so far this is only on the DX260, but not the D620), it is always the MS Exchange connector that is failing. The main Evolution client stays up and running, but without access to the MS Exchange server, that is not very useful. I guess I really should say it is not useful unless you have other email protocols still up and running in any case. If you have Evo set up with IMAP, POP and other email protocols, the failure of the MS Exchange backend would only affect that one mail store, leaving the others to process email to their hearts content. I have in fact set up Evo from time to time to use IMAP and WebDAV-I.E,-Connector to the same mail store, so that when Evolution Connector fails I can still read email until such a time as it is convenient to blow out of Evo and run my cleanup scripts and restart Evo. That only works if you MS Exchange server is set up to run IMAP as well as the murky MAPI-plus-RPC protocols though at the same time though.

Based on the fact that it is working fine on my D620 I infer that my Dell 745 would run Evolution under 8.04 without issue, but will wait for Mint 5.0 before I do anything OS re-configuring there.

IM trying

Another of the suboptimal areas of MS Infrastructure that I have to deal with is our current IM standard.  MS Office "Communicator". Quotes because, like "Sharepoint", it only really works *at the moment* if you are running an MS sourced desktop. IE: you can communicate or share only if you are of the MS Windows population of computer users. Its like starting off a collaboration project by telling a part of your contributing population "we don't want to hear from you, because you think differently"

The Pidgin project is working on getting a SIP client going that will inter-operate, but I have had no luck to date with it, and neither have several of my Linux using compadres.

The good news is that unless I am running MS Windows someplace, like a VM, I do not have to deal with getting IM's. IM if hot hot hot out there, but for me it maps to YAIV: Yet Another Interruption Vector. Unless I need it for real time problem diagnosis, I tend to stay out of it.

Coming Soon

Ubuntu is out and looking good (even if Evolution needs a pretty stout system to run well), but that is only the beginning of the spring season. Still up are Mint 5.0, which of course is Ubuntu 8.04 polished to a fine sheen, Fedora 9, and OpenSUSE 11. More on those as they go GA.



_____
tags:
Wednesday, April 30, 2008  |  Permalink |  Comments (0)
Quite literally a stack. Of identical computer systems. Weekend hardware retirements net four of a kind systems for new testing of Enterprise Linux Desktop

There are some advantages to being in my job. Not only do I get to see all the cool new hardware, "play" with the systems of my youth (VAX 7000 anyone?), and see the same diversity of software as hardware, but I get to sort through the hardware discard pile. The discard pile has been pretty tall lately too, what with all the machines we are able to retire because of new technology like X86 virtualization. Sometimes a computer gem or two appears in that pile that gets hauled back to my office and then late one night after everyone has gone home, gets Linux installed on it.

He Started It!!

I moderately recently wrote a post about the possibility of using Mepis as an Enterprise Linux Desktop in an MS Windows infrastructure based shop. I thought I was being fairly clear about the ground rules of that particular evaluation, especially the "Needs to work with MS Exchange to work here" part. I said nothing about its suitability in places where one is lucky enough to *not* have to deal with undocumented and arcane MS Windows protocols. My thesis is that until Web 2.0 is able to abstract the end user away from the MS-created protocols or the special way MS creates non-standard versions of standards like Kerberos or WebDAV, a successful Enterprise Linux desktop will have to be able to deal with them directly.

"Off Label Mepis" was not universally popular with the Mepians of the world, especially over at MepisLovers. In the discussion of my post at MepisLovers one of the factors called into question about my testing method was that I had done it in a virtual machine. Every time I read a comment like that it is like going back thirty years to the early days of VM on the mainframe and frequently having OS/VS2, DOS/VSE, or MVS people tell me that VM was not a good place to test things. What is old is new again. Still, it is within the realm of the possible that things like timing issues of a VM [I.E. the way a virtual machine does not really know what is happening in real time, since all it sees are the times it is being dispatched] can affect a test. I have never had that happen on a test of Evolution against MS Exchange from a VM before, but anything is possible.

I repeated the test on real hardware, and my results did not vary on the key point that Evolution did not work against our MS Exchange server. That pretty much was what I expected, but it didn't take me that long to set it up to verify it so it was worth doing. I try not to ever dismiss a criticism if it might have any validity.

Ever since that comment I have been thinking it would be nice to have some standard hardware to be able to compare one version of Linux to another *at the same time*. Serial OS loads for a comparison are a pain in the stern. What if I want to go back and check a different thing or forgot to test something? Easy on a VM. A pain to reload and repatch on real hardware, even as fast as stuff like Ubuntu or Mint load these days.

I do have a small collection of old laptops: Compaq M300's. These are interesting to test with, especially for Linux on a laptop type things. But they are slow and have different amounts of memory. I talked about these units a while back, when I was first comparing Kubuntu and Ubuntu.

What came into my possession was four identical Dell GX260's. They had all sorts of advantages over the M300s:

  • Small Desktop form factor. The GX series came in three case sizes. These were the smallest. These computers are old enough there is no picture on the Dell website though.
  • Stackable: Little feet line up with dents in the case of the other unit. Four tall, it is almost a prefect cube shape.
  • Moderately fast for my normal level of test gear: 2 Ghz Pentium 4, 512 MB RAM, 20 GB hard drives.

Also very nice was that my old production desktop system was a substantially similar DX340, so things I do with the 260's are mostly comparable with what I do on the 340 [same 2.0 Ghz Pentium 4 CPU], other than that the 340 has 1.2 GB of RAM and an 80 GB hard drive.

Virtualization Strikes Again

It is worth noting that the four DX260's came into my Linux-loving arms *because* of the success in our R&D labs of virtualization. I talked about this a bit in "Virtually Greener". The particular lab these came out of has gone from just over 250 computers pre-virtualization to todays 120 computers: more than a 50% reduction in the lab. These four plus a couple of others are the only ones that were re-deployed in other missions. The rest have been sent to the great computer recycler in the sky. Or maybe New Jersey. Someplace.

These four desktop systems had been sitting side by side on a shelve in a 19" rack, their small form factor actually working pretty well in that regard. They had been running various levels of MS Windows Server for testing things. Now they are going to show up over at Linux Counter.

The Six

Add in my current production desktop, a Dell 745, and I have six different systems to run six different versions of Linux *at the same time*. The 745 is currently running Mint 4.0, and will go either to Mint 5.0 or Ubuntu 8.04 in the near future. I have been testing 8.04 on a my personal laptops (IBM X30, Acer 5610) for a while now, and it is very impressive.

  • The 340 has been running PCLinuxOS 2007 since I last posted about it here. I donated money to that project to get access to the faster servers and more recent / additional packages and updates, so it is fully set up and tweaked out the way I like it.
  • 260 number one: Ubuntu 8.04 beta (LiveCD)
  • 260 two: Fedora 9 Alpha (LiveCD)
  • 260 three: OpenSUSE 11.0 Alpha (LiveCD)
  • 260 four: Mandriva 2008.1

The 260's and 340 are hooked to an Avocent switch, a Dell 1280x1024 17 inch LCD panel (172FP), a Sun USB keyboard, and a Dell USB mouse. I was able to get all but one of them running correctly at 1280x1024, but Mandriva and OpenSUSE has to be told to use that resolution, preferring 1024x768.

All were installed on the entire hard drive. All use GRUB. Poor LILO. Seems its fortunes have passed.

The one 'running correctly' hold out is Fedora. It works fine off the LiveCD, but gave several problems on the install. One of which is that there can be no swap space defined while installing it. It is a documented problem that Fedora knows about so I assume the next release of two will fix it. The other is that once installed it will not boot at all. Just won't. When I want to look at Fedora 9, I just run it on the LiveCD for now.

Only Mandriva is an official release, so I will not make any judgments here about the relative anything about these OS's, other than to say Ubuntu as a Beta is farther down the road to GA readiness, and was dead easy to install, but the updates are still coming fast and furious in 'Update Manager', so it clearly is not done quite yet. It is less than a week away from GA as I write this in mid-April. Fedora 9 is set for Mid-May, and OpenSUSE 11 mid June.

I wanted to get this config and what I am planning on doing with them set up here in this post, so that I can refer to this test set up as these releases come to GA'ness over the next few months, and I can look at them on a more level playing field. As always, I will be trying to figure out the big question: Which of these desktops work as Linux Enterprise desktop OS's (whether they were designed to or not).

Finally, you might have noticed Mepis is not among the test stuff. If I had one more 260 computer, it might have been. But probably not. Mepis, according to the folks in the MepisLovers forum, is waiting for KDE 4 to add all the bits and pieces required to support MS Exchange, eschewing Gnomes stuff that is already there. I do not know if that is accurate or just the opinion of the poster, but until I see a hint someplace that Mepis or KDE 4 has made some moves such that they interact better with the MS Infrastructure I have to deal with here at the office, I will probably not spend any more time on it.

And now for something completely different..

i just wanted to insert a quick note here, in case anyone was wonder what has happened to the rate I have been posting recently. The answer is:

  1. Bladelogic
  2. End of quarter, end of fiscal year
  3. Reviews

I have been involved in the activities around bring BMC's latest member of our family into the fold. The BladeLogic acquisition has been hugely exciting, but it has kept me pretty busy.

Then, we not only closed a quarter but closed a fiscal year, and at time like that I take off my R&D Support hat, put on my Production IT hat, and help where I can.

Finally, this is review time for my team, and writing reviews take me a great deal of time and effort...writing time that I don't spend writing here.

That's my story, and I'm sticking to it.



_____
tags:
Monday, April 21, 2008  |  Permalink |  Comments (0)
Wrap up of the migration from the Tru64 TruCluster mission critical NAS server to the CentOS 5 Linux NAS server

This post is to do a wrap-up of the topic I have been posting about on and off here for a while about the new mission critical NAS server cluster based off CentOS5. Previous posts in this series, starting August 29th of 2007:

  1. Tru64 NAS Server Replacement Project
  2. NFS, GFS, nodirplus / readdirplus, and Tru64 updates
  3. CentOS 5 NAS Cluster
  4. CentOS 5 HA Cluster Speeds and Feeds
  5. Kernel Hackage
  6. One Week Later 
  7. Bug 431253
  8. GFS or NFSD?

We are not quite done with the migration of all the file systems off of the Tru64 TruCluster. It's original ~4.5 Terabytes have been slowly absorbed by the new Linux cluster. We have been very cautious. We wanted to make sure that we introduced change in a controlled manner, in case we had any more of those HP-UX client type issues lurking in the woodwork. Dan Goetzman, chief NAS abuser, did find another one, and only this week too. More on that below.

Semantics

We also have the fact that we are still running our modified version of the CentOS 5 OS. Neither RedHat nor CentOS either one has closed the issue we opened (See post "Bug 431253" above), and I think that is a smoking gun waiting to shoot some folks in the toes. Here is why I think that: The file open / close semantics used to "live" inside the code provided by each file system. Ext3 file open / close code could therefore could be slightly (or even very) different from GFS or XFS or some other file system, since each file system was written at different times and places by different people for different reasons, and in some cases like XFS or JFS, for different operating systems than Linux. XFS comes to us from SGI, therefore Irix, and JFS is from IBM / AIX.

Recent kernels have provided the file access semantics internally. An installable file system is not required to use them, but they are available to all. The file system maintainers have started to move from the code inside each of the various file system types to routines in the kernel. It makes sense: Why maintain this common code in all these different places?

GFS went 'there' (to using the kernel file access routines) first, and it is our belief that that this is where the HP-UX client issue was introduced. The kernel routines (written by a subset of people who more than likely did not write all the internal routines contained in all the different file systems) don't work 100% the same way as those buried in the file system code. This might be a bit of understatement.

Since Dan's reading on the subject leads him to believe that the other FS types were going to migrate to letting the kernel handle the semantics, that was/is going to put everyone in the same boat. The broken HP NAS client boat. So the metaphor is not too mixed, the smoking gun is then used to shoot a hole in the bottom of the boat, passing through ones toes and perhaps some aquatic life forms.

We don't have to migrate to a new version of the CentOS OS any time soon though. CentOS is working fine. Dan's file semantics kernel patch is working and has long runtime on it, so we have confidence we can move forward. We do have some motivation to move forward if we can: The TruCluster is off both hardware and software support.

Ouroboros Tru64 TruCluster

The Tru64 TruCluster hardware now has so much excess capacity, since its formerly brimming file systems have been "drained" over to the CentOS cluster that any hardware failure could easily be dealt with by self-cannibalization. Ehww. Sounds ugly when I type it that way. True though: we have two ES40 server nodes, each with four GB of RAM and four CPU's. There are empty RAID sets of all disk capacities (36GB, 72GB, 144GB). The fiber channel cards, Brocade switches, memory channel, etc are all twinned out for the TruCluster. If something fails, it fails over to the surviving bits, and in the seven years we have had this gear the only failures we have had have been either of disks or failures of imagination. In failure mode, we can choose to either ignore it now, or use the redundant  capacity, raid other Alpha based gear for parts (I still have VMS servers running on Alpha gear which in a pinch might give up their lives), or worst case do a time and material call to HPQ. More than likely, the TruCluster will just eat itself though, reducing in size and capacity as it goes. That takes care of the hardware.

The software is a different story. It can not eat itself ... hopefully. It never has anyway. It is stable and we have not patched it in literally years. Before that the patch rate was pretty low, and consisted of mostly point patches for specific problems. Stability of the OS / NAS bits is good news and bad news.  Good that it is stable. Bad when things like NFS V4 are starting to creep into the shop, which the TruCluster just will not deal with other than by forcing the client to downshift to V3 or V2.

Easy Does It

This slow migration of critical file systems allowed Dan to not be spending such a concentrated, focused time on data migration, but to go slow, do a good job, and think about each move in depth. Quality still counts, especially when you are moving your most critical bits and bytes!

As I write this, I just looked at the status of the move on the internal Wiki: the vast majority of the file systems that have for literally years lived on the TruCluster are now over on the CentOS 5 cluster. We have been running builds and packaging against them for months. 

The uptime of the cluster as a whole has been satisfactory. We have had no customer facing service outages at all, and even if there have been rolling upgrades or individual node outages, they have been inside the design parameters. The point of doing this as a cluster was to be able to offline a node, work on it, then have it rejoin the cluster, and Dan has taken advantage of that to upgrade the ILO cards and do various other service related things. I looked at one of the three nodes a moment ago, and it has over sixty days of uptime. That does not matter though: the customer facing service uptime has been pretty much since we put it into service last December.

The main thing, and this is the key point is that our customer never knew we did anything to the cluster, and that was just exactly like what we used to do with the TruCluster, even if the underlying OS, and clustering technology, and hardware, and therefore technical procedures are completely different.

Sun Client Bug

Since I last posted here, we have discovered one more unruly client. This time is is Solaris, and the fix is a patch to that OS, not something to the server. Dan as usual has been all over the problem. Here is what he found. First a note in web forum from Casper Dik at Sun:

Casper H.S. *** <Casper.***@xxxxxxx> writes:

"Ross" <nospam@xxxxxxxx> writes:
Thank you, Casper!
Here is the output:
bash-2.05$ cd testdir
bash-2.05$ ls -f
.. testfile .
Ah, yes.
The chmod code is broken and can't deal with "." and ".." not
being the first two entries of a directory.

Bug id: 4171523 which was filed eons ago and not fixed (being a P4 it
dropped of the radar screen, it seems)

I've upped the priority, pinged the responsible engineer and
added that chown suffers from the same issue.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth

Casper appears to be a pretty valid authority on such things, according to some research someone on my team did, turning up this:

  • http://blogs.sun.com/casper/
  • http://en.wikipedia.org/wiki/Casper_Dik
  • http://www.sun.com/cgi-bin/sun/bigadmin/xpertApp.cgi?session=16_prm&xpert=cdik&action=bio

Dan used Casper's information to find this:

"There is a Solaris BugID for this exact problem, they seem to know about it.
 It appears to be only fixed for Solaris 9 and 10;

125499-01 - For Solaris 10 on sparc
123394-01 - For Solaris 9 on sparc

I [Dan] applied the patch to [a Sun system we use a lot], and all is well.
Fix is going to be on the Solaris side for this one...

The patch fixed chmod/chown as that is what it patched. It looks like chgrp is still broken, same exact defect.
So far, I cannot find where Sun has fixed chgrp for the same problem"

This is not a show stopper as near as we can tell, at least for us. Your shop, and mileage of course will vary. Peeling back the covers a bit, Dan found the underlying bits to this that were causing the problem:

The GFS filesystem getdents() call returns the directory entries in no particular order. These get returned back, via NFS, to the Solaris client where the user space utils chmod/chown/chgrp EXPECT items #1 and #2 to be "." and "..". Depending on the returned list order, a loop can develop, and does in our example, until the ch* command has exhausted it's user space open file limit. I confirmed that our LCFS server is NOT returning the list as the Solaris client expects. Note, that as far as I know all other NFS clients have no problem with the list returned. Just SOLARIS!

Great! A Solaris bug that seem to be in most/all clients (I have tested [a solaris client] and [and another solaris client]) triggered by a abnormal, but not illegal, return by the NFS server.
... I did test with XFS as the backing store filesystem, no problem. So it must be in the GFS getdents() quirk.

Relative Costs, Relative Features

I have noted here before why we went to the complexity and expense of the TruCluster, but assuming you have not read everything in this blog over the years about that subject. That goes all the way back to the beginning in 2005, in posts like "Linux and NAS", where I noted this:

"We take a 2 tiered approach to NAS storage for R&D Support. In our first tier is the 5 9’s type storage. The stuff that just can’t go down. The bits and pieces that are used on our “assembly line” to build and manufacturer our own products. The kind of storage that, if it were down would idle hundreds of people around the world in R&D and endanger our time to market. And we know with a great deal of pain just how critical this storage is, because we used to use a storage appliance there, and it could not survive our network. It crashed all the time, and we paid for it dearly."

We paid pretty dearly for the TruCluster too: round numbers about 140k per Terabyte. Sure, a single SATA disk has a Terabyte now, and for a bit less money per TB. For fun, I divided the cost of the TruCluster per TB cost by the cost of a TB SATA disk, and the spreadsheet said that the disk basically cost nothing, as a percentage. Tweaking up the accuracy a bit higher in OpenOffice Calc, I get 0.00277. Pretty near free.

I will not say that the CentOS 5 based system is as good as our TruCluster is/was. It is both better and worse, and depends on how you look at it. How you define "better". That it achieves high customer facing uptime was a requirement. That it is as fast or faster (and it is faster at some things, such as CIFS) was also a requirement. It would not even be worth pursuing without those very minimal goals. It is less expensive. On the down side, our little Linux machine is not as HA, since nothing invented on this planet yet today can match TruCluster on that score. Sigh. <tongue-in-cheek> I guess that is why it had to die. </tongue-in-cheek>

There are things the new server did not have to be. One obvious t