Monday, June 14, 2010

New GroupWise Hiccups

For awhile now, I've been warning the powers that be that our eDirectory still has issues that can only be fixed by creating a new tree.  We had the perfect opportunity when we migrated to Linux servers and all new drive mappings, but I was nixxed on the decision.  The chant of "but we've had this tree since we went to 4.0" (which was badly botched BTW) was consistent and no one would listen to me.  Not a big surprise in a place where the motto seems to be "but we've always done it this way".

Because of this, I'm never surprised when there's an issue with a newly created account not getting it's GroupWise user database created properly.  It took me awhile to figure out why I was getting C05D errors sent to my mailbox with message attachments that I couldn't read.  I had set my POA logging to verbose and was seeing C05D messages related to another user ID.  I dug around in the logs for the C05D information and finally hit on the fact that it was related to an error I was seeing in my mailbox statistics log and finally put the two together.  Which was really good as the long time user of the other account that was being "blamed" would have had a fit if I tried a structural rebuild or re-create database on his account.

I found errors in my weekly GWCheck process logs - namely the one that captures mailbox statistics.  The error I found in the statistics log looks like this:  (with the letters fid indicating the particular users FID number)

          Checking user = aaa1 (fid) 1214/1448 - 0 bytes
                   Database not found - /mail/postofc/ofuser/userfid.db


I usually "fix" this issue by moving the user to a different PO and moving it back.  Not this time.  Oh it worked great moving it to my "all fubar accounts" post office, but moving it back failed miserably.  In fact the move couldn't finish because the POA couldn't create the new database for whatever reason.  Nope, no errors in the log - which was still set to verbose, we're not out of space, and the domain and agents have full rights to the drive space.

In an vain attempt to resolve the problem, I copied the user database over to the correct server.  This allowed the move to complete, but gave an error anytime a GWCheck process was run.  Not unexpected since I did copy it over from the second PO to the preferred PO.  The error I would get is an error 44 with the following message:

           Error 44 - Database userfid.db is invalid due to security breach!
             - Verification HOST_ID is "gwpo2", should be "gwpo1"

Novell's fix for this it to restore the database from backup, but since there never was one on "PO1" to begin with there was nothing to restore.

I ended up "fixing" it by first checking that the account was inactive (meaning it's never been logged in), then I deleted just the GroupWise account and readded the user to the correct PO in the GroupWise tab in ConsoleOne.  I didn't need to adjust the FID as there was nothing to recover, so I left the account with a new FID and ran a GWCheck just to verify I no longer received any errors.  All was fine and dandy.

Now I just have to wait and see what other bizarre things will occur with this account. DSRepairs show no errors on the specific object and our eDirectory Health Check shows our tree to be healthy - for the most part.  There are things I cannot fix without creating a new tree, so I'll just keep patching with bandaids until it dies in some horrible, complicated fashion.

No comments:

Post a Comment