Thursday, April 8, 2010

How GroupWise issues prove that GWAVA's Reload Rocks!

While I was gone "something" went wrong that caused the GroupWise servers to all be manually rebooted.  Nope, I still haven't gotten details on the entire scenario.  What I can tell you is that the NCP mount to our domain server was not checked after the reboot to ensure it occurred.  This left our post offices and domains without appropriate connectivity.  Oh, everything was running, but not as well as it should have been and now I have issues that will require a rebuild of a couple of post offices.  The joys of those who don't read the wiki that was setup for times such as these.

Somehow in all this mess, my VP lost critical emails and needed to get a restore done.  I learned about this while sitting in the plane at the gateway, waiting for everyone else to board, on my return home from BrainShare.  Thankfully my VP is a patient man (those who haven't worked for banking or hospital VP's should consider themselves lucky) and was willing to wait until I got back to the office.

I arrived back at work, sick (courtesy of the man in front of me on the plane) and having issues thinking straight.  We worked out which dates were needed, and started restoring from tape,which under normal circumstances would have taken just a few hours.  Whap!  Out of space on the reload server.  Co-worker insists that Reload stores the data "just like" a POA (which I know isn't exactly correct), tells me to restore to another location rather than grant more space.  Guess what? wrong location and ran out of room again.  I'm still not 100%, so I restore it all to a different spot, create a temproray restore area and - hooray! - we can see the messages we need. BUT it can't find the BLOB files because Reload is written much better than that and no it's not stored identically to a post office because that would take too much room.  (No this isn't a dig, just trying to explain in a short-hand way).

So a day of finagling later, and following my co-worker's confident instructions that tape would be faster than copying the already untarred files over (causes me less grief to just go with the follow than correct the misconceptions) I finally have the TAR file untarred into the RIGHT location at last and my VP can get to his data.  YAY!!!  Dancing all around!  Best part is, we have a common path name for all restores, whether it's from tape or current backups and regardless of the post office name.  And it all works!  My VP was a very patient guinea pig and it turns out we only lost 1 email that he was able to work around not having.

Had we not had Reload the backup process would have been more painful, much longer, not completed in the time frame he needed, and since we were using two different tape backup dates far more work would have been necessary.  Still not convinced Reload rocks?  That's okay, here's the technical reasoning (sorry you had to sit through my whining here..okay, maybe not so sorry).

Reload stores the BLOB files in a way that they don't have to be copied multiple times like they would be on a tape backup.  It can be configured to create a TAR file of an entire week's data to be copied to a tape or other off-line storage system.  The really cool part about the TAR files?  You can extract just a single day's worth of data.  If it's the first thing you're extracting from the TAR file, you'll need to grab the GRE_DATA folder as well, but once that's done, you can extract just a single date at a time if you'd like.  Compare that to having to restore everything for a single date if you're using the default restore method for GroupWise.  Still confused?  How about a step-by-step comparision?

Scenario: UserA loses data at some point during the week of March 1st (they think), but also lost something that they knew they had sometime around mid-January.  It's now April 1st, so neither dataset is on your live systems as you only store 2 weeks worth.  And you have to use the standardized path name that your help desk knows; no custom restore areas allowed.  What do you do?

Using GWAVA Reload:
  1. Restore the TAR file from March 7th (assumption is you create TAR files on Sundays) to a location on the Reload server.
  2. Un-tar the contents to the default location on the Reload server (i.e. /reload/poa1).
    • If time is of the essence, you can speed this up by creating a folder in the default location that matches the TAR file name (i.e. /reload/poa1/1-week1-2010), then un-tar the gre_data folder and the first date you wish to try.  You can then un-tar the other dates as needed.
  3. In Reload Administration, select Access Backups, the correct POA profile and you'll see the new date listed.
  4. Select a date to load and you're good to go
  5. If that date isn't right, simply use the Reload Administration to unload the current set and select a different date in the TAR set to be loaded
  6. Repeat with new dates or TAR files as necessary


Using GroupWise "default" method for restores
  1. Pick a date that "may" hold the necessary data; best bet is to probably start with March 1st
  2. Restore the entire post office from tape to a location either on the live post office or another server (it works best if it's on the same server that hosts the post office, even if it's another volume)
  3. Create a Restore Area using ConsoleOne; if you're stuck using this method best bet is to have some default ones already setup
  4. Add the user to the Membership list for the correct Restore Area
  5. Grant the POA rights to that location
  6. Provide the user with the correct path to use
  7. Have the user check to see if the data needed is there, if not, go back to step one, pick another date, and repeat until data is found
The number of steps don't seem too different, but when you consider that you only have to perform 1 tape restore to provide 7 days worth of data for the Reload option, versus 7 tape restores for the default GroupWise option it's a significant time savings.  And for my post office sizes, the TAR file actually restores faster than an entire post office.  Plus it doesn't impact the live GroupWise server.

And that is why Reload Rocks.

No comments:

Post a Comment