[Bioclusters] High Availability Clustering

Joseph Landman bioclusters@bioinformatics.org
22 Jul 2003 15:22:01 -0400


Hi John:

  You have to look at what services your master node is providing, and
decide your failover plan.  You need specifically to consider how you
want to do a heartbeat (usually a serial cable or other physical
connection) detection.  You need to look at file system issues.  You
might need to invest in specific file system gear (dual ported FC disks,
redundant NAS's, etc).  You would need to look carefully at your
scheduler.  PBS cannot handle HA now, and there are good reasons to look
at other schedulers.  SGE may be able to do HA, and LSF can do HA.

  Have a look at http://www.linux-ha.org/.  Look at Mon (for providing
basic monitoring and triggering).  Look at
http://www.linuxvirtualserver.org/ and see if you could use that for
some of your services.  It depends strongly upon the services you need
the head node to provide.

  You should look at GFS if you want the file system to be Linux based
rather than appliance based.

Joe

On Tue, 2003-07-22 at 14:58, Osborne, John wrote:
> Hello,
> 
> I'm the unofficial admin for a 20 node (40 CPU) linux cluster here at the
> CDC and I'm looking for some advice.  Our setup here relies upon a *single*
> master node which acts as a gateway to the internal cluster network.  If
> something were to happen to the master node, we'd be in serious trouble if
> we are aiming for 100% uptime.  So far we aren't that serious about 100%
> uptime (although we've had it for this master node thus far) but as the
> popularity of the cluster grows it is becoming more important.  I am
> wondering what is the best way to ensure failover for a master node in a
> cluster.  Write now I just write out a master node image to network storage
> every night and if something goes wrong, the cluster is effectively down and
> it could take hours to get it fixed.
> 
> Is it possible to have 2 master nodes with a single virtual IP address?  How
> are other people solving this problem?
> 
>  -John
> 
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman@scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615