• high availability: guaranteed high uptime percentage
    • example implementation: keeping standby back-up servers
    • disruption is allowed (it’s okay if users have to re-login) as long as uptime is maintained
  • fault tolerance: HA plus the ability of system to keep working after encountering issues or failures
    • stricter than high availability: disruption is NOT allowed (e.g. anesthetic machine)
    • example implementation: same session data is broadcasted redundantly across identical systems, so that when one system fails, another can take over immediately without causing disruption
  • disaster recovery: procedures and methodologies in place to handle major disasters (what to do when HA and FT both failed)
    • pre-planning
      • e.g. separate stand-by premises, separate archived off-site backups, make sure people can always access login credentials in emergency times
    • periodic disaster recovery testing needed