Bright Cluster Manager can sample and monitor metrics from supported GPU cards and GPU Computing Systems, such as the C2050 and the rack-mounted S2050. Examples of supported metrics include GPU temperatures, GPU exclusivity modes, GPU fan speeds, system fan speeds, PSU voltages and currents, system LED states, and GPU ECC memory statistics. The frequency of metric sampling is fully configurable and so is the consolidation of the metrics data over time. Metrics data is stored in Bright Cluster Manager's central SQL database and can be visualized in value/time graphs, as well as in Bright Cluster Manager's unique Rackview.
Furthermore, Bright Cluster Manager allows for alerts and actions to be triggered automatically when GPU metric thresholds are exceeded. Such rules are completely configurable to suit your requirements, and any built-in cluster management command, Linux command, or shell script can be used as an action. For example, if you would like to automatically receive an email and shut down a node when its GPU temperature exceeds a set value, this can easily be configured in Bright Cluster Manager.
"There are now more than a 1000 NVIDIA GPU-based clusters around the world," said Andy Keane, General Manager of the Tesla business at NVIDIA. "Bright Computing's cluster management software fills a critical need for datacenter managers to reliably monitor and manage the status of their GPU-enabled clusters."
"Bright Cluster Manager's unique GPU management and monitoring capabilities is rapidly making it the cluster management solution of choice for GPU clusters", says Dr Matthijs van Leeuwen, CEO of Bright Computing. "We will continue to work closely with NVIDIA to incorporate new GPU management and monitoring capabilities into Bright Cluster Manager".
More information about GPU support in Bright Cluster Manager is available at:
About Bright Cluster Manager
Bright Cluster Manager is a Linux-based cluster management software solution specifically designed to make HPC clusters of any size easy to install, use and manage. Its intuitive graphical user interface offers consistent access to all management and monitoring functionality for cluster administrators. Its HPC user environment provides a comprehensive range of HPC software development tools for cluster users.
Pictures and screenshots of Bright Cluster Manager:
# # #
Bright Computing is a specialist in cluster management software and services for high-performance computing (HPC). Its flag-ship product — Bright Cluster Manager — makes clusters of any size easy to install, use and manage.