The new cluster was officially opened by Professor Tom Rodden of the Interactive Systems Mixed Reality Laboratory (MRL), and Professor Saul Tendler, Pro-Vice-Chancellor for Research, as the highlight of the University’s annual HPC Conference. In addition to presentations from ClusterVision and the other technology partners, the 2-day Conference showcased a wide variety of the University’s current research applications, including presentations on quantum and astrophysics simulation, genome modelling, advanced mathematical solutions, and molecular chemistry.
As the prime contractor for the design, build and management of the Minerva system, ClusterVision managed a complex collaboration of 17 hardware and software partners. Key contributors to the Minerva project included Dell, Intel, Qlogic, NVIDIA, Panasas, Bright Computing, Altair Engineering and Allinea.
The realisation of the Minerva project benefitted from a long-standing relationship between ClusterVision, the University of Nottingham and many of the other collaboration partners. Following the successful trial of Bright Cluster Manager (from Bright Computing) and PBS Professional (from Altair Engineering)
In a detailed tender response ClusterVision proposed the design and performance of a system based on a combination of hardware, software and service components which would surpass the functional requirements, together with a vision of the long term benefit that such a system and collaborative partnership would have on the University and its extended scientific user community. ClusterVision were confirmed as the prime contractor to the project in April 2012, with the system build, configuration and performance testing being successfully completed during the second half of the year.
“The ClusterVision solution won out in a very close competitive tender process, as the technology was judged to be the best match to our requirements. In addition, ClusterVision and their selected partners showed a real commitment to collaborate with the University not only to deliver excellent hardware and software, but also a service package which met our specific requirements,”
The server anatomy of the cluster is based on Dell PowerEdge and PowerVault series components. ClusterVision and Dell drew upon a long established partnership, and the confidence of a number of collaborations on other major international academic cluster systems such as the University of Bordeaux in France, CRP Gabriel Lippmann in Luxembourg, and the King Abdullah University of Science and Technology in Saudi Arabia.
The Minerva system comprises 2 redundant master nodes; Dell Powerdege R720’s, with a single master node shared storage provided by the 2U 12 disk Dell PowerVault MD3200. The compute capacity is shared between 156 Dell PowerEdge nodes, arranged in Dell C6220 servers, with 12 high memory fast I/O nodes also in Dell 6220’s, and 6 additional GPU accelerated nodes. Originally designed using C6100 servers, the Dell compute node specification was subsequently upgraded to Dell PowerEdge C6620’s which were introduced as a vehicle for the latest Intel Xeon E5 Sandybridge processors. Each 2.6 Ghz compute unit contains a 500 GB local disk. The fast I/O nodes have 500 GB SATA and 4 100 GB SSD’s and are designed specifically for the high intensity needs of the applications. The 6 GPU accelerated nodes comprise a Supermicro base chassis, also incorporating the 8-core Intel Xeon E5 processor, together with 2 Tesla M2090 series GPU’s from NVIDIA.
Scalable parallel file storage is provided by 4 Panasas ActiveStor12 series shelves, incorporated as a complete storage appliance with the required management systems and Ethernet switching. Each Panasas ActiveStore12 shelf provides 60 TB capacity and 80 GB cache, giving a total theoretical and usable storage capacity of 240 TB and approximately 180 TB respectively. System interconnect is a dual-level combination of 1GB Ethernet for the administration and management communications and an Intel/Qlogic QDR InfiniBand fabric and switching system for the main application communications. All of the system components are mounted in 9 42U black server racks.
For the software environment ClusterVision selected 3 key providers, Bright Computing, Altair Engineering and Allinea.
Provisioning and cluster management is provided by 176 advanced version licences of the cluster management suite Bright Cluster Manager from Bright Computing. Bright Cluster Manager was used to manage the Linux environment and initial configuration process, and provides much of the software infrastructure for the everyday monitoring and healthcare of the system.
Bright Cluster Manager is also the enabler for ClusterVision’
Although it is beyond the immediate scope of the current Minerva project, the Amazon EC2 cloud bursting functionality of Bright Cluster Manager’s advanced version also creates a working foundation for an anticipated cloud based extension at a later date.
A high level of user management and detailed usage analytics were identified as important operational requirements of the system. To address these needs ClusterVision worked in partnership with Altair Engineering to incorporate licences of Altair PBS Professional, PBS Compute Manager, and PBS Analytics. The software stack was completed with licences of the PGI CUDA Compiler, from the Portland Group, and Allinea’s Optimisation and Profiling, and Distributed Debugging tools, Allinea OPT and Allinea DDT.
The turn-key nature of the Minerva system is completed with ClusterVision’
The Minerva system, which is anticipated to operate at around 45 Tflops performance, will be a valuable local resource for students and research staff at the University of Nottingham. It will also provide a substantial compute capability for an extended network of collaborating United Kingdom Universities and enterprise businesses.