Benefits of using Lustre for pharmaceutical HPCs (VIDEO)
A majority of pharmaceutical companies are moving or have moved to the cloud, and most are considering open source tools. We’ve partnered with several global sponsors to build custom clinical analytics environments and we’d like to share what we’ve observed about one open source tool in particular, Lustre file system. In this post & video, we’ll cover 3 main benefits to using Lustre in your next build.
Click the play button above to watch video.
LUSTR-o-u-s open source (GPL 2.0 license) file system is a parallel shared file system that is scalable, stable, performant and cost effective; a system administrator’s ideal vision for deploying applications and accessing data from multiple clients simultaneously.
Since Lustre is open source, the cost of managing the file system only involves general day-to-day administration and hardware infrastructure cost. Of course, you also have the option of spinning up Lustre FSX within an AWS environment – depending on your business need, performance requirements and future infrastructure decisions.
There are many different offerings out there, with different options and capabilities. Many of these prestigious file systems come at a price, some of which will drive your monthly bill through the roof. Over the past few years, d-wise has been involved in deploying several software packages on HPC file systems. Many of these software packages require a scalable, performant and balanced distributed file system to access data and the ability to configure applications and processes for high availability and failover.
Lustre high level architecture topology is demonstrated below:
Our latest deployments using Lustre in the pharma industry have been extremely successful and cost efficient. The Lustre deployment gives us the opportunity to easily deploy I/O intensive applications such as SAS and implement complex security access to data solutions.
Lustre is used by many of the TOP500 supercomputers and large multi-cluster sites, such as K computer at the RIKEN Advanced Institute for Computational Science, NASA, CEA in Europe, Jaguar & Titan at Oak Ridge National Laboratory, and many others.
Open-source Lustre is one of the file systems which stands out. If you are looking for a cost-effective file system which scales to meet performance ranging from moderate to extreme I/O throughput, Lustre ticks the boxes.
Lustre is a POSIX-compliant, distributed parallel file system designed for scalability, high-performance, and high-availability. FSx for Lustre is PCI-DSS, ISO, and SOC compliant, and is HIPAA eligible. You can control network access with POSIX permissions or with Amazon Virtual Private Cloud (VPC) Security Group rules.
You are not limited to using AWS only, you can deploy Lustre on other cloud providers, including on-premise if that suits your requirements. However, you are limited to using Lustre FSx on AWS as a service since Lustre FSx is an AWS offering. However, open source Lustre can be deployed on any cloud and on-premise infrastructure.
Since data is stored on disk, careful consideration must be followed when designing the cluster, in particular when striping of large data is required. AWS EBS volumes cannot be shared with more than 1 EC2 instance, meaning your data is only written to disks which are mounted to the single EC2 instance. On-site and other cloud infrastructure like Azure do have the option to attach a managed disk to multiple servers.
Lustre is a great file system to use for building SCE applications where complex security, version control and auditing is required.