Faster time to insights with high-performance data storage for AI research

A world-class research institution, The University of Queensland (UQ) sought to simplify data capture, storage, analysis, and management for its high-performance computing (HPC) environment. Collaborating with IBM.

Business challenge

To speed up research collaboration, including for complex, AI-driven projects, UQ needed a storage solution that supports hundreds of terabytes of data generated daily. 

Transformation

UQ built a high-performance data fabric powered by and centrally managed with IBM Spectrum Scale, recently adding an IBM Elastic Storage® System (ESS) solution to support its fastest HPC environment. 

Results

2 hours to achieve ROI on software-defined storage that saves researchers hundreds of hours of processing time per week 

~74% faster runtimes for medical imaging analysis to help speed time to discovery for critical research 

Exponentially increasing data volumes supported by a highly scalable, cost-effective storage fabric

Business challenge story

At the forefront of UQ's AI-driven research

How can we use ultrasound technologies so that therapeutic antibodies can overcome the blood-brain barrier and slow Alzheimer’s disease? What can the neural circuits of fruit flies teach us about designing robotic movements? Why does cellular inflammation lead to cancer, and how can we learn more by imaging live cells at nanoscale size in real-time? Across UQ, creative researchers tackle these and other tough questions, often leading to discoveries that can change the world and people’s lives. 

The research teams focused on these questions rely on the University’s fastest GPU- accelerated computer to carry out their cutting-edge work. Designed specifically for imaging-intensive science and AI workloads, this supercomputer, along with other HPC systems at the University, needs extremely fast, scalable, and flexible data storage available anytime, anywhere. 

To create a faster path from ingest to insights, the Research Computing Centre (CC) at UQ sought to deploy a uniform, high-performance storage strategy, and architecture to effectively support and manage university-wide data capture and analysis.

The RCC wanted a solution that could not only accommodate exponential growth in data volume, velocity, and variety but also provide rapid data access. Researchers at the University generate structured and unstructured data using a variety of computer systems – from desktops to HPC clusters and from an enormous range of scientific instruments, such as MRI scanners, optical microscopes, and DNA sequencers, explains Professor David Abramson, Director at the RCC.

“Our paradigm around data is to keep one logical copy of it and then render it in many different ways, making the data available when a researcher needs it, where they need it,” he says. While evaluating potential solutions, the CC also looked for technologies that could expand in line with the University’s needs well into the future. 

“With the ESS solution, we get all the benefits of a high-speed parallel file system inside a supercomputer with the data management transparency that AFM and other IBM Spectrum Scale features provide.”  – Jake Carroll, Chief Technology Officer, Research Computing Centre, The University of Queensland

Transformation story

PC storage with on-demand access

The CC built a high-performance data storage fabric known as MeDiCI (Metropolitan Data Caching Infrastructure), powered by and centrally managed with IBM Spectrum Scale. “For researchers to drive innovation, they need to be able to undertake high quality research in a timely, scalable and boundary pushing manner, leveraging cutting-edge research computing infrastructure. Our partnership with IBM helps meet these needs,” explains Jake Carroll, Chief Technology Officer, Research Computing Centre at UQ. “With MeDiCI, researchers and students across the University and at other international institutes can seamlessly work with data stored on any compute cluster at UQ and collaborate.” 

“When researchers sit down, they see all of their data. They don’t realize it’s actually moving across optical wires at blind speed from a remote data center,” says Abramson.

In addition, the MeDiCI ecosystem supports a variety of platforms, instruments, and data. “IBM Spectrum Scale software allows us to unify all of our different silos of storage sources into one integrated, intelligent storage infrastructure and then render the data in whichever protocol is appropriate, resulting in faster analytics and greater resource utility,” says Abramson. MeDiCI also automatically captures project metadata, including users, instruments, and data parameters.

The CC team continues to evolve the MeDiCI infrastructure, most recently deploying it as a storage solution for UQ HPC Wiener. The goal is to allow researchers to do more in the same timeframe given the increased throughput that the platform provides. “We needed a solution that could not only sustain quite substantial bandwidth from a gigabytes-per-second perspective but also a very high OPS requirement to support massive amounts of data coming at an unprecedented rate from disk systems and flash storage simultaneously,” explains Carroll.

“We wanted [a hardware platform with] IBM Spectrum Scale because its functionality is pretty close to unique,” explains Carroll. “With the ESS solution, we get all the benefits of a high-speed parallel file system inside a supercomputer with the data management transparency that AFM and other IBM Spectrum Scale features provide. That integration fits into the workflow of our users, and in scientific outputs, workflow is king. That’s why we leverage software-defined storage,” he adds.

With the ESS solution, UQ can support massive data volumes with up to 40 GB of throughput and the ability to scale out to exabytes of storage, and its hybrid cloud model provides rapid metadata access. With the IBM Spectrum Scale RAID erasure coding feature, the solution is designed to support high levels of storage reliability, availability, and performance. Combined with AFM, it also enables the CC to streamline data access within specific project workflows while still maintaining a single, common storage architecture.

The IBM Systems Lab Services and IBM Systems technical sales teams in Australia worked to quickly deploy the ESS GH14S solution on an InfiniBand network and integrate it with the end to-end MeDiCI IT architecture. The teams worked cohesively and with attention to detail at every stage, implementing the array in five days.

The RCC has recently implemented the IBM Storage Insights offering, cloud-based storage management, and support platform with predictive analytics. It provides the team with more in depth, cohesive visibility across the entire infrastructure, enabling higher performance through faster issue resolution.

IBM recently placed a new ESS 5000 at UQ for extensive testing and evaluation. Abramson says IBM is partnering with CC as it has developed a reputation for stretching existing technologies.

“We have already demonstrated significant innovation in applying Spectrum Scale at the University. We have been able to provide feedback on how well it works in our environment and where it can be enhanced,” explains Abramson. “I’m very excited to be able to test IBM’s other leading-edge hardware on our most demanding research needs.” 

“With the ESS solution, we get all the benefits of a high-speed parallel file system inside a supercomputer with the data management transparency that AFM and other IBM Spectrum Scale features provide. That integration fits into the workflow of our users, and in scientific outputs, workflow is king. That’s why we leverage software-defined storage.” – Jake Carroll, Chief Technology Officer, Research Computing Centre, The University of Queensland

Results story

Faster time to discovery

With a uniform data fabric featuring IBM Spectrum Scale technologies such asactive file  management (AFM) for accessing files across the university, the RCC can optimize researchers’ time and university resources while centralizing data management and controlling IT costs. Across UQ, researchers now have comprehensive compute and storage capabilities to support the creation of massive amounts of data at scale and run complex workloads.  

With the expanded bandwidth and IOPS available from the ESS device, research teams 

that rely on the Wiener HPC system can process data at unprecedented speeds. “Machine learning and AI is front and center with the ESS GH14S empowering how our supercomputer’s GPs get utilized, enabling researchers to do more in the same timeframe and accelerating time to discovery,” says Carroll.

In fact, the new storage array delivered an ROI in just two hours, based on performance improvements that save medical imaging researchers across UQ hundreds of processing hours each week.

At UQ’s Queensland Brain Institute (QBI), for instance, neuroscientists studying Alzheimer’s disease reduced the time required to run their project workload, known as a finite element analysis, by approximately 74 percent, shrinking the run time down to 18.72 hours. With a deeper understanding of ultrasound wave distribution on the human skull, researchers can develop technology needed to overcome the blood-brain barrier for drug delivery. “It’s a very complex undertaking, and it needs an enormous amount of compute power and storage,” explains Carroll.

In another case, QBI and other researchers looking at neural circuits in fruit flies developed genetic methods to label and manipulate individual neuron types. With Wiener, they can rapidly process terabytes of high-speed videos of the tiny insects in motion, measuring precise movements of the antennae, abdomen, and joints on six legs. With new insight into each neuron’s role, they can better understand principles governing complex motor tasks, such as walking and flying behavior.

At UQ’s Institute for Molecular Bioscience, researchers studying cellular inflammation employ lattice light-sheet microscopy to capture high-resolution 4D images of living cellular processes. Viewed using a mathematical modeling process known as deconvolution microscopy, the images provide an unprecedented, real-time look at how cancer forms. The Wiener storage solution helps make this possible, including reducing deconvolution time by more than 70 percent. The CC saved researchers additional time by building a user-friendly portal for streamlining deconvolution tasks.

“We have to provide the best infrastructure we can to support an enormous range of research endeavors. Given exponential data growth, we also need to achieve economies of scale,” says Carroll. “IBM help make that possible.”

The University of Queensland

For more than a century, The University of Queensland (UQ) has maintained a global reputation for delivering knowledge leadership for a better world. The most prestigious and widely recognized rankings of world universities consistently place UQ among the world’s top universities. UQ has also won more national teaching awards than any other Australian university. This commitment to quality teaching empowers our 53,600 current students, who study across UQ’s three campuses, to create positive change for society. Our research has a global impact, delivered by an interdisciplinary research community of more than 1500 researchers at our six faculties, eight research institutes, and more than 100 research centers.

Solution components

Educ: Innovation in Research
Storage: Elastic Storage System 3000
Storage: Elastic Storage System 5000
IBM Systems Lab Services: Cross Platform
Spectrum Control & Storage Insights/Pro
Spectrum Scale

Download the case study

Download the case study here

Talk to us about your information management challenges

Talk to us about your financial planning challanges
Talk to us about your financial planning challanges
Talk to us about your supply chain challanges
Talk to us about your supply chain challanges
Talk to us about your supply chain challanges
Talk to us about your supply chain challanges

Talk to us

To find out more about Anaplan solutions for your business, please complete the form below and our team will respond within 1-2 business days