CLIMB-COVID: Cloud Infrastructure to Support the UK’s Covid-19 Response

Radoslaw Poplawski and Nick Loman, University of Birmingham

In March 2020, a partnership of academic laboratories and public health agencies launched COG-UK, a nationwide distributed genome sequencing project in order to monitor the evolution of SARS-CoV-2. In order to integrate, store, process and analyse this genomics data stream we deployed CLIMB-COVID on the UKRI-funded CLIMB cloud platform, working with the University of Birmingham’s BEAR project. This project has so far processed over 2 million viral genome sequences in the UK, and contributed to the discovery and detection of important new variants of SARS-CoV-2. In this talk we will describe the development and deployment of the hardware and software elements of the CLIMB-COVID platform and discuss some of the scaling challenges that were faced along the way, as well as describe some of the key scientific and public health outputs that have been generated by this unique surveillance dataset.

Developing and using the UK Biobank Research Analysis Platform, a large-scale Trusted Research Environment

Oliver Gray and Przemyslaw Stempor, UK BioBank

UK Biobank is a world-leading biomedical resource with data describing the health, lifestyle, physical characteristics, metabolomics, and genetics of half a million UK participants. The UK Biobank Research Analysis Platform (RAP), powered by DNAnexus and Amazon Web Services (AWS), has been designed to accommodate the UK Biobank resource’s vast and dramatically increasing scale, providing accessibility to the data for researchers around the world. Here, we will provide a brief tour of the RAP and the UK Biobank data, and describe how researchers can use the RAP to achieve their scientific aims. We will also discuss the challenges encountered by ourselves and established users of our data in moving from a download-only system to a cloud-based framework.

Crawlers, Bots, Flows, Lambdas, Glues and Autopilots: Applying AI and ML to Radiological Sensor Networks for Safety and Security

Peter Martin, University of Bristol

Driven by step-change advances in cloud computing, the Internet of Things (IoT), and microcontroller technologies, progress in Machine Learning (ML) over the past decade has served to pioneer an increasing number of news technologies; from self-driving (or ‘driverless’) cars to enhanced weather forecasting, and from targeted produce advertising to self-cleaning houses. However, such vast and ever-growing computational intelligence to analyse, interpret, and streamline the potentially vast radiological monitoring dataset that is/could be continually collected using multiple survey ‘nodes’ as part of the UK’s national nuclear safety and security has yet to be applied. Presently, individual detection events are each investigated, as no wider “situational context” to their occurrence is applied – this is hence inefficient, costly and time-consuming as well as blind to small-scale/transient variations (and slow increases in activity) that may otherwise be missed in a large and unwieldy dataset. Work at the University of Bristol has sought to work alongside current academic and industrial collaborations to develop an Artificial Intelligence (AI) and ML system for the enhanced processing and evaluation of “Big Data” derived from such a large (and potentially unlimited) number of mobile (and fixed-position) radiological monitoring devices to yield a more informed detection response, therefore enhancing the UK’s current national radiological surveillance provision.

Global Symmetry is important for the detection of abnormality in mammograms

Cameron Kyle-Davidson, University of York

When radiologists evaluate mammograms images from the left and right breasts are shown concurrently. Radiologists remain capable of detecting abnormalities even up to three years prior to onset of cancer, and when said mammograms are presented rapidly. However, if the normal mammogram contralateral to the abnormal mammogram is replaced by that of another woman, this ability suffers a performance decrease. Evidently, a global signal that signals abnormality exists and is dependent on both mammograms. We investigated whether the effect also appears in a pre-trained neural network mammography model. Further we explored the effect of bilateral differences by developing and training a neural network model which can reliably detect whether a set of mammograms is composed of images taken from the same woman, or two different women. Detection of bilateral asymmetry remains even when mammograms are balanced by size and age; indicative that a “symmetry signal” exists and is relevant for breast cancer detection. We pilot off-site cloud GPU resources for both training and inference of the neural networks, which would have been intractable on our local hardware. In addition, we develop a semi-autonomous mammography dataset cleaning pipeline that can take advantage of high-cpu count cloud machines; through multithreaded image processing.

Reducing time-to-science with self-service HPC and AI platforms in the Azimuth portal

Matt Pryor, Stack HPC

Matt Pryor, John Garbutt, Matt Anson, StackHPC

Recent years have seen increasing divergence from the traditional HPC model, with researchers keen to take advantage of new and rapidly developing tools such as Jupyter Notebooks, Dask, Apache Spark and Kubeflow while still maintaining the ability to run existing codes in a traditional batch environment, all without sacrificing performance. The explosion of tools and platforms, coupled with the fact that many of these platforms also need to be customised for each use-case, places a heavy burden on the operators of traditional HPC systems where individual platforms are deployed and maintained by the operator on behalf of users. We demonstrate here how the Azimuth portal is able to reduce time-to-science and operational overhead by providing researchers with self-service access to HPC and machine learning platforms via a simple and intuitive user interface. Azimuth builds on work done at JASMIN, with funding from the IRIS collaboration, to present users with a catalogue of customisable platforms that they can deploy into their cloud allocation. Leveraging cloud-native technologies and automation, these platforms can be deployed on virtual machines or in Kubernetes clusters and are able to take advantage of hardware acceleration such as GPUs or RDMA networking without explicit configuration from the user. The Azimuth portal is in use at several IRIS sites, and is providing platforms for projects including the SKA.

On the creation of a secure ‘serverless’ workflow between a Mapbox frontend and a SalesForce backend for the Tekkatho Foundation

Mike Jones, Independent Researcher

On the creation of a secure ‘serverless’ workflow between a Mapbox frontend and a SalesForce backend. The brief was to design an interactive map with markers and pop-ups to indicate the location of Tekkatho Foundation libraries in Myanmar, and to display site information and photos on the Foundation’s website map. The Tekkatho Foundation is a UK charity that provides offline digital libraries to universities and schools where access to resources and infrastructure is very limited. In this talk I describe a workflow which processes data obtained from the Foundation’s SalesForce CRM, through Amazon Lambda, serving it via an AWS API REST Gateway to a Mapbox interactive map on a Wix hosted website. I describe how I coerced AWS’s KMS to securely provide Lambda with the credentials to access SalesForce via its OAuth JWT bearer token flow. I cover site origin issues encountered when serving the GeoJSON data to the Mapbox client-side JavaScript and the remedy. I discuss caching options implemented for the GeoJSON data and the Foundation’s images.

Transitioning research computing workloads to the cloud: A thematic approach at Cardiff University

Tom Green, Cardiff University

We outline the service work underway at Cardiff University to inform the transition of our research computing community to the cloud. With the cloud now recognised as an attractive solution to bursting out on-premise HPC capacity, this project is both exploring such options while looking to provide a more thematic approach to cloud sourcing. Following a procurement for cloud services, including a performance evaluation exercise, AWS was selected to provide the initial resource for this pilot project. We discuss the methodology used in mapping the thematic usage profile of our 21,000 core on-premise cluster to assess the available environments. This usage varies from the compute intensive workloads associated with the Physical Sciences & Engineering community to the more data intensive workloads from Biology & Life Sciences. The cost and performance attributes of such workloads, based on a variety of use cases, is set to quantitatively inform the future procurement of HPC services. To help with project management and cost control, we have selected the Ronin user-interface that also permits management of the resources available to users. On completion of this project we will be better positioned to direct users from a cost and performance perspective in a future featuring a hybrid cloud and on-premise HPC service.

Twins in the Cloud: Simplifying the Deployment of Digital Twins for Manufacturing-as-a-Service

Jay DesLauriers, University of Westminster

As once cloud-hesitant industries are encouraged to move to the cloud to run computational workloads, shortfalls in technical skills and cloud knowledge become apparent. Manufacturing is one such industry. Over the past five years, the CPC at Westminster has participated in several European projects investigating Manufacturing in the Cloud. The most recent of these, DIGITbrain (, aims to build a platform for Manufacturing-as-a-Service and support the industry with running workloads and accessing Digital Twins on top of cloud infrastructure. The platform will feature open-source DevOps tools such as Kubernetes, Terraform and Ansible, integrated together inside the MiCADO Execution Engine ( End-users of the platform need not be familiar with these tools’ domain-specific languages or the cloud platforms and middleware that support their workloads. Instead, users will provide values for pre-defined metadata fields that will describe the microservices, models, data and infrastructure that make up and support their workloads. This metadata will be automatically compiled down to an intermediary language based on the OASIS TOSCA Specification (Topology and Orchestration Specification for Cloud Applications). The intermediary language can be interpreted by MiCADO and transformed into formats understood by orchestration tools such as Kubernetes and Terraform, which will execute the users’ workloads.