UKRI Cloud Workshop 2022 – Call for Participation

Our next workshop is on March 29th 2022! 

We will be back at The Francis Crick Institute in central London. As with previous events we are looking forward to a varied programme from the UK research community. We expect the workshop to be a mix of technical talks with researchers reporting on their use of cloud technologies. 

Abstract Submission

We’re looking for talk submissions covering all aspects of research computing using cloud, both public and private. We wish to have a diverse set of viewpoints represented at this workshop and encourage individuals and institutions of all backgrounds (for example academic, technical, business, or user experience) to apply. 

We’ve provided some suggested topics and themes below, but submissions outside of these areas are also welcome. Talk sessions are typically 20 minutes and should include time for questions. We’d also like to record and share the videos and slides afterwards, so please make it clear in your submission if this is likely to be an issue. To submit an abstract complete this form:

The deadline for submissions is 17th January. The Cloud Working Group will review the submissions and we’ll let successful submitters know by 31st January.

Workshop Presentation Format

Due to the ongoing situation, we may need to limit attendance. As such, this year we will be providing an in-person and live streaming experience. Unfortunately, we cannot accommodate virtual speakers for this event. As such, we invite those to submit abstracts who, if successful, wish to join us in-person if they are able and comfortable in doing so.  

Please note that all plans being made are subject to change based on the current health and safety guidance set by the UK Government.

Proposed Workshop Themes

High Performance Computing (HPC)

We would like to hear from those who have user stories involving running HPC-class workloads in public cloud. Stories can also include utilising cloud-native methods to create software-defined HPC infrastructure; hybrid solutions that extend on-premise compute infrastructure with cloud bursting, or adapting HPC workflows to exploit cloud-native technologies, for example.

Cloud Pilots and User Experiences

We would like to hear from operators and users about their experiences of running scientific workloads in private and public cloud environments. How does this compare with traditional HTC and HPC facilities? Have you found any advantages and/or disadvantages that we should know about? Do you use any abstraction layers to make them more usable?

Hybrid Cloud 

We are looking for examples of deployments which bridge the gap between on-premise infrastructure and public cloud, or between cloud providers. This could include; efforts to make workloads portable between clouds, creating cloud services or cloud access to enhance the current solution offered, or technologies to support migration and bursting, for example. In addition, topics covering data movement/migration and data collaboration/sharing have been of a particular interest to our community in the past and would fit within this theme.

UKRI and Trusted Environments

Following the recent UKRI and DARE UK call to inform design of cross-council digital research environments, we are keen to hear from projects that are using cloud to enable research and collaboration with sensitive data.  We are particularly keen to hear from groups that are awarded funding – to provide an early platform for the projects to share previous experience that helped them secure funding and hear what they aim to achieve, as well as how they plan to share their solutions with the wider community.

COVID and Cloud

The global COVID pandemic has presented unprecedented challenges in health and economics and research has been at the forefront of addressing these, from modelling transmission to simulating viral proteins and treatments. We are keen to showcase stories from the research community where cloud has enabled projects; to share practices for operating under demanding conditions and time constraints, but also celebrate the work that is helping to ease us out of the pandemic restrictions.


The recent COP26 conference has fired the starting gun on reducing greenhouse gas emissions to Net Zero. It is no longer an option to simply write code/applications and workflows without ensuring these have been performance optimised within reason. Neither cloud resource/technology/providers, application providers or cloud users can leave it to each other to ensure that workflows have as small a carbon footprint as possible. We are keen to hear from the cloud communities about work that has increased/maintained performance while reducing energy use. In particular we would like to hear about the role of the ResOps professional in ensuring that workflows/applications interact with cloud/cloud technologies in a power efficient manner. We hope this will share best practice and perhaps lead to further workshops and work in this challenging area.

Proposed talks are not limited to these themes but can also be in other areas of interest. Past themes have included storage, governance, IOT/data analytics and challenges faced and overcome when implementing a cloud solution at both business and technical levels.

The Program Committee will make the final decision on the inclusion of any presentations to the meeting.

UKRI Cloud Workshop 2022: Call for Participation in Organising Committee

The UKRI Cloud Working Group is pleased to announce that we will be hosting the 6th annual UKRI Cloud Workshop at the Francis Crick Institute in London on the 29th March 2022. 

The meeting provides an opportunity for UK researchers and representatives from industry to come together and share best practice and new insights in the application of cloud computing for academic research. Past events have attracted speakers from a range of high-profile organisations including CERN, the UK MetOffice, major UK research infrastructure providers and major public cloud providers and typically attracts between 150-180 attendees.

For the event this coming year, we are extending an invitation to members of the research community and commercial companies to take part in an Organising Committee to run the workshop. Participation in this group will provide an excellent opportunity to gain insights into how cloud is being applied – from the innovative application of technologies to address research questions to addressing practical challenges around policy and use. We would like to encourage participation from a diverse set of backgrounds: you may have experience in aspects of cloud or have been involved in running events before or you may simply have an interest and wish to get involved.

As a member of the committee you will help shape the themes, make a call for abstracts and select submissions for presentations. The group will need to self-organise, coordinate meetings and work closely with the UKRI Cloud Working Group (see: towards the successful delivery of the workshop in March ‘22.

The meeting typically consists of one day of presentations and workshops in two tracks. Previous example can be found at :

We hope to host the workshop in-person, however there may be some element of hybrid, virtual conference or social distancing needed. Planning for multiple eventualities will be needed to ensure the event operates in line with government guidelines and provides opportunities to support remote speakers and attendees. The conference venue is set up to support these different hosting scenarios.

Thoughts from Cloud Workshop 2019

It’s a couple of months since the workshop and plenty of time to let the dust settle and reflect on the content. You can find most of the presentations from the workshop if you look follow the links from programme.

As I mentioned in my introduction at the meeting, I’ve noticed a transition over the past year in the adoption and application of cloud and this evident in the abstracts submitted for this meeting.  There are signs of a maturing – in the first couple of annual workshops we held, cloud usage was very much at the experimental stage with early forays into private cloud deployment and first pilots testing out public capability.   This year there were good examples of sophisticated application of cloud technology whether cloud-native applications like Chris Woods’ – use of serverless to dynamically trigger provision of clusters for batch computing – or in-depth demos of DevOps tooling from StackHPC and others.  

Late last year, the Cloud WG ran a smaller technical meeting with no formal agenda – in ‘unconference’ style.  This gave us an opportunity to do more of a deep dive with DevOps technologies.  The positive feedback we received reflected the value in networking and learning together with peers.  There was something of this continued at this year’s workshop with the afternoon demo session.  It was great to have this in-depth technical input alongside higher level presentations, whether overviews of projects or talks around challenge areas such as policy.  João Fernandes shared about the OCRE project in his presentation.  This builds on the work of the GÉANT IaaS Framework, important for the establishment of agreement with public cloud providers for access to their resources for the research community.  
On the topic of policy, the debate continued around the relative merits of public cloud versus on-premise hosting.   Cliff Addison (University of Liverpool) highlighted the tensions between quantifying benefits, budgeting at scale and maintaining portability between cloud vendors.   Owen Thomas (Red Oak Consulting) challenged assumptions with traditional HPC provision and made the case for assessment of overall value not just cost when making comparisons with public cloud.  Andrew Jones (NAG) argued against absolutes when considering the complexities in making choices for hosting for any given application.   Migration to cloud can present enormous challenges as Tony Wildish’s presentation illustrated.  He provided a walkthrough of different approaches for migration legacy code developed for on-premise to operate efficiently on cloud drawn from EMBL-EBI’s experiences.  Elsewhere in the meeting HEPCloud and UKAEA presentations show how hybrid models can be built up to select the required computing resources from on-premise and public cloud resources.  HEPCloud in particular, illustrating the benefit of public cloud to overspill from research infrastructure in order to meet peaks in demand.  

CRC Canada is an example of a complete public cloud solution architected from the ground up.   What is interesting here is the organisational and culture shifts needed to support that model.  In particular, the set up of dedicated effort for auditing and accounting when moving to a consumption based approach to billing.  Pangeo – presented by the Met Office Informatics Lab – demonstrates another cloud enabled solution but what is of interest is the formation of a collaboration bringing together open source solutions to make a platform that is cloud-ready.  At its core is a virtual research environment built largely on Jupyter and Dask together with use of Kubernetes and deployment glue to make it cloud-agnostic.  This kind of solution fits for data analytics where typically datasets have been imported into a cloud environment and manipulated into a form that is analysis-ready.  Use of BinderHub – shown with Pangeo and Sarah Gibson’s demo (Turing) – allows infrastructure to be dynamically provisioned and scientific workflows conveniently shared via Jupyter Notebooks.    

In general though, examples of long-term hosting of large volumes of research data on public cloud however are still absent.  If there’s a pattern from the sample of submissions for the workshop, it’s one of use of public cloud for compute rather than data storage: continued use of on-premise for long-term hosting of data with some bursting to public cloud for batching computing.  Cloud is utilised as a means to obtain an ephemeral computational resource: set-up an environment, stage data, perform calculation, get results and tear down again.  Even so, there appeared to be an increased awareness of the challenges of data hosting with cloud in some of the questions and discussion in the sessions.  These included issues around hybrid and public cloud and multi-cloud scenarios.  For example, if data is hosted in one cloud, how can it be accessed readily by a client with an existing presence hosted on another cloud?   There are definite signs of progress in the community but clearly there are still big challenges for cloud to be more fully utilised for research workloads.

Technical Workshop November 2018

This week we held a technical workshop with a small, but dedicated group of people. We’ve heard that our large 1 day annual workshop is great, but there are areas where people wanted to have a more in-depth discussion.

The workshop was built around “un-conference” format, where we had two pre-arranged talks but the rest of the day was open for discussion – though we were guided by the topics people suggested as part of the registration process.

The morning session kicked off with some high level discussion of possible topics followed by one of our invited talks – Stig Telfer from StackHPC gave a talk on some performance work they’ve been doing with Ceph to support CRAY and the Human Genome project. It was interesting to note the difference moving to bluestore made to Ceph performance and also how LVM and partition use of the NVMe devices had a huge impact on performance of the storage system. Continue reading →

Technical Workshop, 20 November

In addition to our main annual workshop in February next year, we’re also running a smaller pre-meeting this coming month in central London.   The goal of this event is to provide a space specifically for developers, researchers and devops to take a deep dive into technologies for cloud, share from their own experience and learning from each other.  We’ve deliberately avoided setting a fixed timetable so that we can source topics from attendees on the day.  More details and booking information for the day here:

Make sure to bring your laptop 🙂

Save the date 12 Feb 2019 – next Cloud Workshop

We will be holding our 4th annual workshop early next year on the 12th February 2019.  We’re pleased to be back at our familiar venue the Francis Crick Institute in central London.   Please save the date!

In past years we’ve had a great set of speakers from public cloud companies and major research institutes to individual researchers reporting on how they are exploiting cloud computing to meet their research goals.  More details to follow soon.

RCUK Cloud Workshop 2018

The workshop is now just under a few days away.   You can see the programme for the day below.   We have a broad range of contributions from across the research community and also good representation from public cloud providers.  This year we are focussing on international collaborations for our plenary session.   Other sessions focus on mix of application use – from where cloud adoption has reached a mature state – to others where we are examining specific technical and policy related challenges to be addressed.


8th January, Francis Crick Institute London, 1 Midland Road, London, NW1 1AT

09:00 Arrivals, registration, refreshments (Gallery Area)
09:30 Introduction

(Auditorium 2)
Philip Kershaw, RCUK Cloud WG Chair

09:45 Session 1 – International Collaborations

(Auditorium 2)
Chair: Steven Newhouse

Future Science on Future OpenStack: developing next generation infrastructure at CERN and SKA – Stig Telfer, StackHPC
EOSC-hub: overview and cloud federation activities – Enol Fernández, EGI
Public Clouds, OpenStack and Federation – Ildikó Vancsa, OpenStack Foundation
Question time
10:45 Break (Gallery area)
11:15 Session 2a – Technical Challenges – Containers, portability of compute, data movement

(Auditorium 2)
Chair: Adam Huffman

Session 2b – Practical challenges


(Auditorium 1)
Chair: Martin Hamilton

Running a Container service with OpenStack/Magnum – Spiros Trigazis, CERN Aerospace and Cloud – Leigh Lapworth, Rolls Royce
Large scale Genomics with Nextflow and AWS Batch – Paolo Di Tommaso, Centre for Genomic Regulation; Brendan Bouffler, AWS Processing patient identifiable data in the cloud – what you need to consider technically and process wise to keep your data safe – Peter Rossi, UKCloud
Best practice in porting applications to Cloud – Dario Vianello, EMBL-EBI Jisc ExpressRoute Circuit Service, David Salmon and Gary Blake, Jisc
Demystifying Hybrid Cloud with Microsoft Azure – Mike Kiernan, Microsoft The Janet End-to-End Performance Initiative – Duncan Rand, Jisc
Question time Question time
12:30 Lunch (Gallery area)
13:30 Session 3a Innovative applications, usability and training

(Auditorium 2)
Chair: Steve Hindmarsh

Session 3b – Virtual Laboratories and Research Environments

(Auditorium 1)
Chair: Philip Kershaw

Breakout session
(Seminar room)
Visualizing Urban IoT data using Cloud Supercomputing – Nick Holliman, Newcastle University CLIMB – Thomas Connor, Cardiff University / Nick Loman, Birmingham University ResOps training – Erik van den Bergh, EMBL-EBI
Accelerate time-to-insight with a serverless big data platform – Hatem Nawar, Google Cloud CyVerse UK: a Cloud Cyberinfrastructure for life science – Alice Minotto, Earlham Institute
Azure at the Turing – Martin O’Reilly, Turing Institute EBI Cloud Portal – Jose Dianes, EMBL-EBI
HPC – There’s plenty of room at the bottom – Mike Croucher, University of Sheffield Data Labs: A Collaborative Analysis Platform for Environmental Research – Nick Cook / Josh Foster, Tessella
Question time Question time
14:45 Break (Gallery area)
15:15 Session 4a – Technical Challenges – batch compute on cloud

(Auditorium 2)
Chair: David Colling

Session 4b – Technical Challenges – Storage

(Auditorium 1)
Chair: Simon Thompson

Matching to cloud technologies to Theoretical Astrophysics and Particle Physics applications  – Jeremy Yates, UCL Semantic Storage of Climate Data on Object Store – Neil Massey, NCAS / Centre for Environmental Data Analysis, STFC
Hybrid HPC – on-premise and cloud – Wil Mayers, Alces Flight Accessing S3 from FUSE – Jacob Tomlinson, Informatics Lab
Running HPC Workloads on AWS using Alces Flight – Igor Kozin, ICR OpenStack Manila – John Garbutt, StackHPC
OpenFOAM batch compute on AWS – James Shaw, Reading University Providing Lustre access from OpenStack – Thomas Stewart / Francesco Gianoccaro, Public Health England
Implementing medical image processing platform using OpenStack and Lustre – Wojciek Turek, Cambridge University
Question time Question time
16:35 Final Plenary
(Auditorium 2)
Feedback, next steps, cloud strategy for research community, sum-up
17:00 Reception (Gallery area)
18:00 Close


OpenStack Days UK 2017

OpenStack Days are community events with a mixed audience of operators, vendors and people interested in the cloud generally. They are organised independently in different regions around the world and the most recent UK edition took place in London on September 26th.

There were speakers from the OpenStack Foundation, from prominent suppliers of OpenStack services and from users, including representatives from the UK academic community. The opening keynote was from Thierry Carrez, VP of engineering at the Foundation discussing “The Four Opens” and how they apply to OpenStack:

  • Open source
  • Open development
  • Open design
  • Open community

He also addressed the increasing complexity in the OpenStack ecosystem and provided an excellent start to the day. The subsequent keynote talks from AVI Networks, Red Hat, Canonical and Mellanox addressed automation and networking.

The rest of the schedule was split across different rooms, and here are my impressions from the talks I attended. Two speakers from Huawei presented the latest developments of the Kuryr project in OpenStack which integrates Neutron with Docker and Kubernetes, to present network services to containers running in an OpenStack cloud, thereby allowing a more cloud-native approach. Kuryr uses the Dragonflow distributed SDN controller, which allows for good scaling (as demonstrated in tests with Redis) and now supports OpenFlow pipelines in OpenvSwitch. Interesting, it achieves these scaling improvements by running on the compute nodes themselves. Recent efforts have extended Kuryr to embrace Kubernetes, including a controller and CNI driver. Where possible, these components re-use as much as possible of the existing OpenStack infrastructure, such as Keystone for authentication/authorization and projects within Nova. When Kubernetes pods are running inside VMs, the speaker advised using trunk ports to avoid the performance penalty of double packet encapsulation. The next developments for Kuryr will include – scaling for controllers as well as for ports, multi-network support and performance improvements. This was a very useful overview, including reports of performance testing – a feature often missing in such presentations.

Next was Tim Cutts presenting on the work at Sanger on Secure Lustre with OpenStack. Their approach, given the large legacy of existing scientific pipelines that make conservative assumptions about the infrastructure on which they’re running (e.g. POSIX file access), is to regard traditional HPC clusters and cloud computing as complementary, with the latter providing a flexible compute environment. Those workloads need to be supported as the level of cloud provision increases, giving time for them to be re-written with a cloud-native architecture. While they now have 14PB of Ceph storage (4.5PB usable), they needed to be able to use the very large datasets hosted on local Lustre resources, and to provide good security isolation between tenants when doing so. Tim described their work, which relied on new features only available since Lustre 2.9. Using one bare-metal Lustre router shared amongst all the tenants, they achieved 3GB/second aggregate performance across multiple clients. For greater isolation, they have also implemented a separate virtualised router for each tenant. Some wrinkles they found are that they needed to turn off port security, which is an acceptable tradeoff in a fairly tightly controlled environment, and that there is some inadvertent asymmetric routing by Lustre, which implies it does not check the origin of packets sufficiently. More details of their work are available at and there is a video of a similar talk at ISC17.

Prometheus has been adopted as a supported project for monitoring by the Cloud Native Computing Foundation, and it is often discussed in connection with Kubernetes. I was interested to hear about monitoring OpenStack with Prometheus, in a talk by Csaba Patyi of Component Soft. OpenStack is composed of many services, each with their own logs, making the logging and monitoring situation quite complex. Csaba demonstrated how to use the well-established Elastic Stack for OpenStack. This works well, is easily configurable and does not require Logstash. However, it can be hard to separate data from metadata and extra work is needed to handle multi-line logs, which are quite common in OpenStack. He showed how you can manipulate logs based on their information structure, which enables useful views in Kibana and for the creation of dashboards based on log information. The code for the demo environment used in the talk is available on GitHub. For monitoring and alarming he discussed Prometheus, which was created by design for cloud environments. Advantages he mentioned are that there are lots of Prometheus data exporters for OpenStack, and that it can be configured to search via DNS for new hosts, rather than relying upon static configuration.

My first talk in the afternoon session was on High Availability, by Kenneth Tan of Sardina Systems. Kenneth began by highlighting the new expectations that users have, based on their exposure to public cloud services such as AWS, and by pointing out the different perspectives that consumers of services have, as opposed to operators of those services. The first safeguard needed for high availability is to take an infrastructure-as-code approach, allowing for easy redeployment, whereas a simple safeguard for data is to use replication. He discussed the differences between extrinsic and intrinsic ‘death risks’ and the need to detect and distinguish what he termed ‘sick states’ e.g. when a node is affected but not dead – it is ‘unhealthy’. He emphasised the importance of looking for correlations and causality in monitoring data, and in looking for anomalies. Pushed data streams are better for scaling than polling, he suggested. The infrastructure implications of storing metrics and logs were listed, along with the need for systems that can handle unbalanced I/O patterns i.e. large numbers of small writes, along with a small number of very large reads. With good monitoring you should be able to predict imminent faults, and deal with latent threats, before they become patent threats.

OpenStack as a project has not been immune from the remorseless spread of containers, and one of the more interesting aspects for me has been the effort to allow running OpenStack services as containers. Steve Hardy from Red Hat gave a talk entitled “Deploying OpenStack at Scale with TripleO, Ansible and Containers” which discussed the changes in the TripleO project to accommodate exactly such containerisation of services, and to make more use of Ansible in general for deployments. TripleO creates a small OpenStack installation (“the undercloud”) which is then used to install the main installation (“the overcloud”). Historically it largely used Puppet for all the installation tasks, and lacked some flexibility, which made it harder than it should have been to customise deployments to meet local requirements. Node roles in TripleO are now composable and the Mistral project effectively provides an API for TripleO as a whole. One of the benefits of containerisation includes dependency isolation, which makes it much easier to roll backwards and forwards with different versions. TripleO has been collaborating with the Kolla community on this effort. He also mentioned the Paunch tool which manages the containers used by TripleO.

Link to talk PDF.

The final talk I attended in the technical tracks was by Julien Danjou, who is one of the developers in the Gnocchi project. Gnocchi was created when it became clear that Ceilometer lacked the performance required for time-series data, in large part because it was originally designed for billing and included a lot of flexibility that was irrelevant for monitoring. Some interesting features of Gnocchi are that it computes metric aggregations itself, that it can batch measurements together and send them in a single HTTP request, and horizontal scaling is achieved by simply adding more nodes. He referred to recent performance testing in Gnocchi version 4, described on his blog at It’s an interesting alternative to the more generic monitoring approaches listed above.

Link to talk PDF.

The final talk was by Jonathan Bryce, who is the Executive Director of the OpenStack Foundation. This was quite informal and in part a Q&A, particularly useful for people new to the OpenStack community.

It was a packed day and as always I could only see a fraction of the talks I wanted to attend. A convenient way of catching up with the latest developments, seeing how people are using OpenStack, and meeting familiar faces in the community.