Jose Dianes, EMBL-EBI
The traditional pipeline model for biological data is being challenged due to the increasing size and increasing geographical distribution of the data. The massive size of datasets produced by advances in technology and the need for analysing data coming from different sources is making it harder to run pipelines in-house at scale.
Cloud computing addresses many of these issues. By leveraging additional resources on demand, we can deal with scalability problems when they occur. Moreover, when having data stored on the cloud, we can deploy a pipeline close to it, effectively moving compute to data and avoiding all the problems associated with the transfer of datasets over long distances and across legal jurisdictions.
The EBI Cloud Portal is being developed to provide scientists and cloud experts with a platform where computational biology becomes scalable, reproducible, and abstracted away from the complexity associated to different cloud providers. It provides an application model where developers can focus on these challenges in order to make cloud-ready applications available for scientists to use on the cloud. Any cloud provider can be used given the right credentials and configurations, these being either provided by the scientist themselves or shared within organisational teams.