Session 3b: Lightning Talks and Demos Part 1

DIRAC’s genesis is from LHCb
400K HTC jobs are managed by DIRAC at any time
DIRAC is running multiple infrastrutures, including EGI (in Europe) and GridPP (in UK)
DIRAC VMs also run on OpenStack clouds at Datacentred and CERN
Vcycle creates VMs using Nova API, EC2, OCCI, Azure
Vac presents a simulation of OpenStack environment to VMs
2.5K “Hello World” jobs run through virtual organisation

  • Jobs were routed to four sites: Vac (Manchester), LHC computing Grid (LCG), DataCentred and CERN
  • federated four different sites with 3 different technologies, running the same VM

How do jobs access the data, is there a virtual private network between each site as well?

  • DIRAC data is pulled across the public network, the data is publicly accessible – there is no use of VPNs etc.


BBSRC funded the first (and only) CyVerse node outside of the US in Earlham Institute in Norwich
Discovery environment, underpinned by Agave API
Docker eases deployment to Virtual HPC cluster running on OpenStack, managed by HTCondor
All application images are published on Docker Hub
Docker caveats: use version pinning constraints – don’t just always take latest (provides versioning and accurate reproducibility of older versions)
Great for managing complex workflows


Autonomous cars will bring many benefits:

  • safety
  • better congestion management through intelligent traffic management
  • environmental advantages
  • comfort for passengers

Evolution of autonomous cars:

  • ABS
  • Electronic stability program
  • Adaptive cruise control
  • lane keeping assist
  • parking assist
  • autonomous driving in traffic jam and highway
  • full autonomy
  • Use 5G mobile networks to provide an ultra-low-latency and highly-reliable platform (1msec referred to as latency for traversal of 5G network)

Assure system-wide security without significantly degrading performance or increasing latency

Google self-driving car generates 1GB/s of data (2PB/car/year)

CARMA is a 5 year project – currently in its early stages

Solution could use OpenStack or could use containers

Containers would need to migrate – to be closer to the car as it moves

  • Seamless migration without interruption?? Perform planned migration ahead of hitting edge of zone (but this will still cause disruption to service?)


Defining, identifying and combating criminal activity in the cloud: 4 Ps: Pursue, prevent, protect, prepare
what criminality might occur in the cloud environment?

  • tooling to collect and process data about the cloud
  • identifying what indicators of criminality might exits?`
  • how might we detect criminality whilst maintaining privacy
  • how can we demonstrate the existence of criminality to a sufficiently high standard?

Cybercrime centre is a fusion of criminology, psychology and computing

Hierarchy of low-level sensors for crime, “study entropy of data involved in the crime”

  • Machine learning approaches for analysis of collected data
  • Development of ‘scripts’ for crimes
  • Analysis of prior crimes in order to identify patterns

Attacks to the cloud and from the cloud

Cloud-dependent crimes and cloud-enabled crimes

What kind of sensors?

  • how much data is going through, what sort of data it is, how long the packets are (can’t look at the contents of packets due to privacy…)
  • botnets? Not funded for that, but interested in it
  • How do hyperscalers ensure privacy? Would be interested to know…


Taking many of the “best practice” software stacks for biomedical applications (particularly ones popular in the USA)
Building database applications on virtual machines
Data warehousing using i2b2
Brisskit CiviCRM: patient cohort management – in use for >22k recruits in 14 studies
Brisskit OpenSpecimen: sample management – well over 100K samples under management
BrissKit RedCap: web-based questionnaire data, can be completed with patient using tablet
Done a lot of work using MS Azure for Research – creating demo dataset sandboxes
Azure moving to UK data centre – may enable NHS clients to store more data in cloud
Uses Docker and Puppet

Code4Health – NHS programme to endorse open source software – endorsed Brisskit in 2015

  • Working on participating in 100K genome project


  • Integrated support for core research processes


Climate modelling in public cloud

Uses BOINC toolkit – Berkeley Open Infrastructure for Network Computing

  • Enables home users to volunteer CPU time

AWS use case: World weather attribution – attempts to attribute individual sever weather events to climate change

Use spot fleets in AWS: use spare compute resources for much less than on-demand rates

Case Study 1: South America – 750 simulations, 50km resolution

  • Ran on 318 instances running for 90-120 hours
  • 12 instances were terminated due to spike in spot price

Actual cost of spot instances was 1/5th of the on-demand cost

Proof-of-concept a success:

  • Good for fast computation of urgent results
  • Cost effective use of spot fleets
  • Spot price volatility can kill off instances

Any interesting attributions?

  • Flooding in Paris earlier in 2016 – identified a small increase attributable to climate change