Research Software Reactor

Supporting researchers, research software engineers, and technical specialists in the application of cloud computing for the research community.

Review: 1st Software Reactor Sprint: CycleCloud, BinderHub and The Littlest JupyterHub

29 May 2019

Over the course of the three days, 20-22 May 2019, the Research Software Engineering community and Microsoft worked together on a wide range of projects focused on making researchers productive on Azure. Of the proof-of-concept (POC) projects originally proposed three projects were selected:

  • Deploying HPC on Azure using CycleCloud.
  • Push button deployment of BinderHub.
  • Documentation for deployment of The Littlest JupyterHub on Azure.

These POC’s were not only about providing training and creating reusable resources for the RSE community, but also served to identify specific barriers to adoption of Cloud in academic research institutions. Most of the issues encountered where readily solved or worked around by the Azure engineers present at the event. The Azure team were also in real-time contact with the engineering teams internationally who develop the services. In the case of CycleCloud, over the three days 20 separate issues were filed for CycleCloud which were resolved either during or soon after the event.

Overall the feedback from the participants was great and the most common question was when we were organising the next one :-)


This close engagement between the research community and Azure solution architectures, supported by their engineering teams, that defines Research Software Reactor.



Azure CycleCloud - cluster-in-the-cloud

During the sprint a deployment of HPC Services in the cloud was explored using Azure CycleCloud. Azure CycleCloud is a tool for creating, managing, operating, and optimizing HPC & Big Compute clusters in Azure. With Azure CycleCloud, users can dynamically provision HPC Azure clusters and orchestrate data and jobs for hybrid and cloud workflows. Azure CycleCloud provides alerting, monitoring, and automatically scales HPC infrastructure to ensure your jobs run efficiently at any scale. Azure CycleCloud offers advanced policy and governance features such as: cost reporting and controls, usage reporting, AD/LDAP integration, monitoring and alerting, and audit/event logging to give users full control over who runs what, where, and at what cost within Azure.

The aim was to get CycleCloud working for people in personal or organisational Azure Subscriptions. Objectives included:

  • Crib sheet for Research software engineers to use with Central IT around tenant and subscription access.
  • Provision Cycle Cloud for Research Software Engineers who typically are NOT subscription or Tenant Owners.
  • Instructions for the provisioning of Cycle Cloud Head Node deployment and config.
  • Automated deployment of cycle cloud ARM Template, Powershell script and Deploy to Azure One button deployment.
  • Launch permanent shared, scalable (parallel) storage.
  • Install cluster-in-the-cloud, with profile downloaded from organisation template.
  • Given sufficient budget each job should set up an instance with budget controls and authentication, through Azure Active Directory (AAD).
  • Spin up research instances as in research-compute-instance, or otherwise.
  • Auto spin down dormant instances.
  • Persist storage but connect to Blob storage so users can move data onto/off instance.

Thanks to the support of the Azure team at the sprint, and the engineering teams who develop the services and provided us with real-time support, we successfully deployed HPC clusters on Azure using CycleCloud and created practical guidance for RSE’s to do this for themselves. As to be expected when experimenting with new technologies we did run into some issues - over the three days 20 separate issues were filed for CycleCloud. However, the collaborative nature of the event meant that these issues were all resolved during or shortly after the sprint.

RSEs wishing to use Cycle Cloud to deploy HPC cluster on campus should follow the guidance at https://github.com/research-software-reactor/cyclecloud/tree/master/arm-templates and follow the initial setup guidance at https://github.com/research-software-reactor/cyclecloud/blob/master/QuickStarts/SettingUpCycleCloud.md You can follow the Quick Start tutorial at https://github.com/research-software-reactor/cyclecloud/tree/master/QuickStartswhich include Slurm Cluster Deployments.

Push button deployment of BinderHub on the Cloud

Sarah Gibson led one of the sprint teams to create a push deployment of BinderHub for Azure. By the end of day 3 we had a functioning version but with some wrinkles. Thanks to the tenacity of the team all the wrinkles were ironed out in the weeks that followed. Sarah presented this work at RSEConUK 2019

and also wrote up a great blog on the whole experience.

The Littlest JupyterHub on Azure

Tania Allard led the effort on documenting how to deploy the Littlest JupyterHub on Azure.

and also created a push button deployment!