Greg Kurtzer invented software called Singularity to enable the use of containers in high-performance computing (Credit: Marillyn Chung/Berkeley Lab)
Click for a full size image
Berkeley Lab’s Open-Source Spinoff Serves Science
Singularity allows scientists to use “containers” in high-performance computing
By Julie Chao
Scientists used to come to Gregory Kurtzer of Lawrence Berkeley National Laboratory’s (Berkeley Lab’s) IT department a lot, asking for a better way to use software containers in a high-performance computing (HPC) environment. After a while he got tired of saying, “Sorry, not possible.” So he invented a solution and named it Singularity.
Within a few months of its release last year, Singularity took off. Computing-heavy scientific institutions worldwide-from Stanford University to the Massachusetts Institute of Technology to various sites on the European Grid e-Infrastructure-flocked to the software. Singularity was also recently recognized by HPCwire editors as one of five new technologies to watch.
“Singularity has been making huge strides in the computing community,” Kurtzer said, with some surprise, adding that Open Science Grid, a consortium that provides distributed computing resources for scientific research, has served over 20 million containers with Singularity.
It’s now on its seventh release (version 2.2.1) and has caught on so quickly that Kurtzer has launched SingularityWare LLC to further develop and support the open-source software. The company is being funded by RStor Inc., a startup based in Saratoga, California. Kurtzer, the long-time technical lead and architect for the HPC Services group at Berkeley Lab with a joint appointment at UC Berkeley, will shift to an advisory role at the Lab in order to focus on Singularity.
“Berkeley Lab makes some Lab-developed software available at no cost to maximize its impact and to participate in the open-source software community,” said Elsie Quaite-Randall, Berkeley Lab’s Chief Technology Transfer Officer. “Singularity fosters innovation as open-source software, and now SingularityWare LLC-like other Berkeley Lab startups-will set out to expand the reach and adoption of an important technology.”
Who needs containers, anyway?
A typical case where users might need Singularity is if they want to run an application such as Google’s TensorFlow. “They may need a very specific version of Tensor Flow installed,” Kurtzer said. “They can create a container to do that in about five minutes. Then they can take that container, bring it to our environment and run it, even if we don’t have that version of Tensor Flow installed.”
Software containers make it possible to take your entire computing environment, including your files and all the applications you want to run, and encapsulate it so it can be easily replicated on another machine without worrying whether the new machine has a compatible operating system, libraries, applications, and so forth.
“Containers share some of the use cases of virtual machines but without the code redundancy and performance hit associated with virtualization,” Kurtzer said. “Singularity containers allow a user to encapsulate an entire OS (operating system) environment and use it on a shared HPC system like any other program, without an admin doing anything.”
Another example where Singularity would be useful would be allowing other scientists to reproduce experiments. “Say you just published an article. Wouldn’t it be nice to have a location you can cite where someone can download the Singularity container and replicate all the experiments?” Kurtzer asked. “Someone can enter the container, and now they’re sitting in the exact same environment as you were.”
Containerization was developed for enterprise environments, where it has become very popular, especially with the rise of Docker’s container technology. “Docker’s container solution is for the enterprise. But the scientific use case is quite different,” Kurtzer said. “Our goal isn’t to run as many containers as we can on a single host, with each having the illusion of sole occupancy and isolation, but to run maybe one, and enable it to utilize all the resources on that host. It’s kind of the opposite of isolation!”
So Kurtzer started working on his own solution, and four months later, the first version was released last spring. “When I started working on it, I asked, what do scientists really need from containers? They need reproducibility, mobility, and also freedom–the ability to install their own applications and run in their own environment, and store it just like any other data file,” Kurtzer said. “That’s what Singularity solves for scientific computing.”
Long tail of science
Kurtzer chose the name Singularity for its meaning in astronomy. “As I understand it, it’s the culmination of a whole bunch of matter in the universe forming a single infinitely dense point,” he said. “That’s what I was thinking when I was creating Singularity-taking everything necessary to create a reproducible scientific environment and putting it in one file.”
Singularity also enables users to run legacy workflows easily. Kurtzer cites one example of how his group saved an 18-year-old workflow from failing hardware and was able to convert it to a Singularity container that is still being used in production today.
Kurtzer believes Singularity will benefit scientists who may not even know they need it. “We’re trying to reach out to more scientists and engage with additional groups, especially those who are not traditional HPC users, also known as the computational ‘long tail of science,'” Kurtzer said. “We have a lot of users that are running computationally intensive jobs on their laptops and workstations and not making use of the dedicated computational cycles that are designed specifically for computing and available to them. With Singularity we can easily make these large computing resources tangible.”
source: Lawrence Berkeley National Laboratory