What is a Supercomputer and How Does It Work?
Walking among the rows of supercomputer cabinets in Argonne National Laboratory’s Leadership Computing Facility, located about 25 miles from Chicago, is kind of like wandering through a high-tech version of “The Shining’s” Overlook Maze — minus the axe-wielding madman.
The two primary supercomputers in that steel labyrinth, named Mira and Theta, comprise 101 cabinets the size of standard refrigerators that contain stacks of racks and weigh between 3,450 and 4,390 lbs. each. Their combined total: 160 tons — much of that heft due to water-cooling systems that prevent overheating. Along with several other smaller systems, the machines are housed in a 25,000-square-foot data center with low ceilings and a white tile floor. With all of that equipment whirring away, it is not a quiet place. Nearest the computers, visitors must speak-shout in order to be heard above a constant loud hum.
Six Billion Times Faster: The Aurora SuperComputer's New Home
Sprawling though the facility is, it’s insufficient to accommodate the beast that’s soon to land there. By sometime in 2021, if all goes according to plan, a fantastically powerful new supercomputer dubbed Aurora will take up residency. And so, in preparation for its arrival, a major expansion is under way. Priced at $500 million, Aurora will be the first of three so-called “exascale” supercomputers capable of performing a billion billion (aka quintillion) calculations per second in which the U.S. Department of Energy (DOE), which runs Argonne and 17 other national laboratories, is investing $1.8 billion. (Another, dubbed Frontier, will soon be installed at Oak Ridge National Laboratory in Tennessee.)
Not surprisingly for that kind of dough, Aurora will be capable of performing minor computational miracles. Measured as 1018 FLOPS (which stands for floating point operations per second), the system will be six billion times faster than its long ago predecessor, the groundbreaking Cray-1 from 1964. Put in more tangible terms courtesy of Design News, “A person adding 1+1+1 into a hand calculator once per second, without time off to eat or sleep, would need 31.7 trillion years to do what Aurora will do in one second.”
That’s between five and 10 times quicker than the current reigning champ of supercomputers, an IBM-Nvidia mega-machine called Summit that resides at Oak Ridge. Mind, blown.
Who will Aurora unseat? Here's a look at the 10 fastest supercomputers in the world according to the brain trust at TOP500.
Top 10 Fastest Supercomputers in the World
- Summit | U.S.
- Sierra | U.S.
- Sunway TaihuLight | China
- Tianhe-2A | China
- Piz Daint | Switzerland
- Trinity | U.S.
- ABCI | Japan
- SuperMUC-NG | Germany
- Titan | U.S.
- Sequoia | U.S.
“There are limitations on what we can do today on a supercomputer,” Mike Papka, director of the Leadership Computing Facility, said recently after giving a tour of the space. “With Aurora, we can take those to the next level. Right now, we can do simulations of the evolution of the universe. But with Aurora, we’ll be able to do that in a more realistic manner, with more physics and more chemistry added to them. We’re starting to do things like try to understand how different drugs interact with each other and, say, some form of cancer. We can do that on a small scale now. We’ll be able to do that on an even larger scale with Aurora.”
As one of 52 Energy Department supercomputers, Aurora will probably be the only exascale system in existence when it debuts. (That is, unless China builds one first — which some insiders say is pretty unlikely despite reports that the country is scrambling to make one by 2020.) At a March 2019 press conference announcing Aurora’s installation, Argonne associate laboratory director Rick Stevens explained that the system will handle high performance computing applications as well as analysis of streaming data that's generated by accelerators, detectors, telescopes and other research equipment.
At this point, though, Aurora remains a work in progress while Summit gets the glory. Originally slated to go live several years ago in a far less powerful incarnation and launched in mid-2018, Summit cost $200 million, can perform complex mathematical computations at a rate of 200 quadrillion (or 200 trillion) per second and is responsible for snatching back America’s number one perch on the TOP500 list from China. Physically imposing, it is made up of more than 300 units — similar in size to those of Mira and Theta — that weigh a total of 340 tons, occupy 9,250 square feet and are powered by 9,216 central processing chips. Inside are miles of fiber-optic cable, and cooling this behemoth requires 4,000 gallons of water per minute. Also, it consumes energy voraciously — enough to power thousands of homes.
When the “father of supercomputing,” Seymour Cray, first began building his revolutionary machines in the 1960s, such a rippling display of computational muscle was incomprehensible. More than a half century later, it's slowly becoming the norm — and will someday seem as quaint as an Atari 2600 does now.
What is a Supercomputer?
(Hint: Parallel Computing Is Key)
Supercomputers have for years employed a technique called “massively parallel processing,” whereby problems are split into parts and worked on simultaneously by thousands of processors as opposed to the one-at-a-time “serial” method of, say, your regular old MacBook Air. Here’s another good analogy, this one from Explainthatstuff.com:
It’s like arriving at a checkout with a cart full of items, but then splitting your items up between several different friends. Each friend can go through a separate checkout with a few of the items and pay separately. Once you’ve all paid, you can get together again, load up the cart, and leave. The more items there are and the more friends you have, the faster it gets to do things by parallel processing — at least, in theory.
“You have to use parallel computing to really take advantage of the power of the supercomputer,” says Rensselaer Polytechnic Institute doctoral candidate Caitlin Joann Ross, who recently did a six-month residency at Argonne. “You have to understand how data needs to be exchanged between processes in order to do it in an efficient way, so there are a lot of different little challenges that make it a lot of fun to work with. Although there are days when it can certainly be frustrating.”
“Debugging” issues, she says, are the chief cause of that frustration. Calculations that might run smoothly using four processors, for instance, could break down if a fifth is added.
“If you've got everything running perfectly,” Ross says, “then whatever it is that you're running is running a lot faster than it might on a computer with fewer processors or a single processor. There are certain computations that might take weeks or months to run on your laptop, but if you can parallelize it efficiently to run on a supercomputer, it might take a day.”
Another area of Ross’s work involves simulating supercomputers themselves — more specifically, the networks used on supercomputers. Data from applications that run on actual supercomputers is fed into a simulator, which allows various functions to be tested without taking the whole system offline. Something called “communications interference” is one of those functions.
“In real life, different users will submit jobs to the supercomputer, which will do some type of scheduling to determine when those jobs run,” Ross says. “There will typically be multiple different jobs running on the supercomputer at the same time. They use different compute nodes, but they share the network resources. So the communication from someone else’s job may slow down your job, based on the way data is routed through the network. With our simulations, we can explore these types of situations and test out things such as other routing protocols that could help improve the performance of the network.
What Are Supercomputers Used For?
Just Simulating Reality, That's All
For the past several decades and into the present day, supercomputing’s chief contribution to science has been its ever-improving ability to simulate reality in order to help humans make better performance predictions and design better products in fields from manufacturing and oil to pharmaceutical and military. Jack Dongarra, one of the world's foremost supercomputing experts, likens that ability to having a crystal ball.
“Say I want to understand what happens when two galaxies collide,” Dongarra says. “I can’t really do that experiment. I can't take two galaxies and collide them. So I have to build a model and run it on a computer. Or in the old days, when they designed a car, they would take that car and crash it into a wall to see how well it stood up to the impact. Well, that's pretty expensive and time consuming. Today, we don’t do that very often; we build a computer model with all the physics [calculations] and crash it into a simulated wall to understand where the weak points are.”
What Are Supercomputers Used For?
Companies, especially, see the monetary value (ROI, as the corporate types say) in supercomputing simulations, whether they’re manufacturing cars, drilling for oil or discovering new drugs. In 2018, corporate and government purchases contributed to an increasingly robust high-performance computing market.
“Of the top five hundred computers, more than half are in industry,” Dongarra, who spent an early portion of his career at Argonne, says. “Industry gets it. They are investing in high performance computers to be more competitive and to gain an edge on their competition. And they feel that money is well spent. They are investing in these things to help drive their products and innovation, their bottom line, their productivity and their profitability.”
But it’s bigger than just ROI.
“Traditional commercial enterprise can see return on investment calculations of, ‘It saved us this amount of physical testing costs,’ or, ‘We were able to get to market quicker and therefore gain extra income,'" says Andrew Jones, a UK-based high performance computing consultant. "But a basic ROI calculation for HPC is not necessarily where the value comes from. If you ask an oil company, it doesn't come down to being able to find oil 30 percent cheaper. It comes down to being able to find oil or not."
Companies that use supercomputing to make big-picture improvements and increase efficiency have an edge on their competitors.
“And the same is true for a lot of the science," Jones adds. "You're not necessarily looking for a return on investment in a specific sense, you’re looking for general capability — whether our researchers are able to do science that is internationally competitive or not.”
The Need for Speed
'“There are no two greater offenders of ‘look at how big my system is’ than the U.S. and China.”'
Because faster computers allow researchers to more quickly gain greater insight into whatever they’re working on, there’s an ever mounting need — or at least a strong desire — for speed. Dongarra calls it “a never-ending quest,” and Aurora’s (still unproven) sustained exascale capabilities would be the pinnacle of that quest so far. Still, it will be one of many. Scores more supercomputers with sometimes epic-sounding names (Titan, Excalibur) operate in 26 other countries around the world. Manufactured by 36 different vendors, they’re driven by 20 generations of processors and serve a variety of industries as well as government functions ranging from scientific research to national defense.
Those stats are from the website TOP500.org. Co-founded by Dongarra, it has kept tabs on all things supercomputing since 1993, and uses his LINPACK Benchmark (which estimates how fast a computer is likely to run one program or many) to measure performances. According to its latest rundown of the globe’s biggest and baddest, America has five (soon to be six) of the top 10 — including the planet’s fastest supercomputer in Oak Ridge’s Summit and the second fastest, Sierra, at the Lawrence Livermore National Laboratory in California. Runner-up China has only two (but soon to be three). Sure, the country occupies 227 of the top 500 spots and has manufactured 303 of the machines on that list, but the USA can still brandish its giant foam finger. For now. The contest is ongoing and shows no signs of abating.
“There are no two greater offenders of ‘look at how big my system is’ than the U.S. and China,” says Nicole Hemsoth, co-founder and co-editor of The Next Platform.
While China has historically been less concerned with the Top 500, she explains, over the last several years they’ve made high performance computing “a point of national pride,” placing more emphasis on “chart-topping performance” and spending billions to achieve it. Other exascale competitors include France and Japan. According to one study, $10 billion of a projected $130 billion spent on supercomputers between 2018 and 2021 will go toward exascale systems like the one that’s slated for Argonne.
“The race between countries is partly real and partly artificial,” says Jones. “So, for example, if you are the director of a U.S. national lab and you're trying to secure funding for your next HPC machine, it's a very good argument to say that, ‘Well, China's got one that's ten times bigger, so we need to catch up.’ The European Union and China play the same game against the U.S., so there's a little bit of created tension that isn't necessarily real, but it's helping to drive the [competition].”
The media plays a significant role, too. Journalists love to roll out brain-boggling supercomputer stats and explain them evocatively. There’s an example of that at the start of this story. Here’s another, from the New York Times: “If a stadium built for 100,000 people was full, and everyone in it had a modern laptop, it would take 20 stadiums to match the computing firepower of Summit.” ARE YOU NOT ENTERTAINED?
Government officials also enjoy a bit of supercomputing swagger, talking up their gargantuan processing power as the key to societal improvement — and, of course, evidence of their country’s total awesomeness. John F. Kennedy, who revved up the space race in 1961, would have been all over this.
“It’s basic economic competitiveness,” Jones says. “If you drop so far off that your nation is no longer economically competitive with other comparably sized nations, then that leads to a whole load of other political and security issues to deal with.”
COmputing SPeed + Power = Military Might
Beyond the security and economic aspects, he adds, those who clearly understand the implications of high-performance computing see its huge benefits to science, business and other sectors. "So it's a no-brainer that we do this stuff.” (Granted, some reports say those benefits are overblown.) On the nuclear armaments front, for example, supercomputers have proven a huge boon to things that go boom. Sophisticated simulations have eliminated the need for real-world testing.
“They don't develop something, go out into the desert, drill a hole and see if it works," Dongarra says of a practice that stopped decades ago. "They simulate that [weapon] design on a supercomputer. They also simulate what happens to those [weapons] if they are on a shelf for so many years, because they have to verify that the stockpile will work.”
In a major recent upgrade, the Air Force Research Lab — one of five U.S. Department of Defense supercomputing centers — installed four sharable supercomputers on which the entire U.S. military can conduct classified research. The project was promoted as a way to help Air Force, Army and Navy researchers "quickly respond to our nation’s most pressing and complex challenges, which is also accelerating new capabilities to the warfighter at lower costs to the taxpayer.”
Interpret that however you want.
Supercomputing and Artificial Intelligence
Artificial intelligence is still pretty rudimentary, but supercomputers are changing that by turbo-charging machine learning processes to produce quicker results from more data — as in this climate science research.
“To be engaged in supercomputing is to believe in the power of the algorithm to distill valuable, meaningful information from the repeated implementation of procedural logic,” Scott Fulton III writes in an insightful story on ZDNet. “At the foundation of supercomputing are two ideals: one that professes that today's machine will eventually reach a new and extraordinarily valuable solution, followed by a second and more subtle notion that today's machine is a prototype for tomorrow’s.”
As Argonne director Paul Kearns told HPCWire, Aurora is intended for "next generation" AI that will accelerate scientific discovery and make possible improvements in such areas as extreme weather forecasting, medical treatments, brain mapping, the development of new materials. It will even help us further understanding the universe, he added, "and that is just the beginning.”
While Dongarra thinks supercomputers will shape the future of AI, exactly how that will happen is isn't entirely foreseeable.
“To some extent, the computers that are being developed today will be used for applications that need artificial intelligence, deep learning and neuro-networking computations,” says Dongarra. “It’s going to be a tool that aids scientists in understanding and solving some of the most challenging problems we have.”
“Going to be” — future tense. AI work is still only a small percentage of what supercomputers do. For the most part, Jones says, they’re “time machines” that are “bringing next science from five years ahead into today.”
“Ninety percent of traditional HPC installations are still doing traditional HPC workloads — engineering simulations, fluid dynamics, weather and climate modeling,” he explains. “And AI is there at the five or 10 percent level augmenting those and helping to make them work better, but it’s not yet dominating the requirements for buying HPC platforms or even for guiding HPC funding programs.”
Hemsoth thinks it will probably be another five years before existing HPC workflows include a lot of AI and deep learning, both of which will have different compute requirements than they presently do.
“Everyone is jumping the gun a little bit when it comes to AI,” she says. “They're buying systems that are right for AI as it is now. AI will be a practical part of workloads, but it's going to change. And the actual software and application that stuff needs to run on is going to change, which is going to change what hardware you need to have. This stuff is evolving rapidly, but with really long hardware production cycles — especially if you're a national lab and have to procure this stuff three to five years before you ever even get the machine.”
The Future of Supercomputing
"The betterment of mankind is a noble goal to have."
Another brain blaster: your current smartphone is as fast as a supercomputer was in 1994 — one that had 1,000 processors and did nuclear simulations. (Is there an app for that?) It goes to reason, then, that the smartphone (or whatever it’s called) you have in a quarter-century could theoretically be on the level of Aurora. The point is, this stuff is speedy — and it's only getting speedier. Here's how Dongarra nutshells it:
“We reached teraflops in 1997 on a machine at Sandia National Laboratories. That was 1012 teraflops. Then, in 2008, we reached petaflops — 1015 — at Los Alamos. Now we’re on the verge of hitting exascale, with 1018 operations, around the start of 2020 or 2021. In probably 10 or 11 years, we are going to be at zettascale — 1021 operations per second. When I started in computing, we were doing megaflops — 106 operations. So things change. There are changes in architecture, changes in software and applications that have to move along with that. Going to the next level is a natural progression.”
A recent story on TOP500.com titled, “Supercomputing is heading toward an existential crisis,” paints a picture of things to come in which simulations take a back seat.
“Machine learning, in particular, could come to dominate most computing domains, including HPC (and even data analytics) over the next decade-and-a-half,” author Michael Feldman writes. “While today it’s mostly used as an auxiliary step in traditional scientific computing – for both pre-processing and post-processing simulations, in some cases, like drug discovery, it could conceivably replace simulations altogether.”
Whatever form supercomputers take, Argonne’s Papka says they'll become increasingly powerful and transformative, affecting everything from the pedestrian to the profound — from the design of more efficient electric car batteries to, just maybe, the eradication of long-battled diseases like cancer. Or so he hopes.
“The betterment of mankind,” Papka says, “is a noble goal to have."