Low-income commuters who rely on public transit face many challenges—multiple transfers, long waits, and off-hour travel—that aren’t measured in the usual ridership surveys. Vanessa Frias-Martinez, a computer scientist at the University of Maryland, College Park, wants to ease their commute by harnessing two hot trends in computer science, cloud computing and artificial intelligence (AI), which Congress now hopes to scale up dramatically for U.S. scientists.
With support from the National Science Foundation, including an NSF-funded effort called CloudBank that subsidizes access to commercial cloud services, Frias-Martinez plans to track the movements of thousands of Baltimore residents while protecting their privacy. And by applying AI algorithms to the large data sets, she hopes to identify ways to eliminate transit bottlenecks and improve service. Frias-Martinez predicts CloudBank “will flatten the steep learning curve” for first-time cloud users like her.
Congress has now embraced a plan to ensure there are many more. The National Artificial Intelligence Initiative Act (NAIIA) of 2020, which became law last week, aims to bolster AI activities at more than a dozen agencies. Its directives include a study of how to create a national research cloud that would build on CloudBank. It also calls for an expansion of a network of research institutes launched last summer, and the creation of a White House AI office and an advisory committee to monitor those efforts.
“It’s the closest thing to a national strategy on AI from the United States to be formally endorsed by Congress,” says Tony Samp, a former congressional staffer turned high-tech lobbyist for DLA Piper. He and others say the new law is meant to keep the country at the forefront of global AI research in the face of growing investments by other countries.
The NAIIA authorizes spending but doesn’t appropriate money. If funded, however, it would significantly ramp up federal AI investments. It authorizes $4.8 billion for NSF over the next 5 years, with another $1.15 billion for the Department of Energy (DOE) and $390 million for National Institute of Standards and Technology (NIST). NSF, which funds the vast majority of federally supported AI academic research, estimates it spent $510 million on AI in 2020, so the NAIIA would roughly double that effort.
The military is also upping its AI game. The NAIIA is appended to the National Defense Authorization Act, a 4500-page bill providing annual policy guidance to the Department of Defense that survived a presidential veto. This year’s version of the must-pass bill raises the stature of the Pentagon’s Joint Artificial Intelligence Center formed in 2018 and gives it new authority to use AI to improve combat readiness and fight wars.
The NAIIA both codifies what some federal agencies are already doing and gives them an extensive to-do list. For example, it endorses NSF’s network of seven AI research institutes, launched last summer with help from the U.S. Department of Agriculture and in partnership with industry, and backs similar centers at DOE and the Department of Commerce—which includes NIST and the National Oceanic and Atmospheric Administration. The NSF institutes, each funded at roughly $20 million over 5 years, will support research in applying AI to a variety of topics including weather forecasting, sustainable agriculture, drug discovery, and cosmology.
NSF is already soliciting proposals for a second round of multidisciplinary institutes, and many AI advocates would like to see its growth continue. A white paper for President-elect Joe Biden, for example, calls for an initial investment of $1 billion, and a 2019 community road map envisions each institute supporting 100 faculty members, 200 AI engineers, and 500 students.
Their popularity has revived a recurring debate about how to grow such an initiative without hurting the core NSF research programs that support individual investigators. “We’re very proud of the institutes, which have gotten a lot of attention, and we think they can be wonderfully transformational,” says Margaret Martonosi, head of NSF’s Computing and Information Science and Engineering (CISE) directorate. But Martonosi also notes that CISE spends even more on its core programs—and still rejects more good proposals than it funds.
Cloud computing could also boost AI, because it enables researchers to compile and analyze the huge data sets required to train AI algorithms. It, too, gets a big shoutout in the new law, which directs the NSF director and the president’s science adviser to assemble a 12-member task force to study the feasibility of a National Research Resource (NRR). Such a national cloud would scale up what CloudBank is now doing and give researchers the tools to analyze large public data sets containing, say, anonymized government health records or satellite data.
“At present, only a handful of companies can afford the substantial computational resources required to develop and train the machine learning models underlying today’s AI,” says Stanford University’s John Etchemendy. “What’s more, the large data troves required to train these algorithms are for the most part controlled by either industry or government. Academic researchers struggle to gain access to both.” Etchemendy, a former longtime provost, and computer scientist Fei-Fei Li direct Stanford’s Institute for Human-Centered Artificial Intelligence and co-authored a proposal for an NRR that legislators used as a template in the NAIIA.
Columbia University computer scientist Jeannette Wing, whose resume includes leading NSF’s computing directorate and running Microsoft’s research shop, would like to see “all universities use the cloud routinely for all research and all educational activities.” Scientists who continue to rely on their own institutional computing resources, expertise, and support staff, she believes, will find it increasingly difficult to keep pace with competitors who can address cutting-edge research questions via the cloud.
Creating such a ubiquitous network, which she calls an academic cloud, won’t be easy. “Current commercial cloud providers have interfaces and services that are not nontechie friendly and price points that are out of line for academics,” she explains. But she thinks those problems can be solved.
How a national cloud would be structured or managed poses another challenge. Some have suggested linking it to DOE’s network of national labs, or to the supercomputing centers that DOE and NSF support. Etchemendy hopes the government will decide to contract with commercial cloud services such as Amazon Web Services, Google Cloud, Microsoft Azure, and IBM Cloud rather than starting from scratch.
“The commercial cloud providers are doing the innovation, and they invest massive amounts of money to keep it up-to-date,” he says. “It would be a huge mistake to build a facility like a supercomputer center because it would be obsolete within a few years.”
Even if the spending levels authorized by the new law are aspirational, AI advocates say the act demonstrates the remarkable support that the field now enjoys. “There was a real sense of urgency on this issue,” Samp says. “I also think [the NAIIA] provides a foundation for years to come.”
As its name suggests, CloudBank is a way for researchers to buy cloud computing resources. But Mike Norman says the National Science Foundation (NSF)-funded pilot project is more than just a bulk retailer; it also offers researchers boutique advice on how to use the cloud.
“The public cloud is like Home Depot; it has everything you need to build whatever you want,” says Norman, a computer scientist who leads CloudBank at the University of California (UC), San Diego. “But not all customers are the same. Senior scientists may already know what they are doing; they have the blueprint, the knowledge, and the tools to build their doghouse. All they need are the [computing] cycles to run their algorithms.”
“In contrast, a new researcher might be starting off with a question and have no idea how to build the platform that will give them the answer,” Norman adds. “They are looking for guidance. And we’re here to help.”
Norman, who is also director of the NSF-funded San Diego Supercomputer Center, says the goal of the pilot is to find out “who’s out there and what they need.” He and his team spent the first year of their $5 million, 5-year grant awarded in 2019 building a portal that connects NSF-funded scientists with any of four large commercial providers of cloud computing—Amazon Web Services (AWS), Google Cloud, Microsoft Azure, and IBM Cloud.
Norman signed up a company to manage the financial end of those transactions. CloudBank offers discounts for the commercial services, and researchers aren’t required to pay indirect costs, or overhead, to their institutions on the services provided. On the nonfinancial end, a team at the University of Washington is coordinating what Norman calls research facilitation—providing experts to help researchers take advantage of the cloud—and a group at UC Berkeley is helping educators develop curricula for using cloud computing in the classroom.
In September 2020, CloudBank opened for business. One of its first customers was Vanessa Frias-Martinez, a computer scientist at the University of Maryland, College Park. Her $2.35 million grant, from the Smart and Connected Communities program within NSF’s Computer and Information Science and Engineering (CISE) directorate, seeks to improve service for low-income Baltimore residents who rely on public transit by collecting and analyzing data from their often arduous commutes.
Frias-Martinez says CloudBank has allowed her to stretch her research dollars and, as a result, improve the quality and scope of her analyses. “For example, we started to do some experiments with an AWS database and the costs were much higher than we had expected,” she explains. “We submitted a ticket to their helpdesk and they quickly responded” with a full explanation of expenses and some money-saving alternatives.
Going the last mile
CloudBank was created to serve NSF grantees, starting with those funded by select CISE programs who have requested cloud computing. That pool is now tiny by design, but Norman expects demand to increase rapidly once NSF begins to make awards from this year’s program solicitations, the first that include CloudBank as an option. CloudBank could also serve as a template for a far larger, national cloud computing resource, part of a massive scale-up in cloud computing and artificial intelligence outlined in a law passed by Congress last week.
As a baseline, 150 NSF grantees asked for money for cloud computing in 2019, including 67 supported by the CISE directorate. That number has doubled over the past 5 years, although it remains small compared with the more than 2000 awards CISE made that year.
Norman is confident that CloudBank’s system of handling financial transactions can accommodate the projected rapid growth. But keeping it staffed with expert facilitators may be more of a challenge. Newly minted Ph.D.s in the field are flocking to industry, which offers higher salaries and the ability to work on real-world problems.
On the other hand, helping researchers navigate the cloud is not something that the big companies do very well. “It’s high touch,” Norman says, and requires scientists and engineers with both cloud computing expertise and strong people skills. “Companies see us as better able to deal with the last-mile problem.”
But training a cadre of scientists in research facilitation has never been a priority for the federal government, according to Norman, and academic positions are rare. Blue-ribbon panels recommending how to foster greater use of cloud computing typically focus on giving researchers more access to cloud resources, he says.
“But that’s not enough,” Norman says. “You also need the people who can make it work.” He thinks one solution may be to crowdsource the needed expertise via an online community forum.
Margaret Martonosi, who leads CISE, agrees that the country needs a cloud-ready academic workforce and that CloudBank is an important element in achieving that goal. She calls it “one of several building blocks in NSF’s broader strategy” of giving researchers access to cloud computing and other computing platforms, including NSF’s network of supercomputing centers.
She also thinks it has a bright future. As a pilot project, she says, CloudBank “is designed to be scalable. And there’s already a lot of community interest in using it.”