It’s been a while since NASA sent a human into space. 

Sure, American astronauts have been ferrying between Earth and the International Space Station for years — but they’ve been hitching rides on Russian rockets. Now, as the U.S. government’s space exploration agency gears up to send human beings back to the moon within this decade — and as far as Mars in the next — NASA must first assess the skills it has within its ranks. Does it still have the know-how to launch humans out of the atmosphere, land them on the surface of a celestial body and bring them home again? And if it doesn’t, who does it need to hire or train in order to get there? 

In 2019 the agency moved David Meza, a data scientist, to Washington D.C. from his Houston home to help NASA map the skills and talents hiding within its workforce. Project leaders would use this tool to staff their teams; professionals would use it to find their next project. To create this database — and the tools project leaders and employees would use to find each other — Meza used Neo4j, a native graph database platform. 

“I chose a graph database because of the types of relationships we’re looking at,” Meza said. “It seemed to work a lot easier within a graph model than it would in a traditional relational database model.”


Graph vs Relational Databases

Storing data in rows and columns, relational databases like MySQL are perfect for capturing repetitive, tabular forms. They use a predefined structure — like the set definitions within a table — to store the relationships between entities. By contrast, graph databases store relationships at the individual record level, allowing users to analyze complex relationships. This makes them slower to run when handling large datasets, while the more rigid relational model can work faster at scale.


To begin with, he’s working with a government database that includes all 18,000 NASA employees, but in time he hopes to incorporate data from the agency’s 50,000-odd contractors, who work for companies like Boeing and Lockheed Martin. 

Meza sees broader implications for his project beyond NASA.

“One of the reasons I decided to move up here is that I saw a great opportunity to enhance the concept of data science within people analytics,” he said. “I think it’s a fledgling environment for data analytics, and I think it’s ripe for improvement.”

In an interview with Built In, Meza outlined his vision for NASA’s talent mapping tool, how it will work once it’s officially rolled out by year’s end and how, ultimately, this one small data science step will help set humankind up for more giant leaps into space.


David Meza, Senior Data Scientist

david meza nasa head shot

Why did NASA decide to map its talent pool?

There are a lot of changes going on in the government and at NASA. Our missions are changing. We’re trying to go back to the Moon and on to Mars, and we really need to get an idea of whether our workforce is ready for that. We haven’t been to the Moon in decades — do we still have the skill sets required to do that? And if not, what do we need to do to acquire those skills? Can we identify similar skills within our workforce that can translate over to fit the skill sets we’re looking for? 

I was asked to take a look at the problem to see if we could develop a data model that can deliver this information based on the data and workforce we currently have. 


Elsewhere in Aeropace Tech:3 Space Exploration Companies Helping NASA Usher in a New Intergalactic Age


How is your solution designed to work?

We have a UI called talent marketplace where employers can submit an opportunity for somebody at NASA. They fill out the details around the opportunity, the required skills, the duration of the project, the things they want to accomplish. An employee can then log on, browse and apply for different opportunities. 

The next step of what we’re trying to do is build the employee a LinkedIn-type profile where they can put in the knowledge and skills they have. Two things can happen from that. First, an employer can begin searching for people with the knowledge and skills required for their project, and employees who match will pop up. Second, an employee can look for work details or opportunities that match their skills. On the back end, the graph database will do that matching. 

Right now the model is based on very general occupations and work roles data from O*NET, which is open data from the government, plus an element containing knowledge, skills, abilities, tasks, characteristics and cross-functional skills. We’ve then inferred against that to the employee. 

But they’re very general skills and pieces of knowledge, so over the next year we’re trying to fine-tune into the more NASA-specific knowledge and skills that we’re looking for. Our hope is that employees go in and create these profiles as they look for opportunities, as it will also help us understand what we have out there in the organization. We’re looking at ways of speeding up that process, including gamification and badges to make it interesting and fun for them to fill out their profiles.



You’ve already touched on this, but can you give an example of this tool in action?

Our focus during development has been looking at a skill set that we already knew was missing at NASA, which is data science. We’ve been testing to see if we can find those skills. We know people are doing data science-type work, but they’re not identified as data scientists. Can we see whether there are engineers out there doing algorithms and research scientists creating models, who aren’t necessarily called data scientists? And once we can see these skills across the agency, can we use them for some of the other types of work we’re doing right now? 


“We haven’t been to the Moon in decades — do we still have the skill sets required to do that?”


What were some of the challenges associated with this project?

One of our main challenges was to figure out how we pull this information about the employee without having to burden them too much by asking for their whole life history. Part of this was programmatically running different similarity algorithms to see what can we find in their current information and compare that to a particular work role, and asking how similar their current role is to another. And then we start level-setting and making sure we’re reaching a good similarity. 

And then, of course, you have to do validation and verification from the human side to make sure there’s no bias in the algorithms, that we’re not finding things in error and that we can justify the results accurately. We then feed the information we get from the employees back into the model to try to see how we can fine-tune a little bit better.


Read Our Industry CoverageThe Latest In Data Science


How do you see this technology evolving beyond the initial rollout?

From an interface standpoint, there are many things we’re looking at, including a text-to-speech capability where somebody can just ask a question and an RPA — robotic processing automation or chatbot — built into the UI can respond to speech queries like, “Find me an expert in data analytics.” We’re also looking at text-to-cipher capabilities, where we type in a natural language query and our model turns it into a cipher statement. The cipher statement is the query that pulls it out of Neo4j and then pulls that information. 

Not only would it pull the information it’s asked for but things that may be related to it. For example, if I’m looking for a developer, that could be many things. In our database, the word “developer” has many different labels connected to it — it could be a software developer, it could be a web developer, it could be a database developer. So a question may pop back up: “Okay, you said developer, but here are some different options. Which one do you really mean?” We can try to fine-tune that request to help the user along and narrow down their search, which will then build a cipher statement to come back with what they’re actually looking for.


Great Companies Need Great People. That's Where We Come In.

Recruit With Us