In my work as CEO at Research for Improving People’s Lives, I’m frequently asked some variation of “can data be a crystal ball for the job market?” It’s a fair question and especially poignant in the era of the Great Resignation and a potential economic slowdown on the horizon.
Like most attempts at predicting the future, we start with the past. What can the historical data tell us, what trendlines are already visible and what do we expect to happen next? In most industries, lots of data exists to help guide these conversations but the data is messy; much of it is walled off, and the data we do have is typically flawed.
What Are SOCs?
Here’s another thing: The current information in the labor market is geared toward industry, not occupation. Data only reveals which company or organization an individual works for and in what industry. It doesn’t show someone’s actual occupation. For example, the data may show that an individual works for a technology company, but it does not indicate whether that individual is a coder, custodian, manager or data scientist. With no occupational data, there’s no skill set data.
So, how can we use data to generate insights into people’s skills and what they’ll need to be competitive in the job market?
Using Data That Paints a True Picture
First, we turn toward data that paint the truest picture of those two questions: the information that jobseekers and companies put into the market.
Companies post millions of jobs online yearly and job seekers upload millions of resumes. Together, these postings contain the labor market information needed to understand what skills and careers are in demand, how demand is changing across regions, and who is looking for a job. The National Association of State Workforce Agencies (NASWA) has recently launched a National Labor Exchange (NLx) Research Hub that brings together job postings nationally daily.
Gaining Insights from Job Postings
Next, we need a way to transform job postings into data that researchers and policymakers can use for insights. To understand the labor market, policymakers and researchers often turn to Standard Occupation Classification codes, a government-driven framework for understanding and standardizing jobs for analysis via a hierarchy of standard occupations.
Working with real-world SOCs is notoriously tricky. Job seekers and employers use many different words and language variations in titles and duties to describe a job, making it difficult to assign a SOC to each title in a vast dataset. While some companies offer paid services to conduct this assignment, they are often prohibitively expensive for the government and use black-box software that makes it difficult to verify the information and then share it for the public good.
In partnership with NASWA and using the newly launched NLx Research Hub, my organization, RIPL, built an open-source natural-language processing toolkit for modeling structured occupation information and SOCs in unstructured text from job titles and job title postings and resumes.
We call our tool Sockit, and it’s already processed more than 43 million unstructured job postings available in the National Labor Exchange, empirically measuring associations between occupation codes, skills keywords, job titles and full-text job descriptions in the United States during the years 2019 and 2021. The tool models the probability that a job title is associated with an occupation code and that a job description is associated with specific skills keywords, clusters of skills, and occupation codes.
In short, using natural language processing and data science techniques, Sockit determines the most likely SOC code for a free-text job title. Using Sockit and job data available through the NLx Research Hub, researchers can now answer questions such as:
How have companies shifted toward remote work during Covid? RIPL researchers are partnering with NASWA to construct a study modeling in which SOCs support remote work over time.
Have rural job markets experienced growth because of the pandemic? RIPL researchers are modeling which rural counties have experienced the largest increase in job posting share by SOC codes between 2019 and 2022.
How can we recommend high-impact career transitions for unemployed jobseekers based on their previous occupation and skills? Policymakers use tools like Sockit to transform job and state administrative wage data into personalized career recommendations. For example, the Hawaii Career Acceleration Navigator, a digital career navigator developed in partnership between the state of Hawaii, RIPL and the National Governors Association partnership, is using Sockit to power matches between employers and job seekers through DOORS. This first-of-its-kind job discovery system uses state labor data to provide personalized career paths and reskilling.
Making Sense of All This
Now, back to those predictions about the future.
First, industries need to modernize the skill sets of their employees. We can’t just keep pumping out STEM grads and think the problem will solve itself. We require that ever-widening skills gap to close. Second, we need to use technologies like Sockit to make sense of all this data.
We’re making good progress on both fronts, so at least for now, when someone asks me that complicated question about using data as a crystal ball for the job market, I give them a simple answer: We’re working on it.