The path to data science doesn’t always start with a computer science degree. Many would-be data scientists start in math-focused fields like physics, chemistry, engineering or statistics. Even where data science degrees currently exist as standalone majors, they are often housed in mathematics or engineering departments. Others are interdisciplinary programs spanning the physical sciences or are add-on options to (usually) mathematics majors.
The Value of Math-Focused Fields in Data Science
We spoke with five data scientists who started in math-focused fields. Despite different starting places, they share some common experiences. Their curiosity and drive to find answers drew them to their original fields, then practicality shifted their direction towards data science. Once there, they found their math-focused experiences complemented their current efforts in unique — and potentially surprising — ways.
From the Mysteries in Cosmology to the People Behind Data
Briana Brownell, founder and CEO of PureStrategy, an AI-driven analytics company, said her path from math to data science was a strange and convoluted one. It started with theoretical physics in undergrad. She wanted to “understand why the universe was made from such strange stuff,” as she wrote in her blog. But it was the math classes required for the major that captured her curiosity. Rather than the drudgery of grinding out answers in high school math, college math courses were engaging.
“All of a sudden, it was all about proofs and understanding the actual structure of the world,” Brownell told Built In.
Then she found her first job in finance. As a trader on the New York Stock Exchange, she was fascinated by the patterns she saw underlying why some companies succeed while others struggle.
“Anthropology, of course, is all about human behavior ... That ended up being more valuable in creating good data science models and drawing conclusions from that than any amount of mathematics and computer science would have been.”
Brownell’s experiences in physics and math have translated well into data science, both as a trader and in her current work with PureStrategy. Designing algorithms in her previous work has helped in designing AI systems because it trained her to see the inner workings of the problem-solving methodology, for example.
But it’s more about mindset than math. Data science is about deriving useful information from data that (usually) describes how people behave, so identifying patterns and understanding people is far more important, Brownell said. The data scientist she had learned the most from, for instance, had a degree in anthropology.
“Anthropology, of course, is all about human behavior,” she said. “That ended up being more valuable in creating good data science models and drawing conclusions from that than any amount of mathematics and computer science would have been.”
As a trader, Brownell had to learn about emotional intelligence and how people’s emotion-driven motivations drive their behavior. That people-focused training has continued to prove useful to her in her work as a data scientist as well.
“Because all the assumptions that I make, and the pre-processing steps that I take — all of those things that I do before I actually build the model — impact the success of the model and how it impacts other people,” she said.
An Appreciation of Math Plus a Desire to Have an Impact
Brownell isn’t alone in her interest in the human side of data. Ian Wong, co-founder and CTO of digital real estate platform Opendoor, was drawn to electrical engineering by a fascination for technology and how it can impact people’s experiences. Also like Brownell, the underlying mathematics of the field captured his imagination early in his degree.
“There was something elegant and timeless about math, and I wanted to combine that with making an impact on the world,” he said. This ultimately led to studying statistics, in which he received his master’s degree. While working on his Ph.D., however, Wong realized that the cutting edge in applied mathematics was in industry rather than academia, and that academia was severely lagging behind.
“Combined with my desire to build technologies that can create a more immediate impact, I decided to drop out and seek out a job in data science,” he said.
“The best gift that my graduate studies gave me was the skill and discipline of [research] — the journey of continuous learning and discovery, which is a necessity given the rapidly shifting landscape of data science.”
That job came “serendipitously,” he said. “As a bored graduate student, I found myself on Quora a lot, answering technical questions. I attended a Quora power-user party and ended up talking to Keith Rabois, the COO of Square at the time. As a fun icebreaker, I casually asked him if he had a job for me, not expecting him to say yes. The next thing I knew, I was interviewing with Jack Dorsey.”
Since then, Wong has built machine learning applications for Square and later Prismatic. In 2014, he co-founded Opendoor. He said that his academic experiences in statistics, especially, gave him a strong theoretical grounding in analysis and prediction. It also helped train him how to learn.
“The best gift that my graduate studies gave me was the skill and discipline of [research] — the journey of continuous learning and discovery, which is a necessity given the rapidly shifting landscape of data science,” he said.
Academia and Industry Combined by Big Data
Not everyone who started their path to data science in a math-focused field has left it. Jed Macosko, who has a doctorate in chemistry and is a professor at Wake Forest University and president of AcademicInfluence.com, has one foot in academia and one in industry. His journey started back in childhood where a love of math animated his academic life.
Though he enjoyed math, he didn’t think he was good enough to become a math professor. But he knew he wanted to do something that would capitalize on his abilities to do everything through calculus and linear algebra. That meant either physics or chemistry.
Pragmatism drew him to chemistry. He judged he could complete a chemistry undergraduate degree faster than a physics degree, thereby saving on tuition expenses.
“A huge aspect of data science is understanding what it is you’re trying to ask — what questions you’re asking — where the data will help you answer those questions and the joy of discovering things from the data that you didn’t even know were in there.”
Chemistry became biophysical chemistry, then biophysics, as he progressed through his academic career. After he became a professor of physics, work on an electronic textbook got Macosko into big data. This hooked him. He began teaching classes on big data within the physics department. Though students were initially slow to sign up for the class, now it is one of the first of his to fill, “because people know that if they learn data science, they can get a good job,” he said. Today, his students in that class hail from business, computer science and math majors.
Though his journey to data science started with a love of math, Macosko noted that data science and big data go beyond the numbers. The skills one learns in the sciences can be as — if not more — important.
“A huge aspect of data science is understanding what it is you’re trying to ask — what questions you’re asking — where the data will help you answer those questions and the joy of discovering things from the data that you didn’t even know were in there,” he said.
Butterflies, Big Data and Experimental Design Methodologies
Malcolm Chisholm, who has a doctorate in experimental field ecology, was one of the people who helped Macosko get started in data science and convinced him that interdisciplinary skills are a huge benefit to the field. Chisholm, president of Data Millennium, a data consultancy, started his academic career as a zoologist. Though not a traditionally math-focused field like physics, zoology involves a lot of statistical methodology.
Chisholm studied insect population dynamics during his doctoral research after a childhood of catching butterflies and moths. He rapidly amassed large volumes of data in this work. How to organize all that data got him curious about, then eventually working in, data science.
Pragmatism also shifted Chisholm’s attention from his academic roots to data science. After he completed his Ph.D., he needed a job.
“It’s being able to put in place rigorous methodologies that give you some degree of trust in the results of your data science.”
“I asked all the other Ph.D. people what they were doing. They said they were going [to become] programmers and working with information technology,” he said, speaking of the 1980s. “So I realized, ‘Oh, well, I’d better do that, too. If everybody else is doing it, and they can get good money, I’ll do it, too.’”
Since then Chisholm has been working in data governance, data management and data, generally. Like Macosko, Chisholm’s scientific roots are invaluable to his data science work. The application of statistical methodologies — how to apply the right statistics to the data — is a key example. Understanding experimental design methodologies — central to the physical sciences — is even more important.
“It’s not just knowing mathematics,” he said. “It’s being able to put in place rigorous methodologies that give you some degree of trust in the results of your data science.”
Since much of his zoology work involved experimental ecology, experimental design was “a big deal” for Chisholm.
“And that very much helps me in my current work,” he said.
Physics and Love of Language to Natural Language Processing
Wood was always interested in science and good at math as a child. But he also played around with computer programming in high school, including teaching himself BASIC. Most of the programs he built had a very similar theme: getting a computer to generate sentences. This was long before he had heard of natural language processing.
Given his interest in science and the fact he came from a family with a fair number of physicists, he pursued a physics degree in undergrad. The degree involved a lot of math and he learned Fortran — what he called the favored programming language of physicists at the time. But his interest in language persisted.
“Both require good programming skills although programming is not the focus of the job. And both require you to use data and algebra to represent real world processes and phenomena.”
“During the physics degree, I took a number of language elective courses and evening classes, including Arabic, Russian, German and Croatian,” he said. He began to consider going into languages or linguistics for his master’s degree. An admissions tutor at the University of Cambridge’s linguistics department suggested natural language processing instead, “which turned out to be an excellent recommendation,” Wood said.
While interest in language started him down the path to data science, Wood said his physics training has been invaluable.
“Both [physics and data science] require a good feel for numbers and ability to crunch large datasets,” he said. “Both require good programming skills although programming is not the focus of the job. And both require you to use data and algebra to represent real world processes and phenomena.”