Senior Data Scientist

Posted Yesterday
Hiring Remotely in Maryland, USA
Remote
180K-200K Annually
Senior level
Artificial Intelligence • Cloud • Software • Cybersecurity
The Role
Lead end-to-end data science experiments on messy federal and private data: run EDAs, graph analytics and GNNs, perform probabilistic entity resolution, apply LLM-based NER, train interpretable classifiers and anomaly detectors, document reproducible pipelines, and present rigorous, statistically sound findings to operational stakeholders.
Summary Generated by Built In
We are Skyward.
 
That is, a love for people, for improvement, for human advancement through information technology. We are a people-centered business with a desire to serve others. We are diverse and unified; creative and collaborative; a collection of complementary, not competing talents. And though on the surface we remain relaxed, beneath, a torrent of energy links us to our civic tech mission.
 
We stand by our values, and we won’t compromise on any of them.
 
Integrity: We’re conscientious, intentional, and empathetic. Our words and actions align. That’s our character. Please don’t ask us to play another part, we’re poor actors.   
Compassionate: If we may borrow a quote from Theodore Roosevelt: “No one cares how much you know until they know how much you care.” Because our team is thoughtful and supportive, caring deeply for each other, our clients, and our work, this comes naturally. 
Inquisitive: We remain students by failing openly and turning lessons into solutions.
Unconventional: For us, life isn’t what happens outside of work. Work happens inside of life and our culture erases the line often dividing the two.   
Authentic: Made possible only because we embody the values listed above. We’re relaxed and fun yet intensely curious and driven. Team members are placed with thought, care, and precision to ensure that Trust, Truth, and Transparency continue to represent our brand.
 
Because of that, we continue Onward, Upward, and Skyward.

(**CONTINGENT HIRE BASED ON CONTRACT AWARD**)

We need a Senior Data Scientist.
The kind who looks at a tangle of federal and private datasets that don’t share schemas, don’t share IDs, and were never meant to talk to each other, and gets a little excited. The kind who knows that the answer is almost never “throw a bigger model at it” and is almost always “understand the data first, then pick the model.” The kind who can sit across from a federal subject matter expert and explain what a Leiden community is without making them feel dumb, and without dumbing it down either.

If you’ve ever quietly fixed someone else’s “production” notebook on a Friday afternoon - the one with hard-coded paths, no random seed, and a function called final_FINAL_v3() - this might be you.

Come join us if you're motivated to learn from others, to learn from mistakes, to be part of a future-looking and growth-oriented team.

Let's go Skyward together.


What you'll do:

  • Lead end-to-end data science experiments. From a data readiness assessment, through clustering and topological risk modeling, into unstructured-data enrichment and entity resolution.
  • Run exploratory data analyses (EDAs) on government-furnished data inside a government-controlled environment: profile completeness, find the schema mismatches, flag the gaps, and document what the data can and cannot support before a single model gets trained.
  • Apply graph analytics. Leiden community detection, betweenness and eigenvector centrality, motif analysis, temporal cluster detection, link prediction. And be able to explain in plain English what each one means and what it doesn’t.
  • Train interpretable classifiers (logistic regression, gradient boosted trees) and Graph Neural Networks (GraphSAGE, GAT) where the data supports them; reach for unsupervised anomaly detection when labels are thin (and they will be).
  • Run probabilistic entity resolution across biographic, behavioral, and biometric features using tools such as Senzing. Handle name transliteration, DOB variation, and fuzzy address matching like the working scientist you are.
  • Apply LLM-based Named Entity Recognition and relationship extraction to unstructured field text and quantify whether the extracted edges actually change the graph (rather than just adding noise that looks impressive in a deck).
  • Wrangle messy data and recommend supplemental, de-identified data sources that would enrich the analysis, and document the case for each recommendation so the customer’s privacy and legal teams can make an informed call.
  • Document everything so the customer can rerun it after you’re gone. Reproducibility is a hard acceptance criterion here, not a “nice to have.”
  • Hold the line on rigor: confidence intervals on everything, null findings documented with the same care as positive ones, and zero appetite for dashboard theater.

What we'd like you to have:

  • An active DHS Public Trust clearance at time of hire.
  • 7+ years of applied data science experience, with at least 2 years that included graph analytics or network analysis as a primary tool, not a side dish.
  • Strong production-grade Python: pandas, NumPy, scikit-learn, networkx (or graph-tool / igraph), and at least one GNN library such as PyTorch Geometric or DGL.
  • Real-world experience with probabilistic record linkage / entity resolution -  Senzing, dedupe, FEBRL, Magellan, or a homegrown Fellegi-Sunter implementation you’d be willing to defend in a code review.
  • Comfort working without labels. Anomaly detection, positive-unlabeled learning, isolation forests, autoencoders. You know how to evaluate a model when there’s no clean ground truth to compare it to.
  • Interpretability discipline: SHAP, feature importance, partial dependence plots, and the wisdom to pick the simpler model when it’s the right one.
  • LLM application experience beyond prompt-and-pray. Entity and relationship extraction on real text, with real evaluation of extraction quality.
  • Statistical maturity: someone who knows what a power analysis is, why a confidence interval matters, and why p-hacking is bad even when nobody is checking.
  • Comfort presenting technical findings to non-technical operational stakeholders without losing the nuance.
  • Reproducibility hygiene to ship code that someone else can actually rerun: version control, parameterized pipelines, deterministic seeds, pinned dependencies, READMEs that work

What would blow us away:

  • Published research, open-source contributions, or conference talks in entity resolution, graph ML, or applied causal inference.
  • Knowledge of commercial data sources like LexisNexis, Transunion, Babel Street, etc
  • A track record of standing in front of federal subject matter experts and walking them through a null result without flinching.
  • A reputation among teammates as the person who finds the bug in the pipeline before it hits the deliverable.
  • Even if you don’t meet 100% of the qualifications, we encourage you to apply. At Skyward, we’re focused on hiring individuals with the right skills and passion to grow, not just checking off every box.

And now the important part. What we offer you:

  • Medical, dental, vision insurance (fully paid for employees)
  • 15 days of paid leave
  • 7 days of sick leave
  • 2 days bereavement leave
  • 11 paid Federal holidays
  • Up to 40 hours for jury duty
  • 401K with 4% employer contribution (and no vesting period)
  • Up to 4 weeks of paid paternity and maternity leave
  • Company provided laptop
  • $5,000 per year for professional development
  • $600 per year for technical supplies and equipment
  • $2,000 referral bonus
  • Life and disability insurance
  • HSA and FSA
  • Legal Shield and ID Shield Voluntary Benefits
  • Opportunity to work in a collaborative, motivated team focused on modernizing government services with cutting-edge technology and innovative solutions. Who says government work can't be exciting!

At Skyward, we support flexible working hours and remote opportunities to help maintain a healthy work-life balance for all employees.
 
Offers of employment with Skyward are contingent upon acceptable results of a background investigation.
 
Applicants must have the ability to obtain and maintain a Public Trust security clearance due to the nature of our work as a government contractor.

Skills Required

  • Active DHS Public Trust clearance at time of hire
  • 7+ years applied data science experience with ≥2 years focused on graph analytics/network analysis
  • Production-grade Python experience (pandas, NumPy, scikit-learn)
  • Graph libraries experience (networkx or graph-tool or igraph) and at least one GNN library (PyTorch Geometric or DGL)
  • Experience with probabilistic record linkage/entity resolution (Senzing, dedupe, FEBRL, Magellan, or equivalent)
  • Experience working without labels: anomaly detection, positive-unlabeled learning, isolation forests, autoencoders
  • Interpretability tools and discipline (SHAP, feature importance, partial dependence) and preference for simpler models when appropriate
  • LLM application experience for entity and relationship extraction with evaluation of extraction quality
  • Statistical maturity: power analysis, confidence intervals, avoidance of p-hacking
  • Reproducibility hygiene: version control, parameterized pipelines, deterministic seeds, pinned dependencies, runnable READMEs
  • Ability to present technical findings to non-technical operational stakeholders
  • Ability to obtain and maintain a Public Trust security clearance
  • Published research, open-source contributions, or conference talks in entity resolution/graph ML/applied causal inference
  • Knowledge of commercial data sources (LexisNexis, TransUnion, Babel Street)
  • Track record presenting null results to federal subject matter experts or finding pipeline bugs pre-deliverable
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
63 Employees

What We Do

Skyward IT Solutions, LLC is a technology services provider specializing in AI-driven government services and digital modernization for federal and public sector operations. Founded in 2013, the company delivers secure, scalable, and user-centered solutions including custom AI agents, cloud engineering, and agile software development, focusing on improving service delivery and operational efficiency for agencies such as CMS and the SBA.

Similar Jobs

Coinbase Logo Coinbase

Senior Data Scientist

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4700 Employees
180K-212K Annually

Pager Health Logo Pager Health

Senior Data Scientist

Artificial Intelligence • Healthtech • Mobile • Software • Telehealth • Generative AI
Remote
US
366 Employees
140K-150K Annually
Remote
United States
28222 Employees
83K-138K Annually
Remote
United States
340 Employees
174K-174K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account