Position Overview We are seeking a skilled Data Scientist with a focus on Natural Language Processing (NLP) to join our team. The ideal candidate brings 3+ years of hands-on experience working with text data, building NLP models, and developing production-ready machine learning solutions. This role will focus on analyzing unstructured data, designing and implementing NLP models, and collaborating with engineering teams to deploy scalable solutions.
Please note: United States citizenship is a requirement for this position. This position requires a Public Trust security eligibility determination post-hire.
Responsibilities· Conduct detailed exploratory data analysis (EDA) on structured and unstructured text datasets to derive insights and inform model development.
· Design, build, and evaluate models for a variety of NLP tasks, including:
o Text classification
o Named entity recognition (NER)
o Information extraction from unstructured documents
· Develop and refine regular expressions, traditional NLP pipelines, and transformer-based models to support business use cases.
· Write high-quality, production-grade Python code, following best practices for scalability, testing, and maintainability.
· Apply Generative AI techniques and prompt engineering to enhance automation and downstream applications.
· Collaborate closely with machine learning engineering teams to deploy, monitor, and optimize NLP solutions in production.
· Utilize cloud technologies such as Azure or AWS to build, train, and manage ML workloads.
Qualifications· United States citizenship (required).
· 3+ years of experience in data science, with significant focus on NLP.
· Demonstrated expertise in text-based EDA, NLP model development, and working with unstructured data.
· Strong proficiency in Python and NLP/ML libraries (e.g., spaCy, NLTK, Hugging Face Transformers, scikit-learn).
· Hands-on experience with Generative AI models and prompt engineering.
· Strong understanding of machine learning fundamentals, model evaluation, and experiment design.
· Ability to translate business needs into technical solutions and communicate complex concepts effectively.
Preferred Qualifications
· Experience deploying NLP solutions in production environments in partnership with ML engineering teams.
· Hands-on experience with cloud platforms, including Microsoft Azure or Amazon Web Services (AWS).
· Familiarity with CI/CD workflows, containerization (e.g., Docker), or distributed computing frameworks.
Skills Required
- United States citizenship
- Public Trust security eligibility determination (post-hire)
- 3+ years of experience in data science with significant focus on NLP
- Demonstrated expertise in text-based EDA, NLP model development, and working with unstructured data
- Strong proficiency in Python
- Experience with NLP/ML libraries (spaCy, NLTK, Hugging Face Transformers, scikit-learn)
- Hands-on experience with Generative AI models and prompt engineering
- Strong understanding of machine learning fundamentals, model evaluation, and experiment design
- Ability to write production-grade, testable, maintainable Python code for scalable systems
- Utilize cloud technologies to build, train, and manage ML workloads (Azure or AWS)
- Experience deploying NLP solutions in production with ML engineering teams
- Familiarity with CI/CD workflows, containerization (e.g., Docker), or distributed computing frameworks