Responsibilities
- Data Extraction Pipelines: Develop end-to-end solutions integrating OCR, Layout Analysis, and Semantic Parsing to precisely capture chemical formulas, experimental parameters, and performance metrics from complex documents. Resolve cross-modal data alignment between body text and complex scientific visuals (e.g., tables, chemical structures, and charts).
- RAG Development: Partner with Data and Agent teams to design and implement robust RAG (hybrid search, ranking optimization) and search architectures, ensuring highly relevant document retrieval for complex scientific queries.
- Model Post-training: Utilize post-training skills to enhance LLMs' reasoning capabilities and understanding on Materials Science data extraction.
Qualifications
- Minimum Qualifications
- Bachelor's degree or higher in Computer Science, Artificial Intelligence, or a related quantitative field.
- Proficient in leveraging AI-enhanced development tools such as Claude Code, solid programming skills in Python with a focus on algorithmic implementation.
- 3+ years of hands-on experience in NLP/LLM/RAG, with a proven track record of deploying complex information extraction systems in production.
- Excellent communication skills with the ability to understand product requirement, solve data/RAG related challenges within a cross-functional team.
- Preferred Qualifications
- RAG & Data: Deep experience in high-quality data processing and building production-grade RAG systems (including vector databases, hybrid search, and re-ranking).
- Model Refinement: Proven ability to enhance extraction accuracy through post-training techniques, including NER, prompt engineering, and instruction fine-tuning.
- Domain Expertise: Familiarity with Physics, Chemistry, or Biology (e.g., understanding of molecular structures or material properties) is a significant plus.
Top Skills
What We Do
Founded in 2007, Patsnap is the company behind the world’s leading innovation intelligence platform. Patsnap is used by more than 10,000 customers in over 50 countries around the world to access market, technology, and competitive intelligence as well as patent insights needed to take products from ideation to commercialization. Customers are innovators across multiple industry sectors, including agriculture and chemicals, consumer goods, food and beverage, life sciences, automotive, oil and gas, professional services, aviation and aerospace, and education. To learn more about how Patsnap is improving the way companies innovate, visit www.patsnap.com.







