Data Scientist Lead - LLM (Chatbot)

Posted 2 Days Ago
Be an Early Applicant
19 Locations
Remote or Hybrid
Junior
Blockchain • Fintech • Software • Cryptocurrency • Metaverse
The Role
Lead end-to-end LLM pipeline for customer service scheduling: data prep, prompt design, RAG systems, multi-agent architectures, multi-GPU deployment, evaluation pipelines, and chatbot integration to improve model quality and decision-making.
Summary Generated by Built In
Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 250 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

We are seeking a highly skilled professional to join our team, focusing on advancing customer service scheduling optimization through innovative AI solutions. This role involves researching and implementing cutting-edge algorithms to enhance scheduling systems, leveraging business domain knowledge to elevate the impact of AI products. The successful candidate will develop and refine Large Language Models (LLMs) to extract actionable insights, improve business decision-making, and optimize prompt design for more accurate outputs. Additionally, the role includes creating scalable and robust LLM/RAG frameworks tailored to customer service scheduling, fostering innovation and maintaining a competitive market edge.

Responsibilities:

  • Own the full LLM pipeline from data preparation to production real case usage.
  • Design, iterate and optimize prompts (zero-/few-shot, chain-of-thought, tool-calling, etc.) to maximize model utility and safety across products and languages.
  • Build and maintain Retrieval-Augmented Generation (RAG) QA/search systems that connect to multi-source knowledge bases.
  • Familiar with vLLM/SGLang inference architectures and have proven experience deploying and operating LLM services on multi‑GPU or cluster environments.
  • Design, implement and operate multi‑agent LLM architectures (e.g. LangGraph, CrewAI, AutoGen) including task decomposition, agent orchestration, memory sharing and tool‑calling workflows.
  • Develop evaluation pipelines (automatic metrics & human feedback) to measure prompt and model quality, bias, and hallucination rates.
  • Collaborate with product and CS teams to integrate AI models into conversational Chatbot in different scenarios.
  • Track cutting-edge research, author tech blogs, and keep improve current architecture. 

Qualifications:

  • Master’s degree or higher in Computer Science, Data Science or related field..
  • 2+ years of deep-learning/NLP experience, including 1+ year practical LLM work (SFT, DPO, RAG, quantization, inference optimization, etc.).
  • Demonstrated prompt engineering & tuning expertise (few-shot design, structured prompting, prefix-/p-tuning, reward re-ranking, safety filtering).
  • Practical experience building and deploying multi‑agent LLM workflows, with understanding of agent‑orchestrator patterns, shared memory, long‑horizon planning and guard‑rail design.
  • Clean coding practices, good English communication skills, and a passion for rapid learning.
  • Excellent self-driven and ownership with good deliverables.
  • Eager to learn, be curious about AI new technologies
  • Good communication and collaboration skills

Skills Required

  • Master's degree or higher in Computer Science, Data Science or related field
  • 2+ years deep-learning/NLP experience, including 1+ year practical LLM work (SFT, DPO, RAG, quantization, inference optimization)
  • Demonstrated prompt engineering and tuning expertise (few-shot, structured prompting, prefix-/p-tuning, reward re-ranking, safety filtering)
  • Practical experience building and deploying multi-agent LLM workflows (agent orchestration, shared memory, long-horizon planning, guard-rails)
  • Familiarity with vLLM and SGLang inference architectures and deploying/operating LLM services on multi-GPU or cluster environments
  • Experience building and maintaining Retrieval-Augmented Generation (RAG) QA/search systems connecting multi-source knowledge bases
  • Design, implement and operate multi-agent LLM architectures and tool-calling workflows
  • Develop evaluation pipelines using automatic metrics and human feedback to measure model quality, bias, and hallucination
  • Clean coding practices, good English communication skills, strong ownership and collaboration with product/CS teams
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
7,696 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account