Prompt Engineer

Posted 5 Days Ago
Be an Early Applicant
Hiring Remotely in México
Remote
Mid level
Artificial Intelligence • Machine Learning • Marketing Tech • Social Media • Software
Firework gives ecommerce businesses, publishers and advertisers a platform to create an engaging user experience.
The Role
Design, test, and refine prompts for LLMs and multi-modal generative models, build golden evaluation datasets, create reusable prompt libraries and templates, integrate prompts into products, and drive continuous improvement via automated and human-in-the-loop evaluation.
Summary Generated by Built In
About Firework
Join Firework – Where Innovation Meets Impact

Firework is revolutionizing connected commerce with the world’s most advanced and largest AI-powered video commerce platform, trusted by global brands and leading retailers. We bring the energy of in-store experiences online, transforming how businesses engage, convert, and build lasting customer relationships.

At Firework, you’ll be part of a high-growth, team-centric environment where innovation thrives and collaboration fuels success. Having raised over $235m to date led by investors such as the SoftBank Vision Fund 2 and operating at a global scale, we offer unparalleled opportunities to work cross-functionally, solve complex challenges, and drive meaningful impact in the future of connected digital commerce.

If you’re curious, ambitious, and energized by big ideas, Firework is the place to grow, lead, and shape the next era of online shopping—together.

Summary
We are seeking a highly creative and technically curious AI Prompt Engineer to help optimize and develop prompts that drive performance from large language models. In this role, you’ll serve as the bridge between human intent and machine output - designing effective inputs to elicit high-quality, reliable, and ethical responses from generative AI systems.

What you’ll be doing

  • Stay on top of prompt engineering techniques, research, and best practices.
  • Develop and curate golden datasets for prompt evaluation and regression testing across modalities, ensuring long-term quality control and reproducibility.
  • Design, test, and refine prompts to support a wide range of generative AI applications—not limited to chat, but also including audio synthesis, avatar animation, lip-sync alignment, and product image generation.
  • Document best practices and create reusable prompt templates to support internal stakeholders, improving prompt consistency, clarity, and alignment across teams.
  • Refactoring existing prompts to follow best-practice approaches.
  • Collaborate with cross-functional teams to integrate AI-driven features into real-world product experiences, ensuring prompts are aligned with user needs, system constraints, and business goals.
  • Build and maintain prompt libraries with clear versioning, metadata tagging, and usage patterns to support scalable and reusable development.
  • Drive continuous improvement in prompt performance by using both automated metrics and human-in-the-loop evaluation pipelines.
  • Contribute to and extend our internal evaluation framework—designing new evaluation flows, creating prompt-specific test cases, and defining metrics tailored to multi-modal output.

You will have

  • Bachelor’s or Master’s degree in STEM or related field.
  • Practical experience working with large language models and/or multi-modal generative models (e.g., text-to-audio, text-to-image, video or avatar generation).Familiarity with prompt techniques such as zero-shot, few-shot, chain-of-thought, tool usage, and retrieval-based augmentation.
  • Strong analytical and linguistic intuition, with the ability to translate abstract goals into effective machine-readable instructions.
  • Deep interest in language and communication systems, and how humans and machines can interact effectively through prompt-based interfaces.
  • Ability to create and maintain curated evaluation datasets (“golden sets”) to support ongoing testing and performance benchmarking.
  • Strong writing and communication skills, with the ability to explain prompt behavior, rationale, and trade-offs to technical and non-technical audiences

We’ll be excited if you have

  • Hands-on experience with Python or another scripting language of choice.
  • Experience with Jupyter Notebooks, or LLM ops tools and libraries such as LangChain, LangFuse, PromptLayer, or vector search systems.
  • Experience designing or working within evaluation pipelines, including human and automated evaluations, metric design, and result interpretation.

Locations 

The role is remote, out of Mexico.

Don’t hold back

We understand some candidates may see the above and not apply because they don’t meet all the qualifications. We encourage you to apply anyway; we often find talented candidates that fit many other opportunities we have and look for potential too, not just what you did in the past.  As an equal employment opportunity employer, we are a diverse team that strives for an inclusive environment for all. We prohibit discrimination and harassment of any kind based on race, color, sex, religion, sexual orientation, national origin, age, disability, genetic information, pregnancy, or any other protected characteristic as outlined by federal, state, or local laws.

Top Skills

Large Language Models,Multi-Modal Generative Models,Python,Jupyter Notebooks,Langchain,Langfuse,Promptlayer,Vector Search Systems
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Mateo, CA
270 Employees
Year Founded: 2017

What We Do

Firework is the world's leading immersive digital transformation and engagement platform with shoppable video, live streaming commerce, and monetization capabilities.

Powering over 600 direct-to-consumer brands, retailers, and media publishers worldwide, Firework brings TikTok-like interactive video experiences to your own websites and app. We enable customers to create and host native, shoppable video content for engaging product discovery, seamless shopping experiences, and a deeper emotional connection with consumers. The company is backed by IDG Capital, Lightspeed Venture Partners, and GSR Ventures, with over $90 million in capital raised to date with offices in the US(SF and NYC), Toronto, Poland, Slovakia, Brazil, and China.

Why Work With Us

We are a diverse team where everyone belongs. We are creative, curious, and cool in a nerdy way. We believe in growth, results, and in each other and that perfection is a work-in-progress. We are just the right amount of extra and want to change the digital game.

Similar Jobs

Mindalter Studio Logo Mindalter Studio

Content Manager

Artificial Intelligence • Digital Media • Social Media
Remote
14 Locations
6 Employees
50K-110K Annually

Genius Sports Logo Genius Sports

Business Development Manager

AdTech • Artificial Intelligence • Machine Learning • Marketing Tech • Software • Sports • Big Data Analytics
Easy Apply
Remote or Hybrid
México
1800 Employees

Motive Logo Motive

Partner Success Manager - Installations

Artificial Intelligence • Fintech • Hardware • Information Technology • Sales • Software • Transportation
Easy Apply
Remote
México
4000 Employees

monday.com Logo monday.com

Channel Partner Manager MX

Productivity • Sales • Software
Remote or Hybrid
Mexico City, Cuauhtémoc, Mexico City, MEX
3049 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account