We’re seeking a Full-Stack Engineer to design, build, and scale our web platform, which serves as the core interface for deploying multimodal models, observing workloads, and building agent workflows. In this role, you’ll work closely with product, infrastructure, and design teams to create high-performance, developer-friendly, and enterprise-ready tools.
We are looking for a hands-on engineer who is eager to work at the intersection of infrastructure, developer experience, and AI applications. The ideal candidate is a talented full-stack developer, strong collaborator, and someone who enjoys working across the stack, cares deeply about developer workflows, and is excited to help define the future of AI adoption.
Key ResponsibilitiesDesign, build, and maintain web applications and tools for AI model deployment, monitoring, and performance optimization
Develop clean, scalable, and robust APIs powering AI agents, workflows, and user-facing systems
Collaborate with infrastructure engineers to integrate backend systems with deployment and orchestration pipelines
Optimize the performance and usability of web interfaces
Drive code quality through automated testing, CI/CD, and code reviews
Contribute to architecture and design decisions that shape our platform’s long-term direction
Identify and resolve technical debt and improve system reliability in production systems
5+ years of industry experience in full-stack or backend engineering
Bachelor’s or Master's degree in Computer Science, Computer Engineering, or equivalent
Fluent in TypeScript and Python, Expert with React/Next.js
Strong backend experience with FastAPI or similar Python frameworks
Proven expertise in delivering production-scale full-stack applications
Proficiency in designing data models, writing SQL, and working with PostgreSQL
Deep understanding of modern web frameworks and component-driven architecture
Strong API design experience across gRPC/REST/GraphQL in production systems
Solid foundation in cloud-native development
Familiarity with OpenTelemetry tracing, metrics, and structured logging
Knowledge of web security, authentication, RBAC, and multi-tenant SaaS systems
Familiarity with LLM-based workflows, tool invocation, or agentic systems
Familiarity with Kubernetes for container orchestration, including deploying, scaling, and managing containerized applications in production environments
Have worked in a startup or fast-paced environments with ownership and ambiguity
Built developer-facing SDKs/CLIs
Passion for developer experience and enabling AI adoption
Flexible working hours
Daily lunch and dinner provided; unlimited snacks and beverages
Supportive and highly collaborative work environment
Health check-up support and top-tier equipment/hardware support
A front-row seat to the generative AI infrastructure revolution
Competitive compensation, startup equity, health insurance, and other benefits.
FriendliAI is building the world’s best AI inference platform that makes large language and multi-modal models fast, efficient, and deployable at scale. We power high-throughput, low-latency AI workloads for organizations worldwide and integrate directly with Hugging Face, giving developers instant access to over 500,000 open-source models.
We are a small, fast-moving team doing work that matters at one of the most exciting moments in the history of technology. With our world-class inference engine, we are building a platform that the AI industry can actually rely on.
Skills Required
- 5+ years of industry experience in full-stack or backend engineering
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or equivalent
- Fluent in TypeScript and Python
- Expert with React and Next.js
- Strong backend experience with FastAPI or similar Python frameworks
- Proven expertise delivering production-scale full-stack applications
- Proficiency in designing data models, writing SQL, and working with PostgreSQL
- Strong API design experience across gRPC, REST, and GraphQL in production systems
- Solid foundation in cloud-native development
- Familiarity with OpenTelemetry tracing, metrics, and structured logging
- Knowledge of web security, authentication, RBAC, and multi-tenant SaaS systems
- Familiarity with LLM-based workflows, tool invocation, or agentic systems
- Familiarity with Kubernetes for deploying and managing containerized applications
- Experience working in startups or fast-paced environments with ownership
- Experience building developer-facing SDKs or CLIs
What We Do
FriendliAI is The Frontier AI Inference Cloud: an AI infrastructure platform that deploys, scales, and monitors large language and multimodal models. Its inference engine maximizes GPU utilization to deliver faster performance and steep cost savings for open-weight and custom models, while offering enterprise-grade reliability, SLAs, and compliance to help teams run generative AI and agent workloads at production scale.









