Top Tech Jobs & Startup Jobs

7 Days AgoSaved
Hybrid
San Francisco, CA, USA
Mid level
Mid level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design and optimize high-performance GPU kernels (GEMM, attention, routing) for AI inference across NVIDIA and AMD GPUs. Implement CUDA/C++ and low-level assembly code, build reduced-precision/quantized (FP8/FP4) kernels, benchmark cross-vendor performance, contribute to internal GPU libraries, accelerate multi-modal pipelines, and integrate next-generation GPU features into production.
Top Skills: AmdC++CudaCutlassFp4Fp8GemmGpu AssemblyHipNvidiaRocmTriton
7 Days AgoSaved
Hybrid
San Francisco, CA, USA
Mid level
Mid level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Own quality for FriendliAI's full SaaS stack, including backend microservices, frontend, model deployments, and inference. Build pytest automated suites, Locust performance tests, Playwright end-to-end tests, and design strategies for validating LLM inference and model deployment workflows.
Top Skills: Hugging FaceLlm ServingLocustMicroservicesMulti-CloudPlaywrightPytestPython
7 Days AgoSaved
Hybrid
San Francisco, CA, USA
Senior level
Senior level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, implement, and optimize GPU kernels, kernel compiler, memory planner, and runtime for low-latency generative AI inference. Analyze performance bottlenecks across hardware and software, collaborate with infrastructure teams, and maintain production profiling, benchmarking, and validation tooling while supporting new model architectures and multi-GPU strategies.
Top Skills: BenchmarkingC++Compiler InfrastructureDiffusion ModelsDistributed InferenceGpu KernelsKernel CompilerMulti-GpuProfilingPythonRuntime SystemsTransformer Models
7 Days AgoSaved
In-Office
Seoul, KOR
Mid level
Mid level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, implement, and optimize high-performance GPU kernels (GEMM, attention, routing), develop CUDA/ROCm C++ code including low-level assembly, implement reduced-precision/quantized kernels (FP8/FP4), benchmark and ensure parity across NVIDIA and AMD, contribute to GPU libraries, accelerate multi-modal pipelines, and integrate next-generation GPU features into production inference engine.
Top Skills: AmdC++CudaCutlassFp4Fp8GemmGpu AssemblyHipNvidiaRocmTriton
7 Days AgoSaved
In-Office
Seoul, KOR
Senior level
Senior level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Own and evolve core backend microservices for an AI inference platform, building production-grade APIs and multi-tenant SaaS capabilities (authentication, RBAC, billing). Design data models and pipelines across PostgreSQL and ClickHouse, collaborate on multi-cloud orchestration, ensure reliability and performance, and drive engineering quality through testing and CI/CD.
Top Skills: CliClickhouseFastapiGraphQLGrpcKubernetesLlm ServingMulti-CloudNext.JsOlapOltpOpentelemetryPostgresPythonReactRestSdk
New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free
Application Tracker Preview
7 Days AgoSaved
In-Office
Seoul, KOR
Senior level
Senior level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Lead strategy and roadmap for FriendliAI's inference platform, owning initiatives end-to-end. Mentor junior PMs/designers, drive customer discovery, define product requirements and KPIs, partner with engineering/research, and align with GTM and sales to deliver scalable model APIs, deployment workflows, and developer features.
Top Skills: Ai/Ml SystemsAPIsCloud-NativeDeveloper PlatformsGpu-Based InfrastructureHugging FaceInference PlatformsLlm DeploymentModel ApisMulti-Tenant Saas
7 Days AgoSaved
Hybrid
San Francisco, CA, USA
Mid level
Mid level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, deploy, and operate large-scale LLM and multimodal inference architectures. Work hands-on with customer engineering teams to containerize, scale, monitor, and troubleshoot GPU-based inference workloads across Kubernetes, CI/CD, and hybrid/on-prem environments. Create Helm charts, Terraform modules, and observability tooling while delivering workshops and platform reliability insights.
Top Skills: AWSCi/CdDeepspeed-InferenceDockerDocker ImagesEksElkGCPGpu ComputingGrafanaHelmHugging FaceKubernetesLokiOciOtelPrometheusTensorrtTerraformTritonVllm
7 Days AgoSaved
In-Office
Seoul, KOR
Mid level
Mid level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Build and maintain the Python SDK and cross-platform CLI for an AI inference platform. Own packaging/distribution, developer tooling, DevOps automation, documentation, examples, and collaborate across frontend, product, and engineering to deliver ergonomic APIs and top-tier developer experience.
Top Skills: AsyncioCliDockerGoGrpcHugging FaceKubernetesLlm SdksMeta-ProgrammingNode.jsPackagingPypiPythonPython AstPython MonoreposRestTypescriptTyping
7 Days AgoSaved
Hybrid
San Francisco, CA, USA
Senior level
Senior level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Own and evolve core backend microservices for an AI inference platform: build production-grade APIs, multi-tenant SaaS features (auth, RBAC, billing), design OLTP/OLAP data models, collaborate on multi-cloud orchestration, ensure reliability/performance, and drive engineering quality through testing and CI/CD.
Top Skills: Ci/CdClickhouseFastapiGraphQLGrpcHugging FaceKubernetesLlm ServingNext.JsOpentelemetryPostgresPythonReactRestSQL
7 Days AgoSaved
In-Office
Seoul, KOR
Mid level
Mid level
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, build, and maintain agent APIs and production agent applications (document understanding, RAG, automation). Integrate open-source LLMs and multimodal models, collaborate with backend and infra teams for deployment, and ensure APIs are reliable, scalable, and developer-friendly with strong documentation and monitoring.
Top Skills: HuggingfaceKubernetesLangchainLlamaindexLlmsMultimodal ModelsOcrPythonRag
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account