- Optimize large-scale multimodal models for low-latency inference and efficient memory usage across diverse hardware platforms.
- Apply state-of-the-art model compression techniques, including quantization (e.g., INT8/FP16), pruning, and knowledge distillation.
- Develop and integrate custom inference kernels targeting GPU or custom AI accelerators.
- Build profiling tools and performance models to analyze bottlenecks and guide optimization strategies.
- Contribute to real-world deployment efforts in autonomous driving systems, including on-vehicle testing and iteration.
- Track the latest research in efficient ML inference and integrate relevant techniques into production pipelines.
- Master’s or Ph.D. in Computer Science, Electrical Engineering, or related field. Open to recent graduates.
- Strong coding skills in C++ and Python with a focus on performance and scalability.
- Proficient in deploying deep learning models using TensorRT, ONNX Runtime, or TVM.
- Familiarity with CUDA programming and parallel computing principles.
- Solid understanding of model inference workflows and system-level performance tuning.
- Experience in quantization-aware training or post-training quantization.
- Effective communicator and collaborative team player.
- Hands-on experience with deploying vision-language or large multimodal models.
- Familiarity with low-precision inference (INT8/FP16), kernel fusion, and operator-level optimization.
- Experience in autonomous driving, robotics, or edge AI applications.
- Track record of open-source contributions or publications in ML/AI conferences (e.g., NeurIPS, ICML, CVPR).
- Background in system profiling, latency modeling, or compiler-level optimization.
- A fun, supportive and engaging environment
- Infrastructures and computational resources to support your work.
- Opportunity to work on cutting edge technologies with the top talents in the field.
- Opportunity to make significant impact on the transportation revolution by the means of advancing autonomous driving
- Competitive compensation package
- Snacks, lunches, dinners, and fun activities
Top Skills
What We Do
Xpeng Motors is a leading Chinese electric vehicle and technology company that designs and manufactures intelligent automobiles that are seamlessly integrated with the Internet and utilize the latest advances in artificial intelligence. Focusing on China’s young and tech-savvy consumer base, XPENG Motors strives to offer smart mobility solutions with technology innovation and cutting-edge R&D. The company’s initial backers include its CEO & Chairman He Xiaopeng, the founder of UCWeb Inc. and a former Alibaba executive. It was co-founded in 2014 by Henry Xia and He Tao, former senior executives at Guangzhou Auto with expertise in innovative automotive technology and R&D. It has received funding from prominent Chinese and international investors including Alibaba Group, Foxconn Group and IDG Capital. Currently with 3,000 employees, the company is headquartered in Guangzhou and has design, R&D, manufacturing and sales & marketing divisions in Silicon Valley, San Diego, Beijing, Shanghai, Zhaoqing (Guangdong Province) and Zhengzhou (Henan Province).







