Staff Product Engineer / Product Specialist - Spark SME

Posted 3 Days Ago
Be an Early Applicant
Bengaluru, Karnataka
Senior level
Software
The Role
The Staff Product Engineer will design, optimize, and scale Apache Spark-based data processing systems, focusing on high-performance distributed applications. Responsibilities include troubleshooting Spark jobs, managing clusters, conducting performance tuning, and collaborating with teams to implement real-time data processing solutions.
Summary Generated by Built In

Position Summary:

We are seeking an Apache Spark - Subject Matter Expert (SME) who will be responsible for designing, optimizing, and scaling Spark-based data processing systems. This role involves hands-on experience in Spark architecture and core functionalities, focusing on building resilient, high-performance distributed data systems. You will collaborate with engineering teams to deliver high-throughput Spark applications and solve complex data challenges in real-time processing, big data analytics, and streaming.


If you’re passionate about working in fast-paced, dynamic environments and want to be part of the cutting edge of data solutions, this role is for you.


We’re looking for someone who can:

  • Design and optimize distributed Spark-based applications, ensuring low-latency, high-throughput performance for big data workloads.
  • Troubleshooting: Provide expert-level troubleshooting for any data or performance issues related to Spark jobs and clusters.
  • Data Processing Expertise: Work extensively with large-scale data pipelines using Spark's core components (Spark SQL, DataFrames, RDDs, Datasets, and structured streaming).
  • Performance Tuning: Conduct deep-dive performance analysis, debugging, and optimization of Spark jobs to reduce processing time and resource consumption.
  • Cluster Management: Collaborate with DevOps and infrastructure teams to manage Spark clusters on platforms like Hadoop/YARN, Kubernetes, or cloud platforms (AWS EMR, GCP Dataproc, etc.).
  • Real-time Data: Design and implement real-time data processing solutions using Apache Spark Streaming or Structured Streaming.

What makes you the right fit for this position:

  • Expert in Apache Spark: In-depth knowledge of Spark architecture, execution models, and the components (Spark Core, Spark SQL, Spark Streaming, etc.)
  • Data Engineering Practices: Solid understanding of ETL pipelines, data partitioning, shuffling, and serialization techniques to optimize Spark jobs.
  • Big Data Ecosystem: Knowledge of related big data technologies such as Hadoop, Hive, Kafka, HDFS, and YARN.
  • Performance Tuning and Debugging: Demonstrated ability to tune Spark jobs, optimize query execution, and troubleshoot performance bottlenecks.
  • Experience with Cloud Platforms: Hands-on experience in running Spark clusters on cloud platforms such as AWS, Azure, or GCP.
  • Containerization & Orchestration: Experience with containerized Spark environments using Docker and Kubernetes is a plus.

Good to have:

  • Certification in Apache Spark or related big data technologies.
  • Experience working with Acceldata's data observability platform or similar tools for monitoring Spark jobs.
  • Demonstrated experience with scripting languages like Bash, PowerShell, and Python.
  • Familiarity with concepts related to application, server, and network security management.
  • Possession of certifications from leading Cloud providers (AWS, Azure, GCP), and expertise in Kubernetes would be significant advantages.

Top Skills

Spark
Python
The Company
HQ: Campbell, CA
226 Employees
On-site Workplace
Year Founded: 2018

What We Do

Founded in 2018, Campbell, CA-based Acceldata has developed the world's first enterprise data observability platform to help enterprises build and operate great data products.

Acceldata's solutions have been embraced by global customers, such as Dun & Bradstreet, Verisk, Oracle, PubMatic, PhonePe (Walmart), and many more.

Acceldata investors include Insight Partners, March Capital, Industry Ventures, Lightspeed, Sorenson Ventures, Sanabil, and Emergent Ventures. Contact us to learn about the benefits of data observability.

Similar Jobs

Zeta Global Logo Zeta Global

Senior Manager, Data Cloud Applications

AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
Easy Apply
Hybrid
Bengaluru, Karnataka, IND
2194 Employees

Nexthink Logo Nexthink

Product Manager

Artificial Intelligence • Big Data • Information Technology • Software
Hybrid
Bengaluru, Karnataka, IND
1051 Employees

Motorola Solutions Logo Motorola Solutions

Product Manager (Device/Client and Accessories)

Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
Hybrid
Bangalore, Bengaluru, Karnataka, IND
21000 Employees
Hybrid
Bengaluru, Karnataka, IND
289097 Employees

Similar Companies Hiring

Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account