PySpark Developer

Posted Yesterday
Be an Early Applicant
Hiring Remotely in USA
Remote
Junior
Artificial Intelligence • Big Data • Analytics • Business Intelligence
The Role
Develop and maintain production-grade PySpark data and ML pipelines: ingest from multiple sources, implement distributed algorithms and complex SQL, optimize performance, integrate APIs/streams/files, and collaborate with data science to deploy scalable solutions.
Summary Generated by Built In

This is a remote position.

Please go through the entire job post thoroughly before pressing Apply. Post pressing Apply, you shall reach the assessment page that must be attempted.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
Busigence is a Decision Intelligence Company. We create decision intelligence products for real people by combining data, technology, business, and behaviour enabling strengthened decisions.


PySpark Developer

Team: Engineering

Location: Remote

Relevant Exp: 0-4 Years

Background: Been there-Done that

Compensation: Above industry standards


Requirements
This is an immediate requirement. We shall have an accelerated interview process for fast closure - you would required to be proactive and responsive

Remote position (work-from-anywhere)

Immediate joiners must apply

Data Engineering Experienced - course/competitions/internships/job (<4 years)

Competitive compensation



************
| MUST HAVE |
************

1. Code in Python3 - Numpy?

2.Code in Python3 - Pandas?

3.Code in PySpark3 - Core?
4. Code in PySpark3 - SQL?

5.Developed data engineering pipelines on real-world problem (not just toy projects)?

6.Implemented advanced SQL queries

7.Developed complex logics in PySpark3
8.Confidence to learn PySpark3 -MLlib within two weeks?https://spark.apache.org/docs/latest/api/python/reference/pyspark.ml.html (we shall guide but won't spoon-feed)



===========================================

We are offering one of the most challenging & exciting work on Data Pipelines and Machine Learning Pipelines. You shall be working on sophisticated platforms, products and applications

===========================================



We are looking for developer with real passion for data science pipelines. This is a specialist and individual contributor role. Product development experience preferably at a startup or a lean team is desired

ROLE

We are looking for engineers with real passion for distributed computing with actual hands-on experience developing data application on PySpark. You would be required to work with our data science team on development of several data applications.


Mandatory

1. Must be able to fetching data from data sources (databases, APIs, flat files, etc.)
2. Must know in-and-out of functional programming in Python with strong flair for data structures, linear algebra, & algorithms implementation
3. Must be able to convert, break, & distribute existing Python codes to functional programming syntax
4. Must have worked on atleast one real world project in production on PySpark
5. Must have implemented complex mathematical logics through PySpark at scale on parallel/distributed clusters
6. Must be able to recognize code that is more parallel, and less memory constrained, and you must show how to apply best practices to avoid runtime issues and performance bottlenecks
7. Must have worked on high degree of performance tuning, optimization, configuration, & scheduling in PySpark
8. Must have integrated APIs, streams, databases, files (JSON, XML, CSV etc) through PySpark


Preferred
1. Good to have working knowledge vinaigrette of first-class, high order, & pure functions, recurisons, lazy evaluations, and immutable data structures
2. A firm understanding of the underlying mathematics will be needed to adapt modelling techniques to fit the problem space with large data (1M+ records)
3. Good to have worked on PySpark MLlib and PySpark ML
4. Configured Checkpointing and Directed Acyclic Graphs (DAG) on PySpark cluster 
5. Worked on development of data platform


Benefits
How to Apply
You should apply online by clicking "Apply Now". 

For queries regarding an open position, please write to [email protected]

For more information, visit http://www.busigence.com
Products: http://busigence.com/offering

Careers: http://careers.busigence.com

Research: http://research.busigence.com

Jobs: http://careers.busigence.com


Scaling established startup innovating & disrupting various domains through artificial intelligence. We bring those people onboard who are dedicated to deliver wisdom to humanity by solving the world’s most pressing problems differently thereby significantly impacting thousands of souls, everyday.

We are a deep rooted organization with success story having worked with folks from top tier background (IIT, NSIT, DCE, BITS, IIITs, NITs, IIMs, ISI etc.) maintaining an awesome culture with a common vision to build great data products. In past we have served fifty five customers and presently developing Enterprise AI products. More details at http://busigence.com/offering

We work extensively & intensely on big data, data science, machine learning, deep learning, reinforcement learning, data analytics, natural language processing, cognitive computing, and business intelligence. 

We look for talent, not skills
1. You must be [super sharp]
2. You must be [extremely energetic]
3. You must be [honourably honest]


In addition to the regular stuff which every good startup offers – Lots of learning, Food, Parties, Open culture, Flexible working hours, and what not….

We offer you: [Greatest work of life]


You shall be working on our revolutionary products which are pioneer in their respective categories. This is a fact.

We try real hard to hire fun loving crazy folks who are driven by more than a paycheque. You shall be working with creamiest talent on extremely challenging problems at most happening workplace


Skills Required

  • Proficient coding in Python3 (including NumPy)
  • Proficient coding in Python3 (including Pandas)
  • Proficient in PySpark3 core APIs
  • Proficient in PySpark3 SQL
  • Built real-world data engineering pipelines (production)
  • Implemented advanced SQL queries
  • Developed complex logic in PySpark at scale
  • Willingness/confidence to learn PySpark MLlib quickly
  • Fetch and ingest data from databases, APIs, and flat files
  • Strong functional programming skills in Python and data structures
  • Convert existing Python code to functional/distributed style
  • Implemented complex mathematical logic using PySpark on clusters
  • Ability to identify parallelism vs memory constraints and apply best practices
  • Experience with PySpark performance tuning, optimization, configuration, and scheduling
  • Integrated APIs, streams, databases, and various file formats through PySpark
  • Familiarity with functional programming concepts (higher-order functions, immutability)
  • Understanding of underlying mathematics for large-data modeling (1M+ records)
  • Experience with PySpark MLlib and PySpark ML
  • Configured checkpointing and DAGs on PySpark clusters
  • Experience developing data platforms
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
0 Employees
Year Founded: 2012

What We Do

Busigence is a Decision Intelligence company that creates decision intelligence products for real people by combining data, technology, business, and behavior to enable strengthened decisions. Founded by IIT alumni, the company focuses on highly disruptive big data technologies, utilizing artificial intelligence and machine learning to deliver actionable business and relationship intelligence solutions.

Similar Jobs

In-Office or Remote
2 Locations
122 Employees
In-Office or Remote
2 Locations
122 Employees

Applied Systems Logo Applied Systems

Senior User Experience Designer

Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
Remote or Hybrid
4 Locations
3040 Employees
100K-130K Annually

Applied Systems Logo Applied Systems

Cloud Platform Engineer

Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
Remote or Hybrid
2 Locations
3040 Employees
100K-160K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account