At phData, we build platforms exclusively for data and machine learning. Our services and software are used by the world's largest companies to solve their hardest problems.
Our work is challenging and our standards are high, but you'll be set up for success with tooling, training, and colleagues who are among the brightest minds in the field. You will be working with the latest cloud-native and distributed data platforms on the market. Since the data and machine learning industry changes quickly, you have the opportunity to continuously learn. Our strategy is to remain innovative and cutting edge, while also ensuring our work is practical and unlocks real business value for our customers.
While we're growing extremely fast, we maintain a casual, small business work environment. We hire top performers and allow them the autonomy to deliver results. Our award winning workplace fosters learning, creativity and teamwork.
- Best Places to Work (2017, 2018, 2019, 2020, 2021)
- Inc. 5000 Fastest Growing US Companies (2019, 2020, 2021)
- Cloudera Partner of the Year 2020
- Snowflake Elite Partner
- Snowflake Emerging Partner of the Year 2020
In this hands-on role, you will provide Hadoop Administration and ensure performance, reliability, and optimization for big data clusters as well as recommend resources required to deploy and optimize big data technologies.
- Installation, administration, and configuration of big data clusters
- Hadoop Administration; managing and monitoring distributed systems and middleware application performance; recommend, configure and optimize Hadoop ecosystem
- Configuration, troubleshooting and performance tuning of Java applications
- Hadoop security; LDAP, Active Directory and Kerberos (KDC) administration
- Hadoop encryption; HDFS transparent data encryption, LUKS and PKI methodologies
- 5+ years of Linux OS installation, configuration, administration and performance optimization as a Linux System or Java Middleware Engineer with emphasis on distributed computing.
- Experience integrating Linux OS with user authentication backend (LDAP/Active Directory)
- Hadoop experience including:
- Hadoop distribution (Cloudera preferred) including cluster installation and configuration
- Core Hadoop (HDFS, Hive, YARN) and on one or more ecosystem products/languages such as HBase, Spark, Impala, Search, Kudu, etc.
- Java application configuration and performance tuning
- Support for cloud-based big data clusters; AWS preferred
- Infrastructure automation; experience with Ansible and Git is a plus
- Experience scoping activities on large scale, complex technology infrastructure projects
- Proven experience working with key stakeholders and customers with the ability to translate “big picture” business requirements and use cases into a Hadoop solution, including ingestion of many data sources, ETL processing, data access, and consumption, as well as custom analytics.
Excellent communication skills and customer relationship management including project escalations, and participating in executive steering meetings
Perks and Benefits
- Medical Insurance for Self & Family
- Medical Insurance for Parents
- Term Life & Personal Accident
- Wellness Allowance
- Broadband Reimbursement
- Professional Development Allowance
- Reimbursement of Skill Upgrade Certifications
- Certification Bonus