Senior HPC Engineer

Sorry, this job was removed at 4:49 a.m. (CST) on Thursday, June 8, 2023
Find out who's hiring in Dallas, TX.
See all Developer + Engineer jobs in Dallas, TX
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

Level

Experienced

Job Location

Dallas, TX

Position Type

Full Time

Travel Percentage

Negligible

Job Category

Information Technology

Description

Position Summary:

 

 

The hands-on job of an engineer for the HPC platform is responsible for the design, review, and collaboration with the computation infrastructure team for a future-proof HPC compute platform. You will manage the stability of the existing HPC platform in production while forming standards and processes for future HPC platforms and a uniform repeatable deployment strategy. You will provide a key role in troubleshooting the HPC computing platform both software and hardware, container virtualization, network, and other software issues even if these go beyond the platform. Robust definition of software & platform development processes, install, and upgrade (ansible, K8s, Container, Docker, shell script) are some of your main duties.

 

 

Job Responsibilities:

  • Support applications of the software to HPC in both research and production environments
  • Identify, design, and implement the architecture solutions to meet efficient and effective needs of image processing computing infrastructures for high throughput requirement
  • Enhancement, debug and maintain legacy computation software system.
  • Analyze the performance of the computation system to help identify performance bottlenecks.
  • Software issue analysis, debugging, and technical support.
  • Implement unit tests and have good practice in an integration test, regression tests, and documentation.
  • Collaborate and evaluate designs and solutions of cloud applications, hardware, and software.
  • Familiar with parallel computing techniques on multi-core computational systems
  • Strong collaboration skills with manufacturing and design teams
  • Maintenance and creation of Linux OS environment playbooks that are used in software deployment.
  • Support development teams at San Jose and other sites where they experience potential software platform issues
  • Identifying the implications when a move from one software version to the next is required.
  • Development of automated tests that can be re-used on platform changes and upgrades to ensure no regression impact is caused.
  • Be able to work with Linux and Python for test execution and scripting purposes.

Required Qualifications:

  • Associate’s or Bachelor’s Degree or equivalent experience in related field
  • 10 + years of IT experience
  • 10 years of experience designing & architecting Linux environments (specifically Linux, HPC)
  • Experience with Load Sharing Facility (Platform LSF) is highly desirable
  • Experience with IBM HPC (High-Performance Compute) platform is highly desirable
  • Experience with managing Ansible or other CMS administration and support
  • General experience with MMR (Management, monitoring, and reporting), specifically with Nagios, and/or ELK stack is desirable
  • Experience configuring and maintaining SELinux and firewall
  • User maintenance tasks with knowledge of the integration with Active Directory
  • Ability to set up and install of a full Linux, Apache, MySQL, MongoDB, and PHP (LAMP) environment from scratch
  • Ability to set up and administer a Subversion/Scmbug/Bugzilla system for version control
  • Knowledge of Linux networking setup
  • Understanding of Yum and RPM for package management
  • Ability to write scripts in one of the major shell scripting languages for use in cron and for system administration
  • Understanding of Postfix configuration
  • Understanding of Samba
  • Understanding of SSH, RSA keys and their setup
  • Understanding of the Linux init process
  • Understanding how to monitor CPU, memory, disk space, and overall performance is essential
  • Strong communication skills and the ability to work well with others is essential
  • Understanding of cloud technologies, such as AWS, Azure, etc.
  • Experience with IBM HPC, GPFS, and TSM (Tivoli Storage Manager)
  • Background in Perl, grep, sed, awk
  • Understanding of how enterprise server hardware is setup and how to add devices to the configuration
  • Expertise with high-performance networking, ideally with MPI, NCCL, RDMA, and/or Infiniband
  • Experience with GPUs in large scale networks strongly preferred
  • Deep understanding of TCP/IP and the Linux networking stack
  • Experience developing high-quality software in a general-purpose programming language (Python, C, C++, Go, etc)
  • Experience with virtualization and container architecture in cloud environments
  • Configuring, administering, and supporting network storage subsystems (e.g. IBM, NetApp)

Preferred Qualifications:

  • Working knowledge of Microsoft Windows System Administration in order to be able to communicate effectively with other members of the SysAdmin team
  • RedHat Certification is a definite plus
  • Experience working with EMC, IBM, or enterprise storage technologies

Core skills & Competencies:

  • Ability to collaborate with others
  • Excellent communication skills

 

 

This job description reflects management’s assignment of essential functions. Nothing in this job description restricts management’s right to assign or reassign duties and responsibilities to this job at any time

 

 

More Information on Vodastra Technologies
Vodastra Technologies operates in the Other industry. The company is located in Dallas, TX. Vodastra Technologies was founded in 2005. It has 32 total employees. It offers perks and benefits such as Remote work program. To see all 55 open jobs at Vodastra Technologies, click here.
Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about Vodastra TechnologiesFind similar jobs