Senior Site Reliability Engineer

Posted 8 Days Ago
Be an Early Applicant
Hiring Remotely in México
Remote
Senior level
Artificial Intelligence • Cloud • Information Technology • Machine Learning • Natural Language Processing • Consulting
Cutting-edge AI, Machine Learning and Cloud Transformation company at the forefront of technological innovation
The Role
The Senior Site Reliability Engineer at Nexaminds is responsible for designing, implementing, and maintaining reliable infrastructure while automating provisioning and monitoring systems, collaborating across teams, and optimizing performance.
Summary Generated by Built In

Unlock Your Future with Nexaminds!

At Nexaminds, we're on a mission to redefine industries with AI. We're passionate about the limitless potential of artificial intelligence to transform businesses, streamline processes, and drive growth.

Join us on our visionary journey. We're leading the way in AI solutions, and we're committed to innovation, collaboration, and ethical practices. Become a part of our team and shape the future powered by intelligent machines. If you're driven by ambition, success, fun, and learning, Nexaminds is where you belong.

Location: MEXICO

Eligibility Notice This position is only open to candidates who are Mexican citizens currently residing in Mexico. Applications from candidates who do not meet this legal and operational requirement will not be considered. We appreciate your interest and encourage you to apply to roles that match your location.

Nexaminds is looking for a Senior Site Reliability Engineer to design, implement, and maintain scalable, reliable, and secure infrastructure supporting critical applications and services. The ideal candidate has strong experience with cloud technologies, automation, observability, and performance optimization, and enjoys working in a fast-paced, highly collaborative environment.

Qualifications we are looking for:

  • 5+ years of experience as a Site Reliability Engineer or in a similar role supporting production environments.
  • Strong experience with AWS cloud infrastructure and Kubernetes.
  • Hands-on experience with Infrastructure as Code (IaC) using Terraform.
  • Experience with automation and scripting using Python, Bash/Shell, or Go.
  • Deep understanding of Linux/Unix systems and networking fundamentals.
  • Experience with monitoring and observability tools such as Datadog, Prometheus, and Grafana.
  • Familiarity with CI/CD pipelines and DevOps best practices.
  • Strong troubleshooting, performance optimization, and root cause analysis skills.
  • Excellent communication and collaboration skills.
  • Advanced English communication skills.

Preferred Qualifications:

  • Experience with Terragrunt, Puppet, Rundeck, Java, or Spring Framework.
  • Familiarity with observability tools such as Loki, Vector, or VictoriaMetrics.
  • Experience working with additional cloud platforms such as GCP or Azure.
  • AWS or Kubernetes certifications such as AWS Solutions Architect or Certified Kubernetes Administrator (CKA).

Job duties:

  • Design, build, and maintain scalable infrastructure across AWS and hybrid/on-premise environments.
  • Automate infrastructure provisioning and configuration using Terraform, Terragrunt, Puppet, and scripting tools.
  • Develop and maintain CI/CD pipelines and infrastructure automation workflows.
  • Monitor system reliability, performance, and availability through observability and alerting solutions.
  • Implement and manage SLOs/SLIs to support reliability and operational goals.
  • Troubleshoot complex production issues across infrastructure, Kubernetes, networking, databases, and application layers.
  • Execute production changes with strong operational discipline, including rollback planning and validation.
  • Collaborate with security teams to implement and maintain security and compliance best practices.
  • Participate in incident response, root cause analysis, postmortems, and on-call rotations.
  • Design and validate disaster recovery and business continuity strategies.
  • Support cloud cost optimization initiatives and FinOps practices.
  • Maintain operational documentation, runbooks, and troubleshooting guides.
  • Collaborate closely with engineering, platform, and cross-functional teams to drive successful project delivery.

What you can expect from us

Here at Nexaminds, we're not your typical workplace. We're all about creating a friendly and trusting environment where you can thrive. Why does this matter? Well, trust and openness lead to better quality, innovation, commitment to getting the job done, efficiency, and cost-effectiveness.

  • Stock options πŸ“ˆ
  • Remote work options 🏠
  • Flexible working hours πŸ•œ
  • Benefits above the law
  • But it's not just about the work; it's about the people too. You'll be collaborating with some seriously awesome IT pros.
  • You'll have access to mentorship and tons of opportunities to learn and level up.

Ready to embark on this journey with us? πŸš€πŸŽ‰ If you're feeling the excitement, go ahead and apply!

Skills Required

  • 5+ years of experience as a Site Reliability Engineer or in a similar role
  • Strong experience with AWS cloud infrastructure and Kubernetes
  • Hands-on experience with Infrastructure as Code (IaC) using Terraform
  • Experience with automation and scripting using Python, Bash/Shell, or Go
  • Deep understanding of Linux/Unix systems and networking fundamentals
  • Experience with monitoring and observability tools such as Datadog, Prometheus, and Grafana
  • Familiarity with CI/CD pipelines and DevOps best practices
  • Strong troubleshooting, performance optimization, and root cause analysis skills
  • Excellent communication and collaboration skills
  • Advanced English communication skills
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
80 Employees
Year Founded: 2023

What We Do

Nexaminds is a cutting-edge AI and Cloud Transformation company at the forefront of technological innovation. We specialize in developing advanced artificial intelligence solutions that empower businesses to unlock their full potential. With a team of skilled AI and Cloud experts and a passion for revolutionizing industries, we are dedicated to delivering intelligent, scalable, and customized solutions tailored to meet our clients' specific needs. We strive to create innovative AI-driven solutions that enhance efficiency, productivity, and decision-making processes across industries. By leveraging state-of-the-art technology and fostering a culture of creativity, we aim to be the trusted partner that enables organizations to embrace the future with confidence.

Similar Jobs

Earnin Logo Earnin

Senior Site Reliability Engineer

Fintech • Payments • Financial Services
Remote
Ciudad De México, MEX
229 Employees

Circle (circle.so) Logo Circle (circle.so)

Senior Site Reliability Engineer

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Easy Apply
Remote
31 Locations
250 Employees
130K-140K Annually
Remote
2 Locations
13042 Employees
Remote or Hybrid
Monterrey, Nuevo León, MEX
44 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account