Software Development Engineer – Agentic AI

Posted 9 Hours Ago
Be an Early Applicant
Enterprise, TX, USA
In-Office
Senior level
Machine Learning • Cybersecurity
The Role
Design, build, and operate a production generative/agentic AI platform for autonomous SOC workflows. Lead end-to-end feature development, multi-agent pipeline engineering, observability, security, and incident response in federal environments.
Summary Generated by Built In

Job Title:

Software Development Engineer – Agentic AI

About Trellix  
Trellix is a global company redefining the future of cybersecurity. The company’s comprehensive, open, and native cybersecurity platform helps organizations confronted by today’s most advanced threats gain confidence in the protection and resilience of their operations. Trellix, along with an extensive partner ecosystem, accelerates technology innovation through artificial intelligence, automation, and analytics to empower over 50,000 business and government customers with responsibly architected security. More at https://trellix.com.

Role Overview:

Join our innovative team at Trellix, where you'll lead the design and development of a cutting-edge generative AI platform powering advanced AI capabilities across the entire Trellix security portfolio. This isn't prototype work. You'll be building and operating production agentic systems deployed in federal environments, enabling autonomous SOC workflows, multi-agent orchestration, and seamless AI integration across our security products. We're looking for a highly skilled Software Development Engineer with a passion for building robust, scalable, and secure AI solutions that operate at real-world scale.

About the Role:

  • Design and Develop: Lead the design and development of our generative AI platform, driving core functionality, agentic workflows, and platform-level features from concept through production.

  • End-to-End Ownership: Take full ownership of features and functions, from initial design and development through rigorous testing, automation, and ongoing operational health.

  • Agent Workflow Development: Build, iterate on, and harden multi-agent pipelines, including tool use, inter-agent coordination, and autonomous decision workflows for security operations.

  • Deliver High-Quality Solutions: Ensure solutions are delivered on time, within budget, and to the highest quality standards, meeting project goals and customer commitments.

  • Ensure Resilience & Security: Proactively implement best practices to ensure applications are highly resilient, secure, and performant, with particular attention to the sensitivity of security operations data.

  • Observability & Telemetry: Design and implement instrumentation using OpenTelemetry, contribute to operational dashboards, and surface platform health and usage insights to engineering leadership and stakeholders.

  • Technical Analysis & Documentation: Analyze feature requirements and produce detailed design documentation, architectural decision records, and async-friendly technical specs.

  • Production Reliability & Incident Response: Own production issues end-to-end, including triage, root cause analysis, post-mortems, and SLA commitments, for a platform operating in high-stakes environments.

About You:

  • Experience: 5+ years of professional experience in Python application software development, with demonstrated experience building and operating production AI or platform systems.

  • Operating Systems:

    • Linux proficiency is a must

    • Shell scripting (preferred)

  • Core Development Skills:

    • Excellent development and debugging skills in Python

    • Strong grasp of data structures and design patterns

    • Proficiency with REST and async Web APIs

    • CI/CD pipeline experience (GitHub Actions or equivalent)

    • Strong written communication for design docs and async collaboration

    • Ability to operate with autonomy in a fast-moving, ambiguous environment

  • AI/ML Knowledge:

    • Hands-on experience with Large Language Models (LLMs) in production

    • Langchain experience required, including building and operating stateful multi-agent workflows

    • Experience with prompt orchestration and chain composition

    • Familiarity with Agentic AI concepts and patterns (ReACT, chain-of-thought, tool use, Deep Agents)

    • Experience deploying and operating vLLM for self-hosted inference

    • Familiarity with MCP (Model Context Protocol) for agentic tool integration

  • Frameworks & Technologies:

    • FastAPI (preferred)

    • Node.js / TypeScript for tooling and API integration layers (preferred)

  • Databases:

    • Postgres (preferred)

    • Knowledge Graphs, including NebulaGraph or equivalent (preferred)

    • Vector Databases, including Qdrant or equivalent (preferred)

    • Embedding pipeline experience including chunking strategies and retrieval tuning (preferred)

  • Services & Tools:

    • Gunicorn or Uvicorn

    • OpenTelemetry (OTEL) instrumentation

    • Redis (preferred)

    • Langfuse or LangSmith for agent observability (preferred)

    • Kubernetes (preferred)

    • AWS: RDS, EKS, Elasticache, Bedrock (preferred)

  • Domain Knowledge:

    • Working knowledge of threat detection, EDR telemetry, SOC workflows, or SIEM platforms strongly preferred

    • Understanding of Security Incident and Event Management (SIEM) and Incident Response a plus

  • Soft Skills:

    • Excellent communication and collaboration skills with the ability to work effectively across engineering, product, and security research teams

    • Ability to communicate technical decisions and tradeoffs clearly to non-engineering stakeholders

    • Strong written communication for design documentation and distributed team collaboration

    • Comfortable operating with high autonomy and minimal oversight in a fast-moving, ambiguous environment

Company Benefits and Perks:

We believe that the best solutions are developed by teams who embrace each other's unique experiences, skills, and abilities. We work hard to create a dynamic workforce where we encourage everyone to bring their authentic selves to work every day. We offer a variety of social programs, flexible work hours and family-friendly benefits to all of our employees.

  • Retirement Plans

  • Medical, Dental and Vision Coverage

  • Paid Time Off

  • Paid Parental Leave

  • Support for Community Involvement

We're serious about our commitment to a workplace where everyone can thrive and contribute to our industry-leading products and customer support, which is why we prohibit discrimination and harassment based on race, color, religion, gender, national origin, age, disability, veteran status, marital status, pregnancy, gender expression or identity, sexual orientation or any other legally protected status.

Our Commitment to You:

At Trellix, we are committed to creating a safe and trustworthy experience for our customers, employees, and candidates. Please be aware that fraudulent recruiting activity can occur through fake job postings or impersonated communications.

Trellix conducts interviews through professional channels only and does not use text messages, instant messaging, or group chats for interviews. We will never request sensitive personal information—such as your date of birth, Social Security number, or national ID number—during the interview process.

Trellix also does not require candidates to pay fees, purchase products or services, or process payments of any kind as part of the recruiting or hiring process. And Trellix will never keep any original work authorization documents that we may be required to review during the hiring process.

Skills Required

  • 5+ years of professional Python application software development
  • Linux proficiency
  • Excellent development and debugging skills in Python
  • Strong grasp of data structures and design patterns
  • Proficiency with REST and async Web APIs
  • CI/CD pipeline experience (GitHub Actions or equivalent)
  • Hands-on experience with Large Language Models (LLMs) in production
  • LangChain experience, including building and operating stateful multi-agent workflows
  • Experience with prompt orchestration and chain composition
  • Familiarity with Agentic AI concepts and patterns (ReACT, chain-of-thought, tool use, Deep Agents)
  • Experience deploying and operating vLLM for self-hosted inference
  • OpenTelemetry instrumentation and observability experience
  • Experience owning production reliability, incident response, and post-mortems
  • Familiarity with Model Context Protocol (MCP) for agentic tool integration
  • FastAPI
  • Node.js / TypeScript
  • Postgres
  • Knowledge Graphs (e.g., NebulaGraph)
  • Vector Databases (e.g., Qdrant)
  • Embedding pipeline experience including chunking and retrieval tuning
  • Gunicorn or Uvicorn
  • Redis
  • Langfuse or LangSmith for agent observability
  • Kubernetes
  • AWS services (RDS, EKS, ElastiCache, Bedrock)
  • Working knowledge of threat detection, EDR telemetry, SOC workflows, or SIEM platforms
  • Shell scripting

Trellix Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Trellix and has not been reviewed or approved by Trellix.

  • Leave & Time Off Breadth Time off options include paid leave, flexible time off, volunteer time, and “unlimited PTO” reported in the U.S. Usage often depends on team norms and coverage but is viewed favorably where supported.
  • Parental & Family Support Parental support features paid parental leave and family programs such as backup care, fertility, adoption, and neurodiversity assistance. Some accounts describe extended paid leave at full pay in practice.
  • Healthcare Strength Core health coverage is comprehensive, including medical, dental/vision, mental-health/EAP access, and an integrated wellbeing approach. Coverage quality is characterized as solid and comparable to large tech employers.

Trellix Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Plano, Texas
3,118 Employees
Year Founded: 2022

What We Do

Trellix is a global company redefining the future of cybersecurity. The company’s open and native extended detection and response (XDR) platform helps organizations confronted by today’s most advanced threats gain confidence in the protection and resilience of their operations. Trellix’s security experts, along with an extensive partner ecosystem, accelerate technology innovation through machine learning and automation to empower over 40,000 business and government customers.

Similar Jobs

Expedia Group Logo Expedia Group

Development Engineer

AdTech • eCommerce • Information Technology • Travel • Generative AI
Hybrid
Austin, TX, USA
16000 Employees
185K-295K Annually

CVS Health Logo CVS Health

Development Engineer

Fitness • Healthtech • Retail • Pharmaceutical
In-Office or Remote
10 Locations
119959 Employees
118K-261K Annually

HERE Technologies Logo HERE Technologies

Enterprise Architect

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Remote or Hybrid
US
6000 Employees
170K-215K Annually

HERE Technologies Logo HERE Technologies

Account Manager

Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Remote or Hybrid
US
6000 Employees
120K-130K Annually

Similar Companies Hiring

Blissway Thumbnail
Computer Vision • Fintech • Hardware • Internet of Things • Machine Learning • Software • Transportation
Denver, CO
24 Employees
Yooz Thumbnail
Software • Machine Learning • Fintech • Financial Services • Cloud • Automation • Artificial Intelligence
Aimargues, FR
470 Employees
Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account