AI DevOps Engineer
Tlalnepantla, Ciudad de México, MX, 54070
Ingersoll Rand is committed to achieving workforce diversity reflective of our communities. We are an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.
Role Summary
Enable and scale Ingersoll Rand’s GenAI program by designing, building, and operating the production infrastructure that powers AI-driven applications across the enterprise. This role focuses on DevOps, cloud infrastructure, CI/CD, observability, and platform reliability for GenAI systems built on LLM APIs and Snowflake-native capabilities.
Own the operational lifecycle of LLM-powered systems including prompt versioning, model configuration, cost controls, and production reliability across Snowflake-native and API-based GenAI platforms.
You will work closely with AI engineers and application developers to turn prototypes into secure, reliable, observable, and scalable AI applications, ensuring smooth integration with enterprise systems and data platforms. This is a DevOps and platform engineering role with a strong focus on production-grade AI systems.
The Core Challenge
GenAI teams can build powerful applications quickly using LLM APIs—but productionizing them at enterprise scale is hard. Challenges include environment consistency, secure data access, observability, cost control, CI/CD automation, and reliable integrations with core business systems.
This role bridges that gap by providing standardized infrastructure, deployment pipelines, and operational frameworks so AI teams can move fast without sacrificing reliability, security, or governance.
Key Responsibilities
GenAI Platform & Infrastructure
- Design, build, and maintain cloud infrastructure to host GenAI applications using GCP and Snowflake container services
- Support Snowflake-based AI workflows including data ingestion, Cortex Agents, Analyst, and Search
- Define standardized, reusable infrastructure patterns for AI applications across development, staging, and production environments
- Implement cost-aware infrastructure patterns (warehouse sizing, service isolation, token budgeting) for GenAI workloads
- Explore, build, and support proof-of-concept initiatives to evaluate emerging GenAI and MLOps platforms and architectures, focusing on deployment, orchestration, monitoring, and governance of LLM-based systems.
CI/CD & Automation
- Build and maintain CI/CD pipelines using GitHub for AI applications and platform services
- Automate infrastructure provisioning and environment configuration using Infrastructure-as-Code
- Enable safe, repeatable deployments with versioning, rollback, and environment promotion strategies
Observability & Reliability
- Implement observability for GenAI systems using Langfuse and Snowflake observability tools to continuously improve AI system reliability and usefulness.
- Monitor application health, latency, usage, errors, and cost using dashboards, alerts, and runbooks to support reliable production operations.
Cloud & Container Operations
- Manage containerized workloads across GCP and Snowflake containers
- Ensure secure networking, secrets management, access controls, and environment isolation
- Optimize performance, scalability, and cost for AI application workloads
Enterprise Integrations
- Support and operationalize integrations between GenAI applications and enterprise systems such as SAP, Salesforce, SharePoint, and other internal/external platforms
- Ensure reliability, security, and observability of API-based and event-driven integrations
Collaboration & Enablement
- Partner closely with AI engineers, data engineers, and IT teams to remove operational blockers
- Provide documentation, templates, and best practices that enable teams to deploy and operate independently
- Contribute to standards for security, reliability, and governance across the GenAI platform
Required Qualifications
- 3+ years in DevOps, platform engineering, or software infrastructure roles; 1-2+ years specifically with ML/AI infrastructure or MLOps
- Experience operating LLM-based applications in production, including prompt management, cost monitoring, and reliability practices
- Strong experience with CI/CD pipelines (GitHub Actions preferred)
- Hands-on experience with containerized applications (Docker; Kubernetes or managed container platforms)
- Experience operating workloads on GCP or similar cloud platforms
- Proficiency with Infrastructure-as-Code tools (Terraform or equivalent)
- Strong scripting skills (Python and/or Bash)
- Experience implementing monitoring, logging, and observability for production systems
- Experience supporting API-based applications and integrations
- Ability to troubleshoot and operate complex distributed systems
- Strong communication skills and ability to collaborate across technical and business teams
- Fluent in English (written and spoken)
- Bachelor’s or Master’s degree in Computer Science, Software Engineering, IT, or related field (or equivalent experience)
Preferred Qualifications
- Experience with Snowflake, including data ingestion pipelines and Snowflake-native applications
- Familiarity with GenAI application architectures (RAG, agents, prompt orchestration, API-based LLM usage)
- Experience with Langfuse or similar AI observability tools
- Experience integrating enterprise systems (SAP, Salesforce, SharePoint, etc.)
- Experience with data versioning tools (DVC, Pachyderm, LakeFS)
- Knowledge of vector databases and LLM infrastructure (Pinecone, Weaviate, Milvus, Chroma)
- Cloud or MLOps certifications (AWS Machine Learning Specialty, AWS Solutions Architect, Kubernetes CKA/CKAD, Azure AI Engineer, GCP ML Engineer)
- Manufacturing or industrial IoT experience
- Experience with compliance and governance frameworks for AI/ML systems
What This Role IS
- Infrastructure engineer who enables AI teams to move faster through automation and robust tooling
- Systems thinker who balances reliability, scalability, and cost efficiency
- Bridge between AI innovation and production operations who translates complex requirements into practical solutions
- Continuous learner who keeps current with rapidly evolving AI-Ops ecosystem and cloud-native technologies
Ingersoll Rand Inc. (NYSE:IR), driven by an entrepreneurial spirit and ownership mindset, is dedicated to helping make life better for our employees, customers and communities. Customers lean on us for our technology-driven excellence in mission-critical flow creation and industrial solutions across 40+ respected brands where our products and services excel in the most complex and harsh conditions. Our employees develop customers for life through their daily commitment to expertise, productivity and efficiency. For more information, visit www.IRCO.com.
Bicycle rights prism poutine austin. Drinking vinegar gluten-free iceland, typewriter farm-to-table selfies XOXO food truck four loko.