We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

APPLICATION MANAGER

Atos-North America
United States, Arizona, Phoenix
13430 North Black Canyon Highway (Show on map)
May 06, 2026

About Atos Group

Atos Group is a global leader in digital transformation with c. 63,000 employees and annual revenue of c. 8 billion, operating in 61 countries under two brands - Atos for services and Eviden for products. European number one in cybersecurity, cloud and high performance computing, Atos Group is committed to a secure and decarbonized future and provides tailored AI-powered, end-to-end solutions for all industries. Atos Group is the brand under which Atos SE (Societas Europaea) operates. Atos SE is listed on Euronext Paris.

The purpose of Atos Group is to help design the future of the information space. Its expertise and services support the development of knowledge, education and research in a multicultural approach and contribute to the development of scientific and technological excellence. Across the world, the Group enables its customers and employees, and members of societies at large to live, work and develop sustainably, in a safe and secure information space.

Position - AI Engineer

Location - Phoneix, AZ

Type - Fulltime

Job Description

Program Details: Design, build, and ship LLM-powered and agentic product features that enhance the team efforts and outcomes. Build agentic AI systems that reason over context, invoke tools, take real actions, and recover gracefully from failure. Work on integrating the existing AI tools and should know major AI frameworks and libraries. Own service reliability and operational governance by defining SLA's, managing error budgets, and reporting reliability (MTTD, MTTR) to leadership for prioritization, risk decisions and planning. Architect and continuously optimize the observability of platform using Kibana/Elastic (ELF) along with other observability tools like Prometheus, Grafana (dashboards, metrics, alert lifecycle), improving detection quality, reducing noise/toil, and enabling faster triage and measurable uptime improvements. Engineer advance alerting and automation capabilities with Kibana alerting and anomaly detections and integrating response workflows (routing, runbooks, remediation scripts) to standardize on-call execution and accelerate restoration of services. Lead incident response for customer-impacting issues across teams-coordination, communications, service restoration, and blameless RCA-then corrective actions that prevent recurrence and reduce operational risk. Design, automate and validate Disaster Recovery and failover for critical services/journeys (RTO/RPO alignment, DR Drills), ensuring resiliency under failure scenarios and improving recovery. Consult and partner with application teams by providing production readiness inputs (Resiliency patterns, availability, performance/capacity considerations) and driving platform enhancements that improve stability while optimizing infrastructure and observability spend. Primary Skills: Lang-chain, Langraph, RAG, MCP Experience with working on LLM's and integrating with the existing applications Python - FastAPI Cache - Redis Secondary Skills: Observability - ELK (Elactic/Kibana), Prometheus, Grafana, PromQL Software and automation - Java, Vertx, Python/Shell/Bash, Rest-SOAP API, docker containerization, Kubernetes, Kafka Reliability and DR engineering - Distributed architecture and distributed system fundamentals, micro services and event-driven architecture. Cross-team coordination, incident triage and resolution, leadership and stakeholder management.

Here at Atos, diversity and inclusion are embedded in our DNA. Read more about our commitment to a fair work environment for all.

Atos is a recognized leader in its industry across Environment, Social and Governance (ESG) criteria. Find out more on our CSR commitment.

Choose your future. Choose Atos.


Applied = 0

(web-bd9584865-cxkl2)