Timestamp Group aggregates several leading Portuguese IT solutions and services companies around the concepts of excellence and knowledge sharing. We are committed to technological leadership, based on the quality of our service and technological solutions, supported by continuous training and certification.
Role: SRE Consultant - Azure API Management
Responsibilities:
Maintaining the reliability and performance of the API infrastructure within the Azure GenAI platform, focusing on the Secure GPT service;
Taking ownership of service issues and incidents, driving through to resolution and communicating updates to stakeholders;
Identifying opportunities for improvement in the processes and procedures related to the API Operations and driving the implementation of these improvements, particularly in observability automation and customer service delivery;
Acting as a point of escalation for stakeholders seeking assistance with API Operations and ensuring seamless user expertise;
Resolving any API-related issues to ensure seamless user expertise;
Ensuring compliance with security and regulatory requirements related to the Secure GPT service;
Supporting the broader Secure GPT team in delivering exceptional customer service to internal stakeholders.
Technical Skills Required:
Expertise in Site Reliability Engineer, or operational or service management roles, with a specific focus on API;
Operations within the Azure environment and OpenShift (Kubernetes) (at least 5 years);
Analytical expertise, identifying issues and root causes quickly and accurately;
Problem-solving expertise with a proactive and solution-oriented approach;
Expertise in learning quickly new technologies and tools;
Expertise in communicating technical information to non-technical stakeholders;
Expertise in project management, managing multiple priorities and stakeholders;
Driving initiatives forward independently.
Technical Expertise - Must Have:
Expertise in Azure API operations, management and monitoring;
Software Engineering expertise (Continuous Integration / Continuous Delivery [CI/CD], Test Driven Development, etc.);
Expertise in Infrastructure as Code (IaC), with expertise in DevOps practices for both frontend and backend services;
Expertise with Terraform and/or Bicep for automatic infra deployment;
Expertise in Site reliability engineering concepts (SLA, SLO, SLI, Error Budgets, Toil Reduction, Automation, Incident Management, Monitoring and Observability, Capacity Planning and Demand Forecasting, Risk Management, Collaboration and Communication).
Nice to have expertise:
Expertise with ChatGPT or other generative AI technologies;
Expertise working in a highly regulated industry;
Expertise in data security and regulatory requirements.
Place: Lisbon (hybrid)
Start: ASAP
#J-18808-Ljbffr