System and Platform Operations Manager
- Employer
- Epsilon
- Location
- London
- Salary
- Competitive
- Closing date
- 10 Oct 2024
View more categoriesView less categories
- Sector
- Engineering
- Contract Type
- Permanent
- Hours
- Full Time
- Job Function
- Operations
You need to sign in or create an account to save a job.
Job Description
Purpose
The Director, System & Platform Operations is a technical leadership role that is responsible for the support, reliability and stability of CitrusAd production systems, environments and offerings. The team owns the reliability vision for the company, driving continuous improvement through a combination of development and operations initiatives as well as process excellence. This position and their team has solid-line responsibility for operations including the deployment, management, monitoring, reporting, troubleshooting, and repair of production systems. Core to the success of the role is to provide a premium customer support experience focused on a "centre of excellence" that allows for a full-service delivery support cycle.
The Platform Operations team is responsible for supporting all retailers once they are live. Critically important is how this team collaborates and liaises with other teams such as Customer Support, Client Integration Engineering, Engineering and Customer Success teams. This role ensures production stability and facilitates rapid release of new products and features by balancing the needs of delivery teams and business stakeholders. They ensure flawless ongoing operational functionality to meet increasing customer demands. Collaborating closely with the Engineering team, they maintain system stability and support Customer Integration Engineering from an environment's perspective. Additionally, they lead the team responsible for ensuring 24/7 service availability, crucial for CitrusAd's success.
Responsibilities
Purpose
The Director, System & Platform Operations is a technical leadership role that is responsible for the support, reliability and stability of CitrusAd production systems, environments and offerings. The team owns the reliability vision for the company, driving continuous improvement through a combination of development and operations initiatives as well as process excellence. This position and their team has solid-line responsibility for operations including the deployment, management, monitoring, reporting, troubleshooting, and repair of production systems. Core to the success of the role is to provide a premium customer support experience focused on a "centre of excellence" that allows for a full-service delivery support cycle.
The Platform Operations team is responsible for supporting all retailers once they are live. Critically important is how this team collaborates and liaises with other teams such as Customer Support, Client Integration Engineering, Engineering and Customer Success teams. This role ensures production stability and facilitates rapid release of new products and features by balancing the needs of delivery teams and business stakeholders. They ensure flawless ongoing operational functionality to meet increasing customer demands. Collaborating closely with the Engineering team, they maintain system stability and support Customer Integration Engineering from an environment's perspective. Additionally, they lead the team responsible for ensuring 24/7 service availability, crucial for CitrusAd's success.
Responsibilities
- Operational Practices
- Establish and manage operational practices and ensure we design, implement and operate a support model that is fit for purpose for our future.
- Implement proactive solutions for incident and problem detection, response and remediation and continuous improvement
- Be the owner of the operational integrity of all production environments
- Production Monitoring and Operational Reporting
- Adopt a "Measure Everything" approach to ensure that internal service level objectives and customer service levels agreements are exceeded including executive level reporting on operational health metrics such as SLAs, incident resolution, performance, availability, reliability, capacity etc
- Customer Support & Incident Management
- Own incident management processes and on call response.
- Take ownership of complex issues related to performance, reliability, and scalability and leading resolution of serious incidents and events including communications with customers and wider stakeholders
- Change Management
- Uphold processes and procedures to manage change across production platforms
- Provide insight and expertise on how customers will perceive the changes or impacts to customers to drive customer organisation change management and communication.
- Empower the Delivery teams to release new products, features, updates and fixes quickly, while ensuring Platforms remain reliable and stable.
- System Reliability
- Work with the wider Engineering, Product, Delivery and Security teams to ensure that appropriate attention is given to production/system reliability.
- Establish Operational Practices in conjunction with the Product and Engineering teams (e.g. understanding how product feature development could affect the system's overall reliability and performance).
- Provide delivery status information on System Reliability initiatives to the IT Leadership Team and additional stakeholders with a focus and ensure proper communication concerning changes to agreed milestones or challenges, risks and blockers that may affect the outcome or agreed completion dates (with proactive suggestions to resolve)
- IT Service Management
- Execute Service Management processes including Change, Config, Service Level, Performance, Incident and Problem Management to deliver a high level of support and system availability
- Leverage industry standards and best practices for improving service levels and performance
- Uphold Customer Support standards in line with Service Level Agreements
- Ensure SLAs and KPIs are met to the best of your ability, with particular focus on first level response times, escalation paths and resolution times.
- Uphold the IT Service and Support workflow - with a particular focus on ensuring best in class customer experience.
- Deliver support and service solutions for the Group in line with industry best practice
- Work as a team to ensure all SLAs and practices are well defined, documented and consistently applied/adhered to provide premium customer support services.
- Leadership and Direction
- Set and communicate the strategy for achieving the Group's mission, vision and values within the Technology and Operations space, together with the broad actions needed to implement it; inspire a large or diverse workforce to commit to these and to doing extraordinary things to achieve the organization's business goals.
- Responsible for recruitment and professional development of all under the Platform Operations space and owns hiring decisions.
- In conjunction with the CTO, define methodologies and tools that are succinct to CitrusAd and used by all teams.
- Performance Management
- Manage and report on business performance; hold direct reports accountable for achievement of business plans, and take corrective action where necessary to ensure the achievement of business objectives, balancing the need to deliver short term business objectives with the longer term delivery of stakeholder value.
- Develop OKR or KPI framework for the Platform Operations space that is linked to the CTO and wider company vision/mission.
- Organisational Capability
- Identify the capabilities needed to meet the current and emerging business needs of a significant function.
- Evaluate current capabilities, identify gaps, and prioritize development activities.
- Embed personal development and the fulfillment of personal potential in the culture of the organization.
- Build capabilities elsewhere in the organization through mentoring and other informal methods.
- Organisational Planning
- Define the detailed organization structure to align with corporate principles, define the relationship between elements of the organization, and define the responsibilities of senior leaders, to enable the organization to achieve its business objectives.
- Technical Developments, Process Improvement and Simplification
- Discuss and recommend more complex or innovative technical developments to improve the quality of software and supporting infrastructure to better meet users' needs.
- As subject matter expert on the team, maintain understanding of current technology, database management, reliability practices, and future trends through ongoing education, conference attendance and industry press.
- Ensure all processes and procedures are documented for ease of continuous improvement activities
- In collaboration with leadership and other teams, identify and implement process improvement and simplification
- Proactively identify new opportunities to drive improvements and simplification of our overall technology solutions.
- Personal Capability Building
- Develop own capabilities by participating in assessment and development planning activities as well as formal and informal training and coaching; gain or maintain external professional accreditation where relevant to improve performance and fulfill personal potential. Maintain an in-depth understanding of technology, external regulation, and industry best practices through ongoing education, attending conferences, and reading specialist media.
- Strong people leadership skills with 2+ years leading Operations teams within enterprise environments with knowledge of DevOps, SRE, ITIL, Cloud Services, IT Infrastructure and Operations supporting and maintaining production and development environments and building cloud services that are secure, reliable, scalable and observable
- Experience implementing and managing Logging, Monitoring and Alerting frameworks
- Knowledge and experience of establishing deployment and automation pipelines
- Have excellent communications and written skills, and must be able to talk about technology intelligently and passionately to all levels of an organisation including Developers, Architects and senior management (technical and non-technical)
- Have strong organisational skills, and enjoy a dynamic and agile working environment
- Experience in establishing support strategies to support SaaS or Cloud based backends with a particular focus on APM deployment (such as Dynatrace or other monitoring tools)
- Experience with establishing Service Delivery strategies that align to new ways of work methods including Agile
- Experience in successfully managing complex stakeholder relations.
- Experience in leading and driving high-performance technical teams that are disparately located (international IT operations is a must)
- Experience and understanding of international requirements relating to data/information security.
- Experience in the design, development and management of commercial technology contracts, technical service level agreements, and KPIs
- Experience of managed/outsourced services . click apply for full job details
You need to sign in or create an account to save a job.
Get job alerts
Create a job alert and receive personalised job recommendations straight to your inbox.
Create alert