Monitoring Engineer Gurgaon

Year Gurgaon, Haryana, India

Apply Now

Key Responsibilities

24x7x365 on call support (in rotation) to manage and execute on the Incident Management process.
Fast and effective response to service failure Alerts and Notifications from a range of systems.
Impact and Severity Assessments of service failures, both internal and external stakeholders.
Management of Bank/PG\'s downtime or other services against SLA Targets. Escalation of downtime within the bank/PG\'s, as well as internally.
Accurately tracking on progress and escalations on issues & internal ticketing systems.
Updating merchants/internal stakeholders on the status of any service outage, either directly by phone and email or via the ticketing tool.
Notifying merchants via email of any planned maintenance, either internal or Bank/PG.
Managing the outcomes of Reason for Outage (RFO/RCA) and Major Incident Reports (MIR) both internally and externally.
Hands on experience on Database (SQL)
Hands on experience on Python, shell scripting.
Software Development in terms of automating repeatable Operations tasks (TOIL).SRE Metrics & Monitoring Strategy (SLI, SLO, etc.). Schedule and lead all continuous improvement activities, including Incident reviews, Change implementation reviews, TOIL automation candidate areas etc.
Based on post-incident reviews, he/she will need to optimize the Software Development Life Cycle (SDLC) to boost service reliability.
To ensure a seamless flow of information between teams, site reliability engineer job may require documenting the knowledge gained.

Must have:

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.