Lead Systems Operations Engineer

Year    Hyderabad, Telangana, India

Job Description

b'


NTT DATA Services currently seeks L3 Support - SRE and Middleware with Scripting to join our team in Bangalore
Responsibilities:
  • Drive innovation in digital technology & Innovation application portfolios, increase efficiency through automation, SRE and Agile with an emphasis on enhancing end user experience.
  • Leading Team on all technical issues related to APP and WEB tier.
  • Expert in Middleware Administration (WebLogic, Tomcat, Apache, IIS) and Strong working Experience in production support of middleware applications.
  • Drive automation of manual repetitive operational tasks and Engineer solutions to automate production game plans.
  • Perform trend analysis of repetitive production issues and engage relevant operation/development teams to address the failure patterns and incidents.
  • Drive adoption of self-healing and resiliency patterns.
  • Enhance the end-to-end application or system observability by enhancing the alarm setup or developing new dashboards using the monitoring/log analysis/analytic tools such as splunk, AppD, Elastic Search, PowerBI, Tableau etc.
  • Closely work with enterprise SRE team and perform SRE maturity assessment for applications in scope, baseline current state metrics, establish SLI/SLOs, Error budget, Service Levels, monitoring, alerting and recovery objectives and perform periodic resiliency testing for all applications in scope.
  • Manage the Toil Registry created for the group & Reduce toil by fine tuning existing monitoring/alarming setup or by developing tools to automate the routine tasks using ansible, shell scripting etc.
  • Develop a solution for self-healing of alarms thus aiding in production Incident reduction.
  • Enhance or fix the bugs in the existing patching & production release install scripts for improving the success ratio and own/participate in the root cause analysis using 5-Why approach.
  • Recommend infra level solutions by proactively analyzing low level errors in application logs which are undetected to enhance the customer experience.
  • Direct large-scale projects and application implementations from proof of concept through testing and installation.
  • Troubleshoot high severity production incidents in real time, improve system availability & reliability by facilitating blameless postmortems to prevent problem recurrence.
  • Apply analytics on historical monitoring or incident data for predicting issues and take proactive actions.
  • Statistical gathering and analysis to assist architecture engineering and development teams in capacity planning requirements to support projected transaction volumes, response times and system availability targets.
  • Collaboration with enterprise partners on issues and initiatives that impact the infrastructure.
  • Add value to team delivery and work with team to complete tasks with high quality and actively learn new skills/technologies.

Essential Qualifications:
  • Bachelor\'s Degree or equivalent experience in any software engineering discipline.
  • 10+ years\' experience in production support & SRE implementation in a large scale environments (preferably in banking domain.
  • Hands on experience in web & middleware platform (apache, tomcat, WebLogic, PCF etc) in Linux/windows environments.
  • Hands on experience in supporting PCF applications and microservice architecture-based applications.
  • Hands on experience with monitoring/log analysis/dashboard tools such as Appdynamics, splunk, Elastic Search, Netcool, PowerBI, Tableau etc.
  • Proficiency in shell scripting, ansible and one programming language such as python or JavaScript.
  • Good knowledge in DevOps tools - GitHub, Jenkins, UCD and cloud platforms such as GCP.
  • Knowledge in Database and network environments.
  • Good knowledge in Agile and ITIL framework.
  • Strong analytical and problem-solving abilities, with quick adaptation to new technologies, methodologies and systems.
  • Demonstrate strong written, oral communication skills and documentation skills and able to work independently.
  • Self-learner, understand technology environment and deliver faster.
  • Willing to work in shifts (24x7 models)

Desired Qualifications:
  • Experience in Unix Server Support domain.
  • Cloud certification
  • Experience with Tableau/ MicroStrategy or similar BI tools
  • Bachelors or Master\'s degree in Computer Science, Software Engineering or a related field

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD3340201
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Hyderabad, Telangana, India
  • Education
    Not mentioned
  • Experience
    Year