Sre Principle

Year    MH, IN, India

Job Description

Location: Pune




Employment Type: contract




Designation: SRE Principle



###

Job Details




- SRE Principal Engineer




JOB MISSION:


New Balance's Direct-to-Consumer Engineering team is responsible for creating, maintaining and providing customer service for its branded eCommerce websites. We seek talented individuals that fit into our team-oriented atmosphere and are proud to have an environment that offers the comfort of a true work/life balance.




The Principal Site Reliability Engineer will play a lead role in the production environment by monitoring availability and taking a holistic view of system health. They will build software and systems to manage platform infrastructure and applications; improve reliability, quality, and time-to-market of our suite of software solutions; and measure and optimize system performance - all with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.



Responsibilities

Ensure availability, latency, performance, and efficiency of our global ecomm sites Experience driving change management and incident management Promote best practices and innovative observability to guide product delivery teams in achieving operational excellence for new product deliveries. Drive operational excellence and evangelize best practices in observability. Develop unified observability dashboards and implement E2E observability requirements. Design innovative observability solutions for internal and external stakeholders. Contribute to observability instrumentation standards and create repeatable patterns for engineering teams. Define and implement E2E observability requirements and lead teams to support E2E best practices. Collaborate with cross-functional teams to achieve objectives and drive high reliability into systems. Build proprietary tools to mitigate weaknesses in incident management or software delivery. Implement SRE best practices to increase system reliability and performance. Automate processes for improved collaborative response and prepare teams for incidents. Maintain error budgets, meet SLOs, and support uptime and availability of critical platform components. Automate technology stacks to improve operating costs while responding to traffic spikes. Location: Pune - NBIT Office, Mandatory in person - Tu, We, Thu in a week Work timings: First 3 months in EST to onboarding ramp up, move into IST work timings for 8 hours with a possible 1 hour overlap in the evening with US team in EST (10am to 7pm)


Required Skills and Experience:

Bachelor's Degree in Computer Science, Information Science, Engineering, or a related field. 10+ years of experience in code management, deployment processes, procedures, and tools in a DevOps or SRE role. Experience with monitoring tools (preferred: Dynatrace, Splunk, Datadog, Grafana, and New Relic). Proficiency in state-of-the-art observability trends, tools, products, and technologies. Ability to identify organization-wide gaps in the SRE practice and implement solutions that contribute to organizational transformation. Experience driving cross-organization adoption of new technologies or initiatives. Ability to influence senior management in selecting the right strategy, processes, and structures to transform the organization into a modern SRE team. Proactive in identifying performance bottlenecks, anomalous system behavior, and addressing root causes of service issues. Passionate about technology with a strong sense of curiosity and a desire to improve processes, automate everything, and continuously learn. Successful experience supporting a cloud production environment (strong preference for Azure). Competency in one or more programming languages for automation (Python strongly preferred). Knowledge of cloud deployment tools and methodologies (ideally Ansible, but Terraform, Azure DevOps, etc. are also considered). Deep understanding of Kubernetes and Docker architecture and associated tools. Experience with at least one configuration management solution (e.g., Chef, Ansible, AWS CodeDeploy). Proficiency with repository and pipeline-related tools (e.g., GitLab, Jenkins, Bamboo, Travis, CircleCI). Experience with implementing and using various application and infrastructure monitoring tools. Strong troubleshooting skills. * Ability to take ownership and deliver solutions autonomously.

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD3653830
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    MH, IN, India
  • Education
    Not mentioned
  • Experience
    Year