Site Reliability Engineer

Year    Bangalore, Karnataka, India

Job Description


Position: Site Reliability Engineer
Location: Chennai , Bangalore
Exp: 10- 12 Years
Skills : SRE , DataDog, Azure DevOps, Jenkins, Octopus , Cloud , Python:

  • Ensure smooth production on AWS and GCP by maintaining availability, scalability, and reliability with Kubernetes (GKE), AWS ECS, and cloud-native services, focusing on uptime and availability.
  • Build and automate infrastructure with Terraform, CI/CD tools, and Python scripts to reduce manual tasks and optimize error rates and throughput.
  • Provide 24xc3x977 on-call support to ensure system availability and quick issue resolution, minimizing MTTR.
  • Monitor infrastructure with telemetry, tracking latency, error rates, and other SLIs to ensure seamless operations.
  • Improve system performance by analyzing metrics from OS, containers, APIs, and apps to address issues early, focusing on response times and resource usage.
  • Automate deployments using CI/CD, ensuring performance, compliance, and cost efficiency.
  • Plan immutable infrastructure deployments with automated pipelines, ensuring low latency and cost optimization.
  • Collaborate with teams (.NET, Java, APIs, Python) to optimize testing and automate deployments for reliable releases, managing error budgets.
  • Design scalable systems, manage platforms for high demand, and monitor capacity and throughput.
  • Automate processes for efficiency and resource management, reducing saturation. Ensure binaries, configurations work across environments, focusing on scalability.
  • Balance feature development with system stability by managing SLOs and error budgets.
  • Experience managing cloud infrastructure on AWS, GCP, and Kubernetes with a focus on scalability and SLO-driven performance.
  • Proficiency with tools like DataDog, Azure DevOps, Jenkins, and Octopus for code deployment and monitoring throughput and latency.
  • Strong background in software development, test automation, and Infra-as-code (Terraform) for efficient deployments.
  • Expertise in Python, .NET, or Java for automating tasks and optimizing performance and latency.
  • Familiarity with distributed storage systems, handling RPA toolsets, large datasets, focusing on cost efficiency and data throughput.
  • Experience with Kubernetes and AWS/GCP services for resource management and resource usage.
  • Proactive in identifying bottlenecks, troubleshooting, and improving system performance.
  • Ability to design scalable systems to support business growth, ensuring SLO adherence.
Interested candidate share your resume at rubi.jena@mnrsolutions.in#SiteReliabilityEngineer #SRE #CloudComputing #AWS #GCP #Kubernetes #DataDog #AzureDevOps #Terraform #Python #DevOps #CI_CD #Jenkins #Octopus #InfrastructureAsCode #Automation #Monitoring #CloudInfrastructure #SoftwareDevelopment #Chennai #Bangalore #JobOpening #TechJobs #Hiring #ITCareersJob Type: Full TimeJob Location: Bangalore Chennai

MNR Solutions

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD3534391
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Bangalore, Karnataka, India
  • Education
    Not mentioned
  • Experience
    Year