Sr. Hpc Engineer Lsf/slurm, Rhel/centos/suse, Ansible, Vdi, Netapp Storage, Eda Tools

Year    Bangalore, Karnataka, India

Job Description


Company DescriptionAt Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible.At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we\'ve been doing just that. Our technology helped people put a man on the moon.We are a key partner to some of the largest and highest growth organizations in the world. From energizing the most competitive gaming platforms, to enabling systems to make cities safer and cars smarter and more connected, to powering the data centers behind many of the world\'s biggest companies and public cloud, Western Digital is fueling a brighter, smarter future.Binge-watch any shows, use social media or shop online lately? You\'ll find Western Digital supporting the storage infrastructure behind many of these platforms. And, that flash memory card that captures and preserves your most precious moments? That\'s us, too.We offer an expansive portfolio of technologies, storage devices and platforms for business and consumers alike. Our data-centric solutions are comprised of the Western Digital\xc2\xae, G-Technology\xe2\x84\xa2, SanDisk\xc2\xae and WD\xc2\xae brands.Today\'s exceptional challenges require your unique skills. It\'s You & Western Digital. Together, we\'re the next BIG thing in data.Western Digital\'s High-Performance Computing environments are key to bringing new storage solutions to market. As a Senior High-Performance Computing (HPC) engineer in the IT Infrastructure team, you will be at the heart of Western Digital\'s engineering and product development process, delivering the IT HPC infrastructure and services that empowers engineering teams to develop new storage technologies and deliver high quality products to market quickly.As a member of the HPC as a service team - HPCaaS, you will be responsible for establishing and executing strategic objectives focused on improving the effective utilization of the compute resources while meeting or exceeding customer service level agreements for job prioritization, job concurrency, and job throughput in our EDA compute clusters. This includes leading architectural innovation and path finding efforts to create and implement Western Digital\'s next generation Grid computing environment. As a member of the team, you will be expected to not only deliver on technical requirements and solutions but also be able to present your solutions to senior management. Responsibilities include but are not limited to working as an individual contributor, a team member and a technical team lead to explore, define, and pilot new solutions with little supervision. Develop solutions, scripts, and/or processes to automate management of services and tools as required. In this role, you will be collaborating closely with EDA and hardware design team stakeholders to define and deliver workload efficiency improvements in Western Digital\'s EDA HPC infrastructure globally.What you\'ll be doing:

  • Support multi-site, high-performance compute infrastructure and services for the global engineering product development organizations
  • Design, create, deliver, and support the deployment of Ansible automation within HPC and Unix environments
  • Identify and propose solutions and new services for the distributed ASIC and GPU computing clusters
  • Perform troubleshooting and root cause analysis of HPC clusters and file system related issues
  • Develop and maintain documentation for all aspects of the HPC infrastructure
  • Improve root cause analysis and corrective action for problems large and small - identify patterns and propose how we can automate repetitive tasks
  • Recommend and implement solutions to improve the performance of workloads
  • Support diverse Engineering Design Automation environment
Tooling
  • GitHub
  • Terraform, Ansible
  • Splunk, Grafana, Prometheus
Infrastructure
  • OS: RedHat and any related distribution
  • Monitoring tools like nagios/cacti or any equivalent
  • PXE/Kickstart configuration
  • NFS storage management & automounter
  • EDA tool installation and support like Cadence and Synopsys
  • Opensource tool installation and support
  • Unix/Linux authentication with AD
  • Infrastructure automation with scripting knowledge
Qualifications
  • Bachelor\'s degree in computer science or equivalent experience
  • 10+ years of Linux systems administration experience specifically in managing or supporting RedHat and/or Centos Linux in production environments
  • Experience with configuration management tools: Ansible, Puppet, Chef
  • Experience with automation
  • Ability to technically lead a project through the lifecycle
  • Scripting skills: highly skilled in at least two typical scripting languages (shell/bash, python, ruby)
  • Excellent problem-solving, multitasking, troubleshooting skills, and attention to detail are required to work in this challenging and dynamic environment
  • Very strong interpersonal, customer service, result-oriented, and team-building skills
Additional InformationWestern Digital thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.Western Digital is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.

Western Digital

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD3450076
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Bangalore, Karnataka, India
  • Education
    Not mentioned
  • Experience
    Year