Senior Prinicipal Member of Technical Staff - IC5 Job in ORACLE

Senior Prinicipal Member Of Technical Staff Ic5

Year IN, India

Apply Now

Job Description

Team Overview: The OCI Cluster Networking team is at the forefront of building ultra-high-performance networking solutions to support advanced AI/ML/HPC workloads. This is your chance to join the AI revolution by designing scalable systems that support thousands of GPUs without compromising on performance.

Role Summary: As a Senior Principal Member of Technical Staff, you'll be part of a dynamic team responsible for designing, developing, and optimizing a software and hardware stack capable of running distributed AI/ML/HPC workloads across thousands of GPUs. You will work with cutting-edge libraries like NCCL, leverage high-performance networking, and build innovative, scalable solutions for our customers.

Who You Are: We're looking for adaptable, self-motivated engineers who can learn quickly. You are a solid developer and distributed systems generalist who can work across the stack, from low-level systems to high-level distributed system interactions. You value simplicity, scalability, and thrive in a collaborative, agile environment.

Career Level: IC5

Career Level - IC4

Key Responsibilities:

• Design and develop scalable, high-performance software and hardware solutions for distributed AI/ML/HPC workloads.
• Performance tune networking libraries (e.g., NCCL) and integrate them with our distributed systems.
• Collaborate with cross-functional teams on new initiatives and deliver innovative solutions to complex networking challenges.

Basic Qualifications:

• 10+ years of software development experience in systems or application-level engineering
• 2+ years of experience with collective communication libraries (e.g., NCCL, RCCL, MPI) and GPU frameworks (e.g., CUDA, ROCm)
• 2+ years of experience with ML training frameworks (e.g., PyTorch, TensorFlow)
• Proficiency in at least two of the following programming languages: Go, Java, C/C++, Python
• Strong knowledge of data structures, algorithms, and operating systems
• Excellent communication skills, both verbal and written
• Bachelor's degree in Computer Science, Engineering, or a related field

Preferred Qualifications:

• Master's degree in Computer Science or a related field
• Experience with RDMA programming, including GPUDirect RDMA
• Experience with distributed workload managers (e.g., Kubernetes)
• Proficiency with Linux performance tools
• Familiarity with SDN, NFV, and cloud networking
• Experience with Infrastructure-as-a-Service platforms (e.g., AWS, Azure, GCP)

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Job Detail

Job Id

JD3555147
Industry

Not mentioned
Total Positions

1
Job Type:

Contract
Salary:

Not mentioned
Employment Status

Permanent
Job Location

IN, India
Education

Not mentioned
Experience

Year

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers