We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Software Development Engineer - Collectives and Network

Advanced Micro Devices, Inc.
$120,000.00/Yr.-$180,000.00/Yr.
United States, California, San Jose
2100 Logic Drive (Show on map)
Jun 26, 2026


WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

THE ROLE:

This software engineer role will help drive AMD's strategy, architecture, optimization and tooling to achieve industry-leading AI Pre-training and Distributed Inference Performance on AMD GPU. You will partner across hardware architecture, AI frameworks, compilers, runtime, ROCm, developer tools and models to scale performance analysis and optimization.

As an Engineer of Collectives and Network performance, you will drive the end-to-end technical performance attainment across the entire software stack focusing on getting the best performance on multiple generations of AMD GPUs with wide range of models including latest state-of-the-art AI models. You will help set the strategy and roadmap for general optimization, accelerating supporting new models and out of box performance.

If you are passionate about performance optimization, getting the best out of the hardware, and shaping the future of AI acceleration, then this role is for you.

THE PERSON:

The ideal candidate will have deep knowledge withNetwork, NIC and GPUhardware architecture, software optimization, performance modeling, AIframeworksand latesttrendin inference and training optimization. Hand-on experience in mapping model architecture to low level software, hardware and understanding the impact of each layer of the stack on model performance.Strong knowledge in latest generative model architecture, especiallySoTAmodels, distributed inference and deployment at scale is crucial.

KEY RESPONSIBILITIES:

  • Assist with strategy and roadmap for AMD Collectives and Network optimizations.
  • Provide guidelines to customers on efficient network load-balancing, workloadschedulingand model shardingstrategies.
  • Performance tuning,profilingand analysis of large-scale models for LLM, diffusion, multimodal,RecSysand generative AI, single node and distributed. In addition to exploring various tradeoffs and design decisions.
  • Participate in hardware-software co-design for future hardware optimizations - especially on scale-up networks,NICand scale-out networks.
  • Develop and improve framework, tools and infrastructure for performance estimation,modelingand reporting.
  • Communicate and present the results ofthe performanceanalysis and modeling to stakeholders, and senior leadership. And provide a concrete recommendation.
  • Cross team collaboration and working across the organization toidentifyopportunities and develop strategies.

PREFERRED EXPERIENCE:

  • Multiple years of technical experience in performance optimization.
  • Strong technicalexpertiseand experience in performance analysis, projection, andnetworkhardware architecture.
  • Deep knowledge and hand-on experience of AI Frameworks such asPyTorch, JAX,vLLM, andSGLang.
  • Strong technical leadershipskills,ability to work collaboratively with cross-functional teams.
  • Mentor, coach, and inspire a diverse and talented team of researchers and engineers.
  • Excellent written, verbal, and presentation skills, ability to coordinate internally and externally.

ACADEMIC CREDENTIALS:

  • A PhD or master's degree in computer science, electrical engineering, or a related field.

LOCATION:

San Jose, CA (Hybrid)

This role is not eligible for visa sponsorship.

#LI-MV1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here.

This posting is for an existing vacancy.

Applied = 0

(web-77cf7d65c7-jdxdg)