- Search for JobsSearch for Jobs
- Browse for JobsBrowse for Jobs
- Create a ResumeCreate a Resume
- Company DirectoryCompany Directory
Deep Learning Distributed Training Engineer
-
Job CodeJR0216277
Design, develop and
optimize for Deep Learning Training on Data Center targeted
Discrete GPU and CPU clusters. Implement various distributed
algorithms such as model/data parallel frameworks, parameter
servers, dataflow based asynchronous data communication in deep
learning frameworks. Transform computational graph representation
of neural network model. Develop deep learning primitives in math
libraries. Profile distributed DL models to identify performance
bottlenecks and propose solutions across individual component
teams. Optimize code for various computing hardware backends.
Interact with deep learning researchers and experience with deep
learning frameworks.
We are in agile development
environment, you should be able to juggle multiple-tasks and able
to make forward, demonstrable progress that delivers impact. You
will have an opportunity to work with external and internal teams
who are passionate about AI/DL training.
The
ideal candidate should exhibit the following behavior
skills:
- Strong communication skills. Ability to develop high-quality externally publishable material is a plus
- Work well in a dynamic team environment
Qualifications
You
must possess the below minimum qualifications to be initially
considered for this position. Preferred qualifications are in
addition to the minimum requirements and are considered a plus
factor in identifying top candidates. Experience listed below would
be obtained through a combination of your school-work/ classes/
research and/or relevant previous job and/or internship
experiences.
Minimum
Qualifications:
- Masters with 4+ years of experience or PhD with 2+ years of relevant industry experience in Computer Science or Computer Engineering or Electrical Engineering or AI or computer vision or SW Engineering or Physics or Mathematics or related relevant technical discipline.
- 2+ years of experience with the following skills:
- Excellent Programming skills in languages like Python, C/C++ and CUDA Low level programming and performance optimization skills for CPU and GPU including code generation, performance optimization, distributed compute, and resource management.
- Understanding of Deep Learning algorithms and experience in deploying/optimizing distributed training on GPU/CPU clusters
- Familiarity with DL frameworks (e.g. TensorFlow, PyTorch, Mxnet, etc.)
Preferred
Qualifications:
- 2+ years of knowledge/experience in Artificial Intelligence solutions applied t segments such as HPC, Cloud, Visual Computing and/or Enterprise.
- Prior experience in deployment strategies, performance optimization, distributed computing algorithms, multi node, multi-GPU scaling big plus
- Large scale language model (GPT-x, Megatron) training on compute clusters is a definite plus.
- Experience in Machine Learning infrastructure development and optimization (framework, ML pipeline, deployment)
- Experience or training in one or more of the parallel programming methodologies: SYCL, C++, OpenMP, MPI, CUDA is highly desired
Enable amazing computing experiences with Intel Software continues to shape the way people think about computing across CPU, GPU, and FPGA architectures. Get your hands on new technology and collaborate with some of the smartest people in the business. Our developers and software engineers work in all software layers, across multiple operating systems and platforms to enable cutting-edge solutions. Ready to solve some of the most complex software challenges? Explore an impactful and innovative career in Software.
Intel
strongly encourages employees to be vaccinated against COVID-19.
Intel aligns to federal, state, and local laws and as a contractor
to the U.S. Government is subject to government mandates that may
be issued. Intel policies for COVID-19 including guidance about
testing and vaccination are subject to change over
time.
Posting
Statement
All qualified
applicants will receive consideration for employment without regard
to race, color, religion, religious creed, sex, national origin,
ancestry, age, physical or mental disability, medical condition,
genetic information, military and veteran status, marital status,
pregnancy, gender, gender expression, gender identity, sexual
orientation, or any other characteristic protected by local law,
regulation, or
ordinance.
Work Model for this Role
This role is available as fully home-based and generally would require you to attend Intel sites only occasionally based on business need.
Before you go...
Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.