DESCRIPTION
- Hiring location: Beijing, Shanghai, Guangzhou, Shenzhen, Hong Kong(visa sponsorship provided)
Would you like to join one of the fastest-growing teams within Amazon Web Services (AWS) and help shape the future of GPU optimization and high-performance computing? Join us in helping customers across all industries to maximize the performance and efficiency of their GPU workloads on AWS while pioneering innovative optimization solutions.
As a Senior Technical Account Manager (Sr. TAM) specializing in GPU Optimization in AWS Enterprise Support, you will play a crucial role in two key missions: guiding customers' GPU acceleration initiatives across AWS's comprehensive compute portfolio, and spearheading the development of optimization strategies that revolutionize customer workload performance.
Key Job Responsibilities
- Build and maintain long-term technical relationships with enterprise customers, focusing on GPU performance optimization and resource allocation efficiency on AWS cloud or similar cloud services.
- Analyze customers’ current architecture, models, data pipelines, and deployment patterns; create a GPU bottleneck map and measurable KPIs (e.g., GPU utilization, throughput, P95/P99 latency, cost per unit).
- Design and optimize GPU resource usage on EC2/EKS/SageMaker or equivalent cloud compute, container, and ML services; implement node pool tiering, Karpenter/Cluster Autoscaler tuning, auto scaling, and cost governance (Savings Plans/RI/Spot/ODCR or equivalent).
- Drive GPU partitioning and multi-tenant resource sharing strategies to reduce idle resources and increase overall cluster utilization.
- Guide customers in PyTorch/TensorFlow performance tuning (DataLoader optimization, mixed precision, gradient accumulation, operator fusion, torch.compile) and inference acceleration (ONNX, TensorRT, CUDA Graphs, model compression).
- Build GPU observability and monitoring systems (nvidia-smi, CloudWatch or equivalent monitoring tools, profilers, distributed communication metrics) to align capacity planning with SLOs.
- Ensure compatibility across GPU drivers, CUDA, container runtimes, and frameworks; standardize change management and rollback processes.
- Collaborate with cloud provider internal teams and external partners (NVIDIA, ISVs) to resolve cross-domain complex issues and deliver repeatable optimization solutions.
-
About the team
AWS Global Services includes experts from across AWS who help our customers design, build, operate, and secure their cloud environments. Customers innovate with AWS Professional Services, upskill with AWS Training and Certification, optimize with AWS Support and Managed Services, and meet objectives with AWS Security Assurance Services. Our expertise and emerging technologies include AWS Partners, AWS Sovereign Cloud, AWS International Product, and the Generative AI Innovation Center. You’ll join a diverse team of technical experts in dozens of countries who help customers achieve more with the AWS cloud.
Why AWS?
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Inclusive Team Culture
AWS values curiosity and connection. Our employee-led and company-sponsored affinity groups promote inclusion and empower our people to take pride in what makes us unique. Our inclusion events foster stronger, more collaborative teams. Our continual innovation is fueled by the bold ideas, fresh perspectives, and passionate voices our teams bring to everything we do.
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
BASIC QUALIFICATIONS
- 5+ years in cloud technical support, solutions architecture, or customer success management, with at least 3 years of hands-on experience in GPU/accelerated computing platforms.
- In-depth understanding of GPU instance families (e.g., AWS G/P/H series) or similar offerings from other cloud providers, AMI/driver/CUDA/container compatibility management, and cloud storage/network performance tuning (e.g., S3 I/O, EBS/Instance Store equivalents, preprocessing pipelines). Proficient in scheduling GPU workloads with EKS or equivalent Kubernetes-based orchestration services, including node pool tiering, resource quotas, elastic scaling, and auto-recovery strategies. Experienced in multi-GPU/multi-node distributed computing (NCCL, topology awareness, tensor parallelism, pipeline parallelism) with expertise in communication optimization for large-scale AI training and inference.
- Skilled in PyTorch/TensorFlow performance analysis and optimization, including DataLoader tuning, mixed precision, operator fusion, and inference acceleration toolchains (ONNX, TensorRT, CUDA Graphs).
- Experienced in cost and capacity governance, familiar with Savings Plans, RI, ODCR, Spot, Capacity Blocks, and right-sizing strategies or their equivalents in other cloud platforms.
- Demonstrated cross-functional communication and influence skills, capable of driving technical solutions with data and business objectives.
PREFERRED QUALIFICATIONS
- AWS Solutions Architect Professional, Machine Learning Specialty, or DevOps Professional certification or equivalent credentials from other cloud providers.
- Hands-on experience with NVIDIA ecosystem software and toolchains (CUDA/cuDNN/NCCL, TensorRT, CUDA Graphs) and proven ability to maintain performance consistency across versions and platforms.
- Delivered quantifiable performance improvements (GPU throughput, latency reduction, cost savings) with demonstrated benchmarking and regression testing methodology.
- Proven repeatable optimization results in LLM inference, batch AI training, real-time video processing, or high-performance computing (HPC).
- Contributions to open source projects (Run:ai, Ray, vLLM, DeepSpeed, Kubeflow, etc.) or published technical articles, whitepapers, or performance benchmarking.
- Experience with Infrastructure as Code (Terraform, AWS CDK **or equivalent cloud development frameworks**), Helm Charts, baseline container image management, and DevOps automation.
- Able to present performance-business tradeoffs and results to senior stakeholders using PR/FAQ documents, architecture diagrams, and capacity/cost reports.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Job details
CHN, Guangzhou
CHN, Shanghai
CHN, Beijing
CHN, Shenzhen
Support
Solutions Architect
举报职位