Principal Engineer AI ML Infrastructure Engineer
130k - 160k USD
Remote
Full Time
#Engineering
#Infrastructure
#Linux
#Python
#BASH
#PHP
#Machine Learning
#Automation
#Benchmarking
#Troubleshooting
At Vultr, we are on a mission to make high-performance cloud computing accessible, affordable, and easy to use for developers and businesses across the globe. As the world’s largest privately-held cloud computing company, we have grown entirely through our own success without raising equity financing. Today, we support over 1.5 million customers in 185 countries, providing them with reliable cloud compute, storage, and GPU solutions from our network of 32 data centers. We are currently looking for a talented engineer to help us continue this journey and leave a lasting mark on the future of cloud infrastructure.
The opportunity
We are seeking a Principal Engineer for AI ML Infrastructure to join our central engineering team. This role is a vital part of our growth strategy, as you will take ownership of the setup, provisioning, and operational excellence of our GPU-based systems. You will work to ensure that our infrastructure remains fast, stable, and performant, directly impacting the experience of our customers who rely on our platform for their most demanding computational workloads.
A day in the life
- You will develop and maintain robust GPU infrastructure across both bare metal and containerized environments, working closely with our networking team to build scalable GPU clusters.
- You will conduct in-depth benchmarking, performance testing, and troubleshooting to identify hardware or software limitations and ensure consistent, reliable provisioning for our users.
- You will act as a primary point of contact for hardware and software vendors, collaborating to resolve bugs, manage drivers, and implement reference architectures that meet the needs of diverse applications.
Who you are
You are a senior-level engineer with a deep technical background and a passion for high-performance computing. You communicate effectively in English and bring the following expertise to our team:
- Hands-on experience with modern, high-performance GPUs, specifically NVIDIA products like NVLink, Infiniband, and vGPU technologies.
- Extensive knowledge of bare metal internals, including firmware, BIOS, BMC, Redfish/IPMI, and PCIe automation.
- Strong proficiency in Linux, device drivers, and package management.
- Practical experience with Python, BASH, and PHP.
- Familiarity with machine learning software and large-scale infrastructure automation.
Why you'll love it here
We offer a salary range of $130,000 to $160,000, which may vary based on your specific background and location. Beyond the paycheck, we provide a supportive environment designed to help you thrive:
- A fully remote work environment that includes a company-wide get-together.
- A 401(k) plan with a 100% match up to 4% and immediate vesting.
- An annual professional development reimbursement of $2,500.
- Generous paid time off, including 11 holidays, a birthday day off, and increased leave as you reach work anniversaries.
- Financial support for your home office setup, monthly internet costs, and gym memberships.





