HPE’s Neil MacDonald: Running AI In Public Cloud Will ‘Crush’ IT Budgets

by nlqip
June 18, 2024

“The target customers for this [HPE Private Cloud for AI] are very firmly enterprise customers who are seeking to embrace generative AI and who recognize that running AI workloads at scale in the public cloud will crush their IT budgets,” said MacDonald, the general manager of Compute, HPC (High Performance Compute) and AI for HPE.

Hewlett Packard Enterprise Executive Vice President Neil MacDonald says that HPE Private Cloud for AI has a big GenAI advantage over the public cloud, which he proclaims will “crush” the IT budgets of enterprise companies.

“AI is the most data- and compute-intensive workload of our generation, and managing that data and the governance around it and the protection of that data and IP leaves a lot of enterprises to want to embrace generative AI but to do it in a private cloud model,” said MacDonald.

Manuvir Das, vice president of enterprise computing for Nvidia, who joined the HPE press conference on the new Nvidia AI Computing By HPE portfolio, said he also sees the private cloud cost advantages for customers.

“It’s the age-old debate that if you have the wherewithal to stand up your own infrastructure, you can get a far superior TCO than relying on a cloud service,” he said.

Furthermore, Das said there are benefits for customers looking to keep their private data on- premises. “The AI workload is particularly interesting because it’s so driven by data,” he said. “And if you think about an enterprise company, you are really accessing the private data of your company that really represents the secret sauce, the IP of your company. So the question is, would you prefer sending all of that data into the cloud, or would you rather keep that all under your control?”

Finally, there are latency issues with regard to moving the data from on-premises to the public cloud, said Das. “If you have petabytes of your enterprise data that you’re now extracting insight from, do you want to move the data to the compute? Or do you want to move the compute to where the data is? So I think these are the reasons why a private solution is quite desirable. And of course, you know both options exist and every customer will decide for themselves which option they prefer.”

MacDonald also weighed in on a number of other issues including why HPE is going all in with Nvidia on the Nvidia AI Computing By HPE portfolio rather than AMD, and why HPE has a liquid cooling advantage over competitors.

Below is an excerpt from the press conference question-and-answer session with MacDonald.

Why has HPE decided to go all in with Nvidia rather than AMD?

If you think about enterprise AI success, generative AI relies not just on accelerator silicon, but also on fabrics, on system design, on models, on software tooling, on the optimizations of those models at runtime. And so we are thrilled to be working closely within Nvidia with a very strong set of capabilities that together enable us to have our enterprise customers be able to move forward much more quickly on their enterprise [AI] journeys.

It’s key to notice that this HPE private cloud AI is not a reference architecture that would place the burden on the customer to assemble their AI infrastructure by cobbling together piece parts, whether those are GPUs or pieces of software or different connectivity.

HPE and Nvidia have done the hard work for customers like co-developing a turnkey AI private cloud that is up and running in three clicks. And that goes much beyond a question simply of an accelerator.

Who are the target customers for Nvidia AI Computing By HPE?

So the target customers for this are very firmly enterprise customers who are seeking to embrace generative AI and who recognize that running AI workloads at scale in the public cloud will crush their IT budgets.

AI is the most data- and compute-intensive workload of our generation, and managing that data and the governance around it and the protection of that data and IP leaves a lot of enterprises to want to embrace generative AI but to do it in a private cloud model.

So our target customers for HPE Private Cloud AI are those enterprises around the world who are all coming to grips with how to gain the productivity benefits of generative AI in their operations and want to do that on-prem with greater efficiency and greater control.

Does HPE Private Cloud For AI integrate with the HPE GreenLake for LLMs public cloud?

At HPE we are currently supporting a few customers with GPU as a service that supports large language model work and other AI workloads. This spans thousands of GPUs that we’re providing to customers via the cloud. That’s currently an offer available to select pilot customers. We’re working to refine the offering and we’ll share further details later this year.

Will all of the Nvidia AI Computing By HPE portfolio rack servers have liquid cooling?

Not yet. The growth in the energy and thermal intensity of accelerator silicon and in CPU silicon continues to accelerate. As a result, within our product portfolios and across our servers, we offer a variety of systems today that encompass traditional air-cooled, 70 percent PLC systems and 100 percent PLC systems. And we continue to evolve our technologies around liquid cooling as we move forward.

But as these systems are becoming more accelerator-rich, and therefore more thermally and energy challenged, it becomes increasingly pervasive to deploy either partial direct liquid cooling or in the most [high]-performance systems 100 percent direct liquid cooling. So across the portfolio today, we have all of that and you will see increasing use of direct liquid cooling as we continue to work with Nvidia on future systems and future silicon.

Is that direct to chip liquid cooling rather than a rear door heat exchanger? Also is it single- or two-phase cooling and are you looking for immersion [cooling] down the line?

So you referred to rear door heat exchanger technologies that you see on the [slide on the] extreme left. There are also advanced systems for recirculating air around air-cooled systems and leveraging water supply as the heat exchanger without requiring different server technology. That’s also on the left [hand side of the slide].

In the middle [of the slide] is classic direct liquid cooling, with the fluid removing the heat from the system going to an external CDU as the point of exchange in the system and then on the right, you see the same kinds of technologies, but at greater scale, with 100 percent DLC [direct liquid cooling] and no reliance on air cooling at all. So we do all of that today in products that are shipped at scale to customers around the world and we build on decades of experience in direct liquid cooling across the spectrum.

We continue to innovate in the cooling space. That includes all forms of advanced cooling technologies, which we continue to assess and bring to the portfolio. We’re very familiar and aware of what can and can’t be done with immersion base technologies both single base and dual base and the optimum solution for the design points across this range today remain the solutions that you see in front of you.

Is the direct liquid cooling you provide today single phase today?

We continue to provide the direct liquid cooling that we’ve been providing in the past, which is single phase.

Talk about why liquid direct cooling is so important with GenAI systems, particularly with regard to the Nvidia AI Computing By HPE portfolio.

In essence, when you’re dealing with 100 percent direct liquid cooling you are extracting all of the heat using liquid, and you design the system very, very specifically to enable you to do that by pulling the fluid through the system and all the points of rejection of heat.

So this is something that we’re doing at scale that underpins the two exascale systems in the world that have been delivered. And that same technology is increasingly relevant within our broader portfolio. Today we have systems deployed, leveraging our servers for AI workloads. Taking advantage of all of the range of cooling that you see in front of you, including 100 percent direct liquid cooling … is a very strong capability of HPE.

Direct liquid cooling is not about designing a server as much as it is about designing an entire system, including the integration with all of the circulation, the CPUs, the control systems, etc.

Source link
lol

“The target customers for this [HPE Private Cloud for AI] are very firmly enterprise customers who are seeking to embrace generative AI and who recognize that running AI workloads at scale in the public cloud will crush their IT budgets,” said MacDonald, the general manager of Compute, HPC (High Performance Compute) and AI for HPE.…