A dedicated GPU server is a server with one or more graphics processing units (GPUs) that offers increased power and speed for running computationally intensive tasks, such as video rendering, data analytics, and machine learning. Dedicated GPU servers may also have a specialized CPU and come with large amounts of RAM and storage.
The parallel architecture of a GPU, originally designed to handle graphics and video processing, allows a dedicated GPU server to manage multiple tasks simultaneously at speeds beyond the abilities of a CPU-based server.
What Is a GPU Rack Server?
A GPU rack server is a server equipped with GPUs designed to fit into a server rack. A server rack is a rectangular framework with multiple mounting slots designed to hold rack servers and other networking components. Servers are stacked on top of each other to minimize the use of floor space and are slid in and out of the rack as necessary.
A GPU rack server offers several advantages, including better space utilization, increased scalability, maximized airflow, and easier maintenance.
Reasons to Use a Dedicated GPU Server
GPUs are the throughput-optimized, specialized, counterparts to CPUs. Instead of having a handful of heavyweight cores with high clock speeds capable of performing a wide variety of computational tasks, GPUs employ thousands of lightweight cores optimized to process the same operation in parallel (i.e., Single Instruction, Multiple Data [SIMD]).
These cores have instruction sets optimized for dimensional matrix arithmetic and floating point calculations, speeding up linear algebra. The end result is a system optimized for parallel computing.
Reasons you might want to use a dedicated GPU server include:
- Big data analytics pipelines
- Streaming video
- Image processing
- 3D animations and simulations (e.g., the modeling of protein chain folding)
- Deep learning applications (e.g., speech recognition)
- Hash cracking (e.g., password recovery)
- Mining cryptocurrency
If you’ve got a single operation you need accelerated that can benefit from parallel execution across thousands of cores, a dedicated GPU server can help.
Types of GPU Rack Servers
GPU rack servers fit into rack servers or cabinets. Server racks and the equipment installed in them are measured in rack units, written as “U” or sometimes “RU.” A “U” describes the height of equipment (e.g., the height of a server or the height and number of shelves in a server rack).
One U equals 1.75 inches, so the height of a 1U server would be 1.75 inches and a 2U server would be 3.5 inches. A 32U rack unit, for example, can hold 32 1U servers, 4 8U servers, or 1 32U server.
Read “A Definitive Guide to 19-Inch Server Rack Sizes” for more information about rack sizing.
Smaller Form Factors vs. Larger Form Factors
The primary differences between smaller and larger rack server form factors are their density and expandability.
1U & 2U GPU Rack Servers
Smaller form factors, such as 1U and 2U GPU rack servers, are designed with performance density in mind but are less powerful than larger GPU server form factors. They’re commonly used because of their lower costs and ability to save server rack space.
1U and 2U GPU rack servers are easy to maintain, highly portable, and easy to scale (you can scale performance by using several of them). A 1U server can typically hold a CPU or two, several terabytes of memory, and multiple GPUs. A 2U server at double the height of the 1U server will give you a little extra space for computing power and storage.
In smaller form factor servers, GPUs are typically mounted horizontally because of space restrictions. There’s also less space for PCIe slots and storage, though you can expand these using a PCIe expansion kit or JBOD enclosure.
8U & 16U GPU Rack Servers
Larger GPU rack servers such as 8U and 16U are geared toward workloads that require more extensive performance capabilities. They come with more room for storage and additional expansion slots, enabling you to plug in additional PCIe cards to increase data processing performance. The additional space also promotes better air circulation to prevent overheating.
In a larger form factor, GPUs are installed vertically with extra room for power connections located at the top of the card rather than at the rear.
How to Size a Dedicated GPU Server
When sizing a dedicated GPU server, you’ll need to consider the product features you want, as well as your current and future business needs. The optimal server configuration depends on your target workloads, the specific use cases of that server, and how fast you need it to be.
Dedicated GPU servers can be configured for specific target workloads, such as video rendering, deep learning training, inference, big data analytics, and high-performance computing (HPC). The optimal server configuration depends on your target workloads, the specific use cases of that server, and how fast you need it to be.
GPUs use a lot of power and generate tons of heat. They’re larger than CPUs and need extra space for power connectors. The server chassis must not only be large enough to fit the amount of GPUs you want to use but also provide good air ventilation to prevent overheating and thermal throttling.
How Much Does a Dedicated GPU Server Cost?
Understandably, costs will vary depending on whether you choose to build your own dedicated GPU server, rent a server, or use cloud-based services.
If you’ll be building your own server, you’ll need to consider the cost of the GPU as well as the power supply, chassis, specialized CPU, RAM, and storage. You may also need to consider the costs that come with building on-premises data centers, such as power, space, cooling, and maintenance costs.
GPUs are classified on specializations, and prices vary depending on the use case. For example, NVIDIA offers Tesla V100-based servers suitable for deep learning and high-precision calculations. A top-rated GPU such as NVIDIA’s Nvidia GTX Titan Z can cost around $3,000.
If you choose to go with a cloud platform, there are several cloud service providers that offer dedicated GPU-powered server plans including the major cloud service providers like AWS V2 Cloud, Google Cloud Platform, and Azure. AWS, for example, offers on-demand pricing starting at $0.900 per hour for one GPU and four virtual cores.
GPU Rack Server: Purchase vs. Rent
Choosing whether to purchase or rent comes down to several factors. Your company’s budget and potential use cases are the main ones.
Purchasing a GPU rack server involves upfront costs. Top-rated GPUs for machine learning workloads can come with a hefty price tag. Add this to the maintenance, energy, and bandwidth costs of storing your GPU server on premises, and your initial investment costs could be astronomical.
With the pace of modern technological innovations, buying a GPU server comes with the risk of it becoming obsolete before you can make a return on your investment. Updating the system will also incur additional costs.
If you work with large data sets and plan to deploy your models in a production environment, consider renting GPU infrastructure through a cloud service provider. This subscription model allows you to pay by the hour or monthly depending on the resources you use and scale up or down based on current demands.
Get State-of-the-art AI Infrastructure with Pure Storage
A dedicated GPU server offers several advantages over a CPU-based server, including higher performance, increased flexibility, and better utilization of CPU resources. Dedicated GPU servers can be purchased outright or rented from a service provider.
AIRI™ is a simple, highly scalable, flash-based AI infrastructure developed by Pure Storage® and NVIDIA. AIRI is powered by the latest NVIDIA DGX systems with Pure Storage FlashBlade//S® storage, the Pure Storage Purity//FB operating system, and Pure1® cloud management.
Experience new levels of AI success with Pure and AIRI.