GPU Plotting Build Guide
What is this GPU plotting everyone is talking about?
Chia has released the alpha of Bladebit CUDA. The alpha is currently being supported in the Chia beta program, through Keybase. The alpha requires the following supported system configurations
- OS: Windows and Linux
- Memory: 256GB of DRAM
- GPUs: CUDA capability 5.2 (NVIDIA 10 series GPU or higher) with 8GB of GPU VRAM
Chia released the new roadmap with GPU plotting in the GUI and Chia client natively, as well as farming compressed plots in Q2’23. So currently you can build the system and plot regular plots (like we have today) but much faster, and test out the new compressed plots prior to the complete replot.
This is an alpha! Plots are good on Linux, but do not go and replot your entire farm with compressed plots until Chia supports farming. Do small tests. Go on Keybase, Twitter, or Discord to get support from other Chia farmers!
You can find plot times from other farmers in the Chia beta program here. Jon from chialinks has put together a very nice list for all things plotting here.
madMAx has released his GPU plotter called Gigahorse. I won’t cover that specifically here, but Digital Spaceport has a nice overview on his channel.
Should I buy a new platform or upgrade my existing setup?
The first questions to ask are
- Are you replotting to compressed plots? It requires time, some energy, and effort but your farm will now generate more xch for the same amount of storage
- How much storage do you need to replot?
- What is your budget?
- How fast do I want to replot?
- How am I going to move these plots? Network or DAS (Direct Attached Storage) with an HBA and JBOD
- What are you going to do with the workstation when you are done plotting? (e.g. NAS, stable diffusion server, home server, gaming)
- Example, farmer has 500TB to replot. Are you ok with plotting 30TB a day for a total of 16 day replot, or do you want to spend an extra $200 to get to 50TB per day?
System types
Workstations
Pros
- Lots of PCIe lanes and physical x16 slots, will need for GPUs, SSD for buffer, and HBA or 10Gbps NIC to offload plots
- DDR4 ECC is getting inexpensive thanks to new DDR5 servers
- Generally good power supplies, no need for painful DIY or system building.
Cons
- Can be big
- Can’t order with exact configuration needed unless you pay a premium
Desktops
Pros
- Cheap motherboards
- Easy availability and DIY building
Cons
- 128GB kits for desktop are more expensive than used server DRAM
- SSD form factor is M.2 only (harder to support enterprise form factor like U.2)
- Consumer SSDs do NOT have enough endurance to support plotting. Sustained write bandwidth now at ~2GB/s on TLC consumer drives, but will wear out fast.
- Only 20 lanes of PCIe from CPU, generally only get a single x4 slot after you are running a GPU at full bandwidth
Servers
Pros
- Entry CPUs still have a lot of PCIe lanes i/o bandwidth, and memory bandwidth
- Monitor power, thermals, and remotely manage with BMC (power control, sensors)
- Older servers can be found inexpensively
- Tons of DIMM slots to easily support 256GB
Cons
- Cannot easily fit consumer GPU (e.g. 3070Ti) into a server
- 1U servers need data center GPU which are more expensive
Recommendation builds for workstations
Workstations are the easiest place to start out for GPU plotting. They accept standard consumer GPUs, they are easy to build, they are inexpensive to find used, and have tons of support for specific models. They have power supplies and everything you need built directly into the workstation. They have a large amount of PCIe lanes to hook up your GPU, HBA, 10Gbps NIC, SSD and other devices. The larger number of memory channels and support for ECC memory (cheaper used) is required for plotting. The plotters work best completely in memory of 256GB. Memory is at an all-time low price right now (Q1’23), so it is a great time to build a workstation.
“When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network.”
Budget build
($500-600 total) for as low as 180 second plot time, up to 30-52TB per day
- Xeon v3 or v4 (Search on eBay Broadwell that supports PCIe 3.0 ($15-30)
- Dell Precision Workstation T7810(Search on eBay), HPE Z440(Search on eBay) , Lenovo P710(Search on eBay) $100-200)
- 256GB DDR4 ECC 2133(Search on eBay), 8x 32GB DIMMs ($250)
- 2060, 2070, or 3060(Search on eBay)($100-300)
- 2x Samsung SM863a high endurance SATA in RAID0 ($80) or single PCIe 3.0 NVMe AIC or U.2 in an adapter
Skylake Workstation
($900-1000 total) for 160-180 second plot time, up to 30-52TB per day
- Xeon Silver 4110 ($20-40) or Xeon Gold 5120 ($60)
- Dell Precision Workstation 7920(Search on eBay) or 7820, HP Z6 G4(Search on eBay), Lenovo P720(Search on eBay)($300)
- Note: The Dell and Lenovo have 2 sockets for CPU, so enough DIMM slots to get to 256GB with 32GB modules. If you go the HP Z6 G4 route, it only has 6 DIMM slots so you must use 64GB DDR4 DIMMs which are a bit more expensive, but the workstations are a few hundred $ cheaper
- 256GB DDR4 ECC 2133(Search on eBay), 8x 32GB DIMMs ($250) or upgrade to 2666 ($300-350)
- 3060 Ti(Search on eBay)($300)
- Samsung PM983(Search on eBay), Intel P4510/4610(Search on eBay), Micron 7300, 9300, Kioxia CD5, CM5, WD SN840, SN640(Search on eBay) ($100)
Threadripper Pro
You will need to hunt eBay for the right time to find a used/refurbished P620! I was able to get a P620 with A4000 ($500 value), 5945WX, and 64GB of DRAM for $1200!!! The base model of the P620 is great. Other DIY TRP builds, and Supermicro are still far too expensive.
($1300 total) for 90-150 second plot time, up to 100TB per day
- AMD Ryzen™ Threadripper™ PRO 3945WX or 5945WX
- Dell Precision 7865 Tower Workstation, Lenovo P620(Search on eBay) ($900 barebones, $1200-1400 with CPU, DRAM, and basic GPU)
- 256GB DDR4 ECC 2133(Search on eBay), 8x 32GB DIMMs ($250) or upgrade to 3200(Search on eBay) for 20% more performance ($250-500)
- 3060 Ti(Search on eBay) ($300) to 3080 ($500)
- PCIe 4.0 NVMe preferred but not required, Samsung PM9A3(Search on eBay), Solidigm P5510(Search on eBay), Micron 7400, 9400, Kioxia CD6, CM6
If you want to give the Bladebit CUDA alpha a quick try you can grab the binaries from here https://downloads.chia.net/bladebit or from the open source Github repo here
Install NVIDIA drivers for Ubuntu. Warning! Latest alpha builds may require NVIDIA 530 driver.
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt install nvidia-driver-530
and reboot the system. If you want to be able to compile the latest versions (not required). Install NVIDIA dev kit
https://developer.nvidia.com/cuda-downloads?target_os=Linux
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda
PATH=/usr/local/cuda/bin/:$PATH
sudo apt install cmake libnuma-dev
git clone https://github.com/Chia-Network/bladebit.git -b cuda-compression
cd bladebit
mkdir -p build-release
cd build-release
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . --target clean --config Release
cmake --build . --target bladebit_cuda --config Release -j$(nproc --all)
Test bladebit cuda in benchmark mode (does not write to final destination)
bladebit_cuda -f <farmer key> -c <contract address> -n 1 --benchmark cudaplot /mnt/ssd
Does anyone know if Bladebit has been “fixed” so that it will plot in memory? Tried the latest version inside of Machinaris and got a whole bunch of bad plots. BB 3.0.0, I believe is the version.
Bladebit ramplot originally required 416GiB for complete in-memory plotting. This version only requires 256GB. Eventually the CPU in memory plotter will also only require 256GB
CUDA compatability 5.0 is the GTX 9xx series cards – a few of the highest-end ones DID have 8 GM of RAM, and they’re pretty cheap since Etherium dropped GPU mining.
The GTX 10xx series cards are CUDA 5.2, anything from the 1070 up is at least 8GB.
sorry fixed this, 5.2 is correct. 10 series and up
CUDA compatability 5.2, per Howard in Github.
NOT 5.0, that would be GTX 9xx series cards.
fixed thanks!!
Hi, thank you!!
I have a question for the Budget build, Dell T7810 version.
You wrote, i quote “2x Samsung SM863a high endurance SATA in RAID0 ($80) or single PCIe 3.0 NVMe AIC or U.2 in an adapter”
Which is faster? do all of them allow the system to reach 30tb per day? and also Samsung SM863a high endurance SATA what size per disk?
yeah! 30TB per day is around 350MB/s writing to the disks at all times, you will need 2-3 disks open at all times. two high endurance SATA drives will handle that no problem. two of the 480GB ($25-30 each right now) is perfectly fine for this.
A question about desktop GPUs: is a dummy hdmi plug required to get them to work for plotting?