Chia optimized plotting drives, are they worth the cost?
Summary
The LX3030 is very good for Chia plotting, almost able to max out a 32 core AMD Threadripper performance with madMAx. The 1TB will likely be the sweet spot, priced under a high-end data center SSD and easy availability and compatibility because of the M.2 form factor. LX2030 is perfect for temp 1 when using madMAx as temp 2 with a ramdisk, as it has enough performance to keep up and great endurance for this use case.
Introduction
I did an interview with Phison talking about the new Chia optimized plotting drives. They were gracious enough to send me a few samples to test out. I have a background in data center SSDs, so I’ve never thought highly of the performance of consumer drives. Datacenter drives are made for sustained performance, high bandwidth, and much higher endurance requirements. My recommendations to the majority of Chia plotters have to been to buy used data center SSDs, but a lot of people don’t want to deal with eBay, looking at used SMART logs, confusing warranty, and replacements, etc. It’s much easier for most people to just be able to buy a drive off Amazon and pop the M.2 into your desktop and start plotting. I was very excited to hear how these drives are going to actually hold up in plotting.
Using Single Level Cell (SLC) NAND for higher performance and endurance
The first trick that Phison did is the one with the largest impact. They use the dynamic SLC mode of the NAND but stay in this mode (effectively disabling caching, and using only SLC). This greatly improves performance and endurance. One interesting thing about modern TLC and QLC is that all the NAND die & packages have the native capability to write in SLC mode to get higher performance and endurance. The NAND cell can actually change on the fly. This is useful in consumer SSDs that are mostly idle, a user writes a large file, they get really good bursty performance, and in the background, the SSD firmware migrates the data to the slower media. TLC and QLC still have great read performance, so in a read-heavy workload like media consumption, office, and gaming this is perfectly fine. Because of this, caching algorithms are commonly used in the majority of consumer drives. This is bad for Chia plotting because of the sustained writing and the cache will quickly get exhausted. They have some other interesting tricks about the LDPC engine (error correction) and retention, but I won’t go into those here. The exact same NAND can get better endurance on bigger, more expensive controllers because of better error correction – but we can make it easy and grade NAND into general buckets.
Media | NAND PE Cycles |
Consumer QLC | 1000 – 1500 |
Enterprise QLC (Intel only) | 3000 |
Consumer TLC | 3000 |
Enterprise TLC | 7000-10,000 |
SLC | 40,000 – 50,000 |
The first partner to bring the Phison Chia optimized plotting drives to market is PNY with the LX2030 and LX3030 NVMe SSDs.
The reason why SLC NAND has not been used in consumer SSDs for the total capacity is all about cost. It is exactly 3 times more expensive per byte than TLC, and exactly 4 times more expensive per byte than QLC. The endurance, however, is about 20-40x, which makes this cost to endurance tradeoff really interesting for Chia plotting. We will investigate here whether or not these drives are worth the extra cost.
Performance Overview
I’m using the Beast AMD Threadripper system from my previous post.
System Setup
- Gigabyte Technology Co., Ltd. TRX40 AORUS MASTER
- AMD Ryzen Threadripper 3970X 32-Core Processor
- 128GB of DDR4 3600, 4 units of 32GB (F4-3600C18-32GVK)
- Maker Pro 2TB, firmware ECFMB3
- MadMAx version 28ad9af
Performance Summary
I was able to achieve 12.33 TiB a day using just a single SSD. This was using the drive as temp 1 and temp 2 (no ramdisk). The sweet spot appears to be two madMAx instances of 16 threads (-r) each. You can see when I push it to 4 instances I get higher iowait (the red in the CPU chart) which actually slows the total performance down. The control was to see if we can get CPU utilization higher by using a second SSD, which is able to squeeze another 1TiB a day out of the system with madMAx.
Test | Instances, threads | TiB per day | Total plot creation time |
Test 1 – 2TB Maker Pro (LX3030) | 1, 32 | 9.38 | 900.8 sec (15.01 min) |
Test 2 – 2TB Maker Pro (LX3030) | 2, 16 | 12.33 | 1387.8 seconds (23.13 min) |
Test 3 – 2TB Maker Pro (LX3030) | 4, 16 | 11.68 | 3250.66 sec (54.18 min) |
Test 4 – 2TB Maker Pro (LX3030) and Optane P5800X | 2, 16 to Maker Pro2, 16 to Optane P5800X | 13.26 | 2579.1 seconds (43 min) |
Test 5 – 2TB maker (LX2030) with 110GB ramdisk | 1, 32 | 9 | 946.894 sec (15.7816 min) |
Test 1
Test 2. You can see the peak bandwidth for the Maker Pro 2TB (LX3030) is about 3GB/s. This is about as good as I have seen on any drive out of this system. There is a small amount of iowait which is normal when you are pushing a drive to the limit.
Test 3 – Notice higher CPU utilization but much higher iowait (red). Drive bandwidth stays about the same which means we are maxing this drive’s capabilities out.
Test 4 – Total bandwidth and CPU utilization go up, as well as total output by 1TiB per day. Showing that a single 2TB Maker Pro is “almost” enough to max out this Threadripper 3970X.
Test 5. You can see the lower CPU utilization corresponds to lower output. This has nothing to do with the drive. There is low iowait meaning a faster SSD won’t help at all. MadMAx just doesn’t scale very well on higher core count systems with only a single instance.
Endurance overview – plot calculations
The Maker Pro 2TB is holding up strong. 2.06 PBW at only 2% used, suggesting I’m actually running below the WAF that they rate the 54 PBW at.
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 41 Celsius
Available Spare: 100%
Available Spare Threshold: 5%
Percentage Used: 2%
Data Units Read: 2,711,936,352 [1.38 PB]
Data Units Written: 4,027,053,141 [2.06 PB]
Host Read Commands: 3,899,398,379
Host Write Commands: 7,595,314,902
Controller Busy Time: 22,232
Power Cycles: 9
Power On Hours: 1,254
Unsafe Shutdowns: 0
Media and Data Integrity Errors: 0
Error Information Log Entries: 11
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Thermal Temp. 1 Transition Count: 116
Thermal Temp. 1 Total Time: 129650
I have added these drives to the official Chia endurance wiki.
This shows that the 1TB LX3030 can plot about 2PB of plots at the rated WAF, making it an excellent choice for $600. I hoping they can get it a bit lower because $600-1200 has more competition with high-end data center drives. The LX2030 can do 769 TiB of plots at the rated spec sheet WAF making this actually a strong drive for endurance, but the performance is not good enough for a 32 core system. This drive is better suited for an 8 core desktop. You can see if you are going to be doing an insane amount of plotting the $/TiB of the LX3030 is actually fairly competitive, but I am still hoping they are able to get the cost down so that more Chia plotters can afford it.
Model | $ASP | $/GB | Capacity (GB) | TBW WAF=1 | TBW (spec) | Total Plotted (TiB) | Total plotted madMAx (ramdisk tmp2) | $/TiB plots |
PNY LX3030 | $1,149.00 | $0.57 | 2000 | 87961 | 54000 | 4154 | 13500 | $0.28 |
PNY LX3030 | $600.00 | $0.60 | 1000 | 43980 | 27000 | 2077 | 6750 | $0.29 |
PNY LX2030 | $500.00 | $0.25 | 2000 | 21990 | 10000 | 769 | 2500 | $0.65 |
Cost
The Beast AMD Threadripper system cost me $3000 with no SSD. With the LX3030 1TB this would be $3600, which is a pretty strong cost for a system that can plot over 12TiB a day, and it can be fully reused as a high-end gaming desktop, workstation, and content creation/video editing after.
If the LX2030 ends up being slightly less expensive at $300-400, this will also be a valid system config being used alongside a ramdisk for tmp2. It isn’t fast enough to be used as the only SSD in the sytem (5TiB a day only), but it may be suitable for an AMD 8 to 16 core system.
Next Steps
Phison developed these drives right around the time madMAx was released. Anyone who has been involved with Chia since the beginning knows that the plotting requirements used to be not only high sustained write bandwidth and endurance, but also 256GB of capacity per parallel plot. This is the reason they launched with the 2TB versions first, it takes months to get a new SSD model out. With madMAx you can use a much smaller capacity SSD. I’m hoping that they end up making a 512GB version for budget desktop plotters. The 2TB drives are still very good, and hopefully, if people have some other use cases (besides Chia plotting) that they can use the drive for after these will be useful. The M.2 form factor makes it so people could actually use these in a high-end laptop as well. I’m also looking forward to the Phison E18 version with SLC, as this should be significantly higher performance.
I think a lot of consumer-level plotters were built that can only hold 128GB of RAM, that have at least one 4.0 M.2 slot, and 6-8 cores (all before MadMax could use that RAM, and BladeBit had people looking at used enterprise gear with 512GB). From what I see in this article: when the SSD in those original plotters start showing errors… the best path forward may be to max RAM to 128GB and add the small LX3030?
If that’s true, then for about $1000 you can combine the higher TBW, and most of the writes going to RAM, to end up with a refreshed plotter that should pretty much last forever?
not even that, a single LX3030 paired with even 32GB of DRAM should be enough performance for almost any desktop system for sustained plotting, and has enough endurance to where you won’t ever have to worry about it. The 128GB of DRAM can be paired with the lower end LX2030, which still has good endurance just not as good of sustained performance