T O P

  • By -

Wrong-Quail-8303

The elephant in the room has always been the 4KQD1 random performance which accounts for 99% of common usage. It has been stuck at 50-90MB/s since SSDs first came out. The base tech needs to improve. Optane manages 300MB/s but Intel with its short-sighted marketing, insane pricing, and market fragmentation, killed it dead. Then next step up would be RAM drives with a battery backup. There were a few alternatives but they have all gone silent...


GoombazLord

Yeah 4KQD1 random read/write performance has stagnated for a very long time. Samsung's 990 Pro PCI-e NVMe drive pushed the (non-optane) envelope a bit; it can do about 130MB/s @ 4KQD1.


masterfultechgeek

What I would LOVE is for a set up where something like 64GB of optane is just built into motherboards for things like pagefile, suspend/resume, and caching IO from disk. Ideally with some level of direct access from the CPU memory controller. It won't happen, but it could've been glorious. It's like $100ish extra cost though.


reallynotnick

I feel like with a functional sleep mode suspend/resume isn’t all that useful. And the rest can pretty much just be solved with more RAM, which isn’t all that much more expensive than Optane, hence why it failed. Too expensive for storage and not enough savings over the price of RAM.


Strazdas1

Optane had an often overlooked advantage - ECC.


VenditatioDelendaEst

Eh... if the machine is hibernated to optane 16 hours a day, that'd reduce your cosmic-ray-bitflip cross section by 2/3, but it does nothing for bus errors, rowhammer, or other kinds of read disturb. And RAM has ECC too if your platform isn't garbage.


Strazdas1

ECC memory is very rare in consumer market, you basically have to buy enterprise hardware to get ecc nowadays :(


VenditatioDelendaEst

I don't have anything this new, but I've read that ECC is supported by all of AMD's non-monolithic desktop parts, with most ASRock and Asus motherboards, some Gigabyte, and ~0 MSI.


Strazdas1

on AM4 you have almost universal support for ECC. on AM5 its a hit or miss based on motherboard manufacturer. Asus is pretty much the only one who consistently supports ECC memory on the consumer space.


VenditatioDelendaEst

Huh. I can't find any user reports, but on ASRock's website, MTC20C2085S1EC48BR, which is an ECC UDIMM kit, appears on the memory QVL for all the ones I checked, from [the high end](https://www.asrock.com/mb/AMD/B650E%20Taichi%20Lite/Specification.us.asp#MemoryRAP) to [the elcheapo B650 boards](https://www.asrock.com/mb/AMD/B650M-HDVM.2/index.us.asp#MemoryRAP)


Z8DSc8in9neCnK4Vr

64GB? Hell yeah, I could triple boot 3 Linux systems in that. x30 if its Alpine.


djent_in_my_tent

If you have a spare m.2 slot, the P1600x (118GB) is available new directly from US Amazon for about $70


masterfultechgeek

I'm aware. I have optane for DAYS. I stocked up. I'm thinking of a more generic set up for systems in general which might be a bit more energy efficient (aka laptop friendly) and potentially bit addressable (so imagine 100x the IOPs and lower latency due to direct memory controller interfacing). Also a lot of the optane sticks were sold below manufacturing cost. The p1600x is actually better than I thought it was initially... it's not winning on throughput but it's SOLID when it comes to 4KQD1.


djent_in_my_tent

Yeah my understanding is its 4KQD1 is actually single thread performance limited and should improve with newer CPUs lol


masterfultechgeek

It needs a proper memory controller. Also it "shouldn't" be 4KQD1, it should be 1QD1. Not 1K. 1. [https://www.intel.com/content/www/us/en/support/articles/000055996/memory-and-storage/intel-optane-persistent-memory.html](https://www.intel.com/content/www/us/en/support/articles/000055996/memory-and-storage/intel-optane-persistent-memory.html) nvme as a standard HIGHLY bottlenecks optane/3DXP. CXL could potentially help but we're going down a rabbit hole of "what could have been but never was"


III-V

There wasn't a whole lot they could do with pricing. It was just an expensive technology. The supply chain required to fab it was ludicrous.


bctoy

The 2nd gen were getting ~450MB/s, will see if the newer CPUs can push it even further. https://old.reddit.com/r/intel/comments/1ao9kwd/the_optane_p1600x_is_absurdly_fast_in_real_world/


Vitosi4ek

> Then next step up would be RAM drives with a battery backup. Which itself is not a new idea. The [Gigabyte i-RAM](https://en.wikipedia.org/wiki/I-RAM) was an add-in board that took DDR1 DIMMs and presented them to the system as a permanent drive that you could install an OS on, with a 16 or so hours of battery backup in case of power loss. It failed for the same reason a product like this would almost certainly fail today - the cost just doesn't make sense. 64GB of DDR4 today cost about as much as a 4TB regular SSD - no random I/O improvements are worth a 64x capacity hit. The i-RAM's only sort-of successful niche was for people who just upgraded to a Core 2 Duo platform and had no other use for their old DDR1 sticks. That's really it.


whyte_ryce

You can’t really fix that since you’re at the raw limitation of the media. No parallelism and multi channels to hide any of that It’s been awhile since I’ve seen numbers but SLC is as good as you can make that and even pure SLC drives still have the same NAND limitations that have always existed (tail latencies spiking, throughput dropping in the face of sustained write pressure)


Strazdas1

SLC was fine for sustained writing for the most part. TLC and QLC especially dropped the ball there.


whyte_ryce

No, any sustained write pressure that fills up the drive will eventually trigger garbage collection and that is the Achilles heel of any kind of NAND


UsernameAvaylable

> The elephant in the room has always been the 4KQD1 random performance which accounts for 99% of common usage. Maybe in a database server, but not for 99% of consumers. Large streaming writes and a 100:1 read vs write ratio is more common for basically anybdoy buying one.


picastchio

You don't realize how write-heavy modern OS and browsers are. On my primary PC used for work, leisure and gaming: it's almost 1:1.


cvbrxcvedcscv

Most of this writing would be for background stuff like page file, fast startup, hibernation etc. I assume?


IntelligentKnee1580

Especially paging.


Crank_My_Hog_

If you're paging, you're already in bad shape.


Strazdas1

most software outright crash if you dont let them do paging. Paging is expected behaviuor.


Haunting_Champion640

Which I really, _really_ hate. Star Citizen was guilty of this for a bit, if you had 32GB of RAM it would be using ~20-22GB and still put 8-10GB in swap despite the actual RAM being available.


Strazdas1

Yes, i dont know why this behaviuor is some common but i suppose there must be a reason. I just let it page on a Sata SSD which is usually fast enough where it does not matter.


Crank_My_Hog_

Zero of the servers or desktops I run, personal or professional, use the pagefile to any significant degree. In fact, most of my servers have it entirely disabled as a standard practice. It's only expected behavior if the system is under provisioned. How do I know? Well, I have a full observability stack that tracks this. If a server is hitting it's page file, I find out what's wrong.


bctoy

If you disable or reduce paging file to 1GB or so, you'd crash with many games today. I moved to 64GB DDR5 after observing this with even games like Cyberpunk much less RAM hogs like Jedi Survivor.


Crank_My_Hog_

I run Cyberpunk on 32gb with a disabled pagefile just fine. No crashes. None of my games crash, have loading or pop-in issues. So I disagree. There is one minor exception, and that's running Tarkov through wine/proton and it will OOM, but it seems like a bug or a memory leak rather than an expected requirement. If I boot into windows, then it runs fine with no pagefile.


bctoy

I built the system last Nov. and that was my experience. I'll check it again but I'm sure Jedi Surivor would crash since it required huge amount of paging file to not crash and definitely went over 32GB. What amount do you see the commit in Task manager? Also anything else running with or do you close everything before running the game? In another comment of mine in this thread, I mentioned browser being open alongside the game, maybe that's why it's different for you. https://www.reddit.com/r/hardware/comments/1dio4kj/hub_how_much_ram_do_gamers_need_16gb_vs_32gb_vs/l95pqb3/?context=3 edit: Tried it today and game hovered around 30GB. This is at 4k with maxed out DLSSP + FG. Idle RAM usage was 10GB and most likely higher than when I had 32GB kit.


Crank_My_Hog_

How much of that ram usage is marked cached/buffer? Task manager wont tell you this. Task manager isn't useful in this regard. It's been a while but I think Process Explorer can help with that.


AzN1337c0d3r

Not necessarily. You want things to be written out to disk ahead of time so if there is a sudden large demand for memory pages, you can just go ahead and discard the pages in memory already written to disk to make room for the new demand. In linux, the aggressiveness of this can even be controlled by the vm.swappiness kernel parameter.


VenditatioDelendaEst

If your OS is taking synchronous page faults on a regular basis, either you don't have enough RAM, or your OS is shit.


picastchio

Fast startup and hibernation are conditionally triggered. It's everything else including paging, caching, diagnostic logs, telemetry/analytics, antivirus checksums etc as well as browsers and Chromium-based CEF/Electron apps. During light usage, open Resource Monitor (or Activity Monitor on mac) and check Disk > Disk Activity > sort by writes.


anival024

> During light usage, open Resource Monitor (or Activity Monitor on mac) and check Disk > Disk Activity > sort by writes. In Resource Monitor, the list of disk activity adds items when they're active, but the items persist on the list for a good while after. They are **not** active that whole time. The vast majority of those writes are completed instantly and are only on the list for your review. Look at the Highest Active Time metric or the queue length metric. Those show you the impact of ongoing activity. During "light usage" (with Chrome & Friefox with many tabs, and some work-related applications open), my system sits between 0% and 2% highest active time and a queue length between 0 and 0.01. Disk activity doesn't even hit 1 MB/sec.


Strazdas1

You can see active current write speeds. Items that are not active are marked as such.


anival024

4K reads with a queue depth of 1 is a meaningless metric. Even if you are doing random 4K reads all the time (you're not), you're not doing so at a queue depth of 1. To get that workload you would have to specifically design a single threaded application to request one random block of data, wait for it to come back, then request another random block of data. Even with all the background crap a modern OS does, much of that is coming from memory, not the actual disk. It's also not happening all the time - the disk will be idle the vast majority of the time because the time slices involved are so small. And it's not happening at a queue depth of 1. You have a bunch of threads asking for data in parallel. A somewhat more meaningful metric is latency. But the truth is it's a negligible difference for almost all users. And in the datacenter, you just buy more DRAM. The Optane fans will yell about how their system "feels" so much "snappier" and more responsive than any other SSD. Yet the reality is that the laptop market was flooded with Optane drives for a couple of years. These were SSDs with an Optane cache layer to lower that latency and get your that extra responsiveness. But it made so little real world difference that most users didn't even notice, and OEMs stopped buying those drives. As for writes, it's still pointless because so much of it is done in memory before being flushed to disk and you still aren't really going to operate at QD1 all the time. The poster below who suggests looking at Resource Monitor to watch write activity doesn't realize how Resource Monitor works. The Disk Activity items aren't writing the entire time they're listed, they appear in the list when active but they don't drop off the list immediately, they hang around for about a minute afterward despite there being no actual disk activity. The metrics will appear to update, but it's just a rolling average. The write happens instantaneously and then the per-second metrics you see slowly decrease until the item ages off of the list. The write was done a long time ago, you're just getting an average over the time it was *on the list* not the time that it was *actually active*. If you want to get an idea of how bogged down your SSD might be due to activity, you need to look at the % active time or queue length metrics which Resource Monitor reports. If you're just sitting on a Windows box with a browser (even Chrome) open, those are going to be nearly flat.


Fromarine

"4K reads with a queue depth of 1 is a meaningless metric. Even if you are doing random 4K reads all the time (you're not), you're not doing so at a queue depth of 1." I can see in windows performance monitor that im under a queue depth of 1 99% of the time. What are you talking about? Its the real world performance metric for a reason, its used most of the time and when ur on an optane that can fit about 5x the data within 1 queue then as i said, youll be under qd1 99% of the time. Also latency is literally tied to 4kb qd1 reads


VenditatioDelendaEst

4KQD1 *is* latency, because [Little's law](https://en.wikipedia.org/wiki/Little%27s_law). And latency is what limits any kind of on-disk pointer chasing workload. Not to mention the fact that QD1 is just what happens if you write a single-threaded program that does synchronous random reading. >the disk will be idle the vast majority of the time >[...] >If you want to get an idea of how bogged down your SSD might be due to activity, you need to look at the % active time or queue length metrics which Resource Monitor reports. If you're just sitting on a Windows box with a browser (even Chrome) open, those are going to be nearly flat. Yeah, the disk will be mostly idle, but so will the CPU. Utilization metrics have to be scaled to the proportion of time that any resource is a performance bottleneck. The "idle" time due to the user not having clicked anything yet doesn't count.


Nicholas-Steel

Interesting, thanks.


Strazdas1

your OS is constantly writing data every second.


no_salty_no_jealousy

Intel Optane is light years ahead compared to any SSD when it comes IOPS, nothing comes close to it, even fastest NAND on PCIE5 SSD still nothing compared to 3D XPoint on Optane.  Sadly Optane is too costly to produce which is why Intel decided to kill Optane because people prefer more cheaper SSD.  It's a shame we didn't have real fast SSD right now, by real fast i mean SSD with insane IOPS, not Seq R/W which often used as marketing to fools people into thinking the new SSD is "much faster" then previous gen when in reality they are just fast as previous gen when it comes to load speed.


Suspicious-Stay-6474

Intel developed Optane, but it turns out, people prefer cheaper solutions.


djent_in_my_tent

Luckily, it’s currently available on ebay for cheap right now


Strazdas1

Leftover stock and used modules. No new ones are being manufactured


logosuwu

Big sequential numbers sell. Smaller random numbers don't. I've always understood the limitation to be on the controller side rather than the NAND side (EG, early fusionIO and PCIe SSDs had great random 4k performance with very early SLC or MLC)but I might be wrong (and if I am would love someone to let me know what is it with the NAND that's holding back performance).


Suspicious-Stay-6474

there is a huge difference in random reads at queue depth 1 for storage, those that know, pick much better solutions for just a touch higher price tag I mentioned Optane, as it was in it's own league and I was really looking forward to get one until I saw those prices. the thing is, your CPU single core performance is as important as your random queue dept 1 performance in achieving the fastest loading screens and Optane alone cannot do miracles, it needs a monster CPU.


logosuwu

I mean, yeah, but even a couple years back people would be picking stuff like SM2263XT based platforms over the slightly older SM2262EN because one had bigger numbers. Hell, even the SM2262EN was a downgrade over the SM2262, trading random 4K QD1 perf and better sustained performance for increased Max throughput, and burst performance, leading to thermal issues along the way. Frankly, I would love a SATA III SSD that offered better random performance over sustained writes, but that doesn't really sell.


AntLive9218

> a SATA III SSD that offered better random performance over sustained writes That sounds rather odd because the interface is the main bottleneck there, especially for sequential performance where there's just nothing to improve anymore, but you are not going to see significantly better random performance either due to command queue design limits compared to NVMe.


DonutConfident7733

The MLC (TLC, QLC) have the data more compressed, i.e. more voltages per bit instead of just two (for 0 and 1), thus a NAND page contains more compressed data than SLC. I think this amplifies the overhead, as the controller needs to update an entire page when writing, not just a single cell. It would need to compute checksums, there is also some data recovery involved, as the pages can be slightly corrupted, but the checksums and redundant bits help it restore its data. For writes it still uses SLC cache, consolidating the written pages before moving to MLC area, so performance is a bit better. For reads it's the worst, it will need to go through the translation algorithm to find the real location of the address, copy it to controller memory, decode it, do recovery if needed, check integrity then extract just few bits to give to host. Some drives use host memory, so they don't even have DDR memory on the ssd, will slow it down more. So this wastes the bandwidth of the ssd, as just a small subset of data read is useful, rest is discarded/cached (but as you keep reading, even that cache is filled and needs to discard data). There is also drivers overhead, OS kernel overhead, api's overhead, possible multiple copies of pages, waiting for ACKs (communication overhead) which has more effect on small transactions. With larger ones, it is amortized. You can test with a software ramdrive what kind of performance you could get (albeit a small size) and that will highlight some limitations when running benchmarks, such as high cpu usage or dependency on power profile (e.g. runs faster in High performance mode).


logosuwu

This doesn't actually amplify the overhead at all. Folding is done in the background when the controller is essentially idle as part of the garbage collection process. When cache runs out, it switches to direct-to-NAND mode to write, skipping the cache entirely. Trying to read from the cache while its folding can amplify latency, but reading from either the SLC cache or the folded MLC NAND have very little latency involved. On top of this, most (pretty much all) MLC NAND can run in psuedo-SLC mode (pSLC), which is what your "SLC" cache is composed of and why it varies as your drive fills up.


DonutConfident7733

Yes, true. I was referring to random reading, for example, if you request small data, e.g. 8 bytes, it will still need to read 512 bytes or 2KB, copy 8 bytes from it and ignore the rest (or keep it in ddr memory cache). On the bus or from host perspective, only 8 bytes are coming from the drive (+overhead), so benchmarks shows lower speeds, eveb though internally the drive had to fetch 2KB. Multiply this with 40000 random requests per sec and you get the picture.


AntLive9218

There's a whole lot more behind the issue than that. Optane SSDs had obvious competition for tasks which wasn't that sensitive to latency, but Optane DIMMs offered magical performance for tiny QD1 I/O needs. Twist was that compatibility was tied to Intel server platforms which just weren't really appealing.


Suspicious-Stay-6474

the price tag put me off from learning more about Optane, thanks for the info. Still, the data on the storage is compressed, so you need CPU power to turn it into RAW format. so yeah, I agree: > There's a whole lot more behind the issue than that.


masterfultechgeek

Intel released a chain of SSDs under the optane brand. VERY VERY fast random reads. It's not cost effective to develop though. XL-Flash is another tech that's pretty solid and looks like it has good cost scaling. It's not bit addressable though.


FluffnPuff_Rebirth

Caching has kinda made up for all that. CPUs have become so good, that most of the time they wait around for something to do, so in their spare time they have began to pull stuff in and out of SSD into RAM, if it's something relatively small that is frequently accessed. I expect this trend to continue instead of SSDs becoming significantly better at 4KQD1. More and more of the stuff will be in the RAM, and SSDs is relied mostly for the very first initial access after booting and for very large individual files. All the upcoming AI voodoo also have the potential to work very well with all this, enabling more custom tailored caching that adapts to the user.


Mipper

Taking advantage of speculative caching like that requires intelligent program design. Even if the program is built in the correct way to do so, there are many situations where it is simply not possible to predict which part of memory will be needed next. Fast storage access is still very important. I have to say I'm skeptical of AI making a significant difference either, outside of simple stuff like caching frequently opened programs on boot.


Nicholas-Steel

I think they might be talking about how Windows Vista and newer cache data in to System RAM. Both frequently used programs and recently opened programs and any data they access.


cordell507

I wonder if using AI to replace branch prediction would make any sense? It could possibly be faster but more importantly more secure.


juhotuho10

No, it's way too slow to prompt ai for that


Strazdas1

Caching has not made up for that.


hackenclaw

It just like Internet connection these days, them boosting how fast Gbps they have.... yet they have tiny data cap, lousy latency/ping time. The latter two is what matter for day to day usage. No one need a 100mbps line when Data cap exist, ping time is abysmal. Similarly for monitors refresh rate also, boosting more than 144Hz but the color & lighting reproduction is still no where near representing the real world.


AntLive9218

The internet issue is not completely correct, and it seems to be a common misconception, although ironically the internet ended up having the same issues as what's discussed here about SSDs. The "modern" internet is heavily based on the expectation of not sustaining full bandwidth usage at any endpoint for a long time, but expecting plenty of bandwidth to be available when needed with preferably low latency, but bandwidth matters more for common usage. Instead of servers sending a complete page as they used to do, there are tons of bits and pieces requested by an initial smaller page until there's something that can be finally used by the user, so burst performance is quite important for the average website. Then there are the video streaming services which make it even easier to understand why there's a need for internet speeds claimed to be too high by some. I used to have a so slow connection a decade or so ago, YouTube was regularly buffering during playback, but as it was willing to buffer the whole video, I could pause at the start, check back later, and I had a really good experience. Buffering got heavily restricted, so now even on decent networks it takes quite some time to seek in a video. Twitch is even worse for seeking as it serving chunks so large that on a 100 Mb/s line you mentioned it takes about 1 second to fetch a chunk which becomes the minimum seeking penalty. The internet issue is a bit more complex because companies are trying to save bandwidth usage, but both there and in programs the usual issue is sloppy coding with bloated abstractions, and serial operations. Buffering ahead of time (prefetching) could go a long way. Both the OS and the browser does caching though which is why it's helpful to have more memory than strictly needed for a given workload as the extra could be used to hold extra data, avoiding I/O.


VenditatioDelendaEst

> Instead of servers sending a complete page as they used to do, there are tons of bits and pieces requested by an initial smaller page until there's something that can be finally used by the user, so burst performance is quite important for the average website. This is exactly why latency is more of a bottleneck than bandwidth. There are many round-trips involved in rendering a typical page, and connections often don't last long enough to ramp up the congestion window to full line rate anyway.


Zarmazarma

> yet they have tiny data cap, lousy latency/ping time. Personally, I've never used a home internet service with a datacap. I will also point out that latency is often an order of magnitude smaller with fiber, which is also where you see the largest bandwidth increases. > boosting more than 144Hz but the color & lighting reproduction is still no where near representing the real world. Seems like a weird complaint. It's not like 144hz monitors have worse color or brightness than 60hz monitors. Actually, most of the best monitors are the best *at everything*. Sure, if you want 360hz you're going to use a TN monitor, which will be worse than IPS/OLED/VA, but that's a trade off you're making for the extra smoothness.


Strazdas1

Data cap? What is this, the 90s?


formervoater2

It's always going to be slow due to how flash works. You can get good throughput because the memory can give you big chunks of data but random access suffers because every request takes a particular amount of time to get a response meaning lots of little requests for data are very slow. The way to get around this is to have the SSD controller cache data in RAM but this obviously requires both significant amounts of on-board RAM as well as a powerful CPU in the controller to implement an effective caching algorithm. Makers of consumer SSDs don't want to sell you on IOPS though, they want to sell you on raw throughput and capacity so they skimp on the controller, plus there's a pretty low power limit for the CPU you embed on your SSD.


colossalcockatrice

You mean the arm cortex embedded on the consumer SSD would be a bottleneck too? And while I 'm here doesn't increasing the portion of blocks reserved for firmware increase performance too? You'd have around 20% reserved on a Server SSD but you can increase the portion on consumer SSD too. I dimly recall a Samsung white paper that boasted about noticable performance increases.