This time I have to go into more detail about the AJA test, as I noticed a strikingly inconsistent behavior in the course of the measurements, which cannot simply be explained by external factors such as thermal throttling or system bottlenecks. Rather, the observed anomalies point to internal processes within the controller, possibly also in connection with firmware that is not yet fully developed. This analysis is correspondingly more extensive, as it is important to me to document and classify the behavior in detail. And, of course, I was triggered, as there are hardly any valid tests on the controller so far.
AJA Write: Analysis of the drops that occur
The AJA System Test is particularly suitable for analyzing the write and read processes of an SSD, as it simulates continuous, real data streams, such as those that typically occur during video recording or playback. Unlike synthetic benchmarks, AJA generates uniformly large data blocks at a constant rate, which makes weaknesses in internal memory management, cache handling or pSLC/TLC transitions particularly visible. Above all, the display of individual dips and the development of transfer rates over time make AJA ideal for evaluating the stability and consistency of SSD performance in continuous operation. The abrupt and repeated drop in write speed visible in the first graph with otherwise high continuous performance is highly likely to indicate inconsistent write paths within the SSD’s flash management.
The two most critical areas here are the possible internal fragmentation or flush processes in the Flash Translation Layer (FTL). The SM2508G uses a DRAM-supported FTL with dynamic allocation of the physical blocks. In the case of very long, constant write streams with large amounts of data, existing mapping structures may have to be reorganized. These processes run synchronously with the active write operation and cannot be completely outsourced. An indication of this is the regularity of the drops with a consistently high base level – this is not a thermal drop, but a sudden internal management stop, for example due to block-by-block consolidation or garbage collection at level 1 map granularity.
Or there is a limitation due to pSLC handling and in-flight management when rewriting from SLC to TLC. Despite a large pSLC cache, bottlenecks can occur with saturated write loads if the switch between pSLC and TLC is poorly optimized. In practice, this means that the controller first pushes data into a fast cache area (single-bit write to multi-level cells) and later consolidates the final TLC structure in the background. However, if several large data streams hit this mechanism at the same time while the pSLC budget is already being used, displacement processes with write amplification inevitably occur. The drops would then be the moment when part of the cache is reorganized back into TLC and temporarily no more write resources are available – a classic flush-stall behaviour.
Incidentally, a possible bottleneck in the background could also be due to the type of NAND connection. The SM2508 uses 8 channels, with synchronized write processes always taking place across several NAND dies on the controller side. If a group is currently busy with internal write amplification or block merging, the controller could force a short pause to wait for released resources. This effect also causes short-term drops. These problems cannot be solved by external cooling or platform customization, but would need to be addressed by firmware optimizations, such as smarter pre-flushing, more aggressive background provisioning of blocks, or improved SLC/TLC displacement logic.
AJA Read: Analysis of the drops that occur
The AJA system test shown in the next image documents the read behavior of the SSD over a continuous data stream in playback mode. The horizontal axis shows the consecutive frame number as a time axis, the vertical axis shows the transfer rates in MB/s. The green line represents the moving average, while the blue curve shows the actual, measured values per query. In contrast to classic synthetic benchmarks, which simulate short peaks or block-oriented accesses, AJA shows very clearly how the SSD behaves under a steady, streaming-oriented read access. In this case, the SSD begins with a slightly delayed start-up phase and achieves transfer rates of over 11 GB/s relatively quickly. This value is in the range of the maximum specified sequential read speed for PCIe Gen5 SSDs and is an indication of a good basic configuration, a sufficiently dimensioned read cache and fast NAND connection.
However, the image becomes increasingly unstable as it progresses. After an initially stable section, there are clearly recognizable irregularities. In the middle and late stages of the test, there are a number of short-term drops, with the data rate repeatedly dropping at very short intervals. These drops cannot be attributed to a thermal effect, as they occur abruptly and not progressively. The even distribution over the measurement period also speaks against classic throttling due to overheating.
The instability observed can most likely be attributed to the behavior of the SSD’s internal read management. In Gen5 SSDs with a high number of channels and parallel NAND assembly, access takes place across several dies. If the firmware is not optimally tuned or the read requests are not distributed evenly, temporary bottlenecks can occur during parallelization. These then manifest themselves in the form of the read fluctuations shown here.
Another aspect is the possible fragmentation of the internal mapping layout. If the data structure on the drive is not organized in a linear or block-wise continuous manner, but consists of many logical areas with scattered physical addresses, this can severely disrupt sequential read performance. Particularly with SSDs with dynamic wear leveling or complex garbage collection strategies, this effect can also occur during reading if internal resources have to be reserved for management processes.
The test curve therefore indicates a fundamentally powerful but not completely stable read subsystem. Although the SSD manages to achieve high maximum values, it cannot maintain these throughout continuous operation. From the user’s point of view, this means that micro-delays or latency peaks can occur in certain real-time applications – for example when streaming large uncompressed video files – even though the raw data rate of the hardware is sufficient in principle.
AJA brings these irregularities to light particularly reliably, as it does not allow any buffer times by continuously writing or reading large, evenly structured data volumes and thus makes internal firmware processes directly visible. The combination of high data load and continuous access exposes weaknesses in memory management that often remain undetected in benchmarks with short burst behavior. If you want to read all the figures again, here is the measurement protocol as a PDF:





































11 Antworten
Kommentar
Lade neue Kommentare
Urgestein
1
Neuling
Urgestein
Neuling
Urgestein
Neuling
Mitglied
Neuling
Moderator
Veteran
Alle Kommentare lesen unter igor´sLAB Community →