As per usual we do need to start with a caveat. Our SkyHAWK AI samples are not retail, and instead are OEM variants. Thus we cannot comment on the shipping container or accessories that ship with said drives. Quite honestly, we doubt that (m)any buyers of a SkyHawk AI (or any Enterprise grade HDD for that matter) will care one whit what the box looks like or what accessories they come with. Instead, it is all about the performance… and relative TCO a model has to offer.
Moving on. Over the years Seagate has refined, optimized and arguably perfected their Guardian series of drives. Refined them for the market niche they are targeted at. Optimized their hardware features, and perfected their firmware. Many moons ago the hardware optimization took the form of actually making different models on different ‘blueprints’. For example the Exos X line was the foundation for enterprise storage, but the BarraCudda was the foundation for the home user and SMB targeted models (e.g. non-Pro IronWolf series).
This duality did lead to a lot of confusion… and anger. For example the original SkyHAWK was basically a highly tweaked ‘Cuda, but the SkyHAWK AI was built on the EXOS X platform. Same with the IronWolf vs IronWolf Pro options. These days… this duality is not needed. Seagate has streamlined the Guardian series to the point where the really is no longer two basic foundational blueprints. Instead, all are basically built upon the EXOS X platform and just minor to moderate ‘tweaks’ are applied to it to optimize it for a given market niche. Basically a SKYHAWK AI will not have as much factory testing as an EXOS X, but it will have more than an IronWolf PRO. This is why all in a given generation (e.g. 24TB being the flagship capacity option) will have similar performance and similar uncorrectable error specifications, and even similar TDW ratings.
Similar however is not a synonym for same. More factory testing will certainly finely sift out the ‘on the bubble’ drives that could potentially pass a short QA/QC testing… but for home users who do not blatantly hammer their drives (even NAS drives) the differences do not justify the added expense of extended testing. The same is true of what is being tested for a given line… and yes it does differ. For point of reference think of a 30min stress test vs a 24stress test to ‘prove’ if an CPU overclock is stable or not… as maybe you never stress your rig enough to go past the 30min mark. Conversely, using one stress test vs. another can net different results regardless of time spent testing (SuperPi vs Prime95 for example).
With that said, the major difference between the models stems from radically different firmware applied to the similar underlying hardware. Thanks to buying an SSD manufacture way back in the day, Seagate are masters at firmware optimizations and advanced algorithmic optimization. So much so they are a razor compared to a Bowie knife that is the rest of the industry. So where the ‘other guys’ have to rely upon brute force to get the job done Seagate can easily modify the firmware to do as good to better a job.
The down side to elegance and refinement is Seagate models are highly optimized for their targeted scenario. Of all the various Guardian models that have existed the SkyHAWK AI is the very epitome of this product segmentation. In order to explain this, lets quickly go over how “normal” drives do things, how the Exos X does things, and then how the SkyHAWK AI does it… but first we first have to make something crystal clear. Firmware optimization is not a magic bullet. It simply is a refinement on how the r/w heads lay down the data on the platters. So if you are expecting 10K RPM performance from a 7.2K RPM drive you are still SOL. SAS 10Ks are still the performance champs… and are priced like it.
In either case, with typical HDDs (take the IronWolf Pro for example) The data is pushed over the SATA bus, is funneled into the HDD’s onboard controller, the onboard multi-core ARM controller calculates the necessary ECC parity bits, and then this data is broken up into chunks that are then sent to the low-level controller, which in turn starts the process of writing the data blocks and parity bits to the platters in a linear fashion. What this means is, the platters will look like this: a data block, followed by its ECC parity bits, then another data block and its parity bits… in an ad nausea fashion until the write operation is complete or the drive is filled up.
In the EXOS X the firmware injects “SuperParity” technology into the mix. Thus the data is pushed over the SATA bus, is funneled into the HDD’s onboard controller… and then things change compared to the normal method. First, the controller does indeed figure out how to layout the block and calculate the ECC on said blocks… but it is not just doing a Data+Parity, Data+Parity linear layout. Instead the data blocks are all written in a linear fashion, then the standard ECC is written at the end of the write operation, and then a secondary SuperParity (secondary ECC) track is written that covers both the data and ECC blocks in its ‘super ECC’. Thus there are two write operations, but read performance can be boosted (as it is not reading any ECC unless it has to). Basically think JBOD vs RAID 5 (and its parity stripe) for point of comparison on how the Exos X does things versus the older fashioned models.
With the SkyHAWK AI series Seagate has taken the idea of the Super Parity… and tweaked it. Instead of worrying about a secondary track for parity, it is actually writing each and every stream to its own ‘track’… and can do so for up to 32 concurrent streams. Obviously this takes some major tweaking to how things are done and has some major caveats.
The first is that this requires TDMR to not only work but work fast enough so as to not be a (noticeable) detriment to overall write performance. So as a brief overview of TDMR, Two-Dimensional Magnetic Recording technology uses a dual r/w head configuration with a ‘detector’ signal processor in between the main drive controller and the read/write heads. The two heads are slightly offset with one read head over the central track being read and partially over the track ‘above’ it, and the second read head offset so it is mainly on the central track and the one ‘below’ it. This allows for a more sensitive read on a given bit, allows for eliminating the interference from the bits surrounding the actual bit being read, and generally improves the signal to noise ratio (aka SNR which the main culprit of false errors). It also gives the drive controller two chances of reading the bit properly for a single pass. These days, 32 write + 6 read streams is pushing things but in time we could see a SkyHAWK AI that can handle even more streams. It all will come down to how dense Seagate can make the platters, and how good the TDMR controller is at processing out the signal vs the noise!
In either case, the largest caveat is this requires a honkin’ huge chunk of RAM cache… and using said RAM as more of a write buffer than a classical read buffer. Which is highly unusual for a HDD to say the least. Yes, in classical and even EXOS the buffer “can” be used as an emergency write buffer… but their cache has been optimized for read caching. Not write. Basically, in non-SkyHAWK AI the various firmware algorithms are monitoring what ‘you’ (the host system) are calling for and then guessing at what said host controller / ‘you’ are going to ask for next, and then pushing the first couple chunks worth of data into the buffer… before you ask for it. In we are sure totally and “complete coincidental” manner… it makes “up to” 32 guesses and stores the guesses in their own chunk of the RAM cache. With so many potential guesses the Exos X can keep the IOPS high, and wasted spins to a minimum. This advanced guessing game could be considered ‘AI’ but since Seagate had been doing it this way since the inception of most of the Guardian models… there was no need to append an AI moniker to a given model.
We say ‘most’ and not all as this is where SkyHAWK series enters the chat… and where o.g. SkyHAWK vs. SkyHAWK AI differences start to appear. First, Seagate states the SkyHAWK AI has been optimized for a 80 write / 20 read configuration. This is important because it means it has to be handle 32 concurent write streams and 6 read streams… and be faster than the total MB/S requirement so as to account for latency associated with handling deep (by HDD standards) queue depths. Obviously, that will be tricky when dealing with 32 4K streams and six 4K read stream requests as that is a looooot of data. Thus… it cheats. The very first thing a SkyHAWK AI does is read how many incoming data streams there are and then allocates ~80 percent of the RAM cache for the writes of said streams and ~20 for the streams. It then does a virtual division of the RAM capacity into tiny slices. With each stream getting its own slice of the RAM.
In the classic SkyHawk the algorithms did not spend much time on guessing what you want. It simply allocated 20percent of the RAM cache for read requests and filled it as needed… as streams are highly linear by nature. In the AI variant some of this read allocated RAM is set aside for more ‘Exos X’ like guessing games. Making it 80 percent write, 10 to 15 for linear read, and 5’ish (please note this is S.W.A.G on specific size and should not be taken as gospel) for random read requests. This addition of random read buffer means that when you quickly ‘scrub’ through a data stream the delay(s) are minimized and the overall performance is noticeably improved compared to both typical drives and the non-AI SkyHawk variant. Remember ‘AI’ is just a fancy word for artificial algorithms that can better mimic the needs of the end user. Its not ‘thinking’ it is just following very advanced IF/Then chains (for example).
In either case, with ‘only’ 80 percent of the RAM buffer to work with what the SkyHAWK AI does is a couple interesting things. First, while it was designed and sold to be able to handle ‘up to’ 32 concurrent 4K data streams worth of writes… it is actually only writing one stream’s worth of data to the platters in real-time. In fact all write request (pardon the pun) streaming in are going to the RAM. This “only one at a time” write methodology is done so as to eliminate read performance robbing file fragmentation. Furthermore, it is not a linear stream1 then stream2 then 3… then stream32 layout on the platters. Instead, taking a page from Super Parity, each stream gets its own set of tracks (basically 1/32nd of the total capacity). Which in turn means that, since the SkyHAWK AI has a 512 sized RAM Cache, it is constantly bursting extremely small chunks worth of data to flush a given streams data to its own track. Then it moves to the next track flushes the next chunk of ram cache while the first chunk is starting to refill (relatively slowly). Rinse and repeat and you have a delicate ballet dance of constant writes. All of which are occurring much, much faster than the total amount of data streaming in.
We give that last caveat as if it was literally a 1:1 dance things would go off the rails when it came time to read back previously recorded data while still writing the new data stream to the platters. So this flush the buffer dance can never go above 80 percent of what it is actually capable of under ‘worst case’ scenario (aka the platters are just about full to 100 percent capacity). The same holds true for the read side of the equation… albeit with some caveats. Basically it will only use RAM as a read buffer if it has to. So if you are not streaming 32 streams in and 6 out… there is a good chance that most of the reads will be direct off the platter (after the correct ‘guess’ data blocks are read that is). It is just that it can keep up with 32 + 6 by using the 20 percent for read ahead buffering. It just does not have to. Once again this is where is advanced algorithms come into play and allow the SkyHAWK AI to be ‘smart’. Smart enough to know when to break its own r/w rules, and when not to. All in order to keep performance high. Thus if you stick this in as a “D drive” in your PC how it reacts will be more in line with how a IronWolf will do things. Stick it in a large storage server array and it will react somewhat like an EXOS X. It will never be as good as either… but it will not be a dumpster fire like the non-AI variant could sometimes be.
The one downside to these advanced algorithms is the fact that they understand that A/V streams all come encoded with their own ECC. As such… if it comes down to it will push a known bad bit so as to keep r/w processing flowing. The same is true of writes. It will simply make note of the uncorrectable bit error (typically caused by a then unknown flaw on the platter) and fix it the next time it is read… assuming it has the free cycles to do so. If it does not, it will not. Nor will it cause an 8/30/etc second ‘freeze’ like event that can happen on typical drives who do have Time Limited Error Correction set to more than “lol nope, don’t care”… which is basically all of the other Guardian models (even if it is only a difference of 8 seconds). Basically the SkyHAWK AI does not really care about single bit errors. It will not drop what it is doing and try and recover that bit to the detriment of everything else. Instead, the algorithms will try its best but only so as long as it does not impact real-time I/O performance… as real time I/O is more important than single bit errors. That is a complete 180 compared to all others… and is less than optimal when dealing with data files that do not come with their own ECC. Like say documents. Tax returns. Game files. OS files. App files, and basically anything not A/V related!
In either case, if what the SkyHAWK does – in real time – sounds like a complicated dance between performance and reliability… that is because it is complicated. Compared to most hard drives this firmware creates an insane amount of overhead the onboard SoC has to deal with. Thankfully, modern ARM multi-core controllers are up to the task… it is just that few HDDs actually stress the controller anywhere close to what a SkyHAWK AI can, will and does. Routinely. Needless to say, the SkyHAWK AI series is rather unique and actually is worthy of the AI moniker. Furthermore it is proof positive that just because the hardware may be similar (or even the same) to other models, the performance can radically differ from scenario to scenario.