Drive Performance - TMR

One of the ways that disk drive engineers study performance is to measure TMR-Track MisRegistration. It's really quite a deep subject and servo engineers know more about it than I do, but I think I can give you an "executive" overview and perhaps add a little bit to your understanding of how disk drives work.

First, I'll quickly review the basics of track layout. Data tracks are laid down in concentric circles in a hard disk drive, not in a spiral as on an LP record or a CD. In a modern "embedded servo" drive, there are anywhere from about 70 to about 120 "servo bursts" per revolution, on each track. These servo bursts contain the special data that the head reads and the servo system firmware interprets to determine which track the head is closest to, and how far off "track center" the head's center is. For the space between servo bursts, the system's inertia and the current supplied to the actuator by the servo system keep the head in position, or "close enough", and that's where TMR comes in.

TMR-Track MisRegistration is just a fancy phrase for "error". It refers to where the head's read gap or write gap is, relative to where you want it to be. For convenience, I'll just say "head" from now on, although I'll actually be talking about the spots on the head where reading and writing happen. And in this essay, I'll be talking about "static" TMR, the errors that occur after the head has been "following" a given track for long enough that errors resulting from seeking from one track to another have faded out. I won't be talking about the causes, I'll just be talking about the errors themselves and how engineers think about them.

There are several flavors of TMR. First, there are "static repeatable runout" and "static non-repeatable runout", usually referred to as RRO and NRRO. Those refer to the deviation of the head from the theoretical perfect circle of the track. In this case, static doesn't mean the disk isn't spinning, it means the head isn't seeking. RRO is "phase locked", that is, the head is off track by the same amount at the same point on the disk, each revolution. So we can talk about "once around" or "5 times per rev" RRO, which for a drive spinning at 7200 RPM (120 Hz) would occur at 120 Hz or 600 Hz. NRRO has characteristic frequencies, but they aren't locked to a particular location on the disk. We speak of "506 Hz" and "570 Hz" NRRO frequencies. (Those happen to be two frequencies of particular interest to a certain Viking engineer.)

Then there are several factors that as a mechanical engineer I just lump together as "servo TMR". The servo system has white noise in it, several resonant frequencies, and various filters, including several "notch filters" to hide the effect of mechanical problems. The notches add certain problems of their own.

Those TMR factors are important to an engineer, but they don't have anything directly to do with your data. There are three main TMR factors that are measured using data. They are "Write TMR", "Read TMR" and "Write to Read TMR". (Write to Read TMR is often called just TMR, because engineers tend to give long, clumsy but accurate names to things, then use the same handy short name for several different phenomena, just to confuse themselves and others.)

Let's think about what happens when data is written and read back. As a head passes a servo burst, the "tracking error" is measured and a correction factor is calculated and the current to the voice coil is adjusted a little bit, if necessary. Then we write a sector that's between this burst and the next. The data is written where the head is, generally not on the exact center of the track. The distance between the theoretical track center and the center of the actual data is called "Write TMR". Some time later, the head tries to read that data. The head passes the servo burst, the tracking error is calculated, and the head tries to read the sector that was written. But the head is generally not only not on the center of the track, it's generally off by a different amount than it was when the data was written. The distance between the theoretical track center and the center of the read element is called "Read TMR". The total error, the distance between where the data was written and where the head is when the data is read, is called "Write to Read TMR". Write to Read TMR is very important, because it represents the sum of most of the things that can cause data-handling problems inside a disk drive.

If W/R TMR is so important, how can it be measured? After all, it's defined in terms of data, not servo information. Well, it happens that if you write known data patterns and calibrate the results, you can measure parameters that allow you to calculate W/R TMR. What is written is low-frequency information and high-frequency information. The input data isn't "all 0's" and "all 1's" because the PRML coding won't translate that into constant frequency, but the idea is to get the effect of 0's and 1's. If you have constant-frequency information, it's easy (or so I'm told) to measure its amplitude accurately, and disk drive read channels do measure the amplitude (for setting AGC). It's also possible to command the servo system to move the heads X% off the track center (where X% is a percentage of the track-to-track distance). By reading the data amplitude at various amounts of deliberate off-track, you can calibrate the amplitude against distance. And you can do that even when all the errors that you're trying to measure are happening, if you just take enough measurements under carefully controlled conditions and find the average.

OK, so that's what TMR is. Now, what does it mean to the user? If total W/R TMR is too large, you may not be able to read the data that you wrote. If Write TMR is too large, you may erase the data that was written on the next track. You aren't likely to contemplate either of those eventualities with perfect equanimity.

The servo system, as I said, is measuring the head's deviation from the track centerline at each servo burst and continuously correcting the errors that it finds. If the head drifts too far off center for it to be safe to write, the servo system posts a "write inhibit", telling the system not to write data until further notice. If the head drifts even further off center, it posts a "read inhibit" telling the system that if it reads, it'll likely not be good data. As the head returns toward track center, if its position and velocity are both within spec limits for a spec length of time (# of servo bursts), "read inhibit" and, using tighter specs, "write inhibit" are removed. Writing is controlled by more stringent specs than reading because if you write over some other existing data, there is no recovery at this low level. The data is just gone and you don't even know it! (Again, at this very low level.)

So what are the spec limits? I won't talk about velocity and time limits-those both vary greatly between different companies and different products. Distance off track is usually spoken of as a percentage of the distance between track centers. Write inhibit is usually set if the head gets more than about 10% off center, read inhibit is set at 15% or so. The new MR heads have separate read and write elements, so the old tape mantra of "write wide, read narrow" has been implemented, and there's a tendency these days to open up the read limits a bit.

For the newest drives, with track densities in the range of 10,000 TPI, that implies that the servo system keeps the heads on track within plus or minus 10 microinches for writing. But those numbers are for safety. For performance, you have to be better. If Write to Read TMR is less than 5% of track 99.7% of the time, you'll have a top performing drive. Yes, that implies staying within 2.5 microinches of track center!


Copyright 1999 by Albert Dayes and John Treder