Chapter 7

Allan deviation and clock stability

You built a clock. You logged its fractional frequency $y(t)$ for a week. Now your collaborator asks the only question that matters: how stable is it? The obvious answer — "compute the standard deviation" — is wrong, and not by a little: for the noises real clocks have, the standard deviation is not even a number. This chapter is about the minimal surgical fix, the Allan deviation $\sigma_y(\tau)$, and about the remarkable fact that its shape versus averaging time is a fingerprint of the noise type.

Throughout, $y(t) = \delta\nu(t)/\nu_0$ is the fractional frequency of chapter 6 — dimensionless, e.g. $y = 10^{-15}$ means the clock currently ticks a part in $10^{15}$ fast. Its one-sided PSD $S_y(f)$ (ordinary frequency $f$ in Hz) is the language we already speak; the Allan deviation will turn out to be a second view of the same $S_y(f)$.

Accuracy is not stability

Before doing any statistics, be sure which question you are answering. "How good is this clock?" hides two independent questions. Split the frequency of a real oscillator as

$$ \nu(t) \;=\; \nu_0\,\big[\,1 + \epsilon + y(t)\,\big], $$

where $\nu_0$ is the true (nominal) frequency the clock is supposed to produce, $\epsilon$ is a constant systematic offset, and $y(t)$ is the fluctuating part, defined so that it averages to zero. Accuracy is a statement about $\epsilon$: how far the clock's average frequency sits from the truth. Stability is a statement about $y(t)$: how much the frequency wanders about its own average, wherever that average happens to be. These are independent virtues — a clock can be any combination of stable/unstable and accurate/inaccurate — and conflating them is the classic beginner's error in time and frequency.

A dartboard makes the distinction impossible to forget. Every frequency measurement is a dart; the bull's-eye is the true frequency. Bias $\epsilon$ moves the whole cluster off-center; instability spreads the cluster out. Four throwers, four reputations: a tight cluster away from the bull's-eye (stable but not accurate — a good quartz oscillator years after its last calibration), a loose cluster away from it (neither), a loose cluster centered on it (accurate on average but not stable), and a tight cluster dead center (stable and accurate — the clock everyone wants).

The dartboard: bias moves the cluster, instability spreads it

Each dart is one measurement of the clock's frequency (for the wandering noise types, one gate average of the record on the right); the bull's-eye is the true frequency, and bias acts along the horizontal axis. Drag $\epsilon$ and $\sigma$, or jump straight to the four classic quadrants. Then switch the character of $y(t)$ to random walk or drift and watch the cluster smear and creep — the board and the trace tell the same story.

Now look at what the statistic this chapter builds actually sees. The Allan deviation (two sections down) is made of differences of adjacent averages, so a constant $\epsilon$ cancels exactly: $\sigma_y(\tau)$ measures stability and is completely blind to accuracy. That blindness is a feature — you can measure a clock's stability by beating it against itself or an equal, without owning anything more accurate — and a trap: a beautifully flat ADEV at $10^{-16}$ says nothing whatsoever about whether the frequency is right. The dartboard's stable-but-not-accurate thrower has a superb Allan deviation.

The asymmetry between the two virtues sets the design rule of metrology. Accuracy is fixable after the fact: measure $\epsilon$ once against a better reference and correct for it — that is calibration. Instability is not fixable: you cannot subtract a random wander you cannot predict. Hence "stability first", and hence primary standards: the atomic-clock loop of chapter 10 ties $\nu_0$ to an atomic transition, so that $\epsilon$ is bounded by physics rather than by the date of the last calibration.

Why the ordinary variance fails

Chapter 5 ended on a warning: for noise steeper than white, the variance is not a property of the process but of how long you watched it. Clocks are where that warning bites hardest, because every real oscillator eventually shows flicker ($S_y \propto 1/f$) and random-walk ($S_y \propto 1/f^2$) frequency noise. The variance is the integral of $S_y(f)$ over all $f$, and for these spectra the integral diverges at the low-frequency end. A longer record admits lower Fourier frequencies, so the sample variance just keeps growing.

Watch it happen. Below, three synthetic clocks tick once per second: one with white FM noise, one with flicker FM, one with random-walk FM (each generated once and then analyzed cumulatively). For each we plot the sample standard deviation of the first $N$ points, as a function of $N$.

The sample standard deviation vs how long you measured

Left: the three frequency records themselves, offset for clarity and each normalized to unit RMS over the full record — white FM stays a band of fixed width, flicker FM wanders on every timescale, random-walk FM walks away. Right: the sample standard deviation of the first $N$ points of those same records. White FM converges by a few hundred samples. Flicker FM never settles — it creeps upward logarithmically forever. Random-walk FM grows as $\sqrt{N}$ without bound (the normalization is why all three meet near 1 at the far right). Press "new sample ↻" to convince yourself the shapes are generic, not a fluke of one realization.

So "the variance of my clock" is simply not a number for flicker or random-walk noise — it depends on how long you cared to measure. Quote it and the only thing you have characterized is your patience. Any honest stability measure must (a) converge for the noises clocks actually have, and (b) admit that stability depends on time scale: a clock can be superb over one second and terrible over a day. Both requirements point the same way: measure fluctuations at a chosen time scale $\tau$, and make the statistic blind to the slow divergent stuff below $1/\tau$.

The two-sample fix

Here is what a frequency counter actually gives you: the average fractional frequency over consecutive gate intervals of length $\tau$,

$$ \bar y_k \;=\; \frac{1}{\tau}\int_{k\tau}^{(k+1)\tau} y(t)\,dt . $$

David Allan's 1966 move: instead of the variance of the $\bar y_k$ about their mean, take the mean square of the differences of adjacent pairs,

$$ \sigma_y^2(\tau) \;=\; \tfrac12\, \big\langle (\bar y_{k+1} - \bar y_k)^2 \big\rangle . $$

That is the whole definition. Differencing adjacent averages is the entire trick: a constant offset in $y$ cancels exactly, and slow wandering — the divergent low-frequency content — cancels almost exactly, because two neighboring gates see nearly the same value of it. The two-sample difference is a high-pass filter in disguise (we will make that literal in the last section), and it is precisely the divergent DC end that it removes. The factor $\tfrac12$ is a normalization: for white FM the $\bar y_k$ are independent, so $\langle(\bar y_{k+1}-\bar y_k)^2\rangle = 2\,\mathrm{var}(\bar y)$, and the $\tfrac12$ makes the Allan variance agree with the classical variance of the gate averages in the one case where the classical variance is honest. The square root $\sigma_y(\tau)$ is the Allan deviation — "ADEV" on every clock plot you will ever see.

The picture below is the definition drawn on the data: the time axis is chopped into gates of length $\tau$, each gate is collapsed to its average $\bar y_k$ (the staircase), and $\sigma_y(\tau)$ is the RMS step height between adjacent stairs, divided by $\sqrt2$.

Chop into τ-gates, average each, compare neighbors

White FM samples (faint) at $\tau_0 = 1$ s, and their gate averages (staircase). Drag $\tau$: longer gates average the white noise down, and the stair steps shrink as $1/\sqrt\tau$ — compare the measured $\sigma_y(\tau)$ readout with the white-FM prediction.

One practical refinement, free of new ideas: with $N$ samples you can slide the pair of gates along one sample at a time instead of jumping gate-by-gate, reusing the data in every position — the overlapping Allan deviation, which is what allan.js (and every clock-metrology package) computes, because it squeezes far more statistical confidence out of the same record without changing the expected value.

The $\sigma_y(\tau)$ fingerprint

Now the payoff. Compute $\sigma_y(\tau)$ for every $\tau$ from one gate up to a quarter of your record, and plot it log–log. Write each power-law patch of the frequency-noise spectrum as $S_y(f) = h_\alpha f^{\alpha}$ — chapter 5's $h/f^\alpha$ levels, with the subscript naming the slope — and each noise type produces a straight line with its own characteristic slope. Every PSD in the table is the fractional-frequency one, $S_y(f)$ — the "PM" rows are phase noises seen through chapter 6's identity $S_y = (f/\nu_0)^2 S_\varphi$, which is how a flat phase spectrum becomes $h_2 f^2$ here:

noise type	$S_y(f)$	ADEV slope	typical physical origin
white PM	$h_2 f^{2}$	$\tau^{-1}$	readout/detection noise on the phase
flicker PM	$h_1 f^{1}$	$\tau^{-1}$ (with a log factor)	electronics, amplifiers
white FM	$h_0$	$\tau^{-1/2}$	shot noise, quantum projection noise
flicker FM	$h_{-1} f^{-1}$	$\tau^{0}$ (flat floor)	the oscillator's intrinsic $1/f$
random-walk FM	$h_{-2} f^{-2}$	$\tau^{+1/2}$	temperature, drifting environment
linear drift $\dot y = D$	(deterministic)	$\tau^{+1}$, $\;\sigma_y = D\tau/\sqrt2$	aging, creep

Read a measured ADEV curve from left to right and you are reading the clock's biography: which noise dominates at each time scale. The demo below mixes all five ingredients into one $2^{16}$-sample record ($\tau_0 = 1$ s) and computes the overlapping ADEV live. The dashed lines are the theoretical contribution of each ingredient at its current slider setting; the solid curve is the ADEV actually measured from the synthetic record. Raise each slider and watch its regime lift the measured curve — the curve always hugs whichever dashed line is on top.

The headline plot: σ_y(τ) as a noise fingerprint

Left: the fractional-frequency record (shown averaged into 64 s bins so the slow noises are visible). Right: its overlapping Allan deviation, with dashed theory lines for each contribution: PM = white PM ($\propto 1/\tau$), WFM = white FM ($\propto 1/\sqrt\tau$), FFM = flicker FM (flat), RW = random-walk FM ($\propto \sqrt\tau$), drift ($\propto \tau$). Try turning one slider far up and the others down to isolate each slope; then push the drift slider and watch the $\tau^{+1}$ ramp swallow the long-$\tau$ end.

Two footnotes the fingerprint needs

First: white PM and flicker PM both give $\sigma_y \propto \tau^{-1}$, so plain ADEV cannot tell them apart; the modified Allan deviation (MDEV), which averages the phase inside each gate before differencing, separates them ($\tau^{-3/2}$ vs $\tau^{-1}$). Second: the $\tau^{+1}$ drift line is not noise at all — it is a deterministic ramp, and the right response is to fit and subtract it, not to admire it in the ADEV.

Reading a real clock plot

Here is the shape you will see in essentially every optical-clock paper. At short $\tau$ the curve falls as $\sigma_y(\tau) = a/\sqrt{\tau}$ — white FM, typically quantum projection noise from interrogating a finite number of atoms; a good lattice clock might have $a = 3\times10^{-16}$ at 1 s. Averaging longer keeps helping… until it doesn't: around $\tau \sim 10^4\,$s the curve flattens onto a flicker floor, say $1\times10^{-18}$, set by whatever $1/f$ process the clock owns (blackbody shifts, lattice light shifts, the cavity). Past the floor, curves often turn up again as random walk from slow thermal or environmental drifts takes over. The floor is the clock's headline number, and the white-FM prefactor decides how long you must average to reach it.

A model optical clock: white FM into a flicker floor

The model is $\sigma_y(\tau) = \sqrt{a^2/\tau + \sigma_{\mathrm{floor}}^2}$. Drag the 1-second stability $a$ and the floor; the readouts answer the two questions everyone asks. Note what the floor means: below it, averaging longer stops helping — no integration time gets you under a flicker floor.

The white-FM prefactor is not a new quantity — it is the sensitivity of chapter 8 wearing fractional-frequency clothing. White FM with one-sided level $h_0$ gives (next section)

$$ \sigma_y(\tau) = \sqrt{\frac{h_0}{2\tau}} \qquad\Longrightarrow\qquad a = \sqrt{h_0/2}, $$

so quoting "$3\times10^{-16}$ at one second" and quoting a white noise floor of $h_0 = 2a^2 = 1.8\times10^{-31}\,\mathrm{Hz}^{-1}$ in $S_y$ are the same statement. Chapter 8 will read the same number as a per-$\sqrt{\mathrm{Hz}}$ sensitivity.

Allan meets the PSD

ADEV and $S_y(f)$ are not rival descriptions; the Allan variance is a weighted integral of the PSD. Averaging over a gate and differencing adjacent gates is a linear, time-invariant operation on $y(t)$, so it has a transfer function, and

$$ \sigma_y^2(\tau) \;=\; \int_0^{\infty} S_y(f)\, \big|H_\tau(f)\big|^2\, df, \qquad \big|H_\tau(f)\big|^2 \;=\; \frac{2\sin^4(\pi\tau f)}{(\pi\tau f)^2}. $$

Plot that weight and the "high-pass in disguise" claim becomes a picture — in fact it is better than a high-pass, it is a bandpass: $|H_\tau|^2$ rises as $f^2$ at low frequency (drift rejection), falls off above $1/\tau$ (the averaging), and peaks near

$$ f_{\mathrm{peak}} \approx \frac{0.37}{\tau} \approx \frac{1}{2\tau}. $$

The Allan variance is a bandpass view of S_y(f)

The whole identity in one picture. Left: the chosen noise's $S_y(f)$ (blue), the bandpass weight $|H_\tau(f)|^2$ (gray, dimensionless, peaking near $0.37/\tau$), and their product — the integrand — in orange, filled: the shaded area is $\sigma_y^2(\tau)$. Right: the resulting Allan deviation over all gate times, with your $\tau$ marked. Slide $\tau$: the passband scans down the spectrum and the marker rides along the fingerprint. Switch the noise to see each power law produce its slope.

This is the reason the fingerprint works: $\sigma_y(\tau)$ at gate time $\tau$ mostly reports $S_y$ near Fourier frequency $1/2\tau$, so as $\tau$ grows the ADEV scans down the spectrum, and each power law in $S_y(f)$ maps onto a power law in $\tau$. Doing the integral exactly for each one-sided power law $S_y = h_\alpha f^\alpha$ gives the standard conversion table:

noise	$S_y(f)$	Allan variance
white FM	$h_0$	$\dfrac{h_0}{2\tau}$
flicker FM	$h_{-1}f^{-1}$	$2\ln 2\; h_{-1}$
random-walk FM	$h_{-2}f^{-2}$	$\dfrac{2\pi^2}{3}\, h_{-2}\,\tau$
white PM	$h_2 f^{2}$	$\approx \dfrac{3 f_H\, h_2}{4\pi^2\,\tau^2}$ ($f_H$: measurement bandwidth)

Note the white-PM row needs the high-frequency cutoff $f_H$ of your measurement system: $S_y \propto f^2$ keeps growing, and the bandpass only tames it up to the bandwidth you actually recorded. This foreshadows chapter 9, where the same "sensor = filter function acting on the PSD" logic — with fancier $|H(f)|^2$ — is the master formula for qubit dephasing and LIGO alike.

Is the Allan deviation only for frequency?

Nothing in the recipe is about frequency — bin, difference neighbours, RMS is a statistic you can compute on any time series, and the bandpass picture above explains what it does to any spectrum. The pairing with clocks is convention meeting need: frequency noise is where divergent variance is unavoidable (every oscillator has flicker and random-walk FM), and averaging time is the operationally meaningful variable for a clock. Amplitude noise usually has a milder, near-stationary spectrum and gets asked frequency-domain questions (servo bandwidth, in-band floor), so an RMS over a stated band does the job. But whenever an amplitude-like signal drifts divergently, ADEV is used verbatim: the bias instability of gyroscopes and accelerometers is read off an Allan-deviation plot of the rate output, and laser power ($\delta P/P$), voltage references and magnetometers get the same treatment. Rule of thumb: reach for ADEV whenever the variance refuses to converge and "how long should I average?" is the question — and apply it to a fractional quantity, like $y = \delta\nu/\nu_0$ or $\delta P/P$, if you want dimensionless numbers to compare.

The estimator in code

The overlapping estimator that powers every plot on this page is ten lines. The only cleverness is a cumulative sum, so each gate average is a subtraction instead of an inner loop (this is Allan.adev from allan.js):

// y[0..N-1]: fractional frequency samples every tau0 seconds.
// c[k] = y[0] + ... + y[k-1]  (cumulative sum with a leading zero)
const c = new Float64Array(N + 1);
for (let i = 0; i < N; i++) c[i + 1] = c[i] + y[i];

// overlapping Allan variance at averaging factor m  (tau = m * tau0)
let sum = 0, count = 0;
for (let j = 0; j + 2 * m <= N; j++) {          // slide one sample at a time
  const a = (c[j + m] - c[j]) / m;              // average of gate starting at j
  const b = (c[j + 2 * m] - c[j + m]) / m;      // average of the next gate
  sum += (b - a) * (b - a);
  count++;
}
const adev = Math.sqrt(sum / (2 * count));      // sigma_y(m * tau0)

Lab note

At the largest $\tau$ your record allows, only a handful of independent gate pairs exist, so the last few ADEV points scatter wildly — you can see it at the right-hand edge of the headline demo. Never read a floor off the final point of an ADEV curve; the rule of thumb is to distrust everything beyond $\tau \sim T_{\mathrm{record}}/10$, and real papers put error bars ($\chi^2$-based) on every point.

Exercises

Exercise 7.1 — how long to average

A clock has pure white FM with $\sigma_y(1\,\mathrm{s}) = 1\times10^{-15}$. How long must you average to reach $1\times10^{-17}$?

Solution

White FM: $\sigma_y(\tau) = a/\sqrt\tau$ with $a = 10^{-15}$. Solve $a/\sqrt\tau = 10^{-17}$: $\tau = (a/10^{-17})^2 = (100)^2 = 10^4\,$s $\approx 2.8$ hours. The $1/\sqrt\tau$ law is brutal: every factor of 10 in stability costs a factor of 100 in time.

Exercise 7.2 — from ADEV to PSD and back to hertz

Convert $\sigma_y(1\,\mathrm{s}) = 1\times10^{-15}$ (white FM, $\tau_0 = 1$ s) into the PSD level $h_0$, and then into the absolute frequency-noise PSD $S_{\delta\nu}$ for a strontium clock at $\nu_0 = 429$ THz.

Solution

From $\sigma_y^2(\tau) = h_0/2\tau$: $h_0 = 2\sigma_y^2\tau = 2\times(10^{-15})^2\times 1\,\mathrm{s} = 2\times10^{-30}\,\mathrm{Hz}^{-1}$. Then $S_{\delta\nu} = h_0\,\nu_0^2 = 2\times10^{-30}\times (4.29\times10^{14})^2 \approx 0.37\,\mathrm{Hz^2/Hz}$. Noteworthy: a state-of-the-art clock's frequency noise is sub-hertz-level in absolute terms — it is the enormous $\nu_0$ in the denominator of $y$ that makes the fractional numbers so spectacular.

Exercise 7.3 — read the fingerprint

An ADEV plot falls with slope $-1$ from $\tau = 1$ s to 100 s, then with slope $-1/2$ beyond. Which noise types are dominating in each region? Is the ADEV alone enough to pin down the first one?

Solution

Slope $-1$: phase noise — white PM or flicker PM (they share the $\tau^{-1}$ signature, so plain ADEV cannot distinguish them; MDEV can). Slope $-1/2$: white FM. Physically: a noisy readout dominates the short gates, and the oscillator's own white frequency noise takes over beyond 100 s.

Exercise 7.4 — a day of drift

A clock drifts linearly by $1\times10^{-16}$ per day. What is the drift's contribution to $\sigma_y$ at $\tau = 10^5$ s?

Solution

$D = 10^{-16}/86400\,\mathrm{s} = 1.16\times10^{-21}\, \mathrm{s^{-1}}$. The drift contribution is $\sigma_y = D\tau/\sqrt2 = 1.16\times10^{-21}\times10^{5}/1.414 \approx 8.2\times10^{-17}$ — already comparable to the best floors, which is why serious clock comparisons fit and remove drift before quoting stability.

Exercise 7.5 — the inevitability of τ⁺¹

Explain why the Allan deviation of any clock with a nonzero linear drift must eventually rise as $\tau^{+1}$, no matter how good the clock is.

Solution

For $y(t) = Dt$, adjacent gate averages differ by exactly $\bar y_{k+1} - \bar y_k = D\tau$, deterministically — averaging longer makes the difference bigger, not smaller. So the drift contributes $\sigma_y = D\tau/\sqrt2$, growing linearly in $\tau$, while every random noise contribution grows no faster than $\tau^{1/2}$. Whatever the noise floor, the $D\tau/\sqrt2$ line crosses it at some finite $\tau$ and dominates from then on. The only cure is to remove the drift, which is legitimate precisely because it is deterministic — it carries no surprise, hence no noise.

Exercise 7.6 — the GPS-disciplined oscillator

A quartz OCXO has a flicker floor $\sigma_y \approx 1\times10^{-12}$ and ages by $1\times10^{-10}$ per day. A GPS timing receiver's one-pulse-per-second output is steered to UTC — accurate on average — but each pulse jitters by $\sim 10$ ns, giving white-PM-like $\sigma_y(\tau) \approx \sqrt3\,(10\,\mathrm{ns})/\tau \approx 1.7\times10^{-8}\,(\tau/1\,\mathrm{s})^{-1}$. (a) After a year of free running, which dartboard quadrant is the bare quartz in? (b) A GPS-disciplined oscillator locks the quartz to the GPS pulses with a very long loop time constant. Which quadrant does that move it to, and at roughly what $\tau$ should the handover happen — i.e., where do the two stability curves cross?

Solution

(a) Stable but not accurate. Aging accumulates $\epsilon \approx 1\times10^{-10} \times 365 \approx 4\times10^{-8}$ — a tight cluster of width $\sim 10^{-12}$ sitting four-plus orders of magnitude off-center. Its ADEV is superb and says nothing about the offset.

(b) To stable and accurate. Setting $1.7\times10^{-8}/\tau = 1\times10^{-12}$ gives $\tau \approx 1.7\times10^4$ s $\approx 5$ h: below this the quartz is quieter than the GPS jitter, so the loop must leave it alone (time constant of hours or slower); beyond it, averaged GPS is better, so the loop should steer. (If the aging is left in, the quartz curve rises as $D\tau/\sqrt2$ past $\tau \approx \sqrt2\,\sigma_{\rm floor}/D \approx 20$ min, with $D = 1.2\times10^{-15}\,\mathrm{s^{-1}}$, and meets the GPS line earlier, at $\tau = \sqrt{\sqrt2 \times 1.7\times10^{-8}/D} \approx 4.6\times10^3$ s $\approx$ 1.3 h — either way, "hours".) The disciplined output is quartz-stable at short $\tau$ and GPS-accurate at long $\tau$: the same best-of-both-references logic as the locked loops of chapter 10.