Chapter 8
Sensitivity: what Hz/√Hz means
Every sensor datasheet and every metrology paper quotes a sensitivity in the strange unit "something per root hertz". This chapter makes that number concrete: it is the amplitude spectral density of the instrument's noise referred to the input, and it tells you exactly how your measurement uncertainty shrinks as you integrate longer — and when it stops shrinking.
Noise referred to the input
A sensor is a machine that turns the quantity you care about — a magnetic field, a force, a laser's frequency offset — into something you can record, usually a voltage. Write the input as $u$ and the output as $V$; for small signals the instrument is linear with gain $G$, and it adds its own noise on the way out:
$$ V(t) \;=\; G\,u(t) \;+\; V_{\mathrm{noise}}(t). $$
The output noise $V_{\mathrm{noise}}$ has some power spectral density $S_V(f)$ in $\mathrm{V^2/Hz}$, exactly as in chapter 4. But volts are an implementation detail — what you want to know is how big an input would produce the same wiggle. So divide by the gain: the equivalent input noise has PSD
$$ S_u(f) \;=\; \frac{S_V(f)}{G^2}, \qquad \text{ASD} = \sqrt{S_u(f)} \;\;\text{in units of } u/\sqrt{\mathrm{Hz}} . $$
That single curve — the amplitude spectral density of the noise referred to the input — is "the sensitivity". It is a property of the whole instrument, gain and noise together, expressed in the units of the thing being measured. Some real-world values:
| Instrument | Input quantity | Typical sensitivity (input-referred ASD) |
|---|---|---|
| fluxgate magnetometer | magnetic field | 5 pT/√Hz |
| LIGO (best band, ~200 Hz) | strain (dimensionless!) | ~10−23 1/√Hz |
| broadband seismometer | ground acceleration | ~10−9 m s−2/√Hz |
| photodiode at 100 µA (shot-noise limit) | photocurrent | 5.7 pA/√Hz |
| laser-lock error signal | laser frequency offset | 1 Hz/√Hz |
Two of these deserve a second look. LIGO's strain is dimensionless, so its sensitivity has the delightfully bare unit $1/\sqrt{\mathrm{Hz}}$. And the last row is a frequency discriminator: an instrument (a cavity, an atomic line) whose input is itself a frequency, so its sensitivity is hertz of input per root hertz of Fourier frequency — $\mathrm{Hz}/\sqrt{\mathrm{Hz}}$, the unit in this chapter's title. The two hertz are different animals: the numerator is the signal you measure, the denominator is the bandwidth you measure it in.
Noise figure: how RF engineers quote the same thing
RF engineers have been quoting input-referred noise since the 1940s, under a different name and in decibels (the dBm and the dB arithmetic here are the ones built in chapters 4 and 5 — dBm is power relative to 1 mW; adding decibels multiplies powers). Their reference point is the universal thermal floor of chapter 5: a matched source at temperature $T$ delivers an available noise power density of $k_B T$ per hertz, which at the standard $T_0 = 290\ \mathrm{K}$ is $4.0\times10^{-21}\ \mathrm{W/Hz} = -174\ \mathrm{dBm/Hz}$. Now drive any network — amplifier, cable, mixer — from a source at $T_0$. Its output noise density has two parts: the amplified source noise, and whatever the network adds of its own. The noise factor $F$ is the ratio of the total to the source part alone; dividing top and bottom by the gain, it is equally the factor by which the network degrades signal-to-noise:
$$ F \;=\; \frac{\text{total output noise density}} {\text{part due to the source alone}} \;=\; \frac{\mathrm{SNR_{in}}}{\mathrm{SNR_{out}}}, $$
both taken at a single ("spot") frequency, with matched impedances, and with the source pinned by convention at 290 K. In decibels this is the noise figure, $\mathrm{NF} = 10\log_{10} F$. A noiseless amplifier has $F = 1$ ($\mathrm{NF} = 0$ dB): it amplifies signal and source noise alike and the SNR is untouched. A passive attenuator with power loss $L$ has $F = L$: the signal comes out $L$ times weaker, but the attenuator's own resistors are at 290 K too, so a full $k_B T_0$ still comes out the back — the SNR drops by exactly $L$. Three decibels of cable loss is three decibels of noise figure. Radio astronomers carry the same content as a temperature: referred to the input, the network's own noise is worth an extra source temperature $T_N = 290\,(F - 1)$ K, so a 0.5-dB-NF receiver is "a 35 K receiver".
Here is why this section belongs in this chapter. An amplifier chain with noise factor $F$ sitting on the thermal floor has an equivalent input noise density of $F \cdot k_B T_0$ W/Hz — in decibels, $-174 + \mathrm{NF}$ dBm/Hz. That is precisely the referred-to-input noise density of section 1, for the special case where the input quantity is RF power: noise figure is the input-referred (squared) ASD, quoted in dB relative to the thermal floor. A "sensitivity of $5\ \mathrm{pT}/\sqrt{\mathrm{Hz}}$" and a "noise figure of 6 dB" are the same kind of statement in two dialects — one absolute, one relative to what a warm resistor would do. (If you noticed that chapter 6's additive phase-noise floor read $-177 + F_{\mathrm{dB}} - C_{\mathrm{dBm}}$ rather than $-174$: the extra 3 dB is not a disagreement — it is the half of the added noise that reads as amplitude rather than phase. Same $F k_B T$, two bookkeepings.)
The dialect earns its keep when instruments are chained. For stages with noise factors $F_1, F_2, F_3$ and available gains $G_1, G_2$ (linear, not dB), referring every stage's added noise back to the chain input divides it by all the gain in front of it — the Friis cascade formula:
$$ F_{\mathrm{total}} \;=\; F_1 \;+\; \frac{F_2 - 1}{G_1} \;+\; \frac{F_3 - 1}{G_1 G_2} \;+\; \cdots $$
The consequence is the first commandment of receiver design: the first stage dominates. Thirty decibels of low-noise gain up front makes everything behind it a thousand times less important; the same LNA placed after a lossy cable inherits the cable's loss as straight noise figure. Try it:
The chain is drawn below: each stage's added noise, referred back to the chain input as $(F_i-1)/\prod_{j<i} G_j$ in units of $k_B T_0$, appears under its box and as a bar on the log plot — the total noise factor is 1 (the source) plus the three bars. Toggle the presets and watch the total jump from 1.0 dB to 4.0 dB when the 3-dB cable goes in front of the LNA. For a passive attenuator at room temperature, set $\mathrm{NF} = -G$ (a 3 dB pad has a 3 dB noise figure); the sliders are independent, so that bookkeeping is yours. Right plot: an example time trace of the same signal at each point in the chain, referred to the input (divided by the gain in front, so the sine stays put and only the noise grows). With the LNA first, the later stages leave almost no visible mark; put the cable first and the damage is immediate.
Averaging a constant signal
The simplest possible measurement: the input is a constant $u_0$ (a DC magnetic field, say), buried in white input-referred noise with one-sided PSD $S_u$ (flat, in $u^2/\mathrm{Hz}$). You record for a time $T$ and take the average,
$$ \bar{u}_T \;=\; \frac{1}{T}\int_0^T u_0 + n(t)\, \mathrm{d}t . $$
How uncertain is $\bar u_T$? Averaging over $T$ is a low-pass filter, and the filter's equivalent noise bandwidth (the same ENBW idea chapter 4's window callout used for the Hann window) is $\Delta f_{\mathrm{ENBW}} = 1/(2T)$: the boxcar average passes noise power as if it were a brick-wall filter from DC to $1/(2T)$. So the variance of the average is the flat PSD times that bandwidth,
$$ \sigma^2(T) \;=\; S_u \cdot \frac{1}{2T}, \qquad\Longrightarrow\qquad \boxed{\;\sigma(T) \;=\; \frac{\sqrt{S_u}}{\sqrt{2T}}\;} $$
— divide the per-root-hertz number by $\sqrt{2T}$. If the filter argument feels slippery, count samples instead: digitize at rate $f_s$, so each sample has variance $S_u f_s/2$ (the white-noise relation from chapter 4) and $T$ seconds give $N = T f_s$ independent samples. The variance of the mean of $N$ independent samples is $\bigl(S_u f_s/2\bigr)/N = S_u/(2T)$ — the sample rate drops out, as it must. Either way: a 5 pT/√Hz magnetometer averaged for 100 s knows the field to $5/\sqrt{200} \approx 0.35$ pT.
Watch this happen. Below, one noisy record's running average converges into a funnel of predicted width $\pm\sqrt{S_u}/\sqrt{2t}$ — and on the right, one hundred independent $T$-second measurements scatter with exactly the predicted standard deviation.
Left: the running average of one noisy record (white noise of the chosen sensitivity around the true field $B_0$), with the predicted $\pm\sqrt{S_u}/\sqrt{2t}$ envelope as dashed curves. Right: 100 independent $T$-second averages. Drag $T$ and compare the measured scatter with the prediction; press ↻ for a fresh realization.
The running average hugs the funnel: it is a random walk in disguise (chapter 5), but one whose step size shrinks as $1/t$, so it converges — slowly. The $\sqrt{T}$ in the denominator is brutal arithmetic: every extra digit of precision costs a factor of 100 in integration time. That is why sensitivity, not patience, is the figure of merit.
Here is the essential code behind the left panel:
// one noisy record and its running mean, plus the predicted funnel
const sigma = ASD * Math.sqrt(fs / 2); // sample std for one-sided S_u = ASD²
let sum = 0;
for (let i = 0; i < N; i++) {
sum += u0 + sigma * Noise.randn(rand);
const t = (i + 1) / fs;
mean[i] = sum / (i + 1); // running average after time t
funnel[i] = ASD / Math.sqrt(2 * t); // predicted 1-sigma envelope
}
The same law, where your eye can check it: frame averaging
Every pixel of a camera is running the experiment above in parallel. One frame = signal + read noise of RMS $\sigma$ per pixel; average $N$ frames and each pixel's noise falls to $\sigma/\sqrt{N}$, so a feature of contrast $c$ becomes visible once $c \gtrsim$ a few $\times\, \sigma/\sqrt{N}$. Drag $N$ and watch the faint disks and the fine grating surface out of the snow, slowly — $\sqrt{N}$ is brutal arithmetic in two dimensions too: each factor of 10 in visibility costs a factor of 100 in exposure.
Left: a single frame of a test pattern (three disks of decreasing contrast, a grating, faint text) in per-pixel read noise of RMS $\sigma$. Right: the average of $N$ such frames (simulated by its exact statistical equivalent, noise of RMS $\sigma/\sqrt{N}$). The readout gives the faintest disk's contrast-to-noise ratio.
Two lab morals hide in this toy. First, the visibility threshold is a contrast-to-noise statement, not a brightness one — exactly the referred-to-input logic from the top of this chapter, per pixel. Second, averaging only defeats noise that is independent frame to frame: fixed-pattern nonuniformity, dust shadows or interference fringes sit at the same place in every frame and survive any $N$ (chapter 5's 2-D section shows how such structure announces itself in the spatial spectrum) — the imaging cousin of the flicker floor in the last section of this chapter.
Finding a small AC signal
Now let the signal be a tiny sine at frequency $f_0$ instead of a constant. In the time trace it is hopelessly buried. But noise is spread over all frequencies while the signal lives at one, so the fix is the same averaging trick wearing different clothes: a lock-in amplifier multiplies the record by a reference $\cos(2\pi f_0 t)$ — which shifts the signal down to DC, exactly the mixing of chapter 2 — and then averages for $T$. The signal survives the averaging; the noise only contributes what falls inside a detection bandwidth $\Delta f \approx 1/T$ around $f_0$. Noise power in the detection band shrinks like $S_u(f_0)\,\Delta f \propto 1/T$, so the amplitude signal-to-noise ratio grows as $\sqrt{T}$ — the same $\sqrt{T}$ as before, now centered on $f_0$ instead of DC.
You can see the whole story on a spectrum. The FFT of a $T$-second record has bins of width $1/T$: as $T$ grows, the sine's power stays concentrated in one narrowing bin, so its peak PSD climbs like $T$ while the white floor stands still.
A sine at 16 Hz, buried in white noise of 1 µV/√Hz. Left: one second of the record with the bare signal drawn to scale — invisible. Right: the PSD of the full $T$-second record. Lengthen $T$ and watch the peak rise out of the floor as the analysis bandwidth $1/T$ shrinks.
The moral: narrowband detection = putting your signal at a quiet frequency and then shrinking the bandwidth around it. The first half of that sentence matters as much as the second, because real noise floors are not flat.
Input-referred noise with a white floor of 10 µV/√Hz and a $1/f$ knee at $f_c$: colored curve is a measured (Welch) ASD, gray dashed is the model $\sqrt{S_w(1+f_c/f)}$. Slide the modulation frequency and watch what a 1 µV signal costs you: below the corner the required integration explodes.
Once you are above the corner, going higher buys nothing — the floor is flat — and eventually costs you (finite detector bandwidth, smaller modulation depth). The optimum is simply comfortably above $f_c$, and the payoff is read straight off the sensitivity curve: detecting the same 1 µV signal at 10 Hz instead of 3 kHz can cost a factor of 100 in noise ASD, hence $100^2 = 10^4$ in integration time.
Back to clocks: sensitivity is the Allan deviation
Chapters 6 and 7 now snap together. A clock interrogation — or a laser locked to a cavity — is a sensor whose input unit is hertz: it converts the oscillator's frequency offset $\delta\nu(t)$ into a readable error signal. Suppose its input-referred noise floor is white, with ASD $\sqrt{S_{\delta\nu}}$ in $\mathrm{Hz}/\sqrt{\mathrm{Hz}}$. Then by the boxed formula, averaging for $T$ pins the frequency to
$$ \sigma_\nu(T) \;=\; \sqrt{\frac{S_{\delta\nu}}{2T}} . $$
Divide both sides by the carrier $\nu_0$ to make it fractional. With $y = \delta\nu/\nu_0$ and $h_0 = S_y = S_{\delta\nu}/\nu_0^2$ this reads
$$ \sigma_y(\tau) \;=\; \sqrt{\frac{h_0}{2\tau}} , $$
which is exactly the white-frequency-noise Allan deviation of chapter 7 — same coefficient, same $\tau^{-1/2}$. This is not a coincidence to be checked; it is the same statement twice. In its white-FM regime, the Allan deviation is a sensitivity spec: quoting "$\sigma_y(\tau) = 3\times10^{-13}/\sqrt{\tau}$" and quoting "white frequency noise of $\sqrt{2}\times 3\times10^{-13}\,\nu_0$ Hz/√Hz" are the same sentence in two dialects.
| Frequency sensitivity | Integration time | Frequency uncertainty ASD/√(2T) |
|---|---|---|
| 1 Hz/√Hz | 1 s | 0.71 Hz |
| 1 Hz/√Hz | 100 s | 0.071 Hz |
| 1 Hz/√Hz | 104 s | 7.1 mHz |
And the chain runs on into any quantity you can transduce into a frequency. A worked example, end to end: an atomic magnetometer watches a spin-precession frequency with gyromagnetic ratio $\gamma = 7\ \mathrm{Hz/nT}$ (a number of the right flavor for real alkali or nitrogen-vacancy sensors). Its frequency readout has a white noise floor of $3.5\ \mathrm{Hz}/\sqrt{\mathrm{Hz}}$. Then:
$$ \underbrace{3.5\ \tfrac{\mathrm{Hz}}{\sqrt{\mathrm{Hz}}}}_{\text{frequency sensitivity}} \;\xrightarrow{\ \div\,\gamma\ }\; \underbrace{0.5\ \tfrac{\mathrm{nT}}{\sqrt{\mathrm{Hz}}}}_{\text{field sensitivity}} \;\xrightarrow{\ \div\sqrt{2T},\ T=25\,\mathrm{s}\ }\; \underbrace{0.07\ \mathrm{nT}}_{\text{uncertainty}} $$
(If you had used the $1/\sqrt{T}$ convention below, you would have said 0.1 nT for the same instrument — see the warning.) Try your own numbers:
Set a sensitivity (in any unit X per √Hz) and an integration time; the crosshair reads the expected uncertainty in X off the $1/\sqrt{2T}$ line, with the other convention's $1/\sqrt{T}$ line (a constant factor $\sqrt2$ above) for comparison. Averaging ten times longer buys $\sqrt{10} \approx 3.2\times$ — one decade down the line for every decade across.
When averaging stops helping
Everything so far assumed the input-referred noise is white where you detect. Over long times it never is. The two classic spoilers, both old friends from chapter 7:
- Flicker. If $S_u(f)$ turns up as $1/f$ below some corner, the noise power entering the ever-narrower averaging band stops decreasing — $\sigma(T)$ flattens onto a plateau ("flicker floor") no matter how long you wait.
- Drift. A slow ramp $d\cdot t$ in the input isn't noise at all, but the $T$-average of a ramp is off by $d\,T/2$: the error now grows with integration time.
Put the three together and the uncertainty-versus-$T$ curve has the classic V shape: white noise coming down as $T^{-1/2}$, drift going up as $T$, a flicker plateau wedged between — and an optimal integration time where they cross. Averaging past it makes your measurement worse.
64 simulated records of magnetometer noise — white (10 pT/√Hz) + flicker + linear drift — each averaged for time $T$; the curve is the RMS error of those 64 averages. Dashed guides: $\mathrm{ASD}/\sqrt{2T}$ and $d\,T/2$. Raise the drift and watch the optimum move; raise the flicker corner and watch the plateau form.
This plot is the Allan-deviation story of chapter 7 told in sensor units: replace "pT" by "fractional frequency" and the V shape is exactly why a clock's datasheet quotes both a $\tau^{-1/2}$ coefficient and a flicker floor and a drift number. A sensitivity in $u/\sqrt{\mathrm{Hz}}$ is the honest spec only in the white regime; the full honest spec is this whole curve.
Exercises
A fluxgate magnetometer is specified at 5 pT/√Hz (white). What field uncertainty do you expect after averaging for 100 s? Check your answer against the funnel demo.
Solution
$\sigma = \sqrt{S_u/2T} = 5\ \mathrm{pT}/\sqrt{2\cdot100} = 5/\sqrt{200} \approx 0.35$ pT. (With the $1/\sqrt{T}$ convention you would say 0.5 pT — the √2 trap in action.)
A photodiode draws $I = 100\ \mu\mathrm{A}$ of photocurrent. Shot noise gives a white one-sided current PSD $S_I = 2eI$. What is the current sensitivity in pA/√Hz, and what relative power uncertainty do you reach after 1 s of averaging?
Solution
$S_I = 2 \times 1.602\times10^{-19} \times 10^{-4} = 3.2\times10^{-23}\ \mathrm{A^2/Hz}$, so $\sqrt{S_I} \approx 5.7\ \mathrm{pA}/\sqrt{\mathrm{Hz}}$. After 1 s: $\sigma_I = 5.7\,\mathrm{pA}/\sqrt{2} \approx 4.0\ \mathrm{pA}$, i.e. a relative uncertainty $4.0\times10^{-12}/10^{-4} = 4\times10^{-8}$. Forty parts per billion in one second — shot noise is a very good noise floor, which is why intensity-stabilization servos chase it.
A laser locked to a cavity shows an in-loop error signal of 0.01 Hz/√Hz, but an independent comparison against a second laser shows 1 Hz/√Hz. Which number describes the laser, and how can they disagree by a factor of 100?
Solution
The out-of-loop number. A servo's whole job is to drive its own error signal to zero at the point where it senses — so the in-loop signal is small by construction, not by virtue of the laser being quiet. Any noise added after the sensing point (fiber phase noise, path length to the experiment), and any noise in the sensing itself (cavity vibration, residual amplitude modulation, photodetection noise), is faithfully written onto the light while the error signal stays serenely at zero — the loop steers the laser to cancel what the sensor sees, including the sensor's own lies. Only a measurement through an independent path can reveal it.
Your detection chain has an input-referred noise ASD flat at 10 µV/√Hz above 1 kHz, rising as $1/f$ in PSD (so $1/\sqrt{f}$ in ASD) below that corner. You must detect a 1 µV signal. Choose a chopping frequency and estimate the integration time for a signal-to-noise ratio of 5. What would the same measurement cost at 10 Hz?
Solution
Chop above the corner — say 3–10 kHz — where the ASD is the flat 10 µV/√Hz. SNR of 5 means $\sigma = 0.2\ \mu\mathrm{V}$, so $T = S_u/(2\sigma^2) = (10/0.2)^2/2 = 1250$ s, about 21 minutes. At 10 Hz the ASD is $10\sqrt{1000/10} = 100\ \mu\mathrm{V}/\sqrt{\mathrm{Hz}}$: a hundred times the noise, hence $10^4$ times the integration time (≈145 days) — and even that is optimistic, because in the $1/f$ region the $1/\sqrt{2T}$ law itself fails and $\sigma(T)$ plateaus, as in the last demo. Below the corner the measurement is simply not available; chopping is not an optimization but a necessity.
A satellite ground station has a dish, a 20 m coaxial run to the equipment rack (loss 0.2 dB/m, so 4 dB total), an LNA with 30 dB gain and 0.8 dB noise figure, and a rack receiver with $\mathrm{NF} = 10$ dB. Compare the system noise figure with the LNA mounted at the dish (LNA → cable → receiver) versus in the rack (cable → LNA → receiver).
Solution
Convert to linear: LNA $F = 10^{0.08} = 1.202$, $G = 1000$; cable $F = L = 10^{0.4} = 2.512$, $G = 1/L = 0.398$; receiver $F = 10$. LNA at the dish: $F = 1.202 + (2.512-1)/1000 + (10-1)/(1000 \times 0.398) = 1.202 + 0.002 + 0.023 = 1.226$, i.e. $\mathrm{NF} = 0.89$ dB ($T_N \approx 66$ K). LNA in the rack: $F = 2.512 + (1.202-1)/0.398 + (10-1)/(0.398 \times 1000) = 2.512 + 0.508 + 0.023 = 3.043$, i.e. $\mathrm{NF} = 4.83$ dB ($T_N \approx 592$ K) — nine times the system noise temperature, from moving one box. Notice the anatomy of the loss: the cable's own 4 dB lands directly on the total, and it strips 4 dB of protection from everything behind it. Twenty meters of climbing a tower with the LNA is the cheapest 4 dB in radio engineering.