1. Mathematical Preliminaries#
Here we will refresh some definitions and properties of Fourier Analysis. The math we use will not be rigurous, but it will be, hopefully, consistent.
Definitions: Fourier Transforms#
The Fourier transform, and its inverse can be written either in terms of frequency \(f\) or angular freuency \(\omega=2\pi f\). Their definitions are:
Unless otherwise noted, we will use \(X\) to denote time-domain functions \(X(t)\) and \(\tilde{X}\) to denote frequency-domain functions, \(\tilde{X}\left(f\right)\) or \(\tilde{X}\left(\omega\right)\). We will use the following notation to denote two quantities related through a Fourier/Inverse Fourier transformation:
Nyquist frequency: for discrete time-series evaluated at discrete time points \(t_{i}\) separated by a time step \(\Delta t=t_{i+1}-t_{i}\), the Fourier Transform can only be evaluated at frequencies up to the sampling frequency, also called the Nyquist frequency
It is not possible to estimate energy at frequencies higher than the Nyquist frequency without more finely resolved data.
Definitions: Spectral Density & Covariance#
Wikipedia has some good resources on spectral density.
The cross power spectral density of two variables is:
\(T\) is the total length of the signal and \(*\) denotes a complex conjugate. In practice, we won’t have to deal with the limit or the expectation, as everything will eventually be written in terms of the power spectral density \(S_{\eta\eta}\left(\omega\right)=\left\langle \tilde{\eta}\left(\omega\right)\tilde{\eta}^{*}\left(\omega\right)\right\rangle \) of some noise forcing \(\eta(t)\). The only two things we will to compute spectral densities are the fact that the bracket operator \(\left\langle \cdot\right\rangle \) is linear, and care in scaling the power spectral density of the noise.
The spectral density described above is what we will call the process spectral density. You can think of it as the true spectral density of the underlying process that generated the data. When working with real data, we will only have access to finite samples of the processes \(X(t)\) and \(Y(t)\), defined over a finite interval \(T\). Thus, all we can do is get an estimate of the spectrum, which we will call the sample spectrum. For now we won’t worry about how to compute these estimates, and just use a library that estiamtes these quantities for us.
The power spectral density (PSD}, or the auto-spectrum of a stochastic process \(X\) is:
The cross-covariance function. At least for now, we will be dealing with anomalies from a mean, so we can assume \(E(X(t))=E(Y(t))=0\). If ths is the case, the cross-covariance becomes equal to the cross-correlation function, defined as:
A white noise: process is an uncorrelated process, i.e. the covariance is only non-zero at zero lag. If \(\delta_{\tau}\) is a delta-function centered on zero, then:
Useful Properties and Theorems#
In practice, our derivations will make use of the following properties of the Fourier Transform, spectral density, and covariance functions:
Fourier Transform of a time-derivative
Linearity of cross spectral density:
Expected power spectrum of a white noise process is a constant. In fact, this could be an alternative definition of ``white noise’’, with the \(\delta\)-function correlaton being a consequence. If \(\eta(t)\) is a white-noise process:
Expected cross-spectrum of two independent processes is zero
Wiener-Khinchin theorem links lagged-covariance with cross-spectrum:
Parseval’s Theorem is equivalent to the Wiener-Khinchin theorem at \(\tau=0\), where \(C_{XX}\left(0\right)=\text{var}\left(X\right)\). Still, it is important enough that it is worth discussing it on its own: