Justification of bilinear transform

Question

I would like to understand the "justification" for the bilinear transform. The basic idea as I understand it is that by integration rule of Laplace transform we have for continuous $y(t)$:

$$\mathcal{L}\{y\}(s)=\mathcal{L}\{\int_0^ty'(x)dx\}(s)=\frac{1}{s}\mathcal L\{y'\}(s)$$ $$\Leftrightarrow \frac{\mathcal L\{y\}(s)}{\mathcal L\{y'\}(s)}=\frac{1}{s}$$

Using the trapezoid rule gives the approximation ($h$ is the sampling period): $$y(t)=y(t-h)+\int_{t-h}^ty'(x)dx\approx y(t-h)+\frac{h}{2}(y'(t)+y'(t-h))$$

And taking the $z$ transform using the approximation gives $$\mathcal Z\{y\}(z)\approx \mathcal Z\{ y[n-1]+\frac{h}{2}(y'[n]+y'[n-1])\}(z)$$ $$\approx z^{-1}\mathcal{Z\{y\}(z)}+\frac{h}{2}(\mathcal Z\{y'\}(z)+z^{-1}\mathcal Z\{y'\}(z))$$ $$\Leftrightarrow\frac{\mathcal Z\{y\}(z)}{\mathcal Z\{y'\}(z)}\approx \frac{h}{2}(\frac{1+z^{-1}}{1-z^{-1}})$$

Putting the above together we finally end up with the estimate $$\frac{\mathcal L\{y\}}{\mathcal L\{y'\}}(\frac{2}{h}(\frac{1-z^{-1}}{1+z^{-1}}))\approx \frac{\mathcal Z\{y\}(z)}{\mathcal Z\{y'\}(z)}$$

Question 1. How do we get from the above estimate to the desired estimate

$$\mathcal L\{y\}(\frac{2}{h}(\frac{1-z^{-1}}{1+z^{-1}}))\approx \mathcal Z\{y\}(z)$$

Question 2. It is possible to use trapezoid rule on any linear ODE directly (iterating the rule for higher order derivatives as needed). However, interestingly this gives a different answer than the bilinear transform. The resulting coefficients $a_i$ are the same, however the $b_i$s are different.

So it appears that the bilinear transform distorts the start of the impulse. For example, for 1st order Butterworth lowpass filter we have from bilinear approximation that $b_0=b_1$. As such the impulse response ramps up before ramping down, while the continuous response and trapezoidal estimate only ramps down (see example below). Why are the $b_i$s given by the bilinear transform the "best" (e.g. most widely used) estimates (note: we only consider $b_i$s up to the order of the filter)?

Example: Consider the 1-pole Butterworth filter given by the transfer function:

$$H(s)=\frac{1}{1+s/\omega}$$

The differential eq. is given by: $$y=-y'/\omega$$

By trapezoid method and the above eq.: $$y(t+h)\approx y(t)+\frac{h}{2}(y'(t+h)+y'(t))=y(t)-\frac{h\omega}{2}(y(t+h)+y(t))$$ Collecting the terms we get the difference equation $$y[n+1]=\frac{(1-\frac{h\omega}{2})}{(1+\frac{h\omega}{2})}y[n]$$

Which is the same as with bilinear transform, but lacking the term $b_1$.

Matt L. · Accepted Answer · 2019-12-21T20:43:57.500

3

Let me rephrase your derivation of the bilinear transform to clarify the result. For any $t_0<t-T$ we have

$$y(t)=\int_{t_0}^tx(\tau)d\tau=y(t-T)+\int_{t-T}^tx(\tau)d\tau\tag{1}$$

Approximating the right-most integral in $(1)$ by the trapezoidal rule we get

$$y(t)\approx y(t-T)+\frac{T}{2}\big(x(t)+x(t-T)\big)\tag{2}$$

Switching over to samples of $y(t)$ and $x(t)$ and using the sloppy but common and handy notation $y[n]=y(nT)$, we define from $(2)$ the discrete-time (DT) system:

$$y[n]=y[n-1]+\frac{T}{2}\big(x[n]+x[n-1]\big)\tag{3}$$

This system is now used as the discrete-time approximation to an integrator. In the $\mathcal{Z}$-transform domain it is described by

$$H(z)=\frac{Y(z)}{X(z)}=\frac{T}{2}\frac{z+1}{z-1}\tag{4}$$

So the bilinear transform replaces the transfer function of a continuous-time (CT) integrator $H_c(s)=1/s$ by the transfer function $(4)$, i.e., the bilinear transform uses the mapping

$$s\leftrightarrow \frac{2}{T}\frac{z-1}{z+1}\tag{5}$$

So far for part $1$ of your question.

I must admit that I'm not sure I understand part $2$ correctly. Could you give a simple concrete example showing how you obtain two different DT systems from a given CT system?

As for you final question, the bilinear transform is of course not the best way to transform a CT system to a DT system, simply because there cannot be one best way to do so. However, in some applications - such as optimal filter design - it is considered the most useful transform because it preserves optimality of the magnitude. This is the case because the frequency response of the resulting DT system is just a compressed version of the frequency response of the original CT system, i.e., all properties concerning the magnitude (e.g., ripple size, number of zeros in the stopband, etc.) remain unchanged. Just the frequency axis gets warped, but we can take this warping into account when designing the CT system such that the resulting DT system implements the desired band edges.

Furthermore, the bilinear transform is the only direct mapping from the $s$-plane to the $z$-plane that preserves the frequency response in the sense that the $j\omega$-axis is mapped to the unit circle of the $z$-plane. This is not the case with the forward and backward Euler methods. Furthermore, the left half-plane of the complex $s$-plane is mapped to the inside of the unit circle of the $z$-plane, i.e., stability is preserved in the most complete way. This is not the case with the forward Euler method, and even though stability is also preserved with the backward Euler method, the backward Euler method maps the left half-plane to a small region (a circle centered at $z=\frac12$ with radius $\frac12$) inside the unit circle. This means among other things that certain frequency characteristics cannot be achieved with the backward Euler method.

EDIT: Concerning your example of a first order Butterworth filter

$$H(s)=\frac{Y(s)}{X(s)}=\frac{1}{1+\frac{s}{\omega_0}}\tag{6}$$

you can't ignore the input of the filter in the differential equation, so you get from $(6)$

$$y'(t)=\omega_0\big(x(t)-y(t)\big)\tag{7}$$

Now if you use $(7)$ in the trapezoidal approximation

$$y(t)\approx y(t-T)+\frac{T}{2}\big(y'(t)+y'(t-T)\big)\tag{8}$$

you obtain the same difference equation as if you simply had replaced $s$ in $(6)$ by $\frac{2}{T}\frac{z-1}{z+1}$. This must be the case because the bilinear transform simply is the trapezoidal rule (also called Tustin's method).

edited Dec 21 '19 at 20:43

answered Dec 21 '19 at 19:27

Matt L.

89,963
9
79
179

Thank you for the answer. Now there is an example of how to obtain a different estimate by trapezoid method. Do you have a reference for bilinear transform preserving the optimality of the magnitude response? – Dole Dec 21 '19 at 20:00
1

@Dole: I've edited my answer to address your example. As for the optimality of the magnitude, since the bilinear transform maps the frequency axis to the unit circle, the frequency response remains unchanged, apart from a frequency warping, so obviously optimal magnitudes remain optimal after the transformation. Ripple size etc. cannot change. – Matt L. Dec 21 '19 at 20:45
@MattL- Nice answer. Would the Matched-Z qualify if we are not concerned with the aliasing? (Regarding the points in your sentence about the bilinear transform being the only direct mapping). It maps to the jw -axis and maps the left half plane to the complete inner circle (albeit periodically). So for a frequency response that is negligible above half the sampling rate, would the matched Z transform have the same qualities you detail such as optimality of magnitude? – Dan Boschen Dec 22 '19 at 01:55
@DanBoschen: My point was to make distinction between what I called "direct mappings", i.e., where $s$ is replaced by a bijective function of $z$ (i.e., mapping in the sense of mapping between complex variables, Moebius transformation), and other transformations, as the impulse invariant or matched z transform. – Matt L. Dec 22 '19 at 11:06
@DanBoschen: Note that the matched z transform cannot be described as a mapping $s\leftarrow F(z)$ with a bijective function $F(z)$. – Matt L. Dec 22 '19 at 11:07
@MattL. Got it- thank you. My question was sincere; that aside, would it have all the properties you detail if not for the distortion due to aliasing? – Dan Boschen Dec 22 '19 at 12:51
1

@DanBoschen: That's a good question, and the answer is yes and no :) The thing is that aliasing is inherent in the matched z transform, because it can only be applied to a CT rational transfer function (we need poles and zeros after all). And a rational transfer function can never be ideally band-limited, so aliasing is always there by definition. But we can get close to reproducing the frequency response exactly if the sampling rate is sufficiently high, and if the CT system has a low pass characteristic. – Matt L. Dec 22 '19 at 13:06
@DanBoschen: But note that any zero of the CT system above Nyquist may end up literally anywhere for the DT system. So that remains a problem of the matched z transform even for high sampling rates and low pass CT systems. – Matt L. Dec 22 '19 at 13:22
Oh that's a very interesting case! Thanks for sharing that. Makes me now think of the possibility of some interesting frequency domain cryptography using that property. – Dan Boschen Dec 22 '19 at 13:41
@MattL. Is there a proof somewhere that the bilinear transform indeed is the trapezoidal rule for linear systems? I believe in the above it is shown only for the case $y'(t)=x(t)$, but perhaps it's easy to generalize somehow for all suitable functions from this? Otherwise an induction proof with respect to the order? – Dole Dec 22 '19 at 23:09
@Dole: I'm not sure what you exactly you mean by generalizing the proof. The trapezoidal rule is an approximation for computing an integral, and that's equivalent to replacing $1/s$ by the expression (4) of my answer. So in that sense it is general. – Matt L. Dec 24 '19 at 09:51
@MattL. I mean a proof that using the trapezoidal rule and the bilinear transform will give the same answer. It was shown that this holds for the case $y=x'$, but not necessarily $y+y'+y''+...+y^{(n)}=x+x'+x''+...x^{(n)}$. Maybe you can jump from one to the other somehow, but it's not clear to me how. In any case, the result can also be shown by explicitly calculating the coefficients. – Dole Dec 26 '19 at 16:39

score 2 · Answer 2 · answered Dec 22 '19 at 06:39

Here's another "justification" for the Bilinear Transform. If you equate the transfer functions, in the $s$ and $z$ domains of the unit sample delay, you have:

$$ z^{-1} = e^{-sT} $$

which says the same thing as

$$ z = e^{sT} $$

where $T$ is the sampling period or the reciprocal of the sample rate:

$$ T \triangleq \frac{1}{f_\mathrm{s}} $$

So the bilinear approximation (which is equivalent to the trapezoidal rule or Tustin's method) is made to that exponential.

$$\begin{align} z &= e^{sT} \\ \\ &= \frac{e^{sT/2}}{e^{-sT/2}} \\ \\ &\approx \frac{1+sT/2}{1-sT/2} \\ \\ \end{align}$$

If you solve for $s$, you get:

$$ s \leftarrow \frac{2}{T} \, \frac{z-1}{z+1} = \frac{2}{T} \, \frac{1-z^{-1}}{1+z^{-1}} $$

Now this approximation is normally good for small $s$ but it turns out that every point on the $j \omega$-axis is mapped to a point on the unit circle of the $z$-plane

Justification of bilinear transform

2 Answers2

Linked