5 Homework 4: Nonlinear Least Squares
⬅ Back to Menu

Homework 4: Nonlinear Least Squares Fitting

1. Van der Waals Equation of State

The relationship between pressure (\(P\)), volume (\(V\)), and temperature (\(T\)) is modeled as:

$$ \left(P + \frac{a}{V^2}\right)(V - b) = RT \implies P(V) = \frac{RT}{V-b} - \frac{a}{V^2} $$

We need to find the parameters \(a\) and \(b\) that minimize the sum of squared residuals between the measured data and the model.

2. Mathematical Ingredients

Model:

$$ P_i(a,b) = \frac{RT}{V_i - b} - \frac{a}{V_i^2} $$

Residual:

$$ r_i(a,b) = P_i(a,b) - P_i^{\text{obs}} $$

Jacobian entries:

$$ \frac{\partial r_i}{\partial a} = -\frac{1}{V_i^2} \quad \text{and} \quad \frac{\partial r_i}{\partial b} = \frac{RT}{(V_i - b)^2} $$

So each row of the Jacobian matrix \( J \) is:

$$ J_i = \left[ -\frac{1}{V_i^2} \quad \frac{RT}{(V_i - b)^2} \right] $$

Then LM forms:

$$ J^T J = \begin{bmatrix} \sum J_{i1}^2 & \sum J_{i1}J_{i2} \\ \sum J_{i1}J_{i2} & \sum J_{i2}^2 \end{bmatrix} $$
$$ J^T r = \begin{bmatrix} \sum J_{i1}r_i \\ \sum J_{i2}r_i \end{bmatrix} $$

3. The Levenberg-Marquardt Algorithm

Your original code likely used first-order gradient descent (\( a \leftarrow a - \eta \frac{\partial J}{\partial a} \) , \( b \leftarrow b - \eta \frac{\partial J}{\partial b} \)). The Levenberg-Marquardt (LM) version instead solves, at each iteration:

$$ (J^T J + \lambda I) \Delta = -J^T r $$

Then updates the parameter vector \( \theta = \begin{bmatrix} a \\ b \end{bmatrix} \):

$$ \theta_{\text{new}} = \theta + \Delta $$

Why LM is better here?

Your model is highly nonlinear in \( b \) because of the term:

$$ \frac{1}{V - b} $$

This can become stiff when \( b \) gets close to the smallest volume. LM is usually much more stable than plain gradient descent because it uses local curvature information through \( J^T J \), while still damping risky steps.

Subtle point: Strictly speaking, classical LM often uses \( J^T J + \lambda \text{diag}(J^T J) \). The code below uses the simpler and common form with \( I \), which is still a valid LM-style implementation.


Algorithm Steps

  1. Computes residuals \( r \)
  2. Computes Jacobian \( J \)
  3. Forms \( J^T J \) and \( J^T r \)
  4. Solves for \( \Delta \) from \( (J^T J + \lambda I)\Delta = -J^T r \)
  5. Tests whether the step improved the fit
  6. Decreases \( \lambda \) if successful, increases it if not

So:
• small \( \lambda \rightarrow \) behaves more like Gauss-Newton
• large \( \lambda \rightarrow \) behaves more like cautious gradient descent
That is the key LM idea.

4. V vs. P Visualization

Fitted Parameter a (L² bar / mol²)
--
Fitted Parameter b (L / mol)
--