We’ve used this technique extensively and there is a previous post and a paper on this. But there wasn’t a simple example of exactly how to do use PyTorch with scipy.optimize, so here it is.

Some things to note:

  1. The fitting parameters are converted to tensor with the requires_grad=True – this builds the computational graph and allows to reverse differentiate against them

  2. Option jac=True specifies that we will supply the Jacobian (computed using automatic-differentiation)

  3. The objective function returns the tuple (objective value, jacobian)

NB This very simple example has lots of local minima!

import numpy
import scipy, scipy.optimize
import torch


def minim(obs, f, p0):
    """Fit function f to observations "obs" starting at p0"""
    
    def fitfn(pars):
        # NB the require_grad parameter specifying we want to
        # differentiate wrt to the parameters
        pars=torch.tensor(pars,
                          requires_grad=True)
        y=f(pars)
        # Simple least-squares fitting
        res=((obs-y)**2).sum()
        res.backward()
        # Note that gradient is taken from the "pars" variable
        return res.data.cpu().numpy(), pars.grad.data.cpu().numpy()

    res=scipy.optimize.minimize(fitfn,
                                p0,
                                method="BFGS",
                                jac=True) # NB: we will compute the jacobian
    return res

# Points on which observations are made
X=torch.arange(100)

def exfn(p):
    y=torch.sin(p[0]*X)+torch.cos(p[1]*X)
    return y

# Sample observations
yp=exfn(torch.tensor([0.3, 0]) )

# Sample run
minim(yp, exfn, numpy.array([0.34, 0.01]))