I didn't know about BOBYQA, but that Wikipedia page says it does not need a gradient function, so it will create a quadratic approximation to the function using only function evaluations.jdart wrote: Some people have also used methods such as https://en.wikipedia.org/wiki/BOBYQA or L-BFGS (https://en.wikipedia.org/wiki/Limited-memory_BFGS), which are available in many optimization libraries. These require approximating the gradient and the Hessian (2nd derivative), which is expensive, but on the other hand they converge fast.

L-BFGS requires a gradient function but, as I demonstrated with RuyTune, that can be done automatically from an existing function, with a bit of work and C++ template magic. The approximation of the Hessian is something internal to the workings of L-BFGS.

There are other related algorithms that are variants of the conjugate gradient method, which can use a function that computes the product of the Hessian and a vector. This can also be done using automatic differentiation, although I don't have a good reference that explains how. If anyone is interested, I can try to explain it.