Author Topic: Levenberg-Marquardt solver  (Read 6748 times)

0 Members and 2 Guests are viewing this topic.

ymg

  • Guest
Levenberg-Marquardt solver
« on: November 14, 2013, 03:17:04 AM »
Evgeniy,

It's a little off the subject, but have you done anything along the line of a Levenberg-Marquardt solver like minpack in Fortran?

Would be very handy for minimizing (Least Square) on non linear function.

ymg

ElpanovEvgeniy

  • Water Moccasin
  • Posts: 1569
  • Moscow (Russia)
Re: Levenberg-Marquardt solver
« Reply #1 on: November 14, 2013, 05:12:37 AM »
Evgeniy,

It's a little off the subject, but have you done anything along the line of a Levenberg-Marquardt solver like minpack in Fortran?

Would be very handy for minimizing (Least Square) on non linear function.

ymg

I've never used this algorithm.
How will you use it? What tasks he decides?

ps. Please give a link (or post it) to the algorithm - pseudo-code and examples of data input and output. I'll see what I can help you...

ymg

  • Guest
Re: Levenberg-Marquardt solver
« Reply #2 on: November 14, 2013, 07:00:29 AM »
Evgeniy,

Here is a paper describing the algorithm :
    Improvements to the Levenberg-Marquardt algorithm for nonlinear least-squares minimization

And a link to some Python code:  Python: Marquardt algorithm for nonlinear regression

Now a link to source code of minpack in fortran 77 : Minpack

Basically you supply the function that you want to minimize along with a starting value.  The algorithm then switch between Steepest descent and
Gauss-Newton minimization. (I am overly simplifying here)

For example fitting of a circle for which there is no close form for the equation has to be done by successive approximations.  If you are not very
close with your starting guess, then it is very likely that Gauss-Newton will start to swing and will get lost without converging to a solution.

Levenberg-Marquardt dampens the Gauss-Newton solver and will converge to a solution even if your starting guess was quite far.

I guess we should move this to another thread.

ymg

ElpanovEvgeniy

  • Water Moccasin
  • Posts: 1569
  • Moscow (Russia)
Re: Levenberg-Marquardt solver
« Reply #3 on: November 15, 2013, 03:06:53 AM »
Evgeniy,

Here is a paper describing the algorithm :
    Improvements to the Levenberg-Marquardt algorithm for nonlinear least-squares minimization

And a link to some Python code:  Python: Marquardt algorithm for nonlinear regression

Now a link to source code of minpack in fortran 77 : Minpack

Basically you supply the function that you want to minimize along with a starting value.  The algorithm then switch between Steepest descent and
Gauss-Newton minimization. (I am overly simplifying here)

For example fitting of a circle for which there is no close form for the equation has to be done by successive approximations.  If you are not very
close with your starting guess, then it is very likely that Gauss-Newton will start to swing and will get lost without converging to a solution.

Levenberg-Marquardt dampens the Gauss-Newton solver and will converge to a solution even if your starting guess was quite far.

I guess we should move this to another thread.

ymg

It looks like a great project.
Now, I do not have enough time to run it.

If you have enough time to break the project into tasks, solve all the problems individually, and share them here. Perhaps many will want to help eliminate errors and speed up the work.

ymg

  • Guest
Re: Levenberg-Marquardt solver
« Reply #4 on: November 15, 2013, 04:37:15 AM »
Quote
It looks like a great project.

Evgeniy,

I am quite rusty in all these things.  Nonetheless I will try to come up with a partition of the task
and post it .

A "No Brainer"  might be to go along the same way as in minpack.

The routine spend the most time in the formation of the Jacobian matrix.
Here is where your skills would be well employed.

We also need a Gradient Descent Method

Gauss-Newton we almost got with your routine, transpose exist,
could probably use your Cholesky decomposition etc.

There are nice function for testing like the Rosenbrock better known as the Banana function.



Thanks for the interest.

ymg

ymg

  • Guest
Re: Levenberg-Marquardt solver
« Reply #5 on: November 15, 2013, 01:08:51 PM »

I've been looking a little closer to the python implementation by Ernesto P. Adorio PhD.  Marquardt Algorithm For Nonlinear Regression V. 0.0.4

I believe we could keep the same structure as far calling subroutine is concerned.  A simple translation to Autolisp would give us something that we can then improve.

The python program contains but 9 functions, some of which we already have, others are kind of trivial.

So we are left with the Marquardt function, the sumsqr, J, F and F2 functions.

That being said probably I can start coding some.  I've not figured what is the F2 function yet.

ymg

Code - Python: [Select]
  1. """
  2. file     marquardt.py
  3.  
  4. author   Ernesto P. Adorio PhD.
  5.                  UPDEPP at Clarkfield, Pampanga
  6.  
  7. desc     basic marquardt method for nonlinear regression.
  8.                  based on very old C++ code by the author.
  9. depends  matlib
  10.  
  11. version  0.0.1 2011.12.04  first version.
  12.                  0.0.2 2011.12.05  standalone versioning.
  13.                  0.0.3 2011.12.06  single file version (with testcode built-in)
  14.                                          the last valid computed inverse is now returned.
  15.                  0.0.4 2012.02.05  options for forward and backward difference approximations to derivatives.
  16.                                          Dougherty example with uninspired initial parameter values of [0.0].
  17. """
  18.  
  19.  
  20. def matcopy(A):
  21.         return [a[:] for a in A]
  22.  
  23. def matinverse(AA,inplace = False):
  24.         """
  25.         Determines the inverse of a square matrix BB by Gauss-Jordan reduction.
  26.         """
  27.         n = len(AA)
  28.         B = [[0] * n for i in range(n)]
  29.         for i in range(n):
  30.            B[i][i] = 1.0
  31.  
  32.         if not inplace:
  33.                 A = [row[:] for row in AA]
  34.         else:
  35.                 A = AA
  36.         try:
  37.           for i in range(n):
  38.                 #Divide the ith row by A[i][i]
  39.                 m = 1.0 / A[i][i]
  40.                 for j in range(i, n):
  41.                         A[i][j] *= m  # # this is the same as dividing by A[i][i]
  42.                 for j in range(n):
  43.                         B[i][j] *= m
  44.  
  45.                 #lower triangular elements.
  46.                 for k in range(i+1, n):
  47.                         m = A[k][i]
  48.                         for j in range(i+1, n):
  49.                                 A[k][j] -= m * A[i][j]
  50.                         for j in range(n):
  51.                                 B[k][j] -= m * B[i][j]
  52.  
  53.                 #upper triangular elements.
  54.                 for k in range(0, i):
  55.                         m = A[k][i]
  56.                         for j in range(i+1, n):
  57.                                 A[k][j] -= m * A[i][j]
  58.                         for j in range(n):
  59.                                 B[k][j] -= m * B[i][j]
  60.           return B
  61.         except:
  62.           return None
  63.  
  64.  
  65. def matprint(A,format= "%8.3f"):
  66.         #prints the matrix A using specified format
  67.         m = len(A)
  68.         try:
  69.           n = len(A[0])
  70.         except:
  71.           n = 0 #
  72.         if m:
  73.           for i,row in enumerate(A):
  74.                 if n:
  75.                    for c in row:
  76.                           print format % c,
  77.                    print
  78.                 else:
  79.                    print row
  80.         print
  81.  
  82.  
  83. def gausselim(AA, BB, pivoting = 1, ztol = 1.0e-8):
  84.         """
  85.          solves for X in AX = B.
  86.          AA is a square matrix and B is a column vector(simple array).
  87.          pivoting = 0.  no pivoting
  88.                           = 1.  partial row pivoting.
  89.          Returns (code,R, A, X, B)
  90.            codes: 0 - success,
  91.                           1 - near zero pivot encountered.Do not use the
  92.                                 returned values!
  93.  
  94.         """
  95.         A = [a[:] for a in AA]
  96.         B = BB[:]
  97.  
  98.         size = len(A)
  99.         X       = [0.0] * size
  100.         R = range(size)
  101.         colperm = range(size)
  102.  
  103.         # Triangularization.
  104.         for pivot in range(size):
  105.                 m = A[pivot][pivot]
  106.         if pivoting == 1:
  107.                    absm = abs(m)
  108.                    for row in range(pivot + 1, size):
  109.                           testm  = A[row][pivot]
  110.                           atestm = abs(testm)
  111.                   if atestm > absm:
  112.                          m = (testm)
  113.                                  absm = abs(m)
  114.  
  115.                          # exchange pivot row with row row.
  116.                          A[row], A[pivot] = A[pivot], A[row]
  117.                          B[row], B[pivot] = B[pivot], B[row]
  118.  
  119.                          R[pivot],R[row]=R[row], R[pivot]
  120.  
  121.                 if absm < ztol:
  122.                    return (1, R, A,X,B) # missing row interchange vector in return.
  123.  
  124.         for row in range(pivot +1, size):
  125.                         kmul = float(A[row][pivot])/m
  126.                         # Apply rectangular rule.(Reduction)
  127.                         for col  in range(size -1, pivot, -1):
  128.                                 A[row][col] = A[row][col] - kmul * A[pivot][col]
  129.                         B[row] = B[row] - kmul * B[pivot]
  130.                         A[row][pivot] = 0.0
  131.  
  132.  
  133.    # Perform Back substitution.
  134.         for row in range(size-1, -1, -1):
  135.                 sum = B[row]
  136.                 for k in range(row+1, size):
  137.                         sum -= (X[k] * A[row][k])
  138.                 # print row, sum,A[row][row]
  139.                 X[row] = sum/A[row][row]
  140.  
  141.         # print "Computed value of X = ",  X
  142.         return (0,R, AA,X,B)
  143.  
  144.  
  145. def SumSqr(X, Y, V,f, a):
  146.           """
  147.           Args
  148.                  X,Y - data vectors
  149.                  V   - data variance
  150.                  f   - function to minimize
  151.                  a   - fitting coefficients
  152.           Auxiliary routine to compute the error sum of squares.
  153.           """
  154.           return sum([  ((y - f(x, a))/v)**2 for x,y, v in zip(X,Y, V) ])
  155.  
  156.           #previous lines.
  157.           n   = len(X)
  158.           tot = 0.0
  159.           for i in range(0, n) :
  160.                   t   =  (Y[i] - f(X[i], a)) / V[i]
  161.                   tot += t * t
  162.           return tot
  163.  
  164. def  marquardt(X, Y, V,  f, J, a, maxiter=20, minchange=1.0e-3, minlambdax=1.0e-6, debug= True, derh=0.04, dcode= 1):
  165.           """
  166.           Args
  167.                 X, Y             Input:  data points (X_i, Y_i)
  168.                 V                       Input:  data variance, must not have
  169.                                                    zero values
  170.                 f                       Input:  function with m unknown parameters
  171.                                                    to be evaluated at point xi.
  172.                 J                       Input:  d(f(X)) / d(a_k),
  173.                                                    derivative of f() with respect to
  174.                                                    the a_k parameter evaluated at
  175.                                                    point x = xi, includes f() as
  176.                                                    parameter for possible
  177.                                                    numerical evaluation of derivative
  178.                 a                       Input/Output: function parameters, will be updated by routine
  179.  
  180.                 maxiter   maximum iterations
  181.                 minchange       Input:  minimum change between successive ChiSquares
  182.                 minlambda       Input:  minimum lambda value.
  183.                 debug           Input:  True if messages are to be printed.
  184.                 derh             Input:  numerical differentiation spacing.
  185.                 dcode           Input:  type of derivative approximation to apply.
  186.  
  187.           Return values
  188.                 flag
  189.                   0                              No error
  190.                   1                              Singular matrix
  191.                   2                              Maximum iterations exceeded
  192.                   3                              lambda becomes zero!
  193.                   4                              Singular covariance matrix !!
  194.                  besta                    best fit parameters found by routine
  195.                  bestSS                  best Sum of squares.
  196.                  Cov                            covariance matrix
  197.           """
  198.           n = len(X)
  199.           m = len(a)
  200.           JtJ = [[0.0]* m for i in range(m)]
  201.  
  202.           bestSS = SS  = SumSqr(X, Y, V, f, a)
  203.           besta  = a[:]
  204.           Cov   = None
  205.  
  206.           lambdax = 0.001
  207.           iscomp = True
  208.           ncount = 0
  209.           flag   = 0
  210.           for p in range(1, maxiter+1):
  211.                   if debug: print "marquardt(): iteration=", p
  212.                   if (iscomp) :
  213.                           # Form JtJ matrix
  214.                           for k in range(m):
  215.                                   for j in range(m):
  216.                                           tot = 0.0
  217.                                           for i in range(n):
  218.                                                   tot +=(J(X[i],f,a,k, derh, dcode) * J(X[i],f,a,j, derh, dcode))/(V[i]*V[i])
  219.                                           JtJ[j][k] = JtJ[k][j] = tot
  220.  
  221.                           if (lambdax == 0.0) :
  222.                                   if debug:
  223.                                          print "terminated, lambdax == 0!"
  224.                                   break
  225.  
  226.                           # Form RHS beta vector
  227.                           beta = [0] * m
  228.                           for k in range(m):
  229.                                   beta[k] = sum([(y-f(x,a))/(v*v) * J(x, f, a,k, derh, dcode)for x,y, v in zip(X,Y,V)])
  230.                   for j in range(m):
  231.                          JtJ[j][j] *= (1. + lambdax)
  232.  
  233.  
  234.                   # Solve for delta
  235.                   lastCov = matinverse(JtJ)
  236.                   if lastCov is not None:
  237.                          Cov = matcopy(lastCov)
  238.                   code, R, A, delta,b = gausselim(JtJ, beta)
  239.                   if code:
  240.                          flag = 4
  241.                          break
  242.                   totabsdelta = sum([abs(d) for d in delta])
  243.                   if debug:
  244.                          print "JtJ:"
  245.                          matprint(JtJ,"%5.3lf")
  246.                          print "beta = ", beta
  247.                          print "delta=", delta
  248.                          print "SS =",SS
  249.                          print "lambdax=", lambdax
  250.                          print "total abs delta=", totabsdelta
  251.  
  252.  
  253.                   if (code == 0):
  254.                          # Compute new parameters
  255.                          newa = [a[i] + delta[i] for i in range(m)]
  256.  
  257.                          # and new sum of squares
  258.                          newSS  = SumSqr(X, Y, V, f, newa)
  259.                          if debug: print "newSS = ", newSS
  260.                          # Update current parameter vector?
  261.                          if (newSS < bestSS):
  262.                                 if debug: print "improved values found!"
  263.                                 besta  = newa[:]
  264.                                 bestSS = newSS
  265.                                 bestJtJ = matcopy(JtJ)
  266.                                 a = newa[:]
  267.  
  268.                                 iscomp = True
  269.                                 if debug:
  270.                                    print "new a:", a
  271.  
  272.                                 # Termination criteria
  273.                                 if (SS - newSS < minchange):
  274.                                    ncount+= 1
  275.                                    if (ncount == 2) :
  276.                                           lambdax  = 0.0
  277.                                           flag = 0
  278.                                           break
  279.                                 else :
  280.                                    ncount = 0
  281.                                    lambdax = 0.4 * lambdax  # after Nash
  282.                                    if (lambdax < minlambdax) :
  283.                                           flag = 3
  284.                                           break
  285.                                 SS = newSS
  286.                          else :
  287.                                 iscomp = False
  288.                                 lambdax = 10.0 * lambdax
  289.                                 ncount = 0
  290.                   else :
  291.                          flag = 1
  292.                          break
  293.  
  294.           if (flag == 0):
  295.                  if Cov is None: flag = 4
  296.                  if (p >= maxiter) :
  297.                          flag = 2
  298.  
  299.           return flag, besta, bestSS, Cov
  300.  
  301.  
  302.  
  303.  
  304. def  J( xi, f, a, k, h = 1.0e-5, code = 2):
  305.         """
  306.         Args
  307.            xi - evaluation point
  308.            f  - function to minimize
  309.            a  - fit parameters
  310.            k  - index to active parameter.
  311.            h  - small increment or spacing.
  312.           code- approximation method
  313.                         0 - forward
  314.                         1 - backward
  315.                         2 - central
  316.         Return
  317.            numerical derivative of function f(xi, a) at point xi wrt to parameter a[k]
  318.         """
  319.  
  320.         if code == 0:
  321.           a[k] += h
  322.           fxaph = f(xi, a)
  323.           a[k] -= h  # restore ak
  324.           fxa = f(xi, a)
  325.           return (fxaph - fxa)/ h
  326.  
  327.         elif code == 1:
  328.           fxa = f(xi, a)
  329.           a[k] -= h
  330.           fxamh = f(xi, a)
  331.           a[k] += h   # restore ak
  332.           return (fxa - fxamh)/ h
  333.  
  334.         elif code == 2:
  335.           retval = 0.0
  336.  
  337.           a[k]   += h
  338.           retval = f(xi, a)
  339.           a[k]   -= 2*h
  340.           retval -= f(xi, a)
  341.           a[k]   += h  # restore ak
  342.           return retval / (2*h)
  343.  
  344. from math import exp
  345. def  f( xi, a):
  346.         """
  347.         Args
  348.            xi  - evaluation point
  349.            a[] - current fit parameters
  350.  
  351.         Desc
  352.            Test function
  353.         Return
  354.  
  355.            value of 2 parameter logistic function
  356.         """
  357.         return a[0] / (1. + a[1] * exp( xi * a[2]))
  358.  
  359.  
  360. def f2(xi, a):
  361.         return a[0] + a[1]/xi
  362.  
  363. if __name__ == "__main__":
  364.           X = [1.,2.,3.,4.,5.,6.,7.,8.,9.,10.,11.,12.]
  365.           Y = [5.308,7.24,9.638,12.866,17.069,23.192,31.443,38.558,50.156,62.948,75.995,91.972]
  366.           V = [1.] * len(X)
  367.           a = [0.7, 10., -0.4] # initial estimates.
  368.          
  369.  
  370.           # Sample Problem in
  371.           X = range(1,11)
  372.           Y = [1.93, 7.13, 8.78, 9.69, 10.09,10.42,10.62, 10.71, 10.79, 11.13]
  373.           a = [0, 0]
  374.           derh  = 1.0e-4
  375.           dcode = 0 #use forward difference approximation.
  376.  
  377.           n        = len(X)     # number of data points
  378.           m        = len(a)     # number of unknown function parameters
  379.           maxiter = 50          # maximum iterations for matmarquardt()
  380.           minchange = 0.1
  381.           compcov  = True
  382.  
  383.           print "\nInitial estimates:\n"
  384.           for i in range(m):
  385.                    print "i = %d, %8.5lf\n" % (i, a[i])
  386.  
  387.           flag,a, minss, Cov = marquardt(X,Y,V, f2, J, a, maxiter, minchange, derh, dcode)
  388.  
  389.           print "Flag = %d, minSS = %f\n" % (flag, minss)
  390.  
  391.           for i in range(m):
  392.                   print "i = %d, %8.5lf\n" %(i, a[i])
  393.  
  394.           print "Last computed inverse"
  395.           if flag == 0 and Cov != None:
  396.                   matprint(Cov, "%10.7lf ")
  397.  
  398.  

mailmaverick

  • Bull Frog
  • Posts: 493
Re: Levenberg-Marquardt solver
« Reply #6 on: July 22, 2016, 06:35:49 AM »
Hi

I know I'm replying to an old topic but I feel instead of posting a new topic, it might be relevant here.

I have seen that many high end mathematical calculations are solved very fast by use of Python / Fortran routines. There are many high end libraries available on the internet on Python and Fortran which are really fast and efficient.

Instead of rewriting the code in LISP, couldn't we use the compiled version of these libraries directly in LISP, as we do in C/C++ ?


kdub_nz

  • Mesozoic keyThumper
  • SuperMod
  • Water Moccasin
  • Posts: 2139
  • class keyThumper<T>:ILazy<T>
Re: Levenberg-Marquardt solver
« Reply #7 on: July 22, 2016, 07:46:57 AM »
How do you imagine that would happen ?
Called Kerry in my other life
Retired; but they dragged me back in !

I live at UTC + 13.00

---
some people complain about loading the dishwasher.
Sometimes the question is more important than the answer.

mailmaverick

  • Bull Frog
  • Posts: 493
Re: Levenberg-Marquardt solver
« Reply #8 on: July 22, 2016, 10:44:24 AM »
How do you imagine that would happen ?

I imagined it to happen because I am able to use compiled Fortran / Python libraries (available freely on internet) in C/C++ code which provide high end and very fast mathematical functionality.

For example, for calculating inverse of a matrix (of real numbers, not integers) of size 5100 x 5100, Microsoft Excel's inbuilt function (Excel's inbuilt function is also otherwise considered to be quite fast) took 1693.87 seconds and BLAS library took 64.41 seconds on my computer. You can see the time difference. I also compared the results which were exactly same.

I used the Fortran based BLAS library as follows :-

1.) Firstly, I downloaded the compiled version of BLAS libraries which is freely avaialble on internet
2.) Secondly, I made some code in C/C++ in which I called these compiled libraries. My C/C++ code takes main matrix as input and returns inverse of the same as output.
3.) I compiled the C/C++ code to make a DLL file through Microsoft Visual Studio.
4.) I called this compiled C/C++ code through VBA in Excel.

Now what is happening is that my Excel VBA calls this C/C++ DLL and passes the input matrix to C/C++ DLL.
The C/C++ DLL further passes this matrix to the BLAS DLL which does the calculation and returns back the inverse to C/C++ code.
Then the C/C++ code returns the inverted matrix to VBA from where you can take the output in Excel.
Believe me, the time of 64.41 seconds is all inclusive of above operation.

Now you will ask me why didnt I call the BLAS library directly from VBA ? I tried to do it a lot and also searched the internet for the same but could not succeed. Either it is not possible or it is beyond my current level of knowledge.

Now considering the above, I want to find a way to call these BLAS libraries through LISP, either directly or through intermediate C/C++ code as I mentioned above.

I hope now you understand.
« Last Edit: July 22, 2016, 10:49:47 AM by mailmaverick »