Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculus.hessian is not symmetric #91

Open
RossBoylan opened this issue Jun 24, 2016 · 6 comments
Open

Calculus.hessian is not symmetric #91

RossBoylan opened this issue Jun 24, 2016 · 6 comments

Comments

@RossBoylan
Copy link

RossBoylan commented Jun 24, 2016

julia> @time h10 = Calculus.hessian(RB.mylike, RB.troubleParamsSim1.raw)
 209.557745 seconds (2.43 G allocations: 83.867 GB, 10.31% gc time)
 10x10 Array{Float64,2}:
    566.715     -206.282      -50.7658   …    3135.03        -89.8427
   -206.282      566.715       17.7032       -1993.75         47.1135
    -50.7658      17.7032     143.437        -1163.21         33.7016
     17.7032     -50.7658     -47.7584         282.533         1.3897
    182.291     -122.875      -12.5876       -2943.26         27.7885
     -2.95777     -5.09084      2.97638  …    -452.285        15.1723
    -45.7494      12.5752      39.2448           0.613878     24.8811
  -1942.38       804.496      450.315       -35319.9        1935.28
   3135.03     -1993.75     -1163.21         56057.6       -2498.87
    -89.8427      47.1135      33.7016       -2498.87       1041.54

 julia> issym(h10)
 false

Closer inspection of h10 shows all the errors are extremely small; the largest is 4e-8.

My original assumption was that this was the result of numerical noise from calculating the derivatives in different orders. But the code (finite_difference_hessian! in finite_difference.jl) appears to compute only the upper triangle,

   for i = 1:n
       #code omitted
        # and i+1:n appears to be parses (i+1):n, which is right
        for j = i+1:n

and then ends with

    Base.LinAlg.copytri!(H,'U')

So maybe the problem is in copytri!.

@RossBoylan
Copy link
Author

Same problem after updating to the latest Julia 0.4 code.

   | | |_| | | | (_| |  |  Version 0.4.7-pre+1 (2016-06-19 17:17 UTC)
  _/ |\__'_|_|_|\__'_|  |  Commit 57d0834 (5 days old release-0.4)
 |__/                   |  x86_64-linux-gnu

@RossBoylan
Copy link
Author

julia> h11=copy(h10)
 10x10 Array{Float64,2}:
 julia> Base.LinAlg.copytri!(h11, 'U')
 10x10 Array{Float64,2}:

 julia> issym(h11)
 true

 julia> issym(h10)
 false

So the result of copytri! is symmetric. This suggests the problem is elsewhere, or perhaps the code I thought was doing the work isn't.

@johnmyleswhite
Copy link
Collaborator

Digging into this to find the exact source of asymmetry would lead to a great pull request.

@RossBoylan
Copy link
Author

RossBoylan commented Jun 24, 2016

Using Debug and StackTraces (is there a better way?) I think I may see the problem. The call seems to be routed to the evaluation of the Jacobian of the numerical gradient. The apparent fix would be to send it to finite_difference_hessian{T <: Number}(f::Function, x::Vector{T}) instead. Is there any reason that wasn't being done, e.g., that code is broken?

There are a number of function definitions for hessian in derivative.jl that have no gradient argument and create one numerically. If the suggested fix is safe, it probably should be done for all.

Even if the use of the jacobian worked it appears to involve almost twice as much work as necessary, since it evaluates the derivates for both i, j and j,i. And even if one is using a real gradient, isn't using the current jacobian evaluation likely to be inefficient and probably asymmetric?

It looks as if the call goes to

function hessian{T <: Number}(f::Function, x::Vector{T})
  finite_difference_hessian(f, gradient(f), x, :central)  # derivative.jl:80
end

# leads to
function finite_difference_hessian{T <: Number}(f::Function,
                                                g::Function,
                                                x::Vector{T},
                                                dtype::Symbol = :central)
    finite_difference_jacobian(g, x, dtype)
end

#trace, see qualifications later
StackTraces.StackFrame(:anonymous,symbol("no file"),0,symbol(""),-,false,139887623193497)
StackTraces.StackFrame(:eval,symbol("/home/ross/PCORI/trouble.jl"),1,symbol(""),-,false,139887623176800)
StackTraces.StackFrame(:debug_eval,symbol("/home/ross/.julia/v0.4/Debug/src/Eval.jl"),23,symbol(""),-,false,139887623098583)
StackTraces.StackFrame(:eval_in_scope,symbol("/home/ross/.julia/v0.4/Debug/src/UI.jl"),106,symbol(""),-1,false,139887623096016)
StackTraces.StackFrame(:trap,symbol("/home/ross/.julia/v0.4/Debug/src/UI.jl"),82,symbol(""),-1,false,139887623085585)
StackTraces.StackFrame(:logitlite,symbol("/home/ross/PCORI/trouble.jl"),324,symbol("no file"),0,false,139887623175194)
StackTraces.StackFrame(:mylike,symbol("/home/ross/PCORI/trouble.jl"),493,symbol(""),-,false,139887623171845)
StackTraces.StackFrame(:finite_difference!,symbol("/home/ross/.julia/v0.4/Calculus/src/finite_difference.jl"),126,symbol(""),-,false,139887624133468)
StackTraces.StackFrame(:finite_difference,symbol("/home/ross/.julia/v0.4/Calculus/src/finite_difference.jl"),145,symbol(""),-,false,139887624132974)
StackTraces.StackFrame(:anonymous,symbol("/home/ross/.julia/v0.4/Calculus/src/derivative.jl"),5,symbol("no file"),0,false,139887624132686)
StackTraces.StackFrame(:hessian,symbol("/home/ross/.julia/v0.4/Calculus/src/derivative.jl"),80,symbol(""),-,false,139887624131729)
StackTraces.StackFrame(:eval_user_input,symbol("REPL.jl"),62,symbol(""),-,false,13988762622508# etc

The qualification is that the jacobian function does not appear in the stack trace. I put the breakpoint in the start of my objective function, as a result of which it's called in the first evaluation of the gradient. So I think the stack is that for the closure in which the gradient function is defined via

function derivative(f::Function, ftype::Symbol, dtype::Symbol)
  if ftype == :scalar
    return x::Number -> finite_difference(f, float(x), dtype)
  elseif ftype == :vector
    return x::Vector -> finite_difference(f, float(x), dtype)
  else
    error("ftype must :scalar or :vector")
  end
end

At the top of derivative.jl (the line after:vector does appear in the call stack).

@RossBoylan
Copy link
Author

# in test/derivative.jl
# PR 91
f6(x) = sin(x[1]^2+3x[2]^4)
@test issym(Calculus.hessian(f6, [1.0, 2.0]))

Moving on to solutions...
Interestingly, the f5 function that is there now passes the test.

@pkofod
Copy link

pkofod commented Jan 24, 2017

I've been testing the finite difference functionality for Hessians in JuliaNLSolvers/Optim.jl#337 and I agree that there's a problem.

https://gist.github.com/pkofod/b4e0338265fac7ab1f4aa87a7d3d655f

So the problem I posted has a diagonal matrix as its Hessian, so something is off here. How can it create off-diagonal entries (larger in magnitude than the "real" entries"), when the cross derivatives should be exactly zero? My guess is that the Hessian is not calculated according to the usual method of calculating all combinations, but it rather uses some method where it changes many values at a time? Then maybe it should be possible to switch to the naïve/slow method.

I don't have time to look through the Calculus code right now, sorry. If it doesn't get resolved for a while, I might be able to set some time aside at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants