I've found the main problem. I have a function which repeatedly accesses a
6-dimensional array in a loop. This function took no time in 0.4, but is
actually very slow in 0.5. My problem looks very similar to 18774
<https://github.com/JuliaLang/julia/issues/18774>.
Here's an example:
A3 = rand(10, 10, 10);
function test3(A, nx1, nx2, nx3)
for i = 1:10_000_000
A[nx1, nx2, nx3]
end
end
A5 = rand(10, 10, 10, 10, 10);
function test5(A, nx1, nx2, nx3, nx4, nx5)
for i = 1:10_000_000
A[nx1, nx2, nx3, nx4, nx5]
end
end
A6 = rand(10, 10, 10, 10, 10, 10);
function test6(A, nx1, nx2, nx3, nx4, nx5, nx6)
for i = 1:10_000_000
A[nx1, nx2, nx3, nx4, nx5, nx6]
end
end
function test6_fast(A, nx1, nx2, nx3, nx4, nx5, nx6)
Asize = size(A)
for i = 1:10_000_000
A[sub2ind(Asize, nx1, nx2, nx3, nx4, nx5, nx6 )]
end
end
@time test3(A3, 1, 1, 1)
@time test5(A5, 1, 1, 1, 1, 1)
@time test6(A6, 1, 1, 1, 1, 1, 1)
@time test6_fast(A6, 1, 1, 1, 1, 1, 1)
test6 takes 0.01s in 0.4 and takes 15s in 0.5. Using a linear index fixes
the problem.
On Friday, September 30, 2016 at 2:30:02 AM UTC-4, Mauro wrote:
>
> On Fri, 2016-09-30 at 03:45, Andrew <[email protected] <javascript:>>
> wrote:
> > I checked, and my objective function is evaluated exactly as many times
> > under 0.4 as it is under 0.5. The number of iterations must be the same.
> >
> > I also looked at the times more precisely. For one particular function
> call
> > in the code, I have:
> >
> > 0.4 with old code: 6.7s 18.5M allocations
> > 0.4 with 0.5 style code(regular anonymous functions) 11.6s, 141M
> > allocations
> > 0.5: 36.2s, 189M allocations
> >
> > Surprisingly, 0.4 is still much faster even without the fast anonymous
> > functions trick. It doesn't look like 0.5 is generating many more
> > allocations than 0.4 on the same code, the time is just a lot slower.
>
> Sounds like your not far off a minimal, working example. Post it and
> I'm sure it will be dissected in no time. (And an issue can be filed).
>
> > On Thursday, September 29, 2016 at 3:36:46 PM UTC-4, Tim Holy wrote:
> >>
> >> No real clue about what's happening, but my immediate thought was that
> if
> >> your algorithm is iterative and uses some kind of threshold to decide
> >> convergence, then it seems possible that a change in the accuracy of
> some
> >> computation might lead to it getting "stuck" occasionally due to
> roundoff
> >> error. That's probably more likely to happen because of some kind of
> >> worsening rather than some improvement, but either is conceivable.
> >>
> >> If that's even a possible explanation, I'd check for unusually-large
> >> numbers of iterations and then print some kind of convergence info.
> >>
> >> Best,
> >> --Tim
> >>
> >> On Thu, Sep 29, 2016 at 1:21 PM, Andrew <[email protected]
> <javascript:>>
> >> wrote:
> >>
> >>> In the 0.4 version the above times are pretty consistent. I never
> observe
> >>> any several thousand allocation calls. I wonder if compilation is
> occurring
> >>> repeatedly.
> >>>
> >>> This isn't terribly pressing for me since I'm not currently working on
> >>> this project, but if there's an easy fix it would be useful for future
> work.
> >>>
> >>> (sorry I didn't mean to post twice. For some reason hitting spacebar
> was
> >>> interpreted as the post command?)
> >>>
> >>>
> >>> On Thursday, September 29, 2016 at 2:15:35 PM UTC-4, Andrew wrote:
> >>>>
> >>>> I've used @code_warntype everywhere I can think to and I've only
> found
> >>>> one Core.box. The @code_warntype looks like this
> >>>>
> >>>> Variables:
> >>>> #self#::#innerloop#3133{#bellman_obj}
> >>>> state::State{IdioState,AggState}
> >>>> EVspline::Dierckx.Spline1D
> >>>> model::Model{CRRA_Family,AggState}
> >>>> policy::PolicyFunctions{Array{Float64,6},Array{Int64,6}}
> >>>> OO::NW
> >>>>
> >>>>
> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>
>
> >>>>
> >>>> Body:
> >>>> begin
> >>>>
> >>>>
> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>
>
> >>>> = $(Expr(:new,
> >>>>
> ##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj},
>
>
> >>>> :(state), :(EVspline), :(model), :(policy), :(OO),
> >>>> :((Core.getfield)(#self#,:bellman_obj)::#bellman_obj)))
> >>>> SSAValue(0) =
> >>>>
> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>
>
> >>>>
> >>>>
> (Core.setfield!)((Core.getfield)(#self#::#innerloop#3133{#bellman_obj},:obj)::CORE.BOX,:contents,SSAValue(0))::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>
>
> >>>> return SSAValue(0)
> >>>>
> >>>>
> end::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}
>
>
> >>>>
> >>>>
> >>>> I put the CORE.BOX in all caps near the bottom.
> >>>>
> >>>> I have no idea if this is actually a problem. The return type is
> stable.
> >>>> Also, this function appears in an outer loop.
> >>>>
> >>>> What I noticed putting a @time in places is that in 0.5, occasionally
> >>>> calls to my nonlinear equation solver take a really long time, like
> here:
> >>>>
> >>>> 0.069224 seconds (9.62 k allocations: 487.873 KB)
> >>>> 0.000007 seconds (39 allocations: 1.922 KB)
> >>>> 0.000006 seconds (29 allocations: 1.391 KB)
> >>>> 0.000011 seconds (74 allocations: 3.781 KB)
> >>>> 0.000009 seconds (54 allocations: 2.719 KB)
> >>>> 0.000008 seconds (54 allocations: 2.719 KB)
> >>>> 0.000008 seconds (49 allocations: 2.453 KB)
> >>>> 0.000007 seconds (44 allocations: 2.188 KB)
> >>>> 0.000007 seconds (44 allocations: 2.188 KB)
> >>>> 0.000006 seconds (39 allocations: 1.922 KB)
> >>>> 0.000007 seconds (39 allocations: 1.922 KB)
> >>>> 0.000006 seconds (39 allocations: 1.922 KB)
> >>>> 0.000005 seconds (34 allocations: 1.656 KB)
> >>>> 0.000005 seconds (34 allocations: 1.656 KB)
> >>>> 0.000004 seconds (29 allocations: 1.391 KB)
> >>>> 0.000004 seconds (24 allocations: 1.125 KB)
> >>>> 0.007399 seconds (248 allocations: 15.453 KB)
> >>>> 0.000009 seconds (30 allocations: 1.594 KB)
> >>>> 0.000004 seconds (25 allocations: 1.328 KB)
> >>>> 0.000004 seconds (25 allocations: 1.328 KB)
> >>>>
> >>>> 0.000010 seconds (70 allocations: 3.719 KB)
> >>>> 0.072703 seconds (41.74 k allocations: 1.615 MB)
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thursday, September 29, 2016 at 1:37:18 AM UTC-4, Kristoffer
> Carlsson
> >>>> wrote:
> >>>>>
> >>>>> Look for Core.Box in @code_warntype. See
> >>>>> https://github.com/JuliaLang/julia/issues/15276
> >>>>
> >>>>
> >>
>