Optimize codebase #25

Datseris · 2023-03-14T13:14:21Z

We've spent a lot of time devising the new interface, but practically 0 seconds optimizing it. There are memory "leakages" all over the place. Running:

julia> @btime indicators_analysis($input, $ind_conf, $sig_conf);
  4.146 s (20828885 allocations: 3.07 GiB)

This amount of allocations is of course nonsense. There are probably a lot of incorrect allocations done in windowmap! which shouldnt happen because we use the in-place version with pre-allocated dummies.

Generally speaking, almost all parts of the code base can be optimized. We can start by detecting the bottlenecks. Using the profiling functionality of VSCode we can get a flame graph of what parts of the code are the slowest. Then, we can benchmark the slow functions in separation and see what goes wrong. Do we have type instabilities? Do we allocate unecessary vectors?

Another point to improve is that our functions are not written in a way that allows for easy optimizations. Normally you would want to have a setup function that initializes all containers, and then a function that does the compitations without initializing anything.

The text was updated successfully, but these errors were encountered:

Datseris · 2023-10-14T09:14:26Z

As the codebase increases, more things are merged in that are well below optimal. E.g., cc @JanJereczek the new segmented window code is poorly optimized. Even the most basic thing, getting the segmented window, violates many rules outlined in Julia's "performance tips":

TransitionsInTimeseries.jl/src/analysis/segmented_window.jl

Lines 100 to 103 in daa2bbd

    
           function segment(t, x, t1, t2) 
        
               i1, i2 = argmin(abs.(t .- t1)), argmin(abs.(t .- t2)) 
        
               return t[i1:i2], x[i1:i2] 
        
           end

The function does broadcasting that allocates many new vectors in place by doing argmin(abs.(t .- t1)), Do we really need to create a new vector of t .- t1? No, you can use an iterator or a for loop. Also, do we need to make a copy of the timeseries each time, as x[i1:i2] makes a copy? No, we can return a view instead.

JanJereczek · 2023-10-17T14:42:07Z

Hi @Datseris,

I tried to optimize the segmented analysis. Improving things like changing argmin(abs.(t .- t1)) and using views, as you suggested, is easy. But really, the bulk of the memory allocation goes into the surrogate generation. Consider the following example:

function testsgen(sgen)
    for _ in 1:100
        sgen()
    end
end
x = collect(1.0:100)
sgen = surrogenerator(x, signif.surrogate, Random.Xoshiro(1995))
@btime testsgen($sgen)
# 143.765 μs (200 allocations: 135.94 KiB)

Now imagine we have $n_{seg}$ segments. The current design of the loops is outer loop = over indicators, inner loop = over surrogates (further referred to as option 1). This means that each segment requires $n_i \cdot n_{sur}$ surrogates, with $n_i$ the number of indicators and $n_{sur}$ the number of surrogates passed to the config. Each surrogate generation corresponds to $n_{sgen}$ allocations, with $n_{sgen}= 2$ in the example above. Thus, we reach $n_{tot} = n_{seg} \cdot n_{sur} \cdot n_{sgen} \cdot n_i$ allocations, which is substantial. Typically, we have $10,000$ surrogates, $\mathcal{O}(10)$ segments and $2$ indicators, which leads to $n_{tot} \simeq 400,00$ allocations.

Either we accept this or perform, for instance, following change:
Swap the loops to have outer one = over surrogates, inner one = over indicators (further referred to as option 2). This makes sense to me and the reduction of memory allocation would also apply to the sliding approach. I remember that you insisted, at some point to have option 1 (I had programmed option 2 and you changed it). I don't see what should prevent us from doing option 2. In fact, besides the reduction of memory allocation, I think it is more correct to perform all the indicator computations on the same surrogate (for a fixed surrogate index).

I have implemented this for the segment analysis - in a somewhat dirty way so far, but which seems to confirm that we reduce the memory allocation on following code:

using TransitionsInTimeseries

n = 1001
t = collect(1.0:n)
x = copy(t)
t1 = [23.6, 300.7, 777.4]   # some segmentation lower bounds
t2 = [245.5, 684.7, 952]    # some segmentation upper bounds

indicators = (mean, var, skewness, kurtosis)
change_metrics = RidgeRegressionSlope()

config = SegmentedWindowConfig(indicators, change_metrics, t1, t2;
        width_ind = 20, min_width_cha = 10, whichtime =  last)
res = estimate_indicator_changes(config, x, t);

signif = SurrogatesSignificance(n = 10_000, tail = :right)
pvals = significant_transitions(res, signif)
@btime significant_transitions($res, $signif)
# Former version: 494.023 ms (241554 allocations: 391.31 MiB)
# Current version: 311.283 ms (180845 allocations: 101.18 MiB)

Please tell me what you think.

Datseris · 2023-10-19T08:10:25Z

But really, the bulk of the memory allocation goes into the surrogate generation. Consider the following example:

function testsgen(sgen)
    for _ in 1:100
        sgen()
    end
end
x = collect(1.0:100)
sgen = surrogenerator(x, signif.surrogate, Random.Xoshiro(1995))
@btime testsgen($sgen)
# 143.765 μs (200 allocations: 135.94 KiB)

This sounds like a bug. The surrogate generation shouldn't allocate for random Fourier surrogates. I'll take a look at it.

Option 2 seems good, but what is the impact on performance? Reducing allocations by itself doesn't matter. it i s the run time we care about.

Datseris · 2023-10-19T08:13:18Z

I am confused about what you do in teh segmented analysis. I thought yuou generate the surrogates once and then you take the same segmented windows as the original data. If this is not what you do, then what you do is incorrect and we have to change the source code. Only 100 surrogates must be generated, not more, if the user asked for 100 surrogates. The surrogates must be generated for the whole timeseries, not the window.

JanJereczek · 2023-10-19T08:38:08Z

Actually I don't understand what you say. We only generate surrogates once for the sliding window. Can you explain to me what part of the code you are referring to? The sliding or segmented window?

I am speaking about our general design, which includes both the sliding and the segmented window analysis. Some time ago I had programmed the following (here in pseudo-code for the sake of illustration):

for n in 1:n_surrogates
    s = sgen()
    for i in 1:n_indicator
        perform_analysis(s)
     end
end

You changed it to:

for i in 1:n_indicator
    for n in 1:n_surrogates
        s = sgen()
        perform_analysis(s)
     end
end

Which, to me, does not make sense, since it implies generating more surrogates than necessary. See for instance surrogates_significance.jl, lines 118-136.

JanJereczek · 2023-10-19T08:43:48Z

I am confused about what you do in teh segmented analysis. I thought yuou generate the surrogates once and then you take the same segmented windows as the original data. If this is not what you do, then what you do is incorrect and we have to change the source code. Only 100 surrogates must be generated, not more, if the user asked for 100 surrogates. The surrogates must be generated for the whole timeseries, not the window.

I don't agree with that. Since you have transitions in the time series, your dynamic regime might be quite different depending on the segment. For instance, segment 1 displays much larger variance than segment 2. Making a surrogate for the whole timeseries would give you an intermediate value of variance. You "average" the statistics of different segments in a certain way. I think we rather, quite clearly want a surrogate of the segment that is being analysed.

Datseris · 2023-10-19T09:25:07Z

Which, to me, does not make sense, since it implies generating more surrogates than necessary. See for instance surrogates_significance.jl, lines 118-136.

Yes, this doesn't make sense to me either. That was obviously a mistake that we need to fix!

Datseris · 2023-10-19T09:26:44Z

Making a surrogate for the whole timeseries would give you an intermediate value of variance. You "average" the statistics of different segments in a certain way. I think we rather, quite clearly want a surrogate of the segment that is being analysed

It isn't as clear cut as you say though. Surrogates aren't magic: they want to conserve various statistical properties for their generation to be meaningful and the statistical hypothesis to be validly stated. If you have small windows with not enough statistics to estimate the quantity that surrogates coserve, the whole hypothesis testing infrastructure breaks down. So, the larger the window, the more accurate the statement of hte hypothesis testing.

JanJereczek · 2023-10-19T09:48:27Z

Sure. Our whole approach of estimating statistics needs timeseries (or segments of it) that are long enough - that's not even specific to the surrogates. That's also why we require minimum lengths of segments in the current version of the code. That should also ensure that the surrogates are meaningful.

Datseris added the performance label Mar 14, 2023

Datseris assigned Datseris and JanJereczek Oct 14, 2023

Datseris added the help wanted Extra attention is needed label Oct 14, 2023

JanJereczek mentioned this issue Oct 19, 2023

Reducing the memory allocation of the significance testing #54

Merged

JanJereczek mentioned this issue Oct 30, 2023

Swap loops and improve segmented analysis #57

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize codebase #25

Optimize codebase #25

Datseris commented Mar 14, 2023 •

edited

Loading

Datseris commented Oct 14, 2023

JanJereczek commented Oct 17, 2023 •

edited

Loading

Datseris commented Oct 19, 2023

Datseris commented Oct 19, 2023

JanJereczek commented Oct 19, 2023

JanJereczek commented Oct 19, 2023

Datseris commented Oct 19, 2023

Datseris commented Oct 19, 2023 •

edited

Loading

JanJereczek commented Oct 19, 2023

Optimize codebase #25

Optimize codebase #25

Comments

Datseris commented Mar 14, 2023 • edited Loading

Datseris commented Oct 14, 2023

JanJereczek commented Oct 17, 2023 • edited Loading

Datseris commented Oct 19, 2023

Datseris commented Oct 19, 2023

JanJereczek commented Oct 19, 2023

JanJereczek commented Oct 19, 2023

Datseris commented Oct 19, 2023

Datseris commented Oct 19, 2023 • edited Loading

JanJereczek commented Oct 19, 2023

Datseris commented Mar 14, 2023 •

edited

Loading

JanJereczek commented Oct 17, 2023 •

edited

Loading

Datseris commented Oct 19, 2023 •

edited

Loading