Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large memory footprint of Cox-PH model #37

Open
davibarreira opened this issue Jun 1, 2022 · 1 comment
Open

Large memory footprint of Cox-PH model #37

davibarreira opened this issue Jun 1, 2022 · 1 comment

Comments

@davibarreira
Copy link

davibarreira commented Jun 1, 2022

Hey fellas, thanks a lot for this package. I've been using it recently, and I noticed that, for some reason, it's taking a huge toll in my memory. I have a dataframe with 20k rows. Once I load the dataframe, I have a 4gb of memory in use. After running the Cox-PH model, it grows to 11gb, and stays that way, even if I place the whole thing inside a function.

sdf = CSV.read("sdf.csv", DataFrame);
function coxphcoef(sdf)
    X = Matrix(sdf[!,Not(:duracao)])
    y = sdf[!,:duracao];
    y = EventTime.(y,[true for i in y]);
    @show cphmodel = coxph(X,y);
    return cphmodel.β
end
coxphcoef(sdf);

What is going on here?

BTW, I'm in Julia 1.7.2. Using:
DataFrames v1.3.2
Survival v0.2.2

@ararslan
Copy link
Member

ararslan commented Jun 1, 2022

The structures that keep track of the various parts of the model fitting process, including the model object that's returned, keep some potentially very large arrays in memory. I'd be curious to know whether Julia will free memory appropriately with your coxphcoef function if you return copy(cphmodel.β); perhaps it can't otherwise prove that the entire cphmodel object isn't still needed since you're returning one of its fields. (Just a guess)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants