You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For batch requests with 5000 domains this results in a memory usage of 1GB, for 10k domain almost 2GB, etc. Requiring the worker performing this task to have this much memory available for this short time it takes to generate the reports. Furthermore this memory is retained by the worker until the next report generation is run where the memory will be reused but not freed.
Suggest to refactor the generation logic to write the report file to disk in a streaming fashon in gather_batch_results to eliminate the dom_results (
) is a dictionary/object, existing JSON encoders might not be able to handle this in a streaming manner. But because the report structure is simple enough it might be best to just write a custom encoder or write the JSON directory without any encoder/library.
The text was updated successfully, but these errors were encountered:
Related to #1395
During batch result report generation the result is stored in a variable before being written to a file:
Internet.nl/interface/batch/util.py
Lines 270 to 274 in 9e4d250
For batch requests with 5000 domains this results in a memory usage of 1GB, for 10k domain almost 2GB, etc. Requiring the worker performing this task to have this much memory available for this short time it takes to generate the reports. Furthermore this memory is retained by the worker until the next report generation is run where the memory will be reused but not freed.
Suggest to refactor the generation logic to write the report file to disk in a streaming fashon in
gather_batch_results
to eliminate thedom_results
(Internet.nl/interface/batch/util.py
Line 291 in 9e4d250
Because the
dom_results
(domains
field in the reportInternet.nl/interface/batch/util.py
Line 340 in 9e4d250
The text was updated successfully, but these errors were encountered: