Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError #16

Open
cathhriss opened this issue Nov 29, 2017 · 1 comment
Open

MemoryError #16

cathhriss opened this issue Nov 29, 2017 · 1 comment

Comments

@cathhriss
Copy link

I getting this when trying to process a 209 mb json file:

Traceback (most recent call last):
File "json_to_csv.py", line 85, in
header += reduced_item.keys()
MemoryError

What can I do?

@vinay20045
Copy link
Owner

Hey, I was traveling and hence couldn't write to you earlier... Have you resolved the issue yet?

On most modern machines with >2GB RAM, you should have no issues running the script for your JSON file of 209MB. I'm just curious to know...

  1. What are the details of your system? (RAM, OS etc.)
  2. How are you running the script? (on your CMD line, using an IDE etc.)

The script has room for memory/performance improvement. Here's a couple of approaches to take, if you want to process really large files...

  1. Split the large JSON file into multiple smaller files before processing. - Easiest.
  2. Modify the header determination logic and use buffer object to incrementally process the JSON file. - Slightly more difficult than 1 as you have to take care of handling JSON objects that do not have the same keys.

I'm going to leave this issue open and try to do some optimization when I get some time. Let me know what you do in the meanwhile...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants