Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge harvester hangs #549

Open
ziorick opened this issue Jun 8, 2024 · 0 comments
Open

Huge harvester hangs #549

ziorick opened this issue Jun 8, 2024 · 0 comments

Comments

@ziorick
Copy link

ziorick commented Jun 8, 2024

Hi to all!
I have installed ckan 2.10.3. I'm trying to harvest (using ckan-harvester plugin) a huge other ckan portal (data.gov) about 296k datasets. I don't need to import "remote_orgs" and my configuration is only with "clear_tags" as true.
The gather process start successfully, and ask to remote the correct api/path and row num... All works well. After the first read stage, the gather process start to log: Creating HarvestObject for ... foreach dataset. But never write the line: xxxxxx datasets sent to fetch queue or similar, as in other harvest processes. This instance run on 32GB DDR4, 40c/40t Xeon CPU. The result is a ckan process that use about 10% CPU, 35% RAM resource and ythe postgres grow up (the harvest_object table) but no fetch is started.
Can you help me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant