Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty list of OAI-PMH sets to harvest due to exclusion for some centre registry endpoints #30

Open
twagoo opened this issue Jul 17, 2020 · 3 comments
Assignees

Comments

@twagoo
Copy link
Member

twagoo commented Jul 17, 2020

After implementation of #26, endpoints that have only set(s) defined that are excluded can end up having no records harvested at all if there are also records to be harvested without a specified set.

A concrete example is the HZSK endpoint. It does not specify sets but the centre registry defines a set of the 'WebLicht' type. The harvester applies the following filtering:

[WebLicht]

[]

The result of this can be seen in the endpoint specific log:

2020-07-16T18:04:16,689 INFO [HZSK Repository] Worker - Processing provider[HZSK Repository (only set(s): ) @ http://corpora.uni-hamburg.de:8080/oai/provider] using scenario[ListRecords], incremental[false], timeout[60] and retry[count=2,delays=[10000]]

only set(s): implies that nothing is harvested. Solution: if only an empty list of sets remains, set harvesting should not take place.

@twagoo twagoo self-assigned this Jul 17, 2020
twagoo added a commit that referenced this issue Jul 17, 2020
@menzowindhouwer
Copy link
Contributor

menzowindhouwer commented Jul 17, 2020

Does your commit already fix the problem? Otherwise I can take it along to the work I'm currently doing, i.e. bringing all (HuC/CLARIN) branches together in a new 1.2 release ...

@twagoo
Copy link
Member Author

twagoo commented Jul 21, 2020

@menzowindhouwer

Does your commit already fix the problem?

I haven't checked it yet. Will need to test, ideally write some unit tests. Maybe I'll manage today or tomorrow, but if you see a chance to look at this in the meantime - feel free :) Either way I'll keep you posted!

@menzowindhouwer
Copy link
Contributor

I'm working on the tests right now, so I'll add one. See https://github.com/menzowindhouwer/oai-harvest-manager/tree/develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants