Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uses OpenSearch for search and some reporting #649

Open
wants to merge 92 commits into
base: develop
Choose a base branch
from
Open

Conversation

sfisher
Copy link
Contributor

@sfisher sfisher commented Jun 5, 2024

I've done a pass of cleaning up code and adding documentation to the docs repository relating to OpenSearch.

I believe this is all working on development, but it hasn't been indexed or tested on stage yet. I think all the functionality works though I have only done cursory examination of search results and exercised each advanced search field to be sure it seems to work. More thorough testing would be great before we realease.

We also need to run indexing on each environment before release (stage, production).

Please see the docs/ezid_search_opensearch.md file in the documentation repo for a lot of background about the search.

Updated in this PR:

  • Simple search and results
  • Advanced search and results
  • Manage IDs page, simple and advanced.
  • Manage IDs selection of owner (user or group).
  • Manage IDs "Download all" functionality (part of the "batch download")
  • Batch downloads (the daemon needs to be running for this to work, it's proc-download)
  • Search updating daemon proc-search-indexer
  • OpenSearch schema saved.
  • Example record.
  • Minor changes to get results page to display from OpenSearch instead of DB.
  • OpenSearchDoc class and tests.
  • Small changes to some of the requirements (add OpenSearch library) and some new values in SSM.

PS. I believe this PR also incorporates everything in #604 . It hasn't been merged yet, so I may just close it.

…use of 'doc' being required in some cases (upsert) but not others.
…ather than methods) in the open_search builder class.
…d the date/time an opensearch record was updated.
…er objects of owner, ownergroup, profile, resource
… useful even though it can be looked up the ark/doi
…there is an existing python library called that and it's confusing.
@sfisher
Copy link
Contributor Author

sfisher commented Jun 20, 2024

Thanks for finding the issue with the target URL search, Jing. I've fixed that. Let me know what other changes you'd like to see. You mentioned reverting the report to still use the SearchIdentifiers table? I think what I had worked when I tested, but if you want to put that back to SearchIdentifiers and separate more from that table later, that is OK with me.

@jsjiang
Copy link
Contributor

jsjiang commented Jun 20, 2024

@sfisher Hi Scott,
How about restore the proc-download.py script and have it to use the "searchIdentifier" table as data source. One of the reasons is the crossrefStatus field which is used to identify CrossRef identifiers is not indexed in the identifier table. We can refactor the identifier and searchIdentifier tables and related functionalities later.

Jing

@sfisher
Copy link
Contributor Author

sfisher commented Jun 25, 2024

I reverted proc-download.py to use searchIdentifier. The changes I made are in the commit history if you want to de-entangle it again later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants