Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support high-availability setups for the interactive tools proxy #18481

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

kysrpex
Copy link
Contributor

@kysrpex kysrpex commented Jul 2, 2024

This PR closes #16138. At the moment, information exchanges between the interactive tools proxy and Galaxy are done via a SQLite database. This makes it impossible to have the high-availability (redundant) setup for the proxy that we would like to have at useGalaxy.eu.

To solve the problem, I opted for a relatively simple solution. Since Galaxy already uses SQLAlchemy, I refactored the existing class InteractiveToolSqlite from lib/galaxy/managers/interactivetool.py to use SQLAlchemy instead of Python's built-in sqlite3 library. The changes are backwards compatible. Setting interactivetools_map to a path makes Galaxy use a SQLite database, just like it did before. Setting it to a SQLAlchemy database url allows to use any database supported by SQLAlchemy (e.g. PostgreSQL). Mappings are written to the table "gxitproxy", just as they did before.

Of course, this also needs support on the interactive tools proxy's side, which will be contributed separately.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

This is a refactor of the "propagator" InteractiveToolSqlite in lib/galaxy/managers/interactivetool.py. Tests are already available and can be run using
./run_tests.sh -integration test/integration/test_interactivetools_api.py::TestInteractiveToolsIntegration. I used standard SQL so that the code is compatible with the widest possible variety of databases.

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@@ -18,125 +19,129 @@
)
from galaxy.model.base import transaction
from galaxy.security.idencoding import IdAsLowercaseAlphanumEncodingHelper
from galaxy.util.filelock import FileLock
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I may be missing something regarding the FileLock. What role was it actually playing? Isn't SQLite already taking care of locks?

The pager module effectively controls access for separate threads, or separate processes, or both. Throughout this document whenever the word "process" is written you may substitute the word "thread" without changing the truth of the statement.

…ing for interactive tools

Replace the class `InteractiveToolSqlite` in lib/galaxy/managers/interactivetool.py with a new class `InteractiveToolPropagatorSQLAlchemy`. The new class implements a SQLAlchemy "propagator" for `InteractiveToolManager` (on the same file). This propagator writes the mappings to the table named after the value of `DATABASE_TABLE_NAME`, in the database specified by the SQLAlchemy database url passed to its constructor. Change the constructor of `InteractiveToolManager` so that it uses `InteractiveToolPropagatorSQLAlchemy`.

Change the method `_process_config` of `galaxy.config.GalaxyAppConfiguration` so that it converts the value of `interactivetools_map` to a SQLAlchemy database url if it is a path.

Update documentation to reflect these changes.
@kysrpex kysrpex force-pushed the interactivetoolssqlalchemy branch from 355446f to b54bf35 Compare July 2, 2024 09:24
@jdavcs jdavcs self-requested a review July 2, 2024 14:20
@bgruening
Copy link
Member

Great stuff @kysrpex!

@sveinugu that might also be interesting for you.

@sveinugu
Copy link
Contributor

sveinugu commented Jul 4, 2024

Sounds good!

Just make sure that the info field keeps supporting json, as I opted for storing extra attributes to the proxy that are used for path-based interactive tools as json in the "info" field instead of adding new fields to the schema (the info field is not used for SQL queries, so I don't think there is a need to explicitly type this as JSON, plain text should be fine).

Currently the path-based ITs were reverted in the EU instance due to an error. I made a PR to revert this for the 24.1 release, now merged: usegalaxy-eu#242

So just make sure that the path-based ITs still run (first commit of that PR)!

@kysrpex
Copy link
Contributor Author

kysrpex commented Jul 4, 2024

@sveinugu Regarding the JSON, it's stored as plain text. Seems to work fine (see the psql output below from a Galaxy instance).

galaxy=> select * from gxitproxy;
      key      |         key_type          |     token     |      host      | port  |                                  info                                  
---------------+---------------------------+---------------+----------------+-------+------------------------------------------------------------------------
 326unmiv5sanl | interactivetoolentrypoint | 2jlcp7mzyjsbm | 192.168.122.64 | 32780 | {"requires_path_in_url": false, "requires_path_in_header_named": null}
(1 row)

I'll test the path-based ITs and let you know if they're working.

@kysrpex
Copy link
Contributor Author

kysrpex commented Jul 4, 2024

@sveinugu path-based ITs cause a redirection loop (I am using gx-it-proxy from galaxyproject/gx-it-proxy#21).

grafik

galaxy=> select * from gxitproxy;
      key      |         key_type          |     token     |      host      | port  |                                 info                                  
---------------+---------------------------+---------------+----------------+-------+-----------------------------------------------------------------------
 1540u2qksgef3 | interactivetoolentrypoint | 2ypyg6q4fsp20 | 192.168.122.64 | 32782 | {"requires_path_in_url": true, "requires_path_in_header_named": null}
(1 row)

Should they be working? Have you encountered this before? This may be a misconfiguration of my Galaxy instance, as I did not change what the proxy does with the sessions, just where it stores them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IT / IE Proxy has no redundancy setup support
3 participants