Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review of V3.0.0 changes #1842

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open

Review of V3.0.0 changes #1842

wants to merge 25 commits into from

Conversation

hjhsalo
Copy link
Member

@hjhsalo hjhsalo commented Mar 15, 2021

Let's review what changes v3.0.0 introduces.


This change is Reviewable

@hjhsalo hjhsalo reopened this Mar 15, 2021
Copy link
Member Author

@hjhsalo hjhsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review still WIP, but need to close it in order for others to see the comments.
Let's see if review can be closed and reopened.

@@ -31,7 +21,11 @@
from b2share.modules.access.permissions import admin_only, authenticated_only
from celery.schedules import crontab
from flask import request

from invenio_app.config import APP_DEFAULT_SECURE_HEADERS
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference? Link to documentation?

@@ -31,7 +21,11 @@
from b2share.modules.access.permissions import admin_only, authenticated_only
from celery.schedules import crontab
from flask import request

from invenio_app.config import APP_DEFAULT_SECURE_HEADERS
# from invenio_previewer.config import PREVIEWER_PREFERENCE as BASE_PREFERENCE
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is commented out?
Reference?

Comment on lines +61 to +64
# Rate limiting
# =============
#: Storage for ratelimiter.
RATELIMIT_STORAGE_URL = os.environ.get("INVENIO_RATELIMIT_STORAGE_URL", 'redis://localhost:6379/3')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference? Link to docs?

Comment on lines +57 to +59
def _(x):
"""Identity function used to trigger string extraction."""
return x
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this?

Copy link
Member Author

@hjhsalo hjhsalo Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +77 to +99
# Base templates
# ==============
#: Global base template.
BASE_TEMPLATE = 'b2share_main/base.html'
#: Cover page base template (used for e.g. login/sign-up).
COVER_TEMPLATE = 'invenio_theme/page_cover.html'
#: Footer base template.
MAIN_TEMPLATE = 'invenio_theme/footer.html'
#: Header base template.
HEADER_TEMPLATE = 'invenio_theme/header.html'
#: Settings base template.
SETTINGS_TEMPLATE = 'invenio_theme/page_settings.html'

# Theme configuration
# ===================
#: Site name
THEME_SITENAME = _('B2SHARE')
#: Use default frontpage.
THEME_FRONTPAGE = True
#: Frontpage title.
THEME_FRONTPAGE_TITLE = _('B2SHARE')
#: Frontpage template.
THEME_FRONTPAGE_TEMPLATE = 'b2share_main/page.html'
Copy link
Member Author

@hjhsalo hjhsalo Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which Invenio modules require/provides templates now?
At least invenio-admin provides the admin UI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HK: Since invenio-cookiecutter includes invenio-theme, codebase based on invenio-cookiecutter requires theme defining.

Comment on lines +108 to +111
# Assets
# ======
#: Static files collection method (defaults to copying files).
COLLECT_STORAGE = 'flask_collect.storage.file'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference? Link to documentation?

Comment on lines +101 to +106
# Email configuration
# ===================
#: Email address for support.
SUPPORT_EMAIL = "[email protected]"
#: Disable email sending by default.
MAIL_SUPPRESS_SEND = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to documentation? Invenio-Mail perhaps?

Comment on lines +113 to +127
# Accounts
# ========
#: Email address used as sender of account registration emails.
SECURITY_EMAIL_SENDER = SUPPORT_EMAIL
#: Email subject for account registration emails.
SECURITY_EMAIL_SUBJECT_REGISTER = _(
"Welcome to B2SHARE!")
#: Redis session storage URL.
CACHE_REDIS_URL=os.environ.get("INVENIO_CACHE_REDIS_URL", 'redis://localhost:6379/0')
ACCOUNTS_SESSION_REDIS_URL = os.environ.get("INVENIO_ACCOUNTS_SESSION_REDIS_URL", 'redis://localhost:6379/1')
#: Enable session/user id request tracing. This feature will add X-Session-ID
#: and X-User-ID headers to HTTP response. You MUST ensure that NGINX (or other
#: proxies) removes these headers again before sending the response to the
#: client. Set to False, in case of doubt.
ACCOUNTS_USERINFO_HEADERS = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only Invenio-Accounts or are there configuration variables for inveniosoftware/flask-security in here as well?

BTW, There is for of flask-security, originally named flask-security-too (pallets-eco/flask-security#822 (comment)), which seems to be named to Flask-Middleware/flask-security: https://github.com/Flask-Middleware/flask-security

IDK, will inveniosoftware move to use Flask-Middleware/flask-security in the future or have they already done it.

Copy link
Member Author

@hjhsalo hjhsalo Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INVENIO_ACCOUNTS_SESSION_REDIS_URL : Note that V3 uses invenio-cache: https://github.com/inveniosoftware/invenio-cache

It might be that V2 codebase invenio modules use Redis directly and now they handle their caching needs through invenio-cache.

Comment on lines +129 to +133
# Database
# ========
#: Database URI including user and password
SQLALCHEMY_DATABASE_URI = os.environ.get("INVENIO_SQLALCHEMY_DATABASE_URI", \
'postgresql+psycopg2://b2share:b2share@localhost/b2share')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation can be found from docs of invenio-db, but maybe we should gather Invenio documentation links to our developer documentation for any newcomers & onboarding activities.

#: It should be changed before deploying.
SECRET_KEY = 'CHANGE_ME'
#: Max upload size for form data via application/mulitpart-formdata.
MAX_CONTENT_LENGTH = 100 * 1024 * 1024 # 100 MiB
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be that Invenio has changed the logic behind invenio-files-rest, since this MAX_CONTENT_LENGTH has been defined here.
-> Is FILES_REST_DEFAULT_MAX_FILE_SIZE is not used any more? (

b2share/b2share/config.py

Lines 406 to 409 in d846068

FILES_REST_DEFAULT_MAX_FILE_SIZE = 10 * 1024 * 1024 * 1024 # 10 GB per file
"""Maximum file size for the files in a record"""
FILES_REST_DEFAULT_QUOTA_SIZE = 20 * 1024 * 1024 * 1024 # 20 GB per record
)

HK: Unit tests might benefit from a test that uploads a large test file, maybe generated on-the-fly during tests.

#: should be set to the correct host and it is strongly recommended to only
#: route correct hosts to the application.
APP_ALLOWED_HOSTS = [ os.environ.get("SERVER_NAME", 'localhost'), os.environ.get("SERVER_INTERNAL_IP", '127.0.0.1') ]
APP_ALLOWED_HOSTS = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HK: Setting APP_ALLOWED_HOSTS to None enables requests from anywhere. DO NOT USE THIS IN PRODUCTION. This has been set for development.

Essentially this needs to be set to e.g. domain name of the server where B2SHARE is hosted; (i.e. is being requested from by the client).
See line above for production suitable value.

Comment on lines +165 to +168
# Previewers
# ==========
#: Include IIIF preview for images.
# PREVIEWER_PREFERENCE = ['iiif_image'] + BASE_PREFERENCE
Copy link
Member Author

@hjhsalo hjhsalo Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we don't use invenio-previewer anywhere?

Invenio-previewer is included in requirements.txt.

If I remember correctly HK run into an issue where invenio-previewer needs to be installed even if we don't use it.
-> Maybe some other invenio package functionality breaks (and as such B2SHARE breaks) if invenio-previewer is not installed. Maybe inveniosoftware forgot to include it as dependency to some package.

# https://flask-debugtoolbar.readthedocs.io/en/latest/#configuration

#: Switches off incept of redirects by Flask-DebugToolbar.
DEBUG_TB_INTERCEPT_REDIRECTS = False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HK: Probably from invenio-cookiecutter.

Comment on lines +179 to +188
# Configures Content Security Policy for PDF Previewer
# Remove it if you are not using PDF Previewer
APP_DEFAULT_SECURE_HEADERS['content_security_policy'] = {
'default-src': ["'self'", "'unsafe-inline'"],
'object-src': ["'none'"],
'script-src': ["'self'", "'unsafe-inline'", "'unsafe-eval'"],
'style-src': ["'self'", "'unsafe-inline'", "data:", "https://fonts.googleapis.com/css"],
'img-src': ["'self'", "data: blob:;"],
'font-src': ["'self'", "data:", "https://fonts.gstatic.com", "https://fonts.googleapis.com"],
}
Copy link
Member Author

@hjhsalo hjhsalo Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HK: Despite the comment above, B2SHARE breaks if this is not set.
It is worthwhile to check how this could be either removed or restricted.

Comment on lines +269 to +279
#: Endpoint for uploading files.
DEPOSIT_FILES_API = u'/api/files'
#: Template for deposit list view.
DEPOSIT_SEARCH_API = '/api/deposit/depositions'
#: Template for deposit records API.
DEPOSIT_RECORDS_API = '/api/deposit/depositions/{pid_value}'
#: Records REST API endpoints.
RECORDS_API = '/api/records/{pid_value}'
#: Default API endpoint for search UI.
SEARCH_UI_SEARCH_API = "/api/records/"

Copy link
Member Author

@hjhsalo hjhsalo Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HK: These might not be required.

HH: Might be related to first commits concerning flask_app building.

Comment on lines +381 to +382
consumer_key=os.environ.get("B2ACCESS_CONSUMER_KEY", '*** CONSUMER KEY ***'),
consumer_secret=os.environ.get("B2ACCESS_SECRET_KEY", '*** SECRET KEY ***'),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*** CONSUMER KEY *** and *** SECRET KEY *** are just placeholders. If not defined flask will complain that values have not been set.

Comment on lines +121 to +122
CACHE_REDIS_URL=os.environ.get("INVENIO_CACHE_REDIS_URL", 'redis://localhost:6379/0')
ACCOUNTS_SESSION_REDIS_URL = os.environ.get("INVENIO_ACCOUNTS_SESSION_REDIS_URL", 'redis://localhost:6379/1')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we default to one of these? What does invenio-accounts and invenio-cache documentation say?
Does Invenio-Accounts explicitly require this?

@@ -423,6 +603,7 @@

STATS_EVENTS = {
'file-download': dict(
templates='invenio_stats.contrib.file_download',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that invenio-stats seems to have templates now.
Reference, documentation?

@@ -458,3 +639,13 @@
#: There is no password so don't send password change emails
SECURITY_SEND_PASSWORD_CHANGE_EMAIL=False
SECURITY_SEND_PASSWORD_RESET_NOTICE_EMAIL=False

# Extra (Harry Kodden)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are not set you will get browser errors

SESSION_COOKIE_PATH="/"
SESSION_COOKIE_SAMESITE="Lax"

APP_THEME = ['semantic-ui']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this come from cookiecutter? Does it need to be changed at some point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SESSION_COOKIE_SAMESITE="Lax"

APP_THEME = ['semantic-ui']
SECURITY_CHANGEABLE = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows changing of password for invenio accounts. Can be ignored since we are using B2ACCESS for user validation

#: Use default frontpage.
THEME_FRONTPAGE = True
#: Frontpage title.
THEME_FRONTPAGE_TITLE = _('B2SHARE')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the _ in front of B2SHARE. This needs to be defined like this for Babel / i18n

#: Frontpage title.
THEME_FRONTPAGE_TITLE = _('B2SHARE')
#: Frontpage template.
THEME_FRONTPAGE_TEMPLATE = 'b2share_main/page.html'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially the B2SHARE V2 webui/app/index.html.
!! Build of webui and packaging with B2SHARE backend code needs to be adopted.
New Dockerfile addresses this.

Copy link
Contributor

@jupste jupste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP

Comment on lines +84 to +86
COPY webui/app b2share/modules/b2share_main/static

RUN mv b2share/modules/b2share_main/static/index.html b2share/modules/b2share_main/templates/b2share_main/page.html
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these handle moving of webpack built app-folder to correct path in B2SHARE backend and renaming to index.html to b2share/modules/b2share_main/templates/b2share_main/page.html (See THEME_FRONTPAGE_TEMPLATE config.py for more info)

APP_ENABLE_SECURE_HEADERS = (os.environ.get('SESSION_COOKIE_SECURE', "True").upper() == "True".upper())

SESSION_COOKIE_PATH="/"
SESSION_COOKIE_SAMESITE="Lax"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +36 to +44
@app_created.connect
def receiver_app_created(sender, app=None, **kwargs):
app.logger.debug("Application created")

@app_loaded.connect
def receiver_app_loaded(sender, app=None, **kwargs):
app.logger.debug("Application Loaded")
app.logger.debug("Instance Path: {}".format(app.instance_path))
check_configuration(app.config, app.logger)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly logging but NOTE check_configuration is handled with signals now and not syncronously as with V2:

check_configuration(api.config, api.logger)

Comment on lines +59 to +63
Defaults to ``<env_prefix>_STATIC_FOLDER``
or if environment variable is not set ``<sys.prefix>/var/instance/static``.
"""
return os.getenv(env_prefix + '_STATIC_FOLDER') or \
os.path.join(instance_path(), 'static')
Copy link
Member Author

@hjhsalo hjhsalo Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V2 codebase uses B2SHARE_UI_PATH.

b2share/b2share/factory.py

Lines 133 to 135 in d846068

static_folder=os.environ.get(
'B2SHARE_UI_PATH',
os.path.join(api.instance_path, 'static')),

Might need to be changed, or not

Comment on lines +75 to 96
def config_loader(app, **kwargs_config):
"""Configuration loader.

Adds support for loading templates from the Flask application's instance
folder (``<instance_folder>/templates``).
"""
# Create the REST API Flask application
api = create_api(**kwargs)
api.config.update(
APPLICATION_ROOT='/api'
# This is the only place customize the Flask application right after
# it has been created, but before all extensions etc are loaded.
local_templates_path = os.path.join(app.instance_path, 'theme/templates')
if os.path.exists(local_templates_path):
# Let's customize the template loader to look into packages
# and application templates folders.
app.jinja_loader = ChoiceLoader([
FileSystemLoader(local_templates_path),
app.jinja_loader,
])

app.jinja_options = dict(
app.jinja_options,
cache_size=1000,
bytecode_cache=BytecodeCache(app)
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HK: Might not be used. Comes from cookiecutter. B2SHARE V2 has a very simple config_loader:

config_loader = create_conf_loader(config=config, env_prefix=env_prefix)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check if this can be removed?

@@ -0,0 +1,161 @@
# -*- coding: utf-8 -*-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is completely compatible with invenio cookie-cutter

@@ -0,0 +1,9 @@
# -*- coding: utf-8 -*-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed or could we just use the image from dockerhub?

@@ -0,0 +1,91 @@
[console_scripts]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously in setup.py

@@ -241,7 +241,7 @@ def test_invalid_get(app, login_user,
res = client.get(url_for('b2share_communities.communities_item',
community_id=community_id),
headers=headers)
assert res.status_code == 406
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't fix the underlying issue and needs to be reverted

@@ -33,29 +33,35 @@
from invenio_indexer.api import RecordIndexer
from invenio_records.api import Record
from b2share.modules.deposit.api import Deposit

from time import sleep
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush method was not sufficient to refresh the indices

use wait_for_completion instead

@@ -27,7 +27,7 @@
import copy

from flask import url_for
from b2share_unit_tests.helpers import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be addressed with python path envar

@@ -58,7 +58,7 @@ def test_records_serializers_dc(app, test_records_data):
with app.app_context():
pid, record = make_record(test_records_data)
rec = {
'_source': RecordIndexer._prepare_record(
'_source': RecordIndexer()._prepare_record(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_prepare_record can not be invoked without having the parenthases with the RecordIndexer class

@@ -156,7 +156,7 @@ def test_init_fail_and_retry(clean_app):
# create a conflicting table.
db.engine.execute('CREATE table b2share_community (wrong int);')
result = upgrade_run(clean_app)
assert result.exit_code == -1
assert result.exit_code != 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should return -1?

Copy link
Contributor

@jupste jupste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all the files

HarryKodden and others added 21 commits March 24, 2021 13:58
CHANGE: B2ShareRecordSearch to RecordSearch in searchy.py files

Signed-off-by: Jussi Steenari <[email protected]>
…into v3-feature-branch

Signed-off-by: Jussi Steenari <[email protected]>

Conflicts:
	b2share/config.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants