Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add typing, some refactor, run and fix all doctests #196

Draft
wants to merge 34 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
8e3d7dd
test mypy action
artemisart Nov 1, 2023
f3b1d4b
poetry mypy
artemisart Nov 1, 2023
e7ffa08
strict flag
artemisart Nov 1, 2023
a583dab
nofilter
artemisart Nov 1, 2023
a570ad6
reporter: github-pr-review
artemisart Nov 1, 2023
992b796
run mypy only on latest
artemisart Jan 19, 2024
fd8accd
add github token
artemisart Jan 19, 2024
e7f6908
add permissions
artemisart Jan 19, 2024
64dfef7
strict
artemisart Jan 19, 2024
91c3eb5
Make it compatible with pandas 2
artemisart Feb 3, 2024
24a2818
pyupgrade, typing
artemisart Jan 22, 2024
482b873
update mypy, don't use --strict in ci
artemisart Feb 3, 2024
0243b24
ci: black show diff
artemisart Feb 3, 2024
55381dd
don't import reveal_type
artemisart Feb 3, 2024
ee1fa19
black format
artemisart Feb 3, 2024
5c9f039
Merge branch 'pandas2' into typing-and-maintenance
artemisart Feb 3, 2024
d7e905f
type type type, factorise distinct select where order_by flatten zip_…
artemisart Feb 3, 2024
34ff4af
try fixes for python 3.8
artemisart Feb 3, 2024
2bdbd01
ci: Add python 3.12
artemisart Feb 3, 2024
8a7fbaa
Update numpy for python 3.12
artemisart Feb 3, 2024
bfb764b
remove todo, tweak ci, typing adjustments
artemisart Feb 3, 2024
63501c7
update pylint astroid
artemisart Feb 4, 2024
6749f41
fix py38 py39
artemisart Feb 4, 2024
a660392
magic bytes ClassVar, fix SupportsRichComparisonT
artemisart Feb 5, 2024
b116d2b
we don't need gzip TextIOWrapper shenanigans
artemisart Feb 5, 2024
91f68f0
type type
artemisart Feb 5, 2024
fec54f0
ci
artemisart Feb 5, 2024
d25c013
fix outer join ordering
artemisart Feb 5, 2024
c6ef3e5
DOCTESTS!
artemisart Feb 5, 2024
986cad2
fix sum
artemisart Feb 5, 2024
b4217db
line too long
artemisart Feb 5, 2024
41a6219
doctest exceptions
artemisart Feb 5, 2024
005269b
doctest sqlite test
artemisart Feb 5, 2024
86d15c2
<py311
artemisart Feb 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 22 additions & 7 deletions .github/workflows/pythonpackage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,13 @@ on:
jobs:
build:
runs-on: ubuntu-latest
permissions:
# For https://github.com/tsuyoshicho/action-mypy to display a report on the PR
pull-requests: write
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11", "pypy3.9"]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "pypy3.9"]
include:
- python-version: "3.11"
use_pandas: 1
Expand All @@ -30,14 +33,26 @@ jobs:
- name: Pylint
run: poetry run pylint functional
- name: black
run: poetry run black --check functional
if: always()
run: poetry run black --check --diff --color functional
if: success() || failure()
- name: Test with pytest
run: poetry run pytest --cov=functional --cov-report=xml
if: always()
run: poetry run pytest --doctest-modules --cov=functional --cov-report=xml
if: success() || failure()
- name: mypy
run: poetry run mypy functional
if: always()
run: |
# First run without --check-untyped-defs that can fail CI
poetry run mypy --warn-unused-configs --warn-redundant-casts --warn-unused-ignores functional
# Second run with --check-untyped-defs, ignored for CI status
poetry run mypy --warn-unused-configs --check-untyped-defs --warn-redundant-casts --warn-unused-ignores --extra-checks functional || true
if: success() || failure()
- uses: tsuyoshicho/action-mypy@v4
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
reporter: github-pr-review
level: warning
execute_command: poetry run mypy --warn-unused-configs --check-untyped-defs --warn-redundant-casts --warn-unused-ignores --extra-checks functional
filter_mode: nofilter
if: (success() || failure()) && matrix.python-version == '3.11' # run only on latest to avoid duplicate warnings
- uses: codecov/codecov-action@v1
with:
file: ./coverage.xml
1 change: 0 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
# -*- coding: utf-8 -*-
#
# ScalaFunctional documentation build configuration file, created by
# sphinx-quickstart on Wed Mar 11 23:00:20 2015.
Expand Down
8 changes: 4 additions & 4 deletions examples/PyFunctional-pandas-tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@
],
"source": [
"# Load the initial data using pandas, possibly do some work in pandas\n",
"df = pd.read_csv('camping_purchases.csv', header=None)\n",
"df = pd.read_csv(\"camping_purchases.csv\", header=None)\n",
"df"
]
},
Expand Down Expand Up @@ -256,7 +256,7 @@
],
"source": [
"# Show representation using PyFunctional's csv parsing\n",
"seq.csv('camping_purchases.csv')"
"seq.csv(\"camping_purchases.csv\")"
]
},
{
Expand Down Expand Up @@ -286,7 +286,7 @@
],
"source": [
"# PyFunctional doesn't try to parse the columns, perhaps an area for improvement\n",
"seq.csv('camping_purchases.csv').list()"
"seq.csv(\"camping_purchases.csv\").list()"
]
},
{
Expand Down Expand Up @@ -316,7 +316,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
"version": "3.11.4"
}
},
"nbformat": 4,
Expand Down
8 changes: 8 additions & 0 deletions functional/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import pytest

from functional import seq


@pytest.fixture(autouse=True)
def add_seq(doctest_namespace):
doctest_namespace["seq"] = seq
38 changes: 22 additions & 16 deletions functional/execution.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,30 @@
from __future__ import annotations

from functools import partial
from typing import TYPE_CHECKING, Callable, Iterable, Iterator, Optional

from functional.util import compose, parallelize

if TYPE_CHECKING:
from functional.transformations import Transformation

class ExecutionStrategies(object):

class ExecutionStrategies:
"""
Enum like object listing the types of execution strategies.
"""

PRE_COMPUTE = 0
PARALLEL = 1


class ExecutionEngine(object):
class ExecutionEngine:
"""
Class to perform serial execution of a Sequence evaluation.
"""

def evaluate(self, sequence, transformations):
def evaluate(
self, sequence: Iterable, transformations: Iterable[Transformation]
) -> Iterator:
"""
Execute the sequence of transformations in serial
:param sequence: Sequence to evaluation
Expand All @@ -25,11 +33,7 @@ def evaluate(self, sequence, transformations):
"""
result = sequence
for transform in transformations:
strategies = transform.execution_strategies
if strategies is not None and ExecutionStrategies.PRE_COMPUTE in strategies:
result = transform.function(list(result))
else:
result = transform.function(result)
result = transform.function(result)
return iter(result)


Expand All @@ -38,16 +42,20 @@ class ParallelExecutionEngine(ExecutionEngine):
Class to perform parallel execution of a Sequence evaluation.
"""

def __init__(self, processes=None, partition_size=None):
def __init__(
self, processes: Optional[int] = None, partition_size: Optional[int] = None
):
"""
Set the number of processes for parallel execution.
:param processes: Number of parallel Processes
"""
super(ParallelExecutionEngine, self).__init__()
super().__init__()
self.processes = processes
self.partition_size = partition_size

def evaluate(self, sequence, transformations):
def evaluate(
self, sequence: Iterable, transformations: Iterable[Transformation]
) -> Iterator:
"""
Execute the sequence of transformations in parallel
:param sequence: Sequence to evaluation
Expand All @@ -58,17 +66,15 @@ def evaluate(self, sequence, transformations):
parallel = partial(
parallelize, processes=self.processes, partition_size=self.partition_size
)
staged = []
staged: list[Callable[[Iterable], Iterable]] = []
for transform in transformations:
strategies = transform.execution_strategies or {}
strategies = transform.execution_strategies
if ExecutionStrategies.PARALLEL in strategies:
staged.insert(0, transform.function)
else:
if staged:
result = parallel(compose(*staged), result)
staged = []
if ExecutionStrategies.PRE_COMPUTE in strategies:
result = list(result)
result = transform.function(result)
if staged:
result = parallel(compose(*staged), result)
Expand Down
96 changes: 47 additions & 49 deletions functional/io.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,22 @@
from __future__ import annotations

import builtins
import bz2
import gzip
import lzma
import bz2
import io
import builtins
from pathlib import Path
from typing import Any, ClassVar, Optional, Union

from typing import Optional, Generic, TypeVar, Any
from typing_extensions import TypeAlias

# adapted from typeshed
StrOrBytesPath: TypeAlias = Union[str, bytes, Path]
FileDescriptorOrPath: TypeAlias = Union[int, StrOrBytesPath]

WRITE_MODE = "wt"

_FileConv_co = TypeVar("_FileConv_co", covariant=True)


class ReusableFile(Generic[_FileConv_co]):
class ReusableFile:
"""
Class which emulates the builtin file except that calling iter() on it will return separate
iterators on different file handlers (which are automatically closed when iteration stops). This
Expand All @@ -23,7 +27,7 @@ class ReusableFile(Generic[_FileConv_co]):
# pylint: disable=too-many-instance-attributes
def __init__(
self,
path: str,
path: StrOrBytesPath,
delimiter: Optional[str] = None,
mode: str = "r",
buffering: int = -1,
Expand Down Expand Up @@ -81,12 +85,12 @@ def read(self):


class CompressedFile(ReusableFile):
magic_bytes: Optional[bytes] = None
magic_bytes: ClassVar[bytes]

# pylint: disable=too-many-instance-attributes
def __init__(
self,
path: str,
path: StrOrBytesPath,
delimiter: Optional[str] = None,
mode: str = "rt",
buffering: int = -1,
Expand All @@ -95,7 +99,7 @@ def __init__(
errors: Optional[str] = None,
newline: Optional[str] = None,
):
super(CompressedFile, self).__init__(
super().__init__(
path,
delimiter=delimiter,
mode=mode,
Expand All @@ -112,12 +116,12 @@ def is_compressed(cls, data):


class GZFile(CompressedFile):
magic_bytes: bytes = b"\x1f\x8b\x08"
magic_bytes = b"\x1f\x8b\x08"

# pylint: disable=too-many-instance-attributes
def __init__(
self,
path: str,
path: StrOrBytesPath,
delimiter: Optional[str] = None,
mode: str = "rt",
buffering: int = -1,
Expand All @@ -126,7 +130,7 @@ def __init__(
errors: Optional[str] = None,
newline: Optional[str] = None,
):
super(GZFile, self).__init__(
super().__init__(
path,
delimiter=delimiter,
mode=mode,
Expand All @@ -138,41 +142,35 @@ def __init__(
)

def __iter__(self):
if "t" in self.mode:
with gzip.GzipFile(self.path, compresslevel=self.compresslevel) as gz_file:
gz_file.read1 = gz_file.read
with io.TextIOWrapper(
gz_file,
encoding=self.encoding,
errors=self.errors,
newline=self.newline,
) as file_content:
yield from file_content
else:
with gzip.open(
self.path, mode=self.mode, compresslevel=self.compresslevel
) as file_content:
yield from file_content
with gzip.open(
self.path,
mode=self.mode,
compresslevel=self.compresslevel,
encoding=self.encoding,
errors=self.errors,
newline=self.newline,
) as file_content:
yield from file_content

def read(self):
with gzip.GzipFile(self.path, compresslevel=self.compresslevel) as gz_file:
gz_file.read1 = gz_file.read
with io.TextIOWrapper(
gz_file,
encoding=self.encoding,
errors=self.errors,
newline=self.newline,
) as file_content:
return file_content.read()
def read(self) -> str | bytes:
with gzip.open(
self.path,
mode=self.mode,
compresslevel=self.compresslevel,
encoding=self.encoding,
errors=self.errors,
newline=self.newline,
) as file_content:
return file_content.read()


class BZ2File(CompressedFile):
magic_bytes: bytes = b"\x42\x5a\x68"
magic_bytes = b"\x42\x5a\x68"

# pylint: disable=too-many-instance-attributes
def __init__(
self,
path: str,
path: StrOrBytesPath,
delimiter: Optional[str] = None,
mode: str = "rt",
buffering: int = -1,
Expand All @@ -181,7 +179,7 @@ def __init__(
errors: Optional[str] = None,
newline: Optional[str] = None,
):
super(BZ2File, self).__init__(
super().__init__(
path,
delimiter=delimiter,
mode=mode,
Expand Down Expand Up @@ -216,7 +214,7 @@ def read(self):


class XZFile(CompressedFile):
magic_bytes: bytes = b"\xfd\x37\x7a\x58\x5a\x00"
magic_bytes = b"\xfd\x37\x7a\x58\x5a\x00"

# pylint: disable=too-many-instance-attributes
def __init__(
Expand All @@ -234,7 +232,7 @@ def __init__(
filters=None,
format=None,
):
super(XZFile, self).__init__(
super().__init__(
path,
delimiter=delimiter,
mode=mode,
Expand Down Expand Up @@ -278,23 +276,23 @@ def read(self):
return file_content.read()


COMPRESSION_CLASSES = [GZFile, BZ2File, XZFile]
N_COMPRESSION_CHECK_BYTES = max(len(cls.magic_bytes) for cls in COMPRESSION_CLASSES) # type: ignore
COMPRESSION_CLASSES: list[type[CompressedFile]] = [GZFile, BZ2File, XZFile]
N_COMPRESSION_CHECK_BYTES = max(len(cls.magic_bytes) for cls in COMPRESSION_CLASSES)


def get_read_function(filename: str, disable_compression: bool):
def get_read_function(filename: FileDescriptorOrPath, disable_compression: bool):
if disable_compression:
return ReusableFile
with open(filename, "rb") as f:
start_bytes = f.read(N_COMPRESSION_CHECK_BYTES)
for cls in COMPRESSION_CLASSES:
if cls.is_compressed(start_bytes): # type: ignore
if cls.is_compressed(start_bytes):
return cls
return ReusableFile


def universal_write_open(
path: str,
path: StrOrBytesPath,
mode: str,
buffering: int = -1,
encoding: Optional[str] = None,
Expand Down
Loading
Loading