Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark ML wrappers for processors of TwoStage scenario #52

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
d0459c8
Add word2vec optimizations
netang Mar 9, 2023
8d0511f
Add word2vec optimizations
netang Mar 15, 2023
f627b64
Add `num_item_blocks` amd `num_user_blocks` params to `ALSWrap`
netang Mar 16, 2023
bd1a4ec
Add ANN interface and implementations
netang Mar 16, 2023
8eeeff8
Add dependencies
netang Mar 16, 2023
4eedd86
Add flag to use cluster spark session
netang Mar 16, 2023
98701b7
new feature: optimized TwoStage
Mar 17, 2023
a55bf66
Fix problems
netang Mar 17, 2023
8383556
Merge branch 'sb-main-word2vec' into sb-main-two-stage
netang Mar 20, 2023
0d91094
Merge branch 'sb-main-als-num-blocks' into sb-main-two-stage
netang Mar 20, 2023
9d6b3d1
Merge branch 'sb-main-ann' into sb-main-two-stage
netang Mar 20, 2023
85845ad
Merge branch 'sb-main-cluster-session-flag' into sb-main-two-stage
netang Mar 21, 2023
14b749b
Add example script
netang Mar 21, 2023
9aa9255
Remove debug lines
netang Mar 21, 2023
953a368
minor bugfixes and adjustments
Jun 2, 2023
43e045a
prepared test for SlamaWrap
Jun 2, 2023
1d6576d
---
Jun 2, 2023
c40058a
add reranker estimator and reranker model
Jun 4, 2023
bfdc3b4
unfinished implementation of slama wrap reranker with SparkML interfaces
Jun 4, 2023
c019472
add writer and reader for slama and lama wraps
Jun 4, 2023
f17e1b0
add save/load using slama/lama wrap Spark ML interfaces
Jun 4, 2023
da83ae1
add lama wrap Spark ML interface
Jun 4, 2023
edff0bc
fixed calling of the reranker model
Jun 4, 2023
6b618d4
add correct types for second stage model
Jun 4, 2023
85dd6cb
---
Jun 4, 2023
6fc0974
refactoring: splitting entities to multiple files
Jun 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ lib/
lib64
parts/
sdist/
src/
src/*
var/
wheels/
pip-wheel-metadata/
Expand Down Expand Up @@ -145,7 +145,26 @@ data/
pyvenv.cfg
*~
tmp/
docs/_build/
*.DS_Store
*/catboost_info

# Spark files
metastore_db/

.vscode

# VSCode scala exstention files
.metals
.bloop
.bsp

# meld
*.orig

# supplementary files
rsync-repo.sh
requirements.txt
airflow.yaml

# temporary
experiments/tests
Loading