Releases: malteos/scincl
Releases · malteos/scincl
Dataset and pretrained model weights (w/o leakage)
- Pretrained model weights:
config.json
,pytorch_model.bin
(also available on Huggingfacemalteos/scincl-wol
) - Tokenizer: See w/ leakage release
- Triples (query, positive, negative) and paper metadata:
train_triples.csv.gz
,train_metadata.jsonl.gz
- Corpus and query papers:
s2orc_paper_ids.seed_0.json
,query_s2orc_paper_ids.seed_0.json
Dataset and pretrained model weights (w/ leakage)
- Pretrained model weights:
config.json
,pytorch_model.bin
- Tokenizer:
tokenizer_config.json
,special_tokens_map.json
,vocab.txt
- Triples (query, positive, negative) and paper metadata:
train_triples.csv.gz
,train_metadata.jsonl.gz
- ID Mappings
scidocs__s2id_to_s2orc_paper_id.latest.json.gz
,specter__s2id_to_s2orc_paper_id.json.gz