Skip to content

Releases: malteos/scincl

Dataset and pretrained model weights (w/o leakage)

07 Mar 10:40
Compare
Choose a tag to compare
  • Pretrained model weights: config.json, pytorch_model.bin (also available on Huggingface malteos/scincl-wol)
  • Tokenizer: See w/ leakage release
  • Triples (query, positive, negative) and paper metadata: train_triples.csv.gz, train_metadata.jsonl.gz
  • Corpus and query papers: s2orc_paper_ids.seed_0.json, query_s2orc_paper_ids.seed_0.json

Dataset and pretrained model weights (w/ leakage)

22 Feb 11:29
1d81384
Compare
Choose a tag to compare
  • Pretrained model weights: config.json, pytorch_model.bin
  • Tokenizer: tokenizer_config.json, special_tokens_map.json, vocab.txt
  • Triples (query, positive, negative) and paper metadata: train_triples.csv.gz, train_metadata.jsonl.gz
  • ID Mappings scidocs__s2id_to_s2orc_paper_id.latest.json.gz, specter__s2id_to_s2orc_paper_id.json.gz