Skip to content

module__WebOfKnowledgeReader

Robert Bossy edited this page Jul 27, 2017 · 1 revision

#org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader

Synopsis

Reads Web of Knowledge search result import files.

Description

WARNING: WoK delivers files with a wrong Byte Order Mark, it is advised you remove it using a text editor before feeding it to org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader.

The PT field (Publication Type) is used as a document marker, org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader will create a document each time it reads a PT field.

The VR field will be read and, if its value is different from "1.0", then org.bibliome.alvisnlp.modules.wok.WebOfKnowledgeReader fails.

The following fields will be read and stored as document features, one feature per line: AU, AF, BA, BF, CA, GP, BE, SO, SE, BS, LA, CT, CY, CL, SP, HO, C1, RP, EM, RI, OI, FU, CR, TC, Z9, PU, PI, PA, SN, BN, J9, JI, PD, PY, VL, IS, PN, SU, MA, BP, EP, AR, DI, D2, PG, P2, GA, UT, SI, NR.

The following fields will be read and stored as document features, several features per line split with semicolons: DE, DT, ID, WC, SC.

The following fields will be read and stored as sections, all lines concatenated for the contents: TI, AB, FX.

The following fields will be ignored: ER, EF, FN.

The feature and section names are the 2-character field code. For an interpretation of field codes, see WoK format documentation.

Parameters

Optional

Type: SourceStream

Location of the WoK file(s).

Optional

Type: Mapping

Constant features to add to each document created by this module.

Optional

Type: Mapping

Constant features to add to each section created by this module.

Default value: false

Type: Boolean

Read files in tabular export format.

Clone this wiki locally