Skip to content

Latest commit

 

History

History
189 lines (140 loc) · 7.75 KB

query-translation.adoc

File metadata and controls

189 lines (140 loc) · 7.75 KB

Query Translation

  • Query Languages, Visualization

  • FCS-QL Details

  • Query Mapping

Query Languages

CQL-JS Demo

FCS-QL – Visualization

FCS-QL parse tree
  • Installation

    pip install antlr4-tools
    git clone https://github.com/clarin-eric/fcs-ql.git
    cd fcs-ql/src/main/antlr4/eu/clarin/sru/fcs/qlparser
  • Visualization according to ANTLR4 > Getting Started

    antlr4-parse src/fcsql/FCSParser.g4 src/fcsql/FCSLexer.g4 query -gui
    [ word = "her.*" ] [ lemma = "Artznei" ] [ pos = "VERB" ]
    ^D

FCS-QL Query Nodes

QueryNode (with child node “children”)

  • Expression (layer identifier, layer identifier qualifier, operator, regular expression + flags)

    • Wildcard

    • Group → 1 QueryNode; “(” … “)

    • NOT → 1 QueryNode

    • AND, OR → list of QueryNodes

  • QueryDisjunction → list of QueryNodes

  • QuerySequence → list of QueryNodes → “list of QuerySegmenten”

  • QuerySegment (min, max) → Expression → “a single token”

  • QueryGroup (min, max) → QueryNode

  • Within-Query (SimpleWithin, QueryWithWithin) (Scope: sentence, utterance, paragraph, turn, text, session) (unused)

  • grayed out: currently not supported by the FCS Aggregator for searching (in visual query builder)

FCS-QL Query Nodes – Aggregator

FCS-QL Query Builder

Parsed Query:

  • Query Sequencewith list of Query Segment

    [ word = ".*her" ] [ lemma = "Artznei" ] [ pos = "VERB" ]

  • Query Segmenta token (can be repeatable)

    [ word = "her.*" & ( word = "test" | word = "Apfel" ) ] [ pos = "ADV" ]{1,3}

    • Expression AND

      [ word = "her.*" & word = "test" ]

      • Expression Group

      • Expression

    • Expression GroupExpression ORlist of Expression

      [ ( word = "her.*" | word = "Test" ) ]

    • ExpressionLayer Identifier, Operator, Regex (value)

      [ word = "her.*" ]

FCS-QL – Remarks

  • Currently (Aggregator v3.9.1) only limited support of all FCS-QL features

    → partly due to Visual Query Builder

  • Free text input / improved query builder planned for the future

  • Use appropriate diagnostics if query features are not supported

    • SRU: \info:srw/diagnostic/1/48 - Query feature unsupported.

    • FCS: http://clarin.eu/fcs/diagnostic/10 - General query syntax error. - should be intercepted by FCS-QL parser library

    • FCS: http://clarin.eu/fcs/diagnostic/11 - Query too complex. Cannot perform Query.

Query-Mapping

  • Idea:

    • Let libraries parse raw queries (CQL, FCS-QL)

    • Recursively walk through the parsed query tree, “depth first”

    • Successively generate transformed query (for target system),

      e.g. StringBuilder in Java

  • Examples:

  • ElasticSearch

  • Solr

    • Only BASIC Search

    • ADVANCED Search with e.g. MTAS (“Multi Tier Annotation Search”)

  • In general: use actual Corpus Search Engine for ADVANCED Search

    → otherwise at most a single annotation layer (“text”) can be searched