Skip to content

A fast parser combinator library, with extra querying goodness

License

Notifications You must be signed in to change notification settings

manojo/parsequery

Repository files navigation

parsequery

Build Status

What is Parsequery?

Parsequery is a parser combinator library. It uses staging to remove composition overhead at compile time to produce efficient, fast parsers. Its interface is almost similar to Scala's parser combinators.

Example

To benefit from staging, you wrap your parser declarations in the optimise scope. Here is an example JSON parser:

  sealed abstract class JSValue
  case class JSObject(dict: List[(String, JSValue)]) extends JSValue
  case class JSArray(arr: List[JSValue]) extends JSValue
  case class JSDouble(d: Double) extends JSValue
  case class JSString(s: String) extends JSValue
  case class JSBool(b: Boolean) extends JSValue
  case object JSNull extends JSValue

  val jsonParser = optimise {

    def value: Parser[JSValue] = (
      obj |
      arr |
      stringLiteral.map(x => JSString(x)) |
      double.map(x => JSDouble(x)) |
      accept("null").map(_ => JSNull) |
      accept("true").map(_ => JSBool(true)) |
      accept("false").map(_ => JSBool(false))
    )

    def obj: Parser[JSValue] = (skipWs(accept('{')) ~>
      repsep(member, skipWs(accept(',')))
    <~ skipWs(accept('}'))) map { x => JSObject(x) }

    def arr: Parser[JSValue] = (skipWs(accept('[')) ~>
      repsep(value, skipWs(accept(',')))
    <~ skipWs(accept(']'))) map { x => JSArray(x) }

    def member: Parser[(String, JSValue)] =
      stringLiteral ~ (skipWs(accept(':')) ~> value)

    value
  }

You run the parser as follows:

val myReader = CharReader("""{"libname": "parsequery"}""".toArray)

jsonParser(myReader) match {
  case Success(res, rest) => println(res)
  case Failure(err, _)    => println(err)
}

You can find more examples in the test files here

How do I use it?

A snapshot is available on Sonatype. To use it with SBT, add the following lines to your build:

libraryDependencies += "com.github.manojo" %% "parsequery" % "0.1-SNAPSHOT"
resolvers += Resolver.sonatypeRepo("snapshots")

How fast is it?

Check for yourself, by running the following two benchmarks, sources here and here:

> bench:testOnly parsequery.BooleansBenchmark
> bench:testOnly parsequery.JSONBenchmark

On these benchmarks we are currently about 2-3x faster than FastParse.

More Information

The main goal of parsequery is to systematically eliminate all intermediate data structures that are creating when running a traditional parser combinator program. Typically, parser combinators interleave static composition of parsers with the dynamic act of parsing itself, at runtime. The key insight is that we can fully decouple the static parts from the dynamic ones. This is done by leveraging a technique known as partial evaluation. To know more about the ideas check out the these blog posts on optimising list pipelines and parser combinators. A more detailed blog post catered to the current implementation is to appear soon.

You can find a talk about this here and the accompanying slides here.

This implementation draws inspiration from previous implementations in the LMS framework, found [here](https://github.com/ma nojo/functadelic/tree/master/src/main/scala/stagedparsec) and a macro implementation by the man called Eric Béguet, found here.

About

A fast parser combinator library, with extra querying goodness

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages