Skip to content

Commit

Permalink
Merge pull request #98 from bmatsuo/bmatsuo/docs-about-locked-threads
Browse files Browse the repository at this point in the history
docs about locked threads
  • Loading branch information
bmatsuo authored Feb 6, 2017
2 parents bb9ab6b + acb687f commit 92c9f5a
Show file tree
Hide file tree
Showing 5 changed files with 237 additions and 20 deletions.
25 changes: 22 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,15 +55,29 @@ provided implementations can be considered stable.
###Idiomatic API

API inspired by [BoltDB](https://github.com/boltdb/bolt) with automatic
commit/rollback of transactions. The goal of lmdb-go is to provide idiomatic,
safe database interactions without compromising the flexibility of the C API.
commit/rollback of transactions. The goal of lmdb-go is to provide idiomatic
database interactions without compromising the flexibility of the C API.

**NOTE:** While the lmdb package tries hard to make LMDB as easy to use as
possible there are compromises, gotchas, and caveats that application
developers must be aware of when relying on LMDB to store their data. All
users are encouraged to fully read the
[documentation](https://godoc.org/github.com/bmatsuo/lmdb-go/lmdb) so they are
aware of these caveats.

Where the lmdb package and its implementation decisions do not meet the needs
of application developers in terms of safety or operational use the lmdbsync
package has been designed to wrap lmdb and safely fill in additional
functionality. Consult the
[documentation](https://godoc.org/github.com/bmatsuo/lmdb-go/exp/lmdbsync) for
more information about the lmdbsync package.

###API coverage

The lmdb-go project aims for complete coverage of the LMDB C API (within
reason). Some notable features and optimizations that are supported:

- Idiomatic subtransactions ("sub-updates") that do not disrupt thread locking.
- Idiomatic subtransactions ("sub-updates") that allow the batching of updates.

- Batch IO on databases utilizing the `MDB_DUPSORT` and `MDB_DUPFIXED` flags.

Expand Down Expand Up @@ -119,6 +133,11 @@ questions of why to use one database or the other.
- As a pure Go package bolt can be easily cross-compiled using the `go`
toolchain and `GOOS`/`GOARCH` variables.

- Its simpler design and implementation in pure Go mean it is free of many
caveats and gotchas which are present using the lmdb package. For more
information about caveats with the lmdb package, consult its
[documentation](https://godoc.org/github.com/bmatsuo/lmdb-go/lmdb).

###Advantages of LMDB

- Keys can contain multiple values using the DupSort flag.
Expand Down
57 changes: 45 additions & 12 deletions lmdb/env.go
Original file line number Diff line number Diff line change
Expand Up @@ -378,18 +378,22 @@ func (env *Env) SetMaxDBs(size int) error {
// calling either its Abort or Commit methods to ensure that its resources are
// released.
//
// BeginTxn does not call runtime.LockOSThread. Unless the Readonly flag is
// passed goroutines must call runtime.LockOSThread before calling BeginTxn and
// the returned Txn must not have its methods called from another goroutine.
// Failure to meet these restrictions can have undefined results that may
// include deadlocking your application.
//
// Instead of calling BeginTxn users should prefer calling the View and Update
// methods, which assist in management of Txn objects and provide OS thread
// locking required for write transactions.
//
// A finalizer detects unreachable, live transactions and logs thems to
// standard error. The transactions are aborted, but their presence should be
// interpreted as an application error which should be patched so transactions
// are terminated explicitly. Unterminated transactions can adversly effect
// database performance and cause the database to grow until the map is full.
//
// BeginTxn does not attempt to serialize write transaction operations to an OS
// thread and without care its use for write transactions can have undefined
// results.
//
// Instead of BeginTxn users should call the View, Update, RunTxn methods.
//
// See mdb_txn_begin.
func (env *Env) BeginTxn(parent *Txn, flags uint) (*Txn, error) {
txn, err := beginTxn(env, parent, flags)
Expand All @@ -404,9 +408,10 @@ func (env *Env) BeginTxn(parent *Txn, flags uint) (*Txn, error) {
// Because RunTxn terminates the transaction goroutines should not retain
// references to it or its data after fn returns.
//
// RunTxn does not lock the thread of the calling goroutine. Unless the
// Readonly flag is passed the calling goroutine should ensure it is locked to
// its thread.
// RunTxn does not call runtime.LockOSThread. Unless the Readonly flag is
// passed the calling goroutine should ensure it is locked to its thread and
// any goroutines started by fn must not call methods on the Txn object it is
// passed.
//
// See mdb_txn_begin.
func (env *Env) RunTxn(flags uint, fn TxnOp) error {
Expand All @@ -417,6 +422,10 @@ func (env *Env) RunTxn(flags uint, fn TxnOp) error {
// environment and passes it to fn. View terminates its transaction after fn
// returns. Any error encountered by View is returned.
//
// Unlike with Update transactions, goroutines created by fn are free to call
// methods on the Txn passed to fn provided they are synchronized in their
// accesses (e.g. using a mutex or channel).
//
// Any call to Commit, Abort, Reset or Renew on a Txn created by View will
// panic.
func (env *Env) View(fn TxnOp) error {
Expand All @@ -427,9 +436,22 @@ func (env *Env) View(fn TxnOp) error {
// if fn returns a nil error otherwise Update aborts the transaction and
// returns the error.
//
// Update locks the calling goroutine to its thread and unlocks it after fn
// returns. The Txn must not be used from multiple goroutines, even with
// synchronization.
// Update calls runtime.LockOSThread to lock the calling goroutine to its
// thread and until fn returns and the transaction has been terminated, at
// which point runtime.UnlockOSThread is called. If the calling goroutine is
// already known to be locked to a thread, use UpdateLocked instead to avoid
// premature unlocking of the goroutine.
//
// Neither Update nor UpdateLocked cannot be called safely from a goroutine
// where it isn't known if runtime.LockOSThread has been called. In such
// situations writes must either be done in a newly created goroutine which can
// be safely locked, or through a worker goroutine that accepts updates to
// apply and delivers transaction results using channels. See the package
// documentation and examples for more details.
//
// Goroutines created by the operation fn must not use methods on the Txn
// object that fn is passed. Doing so would have undefined and unpredictable
// results for your program (likely including data loss, deadlock, etc).
//
// Any call to Commit, Abort, Reset or Renew on a Txn created by Update will
// panic.
Expand All @@ -441,6 +463,17 @@ func (env *Env) Update(fn TxnOp) error {
// its thread. UpdateLocked should be used if the calling goroutine is already
// locked to its thread for another purpose.
//
// Neither Update nor UpdateLocked cannot be called safely from a goroutine
// where it isn't known if runtime.LockOSThread has been called. In such
// situations writes must either be done in a newly created goroutine which can
// be safely locked, or through a worker goroutine that accepts updates to
// apply and delivers transaction results using channels. See the package
// documentation and examples for more details.
//
// Goroutines created by the operation fn must not use methods on the Txn
// object that fn is passed. Doing so would have undefined and unpredictable
// results for your program (likely including data loss, deadlock, etc).
//
// Any call to Commit, Abort, Reset or Renew on a Txn created by UpdateLocked
// will panic.
func (env *Env) UpdateLocked(fn TxnOp) error {
Expand Down
137 changes: 135 additions & 2 deletions lmdb/example_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"fmt"
"log"
"os"
"runtime"
"time"

"github.com/bmatsuo/lmdb-go/lmdb"
Expand All @@ -25,8 +26,10 @@ var err error
var stop chan struct{}

// These values can be used as no-op placeholders in examples.
func doUpdate(txn *lmdb.Txn) error { return nil }
func doView(txn *lmdb.Txn) error { return nil }
func doUpdate(txn *lmdb.Txn) error { return nil }
func doUpdate1(txn *lmdb.Txn) error { return nil }
func doUpdate2(txn *lmdb.Txn) error { return nil }
func doView(txn *lmdb.Txn) error { return nil }

// This example demonstrates a complete workflow for an lmdb environment. The
// Env is first created. After being configured the Env is mapped to memory.
Expand Down Expand Up @@ -80,6 +83,73 @@ func Example() {
}
}

// This example demonstrates the simplest (and most naive) way to issue
// database updates from a goroutine for which it cannot be known ahead of time
// whether runtime.LockOSThread has been called.
func Example_threads() {
// Create a function that wraps env.Update and sends the resulting error
// over a channel. Because env.Update is called our update function will
// call runtime.LockOSThread to safely issue the update operation.
update := func(res chan<- error, op lmdb.TxnOp) {
res <- env.Update(op)
}

// ...

// Now, in goroutine where we cannot determine if we are locked to a
// thread, we can create a new goroutine to process the update(s) we want.
res := make(chan error)
go update(res, func(txn *lmdb.Txn) (err error) {
return txn.Put(dbi, []byte("thisUpdate"), []byte("isSafe"), 0)
})
err = <-res
if err != nil {
panic(err)
}
}

// This example demonstrates a more sophisticated way to issue database updates
// from a goroutine for which it cannot be known ahead of time whether
// runtime.LockOSThread has been called.
func Example_worker() {
// Wrap operations in a struct that can be passed over a channel to a
// worker goroutine.
type lmdbop struct {
op lmdb.TxnOp
res chan<- error
}
worker := make(chan *lmdbop)
update := func(op lmdb.TxnOp) error {
res := make(chan error)
worker <- &lmdbop{op, res}
return <-res
}

// Start issuing update operations in a goroutine in which we know
// runtime.LockOSThread can be called and we can safely issue transactions.
go func() {
runtime.LockOSThread()
defer runtime.LockOSThread()

// Perform each operation as we get to it. Because this goroutine is
// already locked to a thread, env.UpdateLocked is called to avoid
// premature unlocking of the goroutine from its thread.
for op := range worker {
op.res <- env.UpdateLocked(op.op)
}
}()

// ...

// In another goroutine, where we cannot determine if we are locked to a
// thread already.
err = update(func(txn *lmdb.Txn) (err error) {
// This function will execute safely in the worker goroutine, which is
// locked to its thread.
return txn.Put(dbi, []byte("thisUpdate"), []byte("isSafe"), 0)
})
}

// This example demonstrates how an application typically uses Env.SetMapSize.
// The call to Env.SetMapSize() is made before calling env.Open(). Any calls
// after calling Env.Open() must take special care to synchronize with other
Expand Down Expand Up @@ -159,6 +229,69 @@ func ExampleEnv_Copy() {
}(time.Tick(time.Hour))
}

// This example shows the basic use of Env.Update, the primary method lmdb-go
// provides for to store data in an Env.
func ExampleEnv_Update() {
// It is not safe to call runtime.LockOSThread here because Env.Update
// would later cause premature unlocking of the goroutine. If an
// application requires that goroutines be locked to threads before
// starting an an update on the Env then you must use Env.UpdateLocked
// instead of Env.Update.

err = env.Update(func(txn *lmdb.Txn) (err error) {
// Write several keys to the database within one transaction. If
// either write fails and this function returns an error then readers
// in other transactions will not see either value because Env.Update
// aborts the transaction if an error is returned.

err = txn.Put(dbi, []byte("x"), []byte("hello"), 0)
if err != nil {
return err
}
err = txn.Put(dbi, []byte("y"), []byte("goodbye"), 0)
if err != nil {
return err
}
return nil
})
if err != nil {
panic(err)
}
}

// In this example, another C library requires the application to lock a
// goroutine to its thread. When writing to the Env this goroutine must use
// the method Env.UpdateLocked to prevent premature unlocking of the goroutine.
//
// Note that there is no way for a goroutine to determine if it is locked to a
// thread failure to call Env.UpdateLocked in a scenario like this can lead to
// unspecified and hard to debug failure modes for your application.
func ExampleEnv_UpdateLocked() {
runtime.LockOSThread()
defer runtime.UnlockOSThread()

// ... Do something that requires the goroutine be locked to its thread.

// Create a transaction that will not interfere with thread locking and
// issue some writes with it.
err = env.UpdateLocked(func(txn *lmdb.Txn) (err error) {
err = txn.Put(dbi, []byte("x"), []byte("hello"), 0)
if err != nil {
return err
}
err = txn.Put(dbi, []byte("y"), []byte("goodbye"), 0)
if err != nil {
return err
}
return nil
})
if err != nil {
panic(err)
}

// ... Do something requiring the goroutine still be locked to its thread.
}

// This example shows the general workflow of LMDB. An environment is created
// and configured before being opened. After the environment is opened its
// databases are created and their handles are saved for use in future
Expand Down
32 changes: 31 additions & 1 deletion lmdb/lmdb.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@ fairly low level and are designed to provide a minimal interface that prevents
misuse to a reasonable extent. When in doubt refer to the C documentation as a
reference.
http://symas.com/mdb/doc/group__mdb.html
http://www.lmdb.tech/doc/
http://www.lmdb.tech/doc/starting.html
http://www.lmdb.tech/doc/modules.html
Environment
Expand All @@ -17,6 +20,12 @@ of creation. On filesystems that support sparse files this should not
adversely affect disk usage. Resizing an environment is possible but must be
handled with care when concurrent access is involved.
Note that the package lmdb forces all Env objects to be opened with the NoTLS
(MDB_NOTLS) flag. Without this flag LMDB would not be practically usable in Go
(in the author's opinion). However, even for environments opened with this
flag there are caveats regarding how transactions are used (see Caveats below).
Databases
A database in an LMDB environment is an ordered key-value store that holds
Expand All @@ -34,6 +43,7 @@ closed but it is not required. Typically, applications acquire handles for all
their databases immediately after opening an environment and retain them for
the lifetime of the process.
Transactions
View (readonly) transactions in LMDB operate on a snapshot of the database at
Expand All @@ -50,6 +60,26 @@ transactions do not require explicit calling of Abort/Commit and are provided
through the Env methods Update, View, and RunTxn. The BeginTxn method on Env
creates an unmanaged transaction but its use is not advised in most
applications.
Caveats
Write transactions (those created without the Readonly flag) must be created in
a goroutine that has been locked to its thread by calling the function
runtime.LockOSThread. Futhermore, all methods on such transactions must be
called from the goroutine which created them. This is a fundamental limitation
of LMDB even when using the NoTLS flag (which the package always uses). The
Env.Update method assists the programmer by calling runtime.LockOSThread
automatically but it cannot sufficiently abstract write transactions to make
them completely safe in Go.
A goroutine must never create a write transaction if the application programmer
cannot determine whether the goroutine is locked to an OS thread. This is a
consequence of goroutine restrictions on write transactions and limitations in
the runtime's thread locking implementation. In such situations updates
desired by the goroutine in question must be proxied by a goroutine with a
known state (i.e. "locked" or "unlocked"). See the included examples for more
details about dealing with such situations.
*/
package lmdb

Expand Down
6 changes: 4 additions & 2 deletions lmdb/txn.go
Original file line number Diff line number Diff line change
Expand Up @@ -244,8 +244,10 @@ func (txn *Txn) Drop(dbi DBI, del bool) error {
// nil error is returned by fn and otherwise aborts it. Sub returns any error
// it encounters.
//
// Sub may only be called on an Update (a Txn created without the Readonly
// flag). Calling Sub on a View transaction will return an error.
// Sub may only be called on an Update Txn (one created without the Readonly
// flag). Calling Sub on a View transaction will return an error. Sub assumes
// the calling goroutine is locked to an OS thread and will not call
// runtime.LockOSThread.
//
// Any call to Abort, Commit, Renew, or Reset on a Txn created by Sub will
// panic.
Expand Down

0 comments on commit 92c9f5a

Please sign in to comment.