Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lmdb: methods to support custom comparison Txn.SetCmp(), Txn.SetCmpDup() #27

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

fiatjaf
Copy link

@fiatjaf fiatjaf commented Dec 6, 2023

This is an old unmerged PR from @bmatsuo that I have been using in production for many months now.

It allows one to write custom comparators for DBIs in C, like this:

package main

/*
#include <string.h>
#include <stdint.h>
#include <lmdb.h>

#ifndef _MDB_CUSTOM_COMPARE_H
#define _MDB_CUSTOM_COMPARE_H

static int mdb_cmp_memn(const MDB_val *a, const MDB_val *b) {
    int diff;
    ssize_t len_diff;
    unsigned int len;

    len = a->mv_size;
    len_diff = (ssize_t) a->mv_size - (ssize_t) b->mv_size;
    if (len_diff > 0) {
        len = b->mv_size;
        len_diff = 1;
    }

    diff = memcmp(a->mv_data, b->mv_data, len);
    return diff ? diff : len_diff<0 ? -1 : len_diff;
}

int strfry_uint64_comparator(const MDB_val *a, const MDB_val *b) {
    MDB_val a2 = *a, b2 = *b;
    a2.mv_size -= 8;
    b2.mv_size -= 8;

    int stringCompare = mdb_cmp_memn(&a2, &b2);
    if (stringCompare) return stringCompare;

    uint64_t ai;
    uint64_t bi;
    memcpy(&ai, (char*)a->mv_data + a->mv_size - 8, 8);
    memcpy(&bi, (char*)b->mv_data + b->mv_size - 8, 8);

    if (ai < bi) return -1;
    else if (ai > bi) return 1;
    return 0;
}

#endif
*/
import "C"

Then use these with txn.SetCmp():

		if db, err := txn.OpenDBI("rasgueadb_defaultDb__Event__created_at", 0); err != nil {
			return err
		} else {
			txn.SetCmp(db, (*lmdb.CmpFunc)(unsafe.Pointer(C.strfry_uint64_comparator)))
			eventsByCreatedAtIndexDbi = db
		}

@wojas
Copy link
Member

wojas commented Dec 7, 2023

Thank you for submitting this.

As far as I can see, this was never submitted as a PR by bmatsuo before and only lived in a branch. Are you aware of any issues and missing functionality with this code, or other reasons that this was never included? Aside from the ugliness and pain of having to provide an unsafe C function pointer, of course.

Some thoughts:

  • Would it make sense to rename the functions to txn.UnsafeSetCmp, etc?
  • This includes a demo binary in exp, but no tests that are actually being run as part of our CI flow.

On one hand I am not sure if we can and want to maintain such a fragile unsafe interface in this library, on the other hand I can see how this could help when interfacing with existing LMDBs that you have no control over and use this feature. This feature could break in a future Go release, as CGo is not part of the Go 1.x compatibility promise, and this is kind of stretching the scope of its use, so perhaps a warning about this needs to be added to the doc comments, and perhaps add // Deprecated: experimental feature that may disappear in a future version so that editors mark any calls as such.

@fiatjaf
Copy link
Author

fiatjaf commented Dec 7, 2023

Honestly, I don't really understand much of what is happening here. All I can say is that I tried it because I was interfacing with an external LMDB that had been created using a custom comparator and it worked for me and I haven't had a single issue. Also I've only used the Txn.SetCmp(), didn't try Txn.SetCmpDup().

But I think it's very reasonable to not have this merged. I would say you could perhaps just keep this PR here, either closed or open, or I could make it a draft, and then in the rare occasion that someone wants this they could find this and use the commits here in a local patch like I have been doing.

@fiatjaf fiatjaf marked this pull request as draft December 7, 2023 14:04
@wojas
Copy link
Member

wojas commented Dec 7, 2023

After some thought I think it would be useful to have it, precisely for such circumstances.

I would say that it can be merged after the following changes:

  • Move the implementation to a separate file so that we can easily exclude it with build tags later if needed (e.g. when not supported in a newer Go version or on another platform)
  • Add some tests, also in a separate file
  • Prefix the names with Unsafe
  • Add the Deprecated doc comment mentioned above

If you do not feel like doing this, that's OK. Maybe somebody else will in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants