Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed SysV structures and Mixin functionality #64

Closed
wants to merge 6 commits into from

Conversation

kyewei
Copy link
Contributor

@kyewei kyewei commented Nov 12, 2015

Old massive PR: #54
Small PR 1: #62

Possibly outstanding issues

  • With the SysV versions of the structure needing permissions and a name, Semian::Simple::SlidingWindow.new(max_size: 4) and Semian::SysV::SlidingWindow.new(max_size: 4, permissions: 0660, name: "sample_int") have different interfaces. Ways to solve
    • do def initialize(max_size: , **) current choice
    • do def initialize(options = {}) or something of the sort
    • not pass in permissions and name at all, but they have to come from somewhere else without leaking the implementation, possible from Semian.register
  • Slightly extraneous: it was previously mentioned that test suite naming is a bit off, for example TestCircuitBreaker is circuit_breaker_test.rb
  • Naming issue of execute_atomically: you know what, I changed it to #synchronize, similar to classes like Mutex and Monitor. It's still aliased to #transaction
  • #shared? It's private, and just returns @using_shared_memory. Hopefully won't ruffle any feathers, or maybe just the ivar is enough?
  • It was mentioned (not sure if still applies) that giving name and permissions to the structures was bad. Personally I disagree, but if so, start a discussion below about it.

Content
Alright, here's what I propose for adding in the SysV structures. I didn't include the tests yet, they will obviously fail because there's no C code to provide actual shared memory in this commit.

The three SysV classes inherit directly from the Simple equivalents. They also mixin the Semian::SysVSharedMemory module, which provides the common functionality.

From the C perspective, to get the memory allocation and things to work, it needs to do a bunch of things. Certain ways of accomplishing them are IMO worse than others, but I'll list the things that the C side needs to do, and the different ways to do them, and why/how I decided on this way.
C needs to accomplish:

  • Hooking the allocation of a Ruby object so a backing C struct is made when the class is instantiated, whether or not acquire_memory_object is called. The C code needs this when unpacking the Ruby VALUE self object into a proper C semian_shm_object *
  • Hook the implementations of the shared methods like #acquire_shared_memory at a shared level
  • Override the specific implementations of a class, like #value=, #<<, etc

This is about to get heavy.

There were 3 ways of doing the above, and there is a diamond problem here. I originally had (1), but I changed to (2).

(1): Have a superclass SharedMemoryObject, and Simple structures inherit from that. The SysV further inherit from the Simple

  • Pros: Only hook access methods, allocation once, can replace functions at any point in time and it will work
  • Cons: Weird to have class Simple::Integer < SharedMemoryObject even though it doesn't do sharing at all

(2): Mixin a SysVSharedMemory module that does shares common functionality into the SysV variants only

  • Pros: Makes sense from class organization perspective, mixin a module if you want shared-ness

  • Cons: Although you can hook methods like acquire_memory_object in one spot, you have to replace the three different allocation methods separately. There is also a timing issue involved. You can't simply do this:

    module SysVSharedMemory
    ...
    def self.included(base)
      # this doesn't work! The classes that mixin this module 
      # are defined before semian.so are loaded
      # meaning this isn't defined when self.included is called
      call_c_function_to_override_alloc()  
    end
    end

    Instead, as I found out, you had to do something like this in the C extension:

    void Init_semian_shm_object() {
    ...
    call_c_function_to_override_alloc_for(klass)
    ...
    }

(3): Make the SysV classes inherit from SharedMemoryObject. But what do we do about the common functionality between Simple and SysV versions? Put that in a module, and include it into things that need the functionality, like this:

  • class Simple::Enum
    module EnumImplementation
      ...
    end
    include EnumImplementation
    end
    class SysV::Enum < SharedMemoryObject
    include Simple::Enum::EnumImplementation
    end
  • Pros: Share things. Single point of replacing methods.
  • Again, this is ugly.

If you made it this far, 馃憤, it was really long.
I chose (2) because it's the most structurally clean, and I have a working solution to the timing problem, and there are no random abuses of Ruby features like Mixins beyond what is necessary.

I've isolated the locking/unlocking, and by doing that, eliminated the lock() and unlock() functions for a synchronize(), and using some delegating I wrap the methods with lock() and ensure unlock().

Here's the typical structure of an acquire that encompasses the C features as well just so we're on the same page:

:acquire_memory_object
  # calculate byte_size, verify semaphores enabled and _acquire exists, else exit
  :_acquire (in C as semian_shm_object_acquire())
    # receive name, permissions, byte_size, check validity
    # hash the name, 
    :bind_initialize_memory_callback
      # It binds a void* function pointer to the C struct. 
         the function is a callback that initializes memory of a given size.
         this is specialized per SysV structure.
    semian_shm_object_acquire_semaphore()
      # call semget, set permissions
    semian_shm_object_synchronize()
      semian_shm_object_lock_without_gvl()
      semian_shm_object_synchronize_with_block() (in begin block)
        semian_shm_object_check_and_resize_if_needed()
          # check requested byte_size against a possible existing memory block, 
             resizing if necesssary by calling shmctl to check sizes
          semian_shm_object_acquire_memory()
            # call callback bound by :bind_initialize_memory_callback to initialize the acquired memory
            # call shmat to attach
      semian_shm_object_synchronize_restore_lock_status() (in ensure block)
  # set @using_shared_memory to true

Here's the structure of a synchronized access. Any method defined using define_method_with_synchronize is synchronized.

:value
  :synchronize
    semian_shm_object_lock_without_gvl()
    semian_shm_object_synchronize_with_block() (in begin block)
      semian_shm_object_check_and_resize_if_needed()
      :value_inner (all the _inner methods are private)
        # access value from C struct
    semian_shm_object_synchronize_restore_lock_status() (in ensure block)

define_method_with_synchronize basically defines the original method, and then calls do_with_sync :method_name

Discussion, opinions, etc would be great. I also want suggestions as to how to go forward with this.

A version with everything working is on the branch with the large PR, #54
@sirupsen @byroot

@@ -3,7 +3,7 @@ module Simple
class Integer #:nodoc:
attr_accessor :value

def initialize
def initialize(**_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the _ here. If you disregard everything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just def initialize(**) ? I didn't know that.

@byroot
Copy link
Contributor

byroot commented Nov 16, 2015

do def initialize(max_size: , **_) current choice

Fine by me. As said though ** is enough.

Naming issue of execute_atomically: you know what, I changed it to #synchronize

Synchronize is definitely much better.

#shared? It's private, and just returns @using_shared_memory. Hopefully won't ruffle any feathers, or maybe just the ivar is enough?

Nah, an accessor is always good.

The rest of the PR is out of my my expertise.

@kyewei
Copy link
Contributor Author

kyewei commented Nov 16, 2015

@csfrancis Any suggestions or comments on the C side? A working version can be found here in the old branch which I keep updated, but probably reading the long PR description up there ^ would be enough.

end

def self.included(base)
def base.do_with_sync(*names)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@byroot probably the most interesting change that you could provide an opinion for is this function here. I call it from C to replace a regular accessor or setter that accesses shared memory and is not a atomic call, and wrap it in a synchronize. I wrap all the functions with this, so it's essentially a do_with_sync :value=, :value, :reset, :increment and do_with_sync :size=, :max_size, :<<, :push, :pop etc

I just want to make sure there are no disagreements here.

Relevant C code is
https://github.com/Shopify/semian/blob/circuit-breaker-per-host/ext/semian/semian_shared_memory_object.c#L319-L324
where define_method_with_synchronize calls do_with_sync

and
https://github.com/Shopify/semian/pull/54/files#diff-99c3116b9c2c62cc89a4ae4cc84fadaaR262
which is where define_method_with_synchronize is used in place of rb_define_method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to avoid this kind of monkey patching.

I'd rather prefer have a noop sychronize method in Simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While looking over the C code base, one of the scarier things was that because I provided the lock() and unlock() as separate functions, it was very difficult to track where I had to re-unlock, since errors could occur in the middle of a function, and then you'd have to return right there, and insert an unlock() there. It just isn't safe to have many exit points sprawled all over the code. So I then opted to write ensures in, but doing so manually would add twice the number of functions and make things very manual.

Here's the what a begin-ensure would look like:

return rb_ensure(semian_shm_object_synchronize_with_block, self, semian_shm_object_synchronize_restore_lock_status, (VALUE)&status);

For reference: https://github.com/Shopify/semian/pull/54/files#diff-905e62ddb202d77b001ca1dbac3cbf06R310

Essentially each function needed a wrapper, and probably also some custom struct to overcome the one-argument limit of the function rb_ensure. This naturally lead to me doing it in Ruby instead since it doesn't have these limitations, which is why synchronize does what it does. IMO it's not a big deal since it only does it once per class, reduces code and complexity by a bit. Maybe using prepend with its use of super would make it cleaner than doing what I do currently, which is having an #{method_name}_inner as the original and ##{method_name} as the synchronized version.

With regards to the noop synchronize, do you mean something more along the lines of (1) , with SharedMemoryObject -> Simple::Integer -> SysV::Integer ? Otherwise I don't understand what you mean by noop synchronize. Unless you mean just having one there to keep the interfaces the same?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyewei I think what he means with noop syncronize is that the SysV implementations continue to inherit from Simple::Integer and friends, which then look like this:

class Semian::Simple::Integer
  # .. stuff
  def increment(delta)
    synchronize { value += delta }
  end

  private
  def synchronize
    yield
  end
  # more stuff
end

When this mixin is included, it just overrides it. The implementation can then be simplified when all the fallback checking is done at boot-time because you can just override it in C, and not have it in Ruby.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying, but that also doesn't work in protecting lets say.. Semian::SysV::Integer#value. If the #value is not protected in C code, and it directly replaces the SysV::Integer, there is no entry point to override and replace with synchronize { ... } except through what do_with_sync does. If you do protect in C, well, that goes back to the problem of making 2x the functions to call rb_ensure(callback, arg1, ensure_fn, arg2)

For example, in code,

def increment(val=1)
  synchronize { self.value += 1 }  # OK
end
def value
   #how do you put a synchronize in here?
  # you can't do self.value, that would just do recursion and crash
  @value
end

@@ -159,6 +160,10 @@ def resources
require 'semian/simple_sliding_window'
require 'semian/simple_integer'
require 'semian/simple_state'
require 'semian/sysv_shared_memory'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these requires be conditional depending on whether the target platform supports SysV?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

@kyewei
Copy link
Contributor Author

kyewei commented Nov 17, 2015

@sirupsen wanted to see the tests, so I'll include them now. Some of those new tests will fail because the C component isn't included in this PR :( They do succeed on the other branch however.

@@ -1,3 +1,5 @@
require 'forwardable'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this now?

@sirupsen
Copy link
Contributor

I think that 2 is the right approach as well.

module SysVSharedMemory #:nodoc:
@type_size = {}
def self.sizeof(type)
size = (@type_size[type.to_sym] ||= (respond_to?(:_sizeof) ? _sizeof(type.to_sym) : 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use @type_size[type.to_sym] and make this easier to read? I assume the only reason you need the respond_to? is because you don't have the C implementation just yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for memoizing the sizes after it's read once from C. I don't want to hardcode sizes, they differ by platform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that, but I don't see why you need the local variable when you have the class instance variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. For raising the TypeError. Maybe that should be moved to C instead?

def acquire_memory_object(name, data_layout, permissions)
return @using_shared_memory = false unless Semian.semaphores_enabled? && respond_to?(:_acquire)

byte_size = data_layout.inject(0) { |sum, type| sum + ::Semian::SysVSharedMemory.sizeof(type) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does sizeof need to be a class method?

@sirupsen
Copy link
Contributor

Fundamentally I think what makes this PR more complicated than it needs to be is that the fallback is done within the SysV driver, instead of at boot time as @csfrancis also pointed out. This will simplify I bunch of things because you can make more assumptions.

require 'test_helper'

class TestSysVSlidingWindow < MiniTest::Unit::TestCase
CLASS = ::Semian::SysV::SlidingWindow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little bit of a nitpick, but I'd call this KLASS instead of make it a method. CLASS is too close to the reserved keyword.

end

def value
NUM_TO_SYM.fetch(@integer.value) { raise ArgumentError }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you did alias_method :to_i, :value and always called value.to_i in Simple::State, you wouldn't need to override these. Duck typing hard!

@kyewei kyewei closed this Nov 23, 2015
@epk epk deleted the shared_memory_and_sysv branch June 26, 2019 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants