Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fiber aware support for ActiveRecord 4 #190

Closed
wants to merge 13 commits into from

Conversation

marshall-lee
Copy link

Hello!
I took #179 and added support for ActiveRecord 4.
All ActiveRecord specs are green for stable versions of 3.2, 4.0, 4.1 and 4.2.

Also I reimplemented clear_stale_cached_connections! for 3.2 that could help but not so much. ActiveRecord users should explicitly release connections or ActiveRecord::ConnectionAdapters::ConnectionManagement middleware could do this and there is nothing to do with it. By the way, they should do it in threaded environment too. Anyway.

3.1 is not working and I cannot figure out why. So my opinion: we should drop 3.1 support in favor of better implementation for >= 3.2, < 5.

Currently tests may fail because of broken mysql2. See brianmario/mysql2#580 (with specs) or brianmario/mysql2#509 — with any of these patches everything works great.

@marshall-lee
Copy link
Author

@dgutov hi! i merged this branch with master, slightly fixed my additional AR4 specs and now it passes.

@dgutov
Copy link
Collaborator

dgutov commented Mar 18, 2015

@marshall-lee Thanks, the build looks good. I should try running our project specs with this branch sometime.

Speaking of clear_stale_cached_connections!, maybe it could use a spec or two. And how come you needed to add calls to it in existing AR specs? Was that for the Reaper spec to work right?

@marshall-lee
Copy link
Author

I added calls to clear_active_connections!, not clear_stale_cached_connections!. This method releases connection acquired by current thread (fiber).

I did this because it's the correct way to use ActiveRecord - release connection when you don't need it. One could never know about clear_active_connections! because Rails automatically calls it from middleware (please look at the first message in this PR). When you use ActiveRecord outside of Rails request lifecycle you should manually release connections with clear_active_connections!.

@marshall-lee
Copy link
Author

I would look at what i could do with clear_stale_cached_connections!.

@marshall-lee
Copy link
Author

The second reason why i added call to clear_active_connections! to existing specs is that in the case of 3.2 it would stuck when connection pool is banging very tightly (specs with large FiberIterator). When there is no connection available in the pool ActiveRecord is trying to clear_stale_cached_connections! but it's somehow not as efficient as reap in 4.x. I really don't understand why but okay i'll look at it one more time. But not releasing connections is not correct so we should do it anyway.

@marshall-lee
Copy link
Author

I think we should add separate test case for concurrent usage of ActiveRecord with connections becoming stale (without call to clear_active_connections!) and temporarily disable it with skip for 3.2 while we don't understand how to improve it.

@dgutov
Copy link
Collaborator

dgutov commented Mar 18, 2015

I added calls to clear_active_connections!, not clear_stale_cached_connections!.

Right, sorry.

I did this because it's the correct way to use ActiveRecord - release connection when you don't need it. One could never know about clear_active_connections! because Rails automatically calls it from middleware (please look at the first message in this PR).

I'm aware of that, but the specs passed fine before. In general, I think it's better to have fewer moving parts there.

When there is no connection available in the pool ActiveRecord is trying to clear_stale_cached_connections! but it's somehow not as efficient as reap in 4.x. I really don't understand why but okay i'll look at it one more time. But not releasing connections is not correct so we should do it anyway.

Whether we clear the connections in specs or not, has no impact on users. On the other hand, you seem to describe a regression. Are the less careful users (who don't call clear_active_connections! in some places where they maybe should) more likely to have problems with this pull request?

@marshall-lee
Copy link
Author

I don't think we should call it a regression. I bet there is the same behavior with stale connections in multithreaded usage of ActiveRecord. If so we have two choises: keep the behaviour as compatible as possible or try to make it better.

But I should definitely revisit this problem. Maybe it's really regressing.

@marshall-lee
Copy link
Author

This is the 'official' behaviour of ActiveRecord.

ActiveRecord 4.1, multithreaded, with ActiveRecord::Base.connection_pool.size == 5:

    threads = []
    10.times do
      threads << Thread.new do
        ActiveRecord::Base.connection.execute("select 1")
      end
    end
    threads.each(&:join)

— this one stucks

    threads = []
    10.times do
      threads << Thread.new do
        ActiveRecord::Base.connection.execute("select 1")
        ActiveRecord::Base.clear_active_connections!
      end
    end
    threads.each(&:join)

— this one doesn't, because we release connection acquired by current thread.

    threads = []
    10.times do
      threads << Thread.new do
        ActiveRecord::Base.connection.execute("select 1")
        ActiveRecord::Base.connection.close
      end
    end
    threads.each(&:join)

— this one doesn't stuck too, because after closing and terminating the acquiring thread, connection became stale so it can be reacquired by other thread.

Behavior with fibers should be the same as with threads.

@marshall-lee
Copy link
Author

Also, this is the original clear_stale_cached_connections! from AR 3.2:

      def clear_stale_cached_connections!
        keys = @reserved_connections.keys - Thread.list.find_all { |t|
          t.alive?
        }.map { |thread| thread.object_id }
        keys.each do |key|
          conn = @reserved_connections[key]
          ActiveSupport::Deprecation.warn(<<-eowarn) if conn.in_use?
Database connections will not be closed automatically, please close your
database connection at the end of the thread by calling `close` on your
connection.  For example: ActiveRecord::Base.connection.close
          eowarn
          checkin conn
          @reserved_connections.delete(key)
        end
      end

So I think we shouldn't do anything special with stale connections. We should just add separate test case confirming that we behave similarly to multithreaded AR.

@marshall-lee
Copy link
Author

Current implementation worked well because it acquires connection per query (similar to wrapping every query with ActiveRecord::Base.connection_pool.with_connection(&block), or similar to Sequel connection pool implementation).

But it doesn't work well on every version of ActiveRecord (so strange — these versions are 3.2 and 4.2, but on 4.0 and 4.1 it passes). So speaking about existing users is slightly not correct. Also current implementation doesn't fully conform to the official ActiveRecord behavior.

But yes... Existing 3.2 and 4.2 users may really experience problems if they don't care about stale connections. Maybe try to temporarily wrap every query with_connection and output deprecation warning like in non-patched 3.2?

@dgutov
Copy link
Collaborator

dgutov commented Mar 18, 2015

We should just add separate test case confirming that we behave similarly to multithreaded AR.

We should also have a spec that confirms the improvement stated in #179: multiple concurrent transactions.

Current implementation worked well because it acquires connection per query (similar to wrapping every query with ActiveRecord::Base.connection_pool.with_connection(&block), or similar to Sequel connection pool implementation).

It seems to me like this scheme can break transactions (which can't be shared between connections). Here we started a transaction in the current fiber, made a query, then launched a second fiber, it checked out a connection (the one we've just been using, right?), then if we want to commit the transaction now, we'd have to wait until the first connection is free again. Is that what happens? The "transactions should work properly" spec consistently passes in 4.0 and 4.1.

Also current implementation doesn't fully conform to the official ActiveRecord behavior.

Bug-for-bug conformance isn't that valuable. As long as things that work (no errors, no wrong results) with fibers are a superset of things that work with threads, I'd say we're in good place.

But if I'm right, the tradeoff being offered is being able to use multiple connections automatically (one fiber = one connection), and employ several concurrent transactions, for the price of having to clean up after ourselves.

@marshall-lee
Copy link
Author

Bug-for-bug conformance isn't that valuable. As long as things that work (no errors, no wrong results) with fibers are a superset of things that work with threads, I'd say we're in good place.

It's not a bug. In ActiveRecrod, working with connection pool you have an option: acquire connection manually and manually release it, or use with_connection wrapper. Moreover, it is a well documented behavior:

Connections can be obtained and used from a connection pool in several ways:

  1. Simply use ActiveRecord::Base#connection as with Active Record 2.1 and earlier (pre-connection-pooling). Eventually, when you're done with the connection(s) and wish it to be returned to the pool, you call ActiveRecord::Base.clear_active_connections!. This will be the default behavior for Active Record when used in conjunction with Action Pack's request handling cycle.
  2. Manually check out a connection from the pool with ActiveRecord::Base.connection_pool.checkout. You are responsible for returning this connection to the pool when finished by calling ActiveRecord::Base.connection_pool.checkin(connection).
  3. Use ActiveRecord::Base.connection_pool.#with_connection(&block), which obtains a connection, yields it as the sole argument to the block, and returns it to the pool after the block completes.

So on the other hand, current implementation is not a superset — it's a subset. We don't have an option, we just do with_connection every time.

Also it wraps a database connection instance with ``EM::Synchrony::ConnectionPool. It's stupid because Synchrony's connection pool uses convention for methods with name starting with 'a'` — doesn't call release if so (thinks that they're async) — It can be potentially buggy.

It seems to me like this scheme can break transactions

Seems like it doesn't break them. Look at current implementation in lib/em-synchrony/activerecord.rb: it somehow wraps transactions acquiring a connection for them.

But if I'm right, the tradeoff being offered is being able to use multiple connections automatically (one fiber = one connection), and employ several concurrent transactions, for the price of having to clean up after ourselves.

Yep, something like that. Also it's more predictable: if we want so, we will have exactly the same connection during entire request in Rails.

Also i think it's not possible to implement Reaper feature on top of current implementation (because it totally loses its sense) but someone may want exactly this behavior: let the connections become stale but clear it periodically.

@marshall-lee
Copy link
Author

Also it seems that current implementation is not actually concurrent. It checks out always the same connection. It is because current_connection_id is not overriden to use Fiber.current.object_id. It uses Thread.current.object_id which is not acceptable. Try to override it right and you'll get failing specs.

@marshall-lee
Copy link
Author

Just for recap, why current implementation should be replaced with this one:

  • It's not stable. It doesn't work on every ActiveRecord version. Actually, it behaves somehow randomly and there is no adequate way to make it compatible.
  • It doesn't just monkey patch ActiveRecord, it substitutes connection instance with something that is actually not a connection – EM::ConnectionPool instance that has its own method and that is conforming to its own convention for missing methods (do they begin with 'a' or not) which is not acceptable for correctly monkey patching third party stuff.
  • It doesn't conform to official documented ActiveRecord behavior and there is no way to make it compatible.
  • It's too hard to use it with other adapters, em-pg-client for instance. To use it with PostgreSQL I should reimplement em_postgresql_connection like em_mysql2_connection overriden by current implementation. That sucks. With this one i just added em-pg-client to Gemfile, reassigned PGconn = PG::EM::Client and everything worked out of the box.
  • It's not actually concurrent. This is the most important. ActiveRecord::Base.connection_pool.connections is not growing as it should — there is exactly one connection in it but in high-concurrent case its size should equal to ActiveRecord::Base.connection_pool.size (see Real concurrency spec for ActiveRecord #196)

@marshall-lee
Copy link
Author

But what if someone has a working em-synchrony/activerecord and may have a problems after updating? Well, i really doubt that someone was really able to use current implementation without evident bugs. It's either not working at all or creating another problems.

Anyway, i suggest you the following strategy:

  • Add a temporary option — something like :old_connection_pool_behavior and set it to true by default.
  • If :old_connection_pool_behavior is set to true then we: 1) wrap every query and every transaction with_connection. 2) output a deprecation message suggesting to care about connection releasing and if already so, manually set old_connection_pool_behavior: false in config allowing to use all the benefits of new behavior.
  • Cover it with specs. Also add missing specs for transactions, concurrency, etc
  • Release the em-synchrony to rubygems!
  • Remove all old_connection_pool_behavior workarounds for the next version.

@marshall-lee
Copy link
Author

Or there is a simpler way: just merge it 😆
Seriously, current implementation is anyway broken and not well tested.

@marshall-lee
Copy link
Author

I've mistaken about current pool concurrency in #196. Well, when the pool works — it just works, effectively using a :pool parameter.
But it seems that current implementation won't allow us to manually work with pool: checkin/checkout, with_connection just because it corrupts AR's connection pool making it contain only one connection (see #196 (comment)). So it cannot be considered as a full drop-in replacement of multithreaded ActiveRecord for existing code.

@dgutov
Copy link
Collaborator

dgutov commented Mar 19, 2015

Also i think it's not possible to implement Reaper feature on top of current implementation (because it totally loses its sense)

There's less reasons to have it, but still, if the connection pool were fixed to contain more than one connection, and some were checked out manually by temporary fibers and not returned, then Reaper could do its thing.

With this one i just added em-pg-client to Gemfile, reassigned PGconn = PG::EM::Client and everything worked out of the box.

Yes, it is pretty nice.

It's not stable. It doesn't work on every ActiveRecord version. Actually, it behaves somehow randomly and there is no adequate way to make it compatible.

I don't know if it's true. As long as the only problem is that the pool contains only one element, and all fibers somehow manage to use it successfully, the only problem left is the lack of concurrency (that's just speed, right?).

it seems that current implementation won't allow us to manually work with pool: checkin/checkout, with_connection just because it corrupts AR's connection pool making it contain only one connection

So, don't you think it's the main compatibility problem we have?

If we could somehow make each new fiber to inherit the connection from its parent (ideally), or just the first checked out one, but interacting with the connection pool manually allowed to use a different connection, that would be backward-compatible enough, I think.

Or there is a simpler way: just merge it

All right, but just to be clear, I'm not a committer here. :)

@marshall-lee
Copy link
Author

@dgutov Hi! Did you tried this branch in your project?

@dgutov
Copy link
Collaborator

dgutov commented May 7, 2015

@marshall-lee Sorry, I should have some results in a few days.

@dgutov
Copy link
Collaborator

dgutov commented May 23, 2015

Done that now (with AR 4.1 and 4.2), sorry for the delay.

Using 4.1, it seems after the pool is exhausted, most specs start failing in the same way:

     Failure/Error: Unable to find matching line from backtrace
     ActiveRecord::ConnectionTimeoutError:
       could not obtain a database connection within 5.000 seconds (waited 5.001 seconds)
     # ./spec/spec_helper.rb:49:in `block (2 levels) in <module:RollbackExampleGroup>'
     # ./spec/spec_helper.rb:39:in `block (2 levels) in <top (required)>'
     # ./spec/spec_helper.rb:21:in `block (3 levels) in <top (required)>'

Using ActiveRecord::Base.clear_active_connections! helps, but having to call it after every spec is pretty invasive. 4.2 doesn't seem to have that problem; any idea why?

@brodock
Copy link

brodock commented Jul 9, 2015

👍

@exAspArk
Copy link
Contributor

I've tried to use it with ActiveRecord 4.2, it also requires to use .clear_active_connections!

ActiveRecord::ConnectionTimeoutError: could not obtain a database connection within 5.000 seconds (waited 5.005 seconds)

      could not obtain a database connection within 5.000 seconds (waited 5.005 seconds)
./vendor/bundle/ruby/2.1.0/gems/activerecord-4.2.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:189:in `block in wait_poll'
./vendor/bundle/ruby/2.1.0/gems/activerecord-4.2.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:180:in `loop'
./vendor/bundle/ruby/2.1.0/gems/activerecord-4.2.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:180:in `wait_poll'
./vendor/bundle/ruby/2.1.0/gems/activerecord-4.2.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:135:in `block in poll'
./vendor/bundle/ruby/2.1.0/bundler/gems/em-synchrony-7b3415a26279/lib/em-synchrony/monitor_mixin.rb:70:in `mon_synchronize'

With this addition all tests in non Rails app pass 😉

@exAspArk
Copy link
Contributor

Hey, what do you think about adding .clear_active_connections! at the end of FiberIterator#each block after requiring em-synchrony/activerecord? Or maybe wrap everything with with_connection? For example:

def each(foreach=nil, after=nil, &blk)
  wrapped_block = proc do |*args|
    ActiveRecord::Base.connection_pool.with_connection { |_conn| blk.call(*args) }
  end

  old_each(foreach, after, &wrapped_block)
end

@marshall-lee
Copy link
Author

@exAspArk Hello!

What do you mean? Monkey patch FiberIterator#each? But why?

@marshall-lee
Copy link
Author

I think it's a bad idea

@marshall-lee
Copy link
Author

I've tried to use it with ActiveRecord 4.2, it also requires to use .clear_active_connections!
With this addition all tests in non Rails app pass 😉

Yes, because it's just how ActiveRecord works. When you use regular (non em-synchrony patched) ActiveRecord outside of Rails or Sidekiq (both have special middleware that calls clear_active_connections!) in multithreaded environment you'll get the same problem.

@marshall-lee
Copy link
Author

And I also think that in every application middleware-approach will be fine.

@igrigorik
Copy link
Owner

@marshall-lee @dgutov any chance we can rebase this one? What's the status on this -- are we close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants