Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock detected #29

Open
jwoertink opened this issue Mar 24, 2021 · 6 comments
Open

deadlock detected #29

jwoertink opened this issue Mar 24, 2021 · 6 comments

Comments

@jwoertink
Copy link
Member

I went to run specs on my app locally, and I got a ton of deadlock errors:

Unhandled exception in spawn: deadlock detected (PQ::PQError)
  from lib/pg/src/pq/connection.cr:203:7 in 'handle_error'
  from lib/pg/src/pq/connection.cr:186:7 in 'handle_async_frames'
  from lib/pg/src/pq/connection.cr:162:7 in 'read'
  from lib/pg/src/pq/connection.cr:414:18 in 'expect_frame'
  from lib/pg/src/pq/connection.cr:398:9 in 'read_next_row_start'
  from lib/pg/src/pg/result_set.cr:39:8 in 'move_next'
  from lib/db/src/db/result_set.cr:39:13 in 'from_rs'
  from lib/avram/src/avram/save_operation.cr:367:17 in 'insert'
  from lib/avram/src/avram/save_operation.cr:349:7 in 'insert_or_update'
  from lib/avram/src/avram/save_operation.cr:297:9 in 'save'
  from lib/avram/src/avram/save_operation.cr:321:8 in 'save!'
  from lib/breeze/src/breeze/operations/save_breeze_sql_statement.cr:1:1 in 'create!:breeze_request_id:statement:args:model:elapsed_text'
  from lib/breeze/src/breeze.cr:14:3 in '->'
  from /usr/local/Cellar/crystal/0.36.1_2/src/primitives.cr:255:3 in 'run'
  from /usr/local/Cellar/crystal/0.36.1_2/src/fiber.cr:92:34 in '->'

I have it set to only enable while in development, so in theory breeze should be skipped during tests... I'll try to dig in more to see what specifically is causing this.

@jwoertink
Copy link
Member Author

Ok, my issue may not actually be breeze related directly... However, it does worry me that it was so easy to his this error. I'll leave it open for now so we can track it. Maybe someone can think of a fix.

@jwoertink
Copy link
Member Author

Ok, just ran in to something similar and I was doing something completely different here:

Unhandled exception in spawn:  (DB::ConnectionRefused)
  from Exception::CallStack::unwind:Array(Pointer(Void))
  from Exception::CallStack#initialize:Array(Pointer(Void))
  from Exception::CallStack::new:Exception::CallStack
  from raise<DB::ConnectionRefused>:NoReturn
  from PG::Connection#initialize<DB::Database>:Bool
  from PG::Connection::new<DB::Database>:PG::Connection
  from PG::Driver#build_connection<DB::Database>:PG::Connection
  from ~procProc(DB::Connection)@lib/db/src/db/database.cr:56
  from DB::Pool(DB::Connection+)@DB::Pool(T)#build_resource:DB::Connection+
  from DB::Pool(DB::Connection+)@DB::Pool(T)#checkout:DB::Connection+
  from DB::Database#checkout:DB::Connection+
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save:Bool
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save!:Breeze::BreezeSqlStatement
  from Breeze::SaveBreezeSqlStatement::create!:breeze_request_id:statement:args:model:elapsed_text<(Int64 | Nil), String, (String | Nil), (String | Nil), String>:Breeze::BreezeSqlStatement
  from ~procProc(Nil)@lib/breeze/src/breeze.cr:14
  from Fiber#run:(IO::FileDescriptor | Nil)
  from ~proc2Proc(Fiber, (IO::FileDescriptor | Nil))@/usr/local/Cellar/crystal/0.36.1_2/src/fiber.cr:92
Caused by: no PostgreSQL user name specified in startup packet (PQ::PQError)
  from Exception::CallStack::unwind:Array(Pointer(Void))
  from Exception::CallStack#initialize:Array(Pointer(Void))
  from Exception::CallStack::new:Exception::CallStack
  from raise<PQ::PQError>:NoReturn
  from PQ::Connection#handle_error<PQ::Frame::ErrorResponse>:NoReturn
  from PQ::Connection#handle_async_frames<(PQ::Frame+ | PQ::Frame::Unknown)>:Bool
  from PQ::Connection#read<(Char | Nil)>:(PQ::Frame+ | PQ::Frame::Unknown)
  from PQ::Connection#read:(PQ::Frame+ | PQ::Frame::Unknown)
  from PQ::Connection#expect_frame<PQ::Frame::Authentication.class, Nil>:PQ::Frame::Authentication
  from PQ::Connection#expect_frame<PQ::Frame::Authentication.class>:PQ::Frame::Authentication
  from PQ::Connection#connect:Bool
  from PG::Connection#initialize<DB::Database>:Bool
  from PG::Connection::new<DB::Database>:PG::Connection
  from PG::Driver#build_connection<DB::Database>:PG::Connection
  from ~procProc(DB::Connection)@lib/db/src/db/database.cr:56
  from DB::Pool(DB::Connection+)@DB::Pool(T)#build_reUnhandled exception in spawn:  (DB::ConnectionRefused)
  from Exception::CallStack::unwind:Array(Pointer(Void))
  from Exception::CallStack#initialize:Array(Pointer(Void))
  from Exception::CallStack::new:Exception::CallStack
  from raise<DB::ConnectionRefused>:NoReturn
  from PG::Connection#initialize<DB::Database>:Bool
  from PG::Connection::new<DB::Database>:PG::Connection
  from PG::Driver#build_connection<DB::Database>:PG::Connection
  from ~procProc(DB::Connection)@lib/db/src/db/database.cr:56
  from DB::Pool(DB::Connection+)@DB::Pool(T)#build_resource:DB::Connection+
  from DB::Pool(DB::Connection+)@DB::Pool(T)#checkout:DB::Connection+
  from DB::Database#checkout:DB::Connection+
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save:Bool
  from Breeze::SaveBreezeSqlStatement@Avram::SaveOperation(T)#save!:Breeze::BreezeSqlStatement
  from Breeze::SaveBreezeSqlStatement::create!:breeze_request_id:statement:args:model:elapsed_text<(Int64 | Nil), String, (String | Nil), (String | Nil), String>:Breeze::BreezeSqlStatement
  from ~procProc(Nil)@lib/breeze/src/breeze.cr:14
  from Fiber#run:(IO::FileDescriptor | Nil)
  from ~proc2Proc(Fiber, (IO::FileDescriptor | Nil))@/usr/local/Cellar/crystal/0.36.1_2/src/fiber.cr:92

It seems it's pretty easy to hit DB errors with this. In this case I was calling this code:

File.read_lines(filename).each do |domain|
      SaveRestrictedDomain.create(text: domain.strip) do |_o, _d|
        # ignore if it fails
      end
    end

This code was in a task that I was running locally, and filename is a file with about 120,000 lines in it. Looks like it's coming from this file. It doesn't really matter where the subscribe is, it's basically global and will run from anywhere once it's been defined. If I'm blasting my database, and it has to run this block on everyone, I'm assuming that the threads are just backing up and dogpiling because I'm pushing queries faster than this block can run. Too many spawn calls...

The original post was caused from me running specs where I was essentially doing the same. I was pushing more queries than what could be handled. These are probably edge cases, but the fact that they are preventing me from doing what I need to is an issue.

For now, I was able to get around the first part because when I ran specs earlier, it thought I was in development. For this case I am in development, but we have a Lucky::Env.task? method that I can use to disable breeze in tasks.

@matthewmcgarvey
Copy link
Member

So are you saying that you think it's because we are saving things to the database using spawn?

spawn do
Breeze::SaveBreezeSqlStatement.create!(
breeze_request_id: req.try(&.id),
statement: event.query,
args: event.args,
model: event.queryable,
elapsed_text: duration.to_elapsed_text
)
end
end

@jwoertink
Copy link
Member Author

That's my assumption, yeah. It seems like It's kicking off more spawns than doing saves, and they are piling up. I guess since it's easy to reproduce, I can take them out of the spawn to see if it still does it. I'll give that a shot tomorrow and see if that makes a difference.

@matthewmcgarvey
Copy link
Member

Here's at least one thing we connect to the current fiber https://github.com/luckyframework/avram/blob/5e29f75371dca5a2bc16858163dc3790cabfbfcb/src/avram/database.cr#L6

@jwoertink
Copy link
Member Author

I'm not too familiar with doing multi-threading stuff. I wonder if that holds us back in some way by restricting to a single thread 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants