SQL Connections, Tasks and Multiple Results Sets

There is a tendency when confronted with a bug to assume that it’s related to the most complex or least understood part of a program. While not unreasonable this can lead to missing simple and obvious problems. I recently wasted several hours convinced that I had a threading issue which predictably turned out to have a much more prosaic explanation.

I was updating an ETL process in C# that pulled data from a cloud API and pushed it to an on premise SQL Server. Multiple downloads/uploads were run concurrently as tasks and each task used its own connection object to connect to SQL Server. This was all working fine but I wanted to wrap all the SQL uploads in a single transaction so I could implement a “fail any fail all” process to roll back all changes to the database if any task failed. The easiest way to do this was to have each task share a single SQL connection and transaction. This was easy enough to implement and I tested it with a few concurrent tasks fine. But as I ramped up the number of tasks the process started to fall over. It threw various errors with the SQL connection and SQL commands including timeouts and this one:

 
There is already an open DataReader associated with this Command which must be closed first
 
Had I just Googled this right away I would have solved this problem in five minutes. But instead I assumed the issue was somehow related to the tasks sharing the SQL connection object. Several things made me make this mistake. First I’m far more comfortable with the database than I am with managing threads so assumed this must be where I’d made the mistake. Secondly the change I’d made was to allow tasks to share a resource. And finally the fact that errors were only thrown when the number of tasks increased and the variety of errors suggested a contention issue of some kind (which indeed it was, just not where I was looking). Spending a couple of hours reviewing the task code without finding an explanation only convinced me that I was missing some subtle issue but in exasperation I did finally Google that error message.

 

What the top Stack Overflow answer (from 10 years ago with 1300 up votes!) reminded me was that if you need a single connection to manage multiple queries concurrently you have to enable multiple active results sets in the connection string. Which of course I hadn’t, which of course explained all the contention issues and which of course I should have spotted hours ago. Lesson learned. Until the next time…