Sunday, February 19, 2012

Best way to retrieve rows matching an array of Ids

My database table has approx 400,000 rows, with an integer Id as primary key
.
I have an array of ids, with N elements where N is typically between 500 and
5,000.
I want to retrieve into a collection all rows whose id matches a value in
this array.
What is the 'best' way to do this and/or what are the tradeoffs to consider?
I can imagine a number of solutions as follows:
1. A parametered query which takes a single Id parameter and returns 1 row.
Implies N round trips to the server to retrieve N rows.
2. A parametered query with an IN clause which takes n parameters "... WHERE
ID IN (@.p1, @.p2, ... , @.pn)". Implies N/n round trips to the server where
each round trip retrieves up to n rows. What is the maximum / optimal valu
e
for n in this case?
3. Build the SQL string dynamically, with an IN clause generated by
concatenating up to n Id values. Implies N/n round trips to the server wher
e
each round trip retrieves up to n rows. What is the maximum / optimal valu
e
for n in this case?
4. Create a temporary table with a single integer ID column and insert the N
Ids from my array (how?). Join the original table with this temporary table
to retrieve all matching Ids.
...
Any suggestions?You can try the solution offered in http://www.aspfaq.com/2248 ... also
check out the link there to Erland's article on array handling.
"Joe" <Joe@.discussions.microsoft.com> wrote in message
news:656DF445-E0EC-495C-B62E-0A535EBC2366@.microsoft.com...
> My database table has approx 400,000 rows, with an integer Id as primary
> key.
> I have an array of ids, with N elements where N is typically between 500
> and
> 5,000.
> I want to retrieve into a collection all rows whose id matches a value in
> this array.
> What is the 'best' way to do this and/or what are the tradeoffs to
> consider?
> I can imagine a number of solutions as follows:
> 1. A parametered query which takes a single Id parameter and returns 1
> row.
> Implies N round trips to the server to retrieve N rows.
> 2. A parametered query with an IN clause which takes n parameters "...
> WHERE
> ID IN (@.p1, @.p2, ... , @.pn)". Implies N/n round trips to the server where
> each round trip retrieves up to n rows. What is the maximum / optimal
> value
> for n in this case?
> 3. Build the SQL string dynamically, with an IN clause generated by
> concatenating up to n Id values. Implies N/n round trips to the server
> where
> each round trip retrieves up to n rows. What is the maximum / optimal
> value
> for n in this case?
> 4. Create a temporary table with a single integer ID column and insert the
> N
> Ids from my array (how?). Join the original table with this temporary
> table
> to retrieve all matching Ids.
> ...
> Any suggestions?|||Very useful resources, thanks, it's a lot clearer. In my case, the number
of values is potentially greater than will fit into an 8K string. So I thin
k
the approach will be:
1. Create the temp table
2. Concatenate as many Ids as possible into a string which does not exceed
8K and call an SP which extracts integers from the string and inserts into
the temp table.
3. Repeat step 2 until all ids in my array are processed.
4. Execute a query with a join to the temp table
5. Drop the temp table.
This will be done from a .NET application. I am concerned about the
lifetime of the temporary table? If I get a SqlException while running the
above, my code will typically close its SqlConnection object - which won't
actually close the underlying connection, just return it to the connection
pool. Presumably this means the temp table could be left lying around for
the next unsuspecting user of that connection, right?
I guess the workaround would be to check for the existence of the temp table
before step 1 above and drop it if it already exists.
"Aaron Bertrand [SQL Server MVP]" wrote:

> You can try the solution offered in http://www.aspfaq.com/2248 ... also
> check out the link there to Erland's article on array handling.
>
> "Joe" <Joe@.discussions.microsoft.com> wrote in message
> news:656DF445-E0EC-495C-B62E-0A535EBC2366@.microsoft.com...
>
>|||Arrays? Collections? Those are not SQL structures. We only have
tables. And we avoid procedural code, cursors, dynamic SQL and string
casting as much as possible -- which is about 99.99% of the time.
Load your list of ids into a one column table and use "WHERE foobar_id
IN (SELECT search_id FROM WorkingList)" in the body of the procedure.
Avoid kludges.|||I think we already established that a good approach is to load the ids into
a
(temporary) table. The question was, what is the 'best' way to load these
ids into the temporary table.
From the resources linked by Aaron Bertrand, it seems that a good approach
is to pass the array of values as a string - this means that all the values
can be passed in a single call.
Do you agree or what alternative would you recommend and why?
"--CELKO--" wrote:

> Arrays? Collections? Those are not SQL structures. We only have
> tables. And we avoid procedural code, cursors, dynamic SQL and string
> casting as much as possible -- which is about 99.99% of the time.
> Load your list of ids into a one column table and use "WHERE foobar_id
> IN (SELECT search_id FROM WorkingList)" in the body of the procedure.
> Avoid kludges.
>

No comments:

Post a Comment