Friday, February 10, 2012

Best way for the performance

I would like to know which way is better for the performance of my
system:
Creating a huge table with more than 5 000 000 000 rows or multiple
tables with the same structure containing 5 000 000 rows?
Here the context of our application:
Our application offers the possibility to create up to 3000 projects.
Each project contains a list of tags and a list of value-timestamp for
each tag.
So, a project can contain up to 500 tags with a maximum of 5000 values
for each tag. In a hypothetical scenario where everything is at
maximun, only one table containing every value-timestamp would be huge
with more than 7 500 000 000 rows.
My first thought was to create a table containning the values for each
project.
Project
===========
Project_ID - PK
ValDataName_ID
.... etc...
ValDataName
===========
ValDataName_ID - PK
TableName (represents the ValDataXXX)
Project_ID
ValDataXXX where XXX is an unique id
===========
ValDataXXX_ID - PK
Tag_ID
TagVal
TagTime
So, I would have a table (ValDataName) containing the names of each
ValData table linked to the project and a table containing my projects.
Then, I would have up to 5000 tables of type ValData.
Since my ValData tables are completely independant, any action
affecting one table does not affect the other.
So, which way would be the better approach?
Thanks.
Francis B. wrote:
> I would like to know which way is better for the performance of my
> system:
> Creating a huge table with more than 5 000 000 000 rows or multiple
> tables with the same structure containing 5 000 000 rows?
> Here the context of our application:
> Our application offers the possibility to create up to 3000 projects.
> Each project contains a list of tags and a list of value-timestamp for
> each tag.
> So, a project can contain up to 500 tags with a maximum of 5000 values
> for each tag. In a hypothetical scenario where everything is at
> maximun, only one table containing every value-timestamp would be huge
> with more than 7 500 000 000 rows.
> My first thought was to create a table containning the values for each
> project.
> Project
> ===========
> Project_ID - PK
> ValDataName_ID
> ... etc...
> ValDataName
> ===========
> ValDataName_ID - PK
> TableName (represents the ValDataXXX)
> Project_ID
> ValDataXXX where XXX is an unique id
> ===========
> ValDataXXX_ID - PK
> Tag_ID
> TagVal
> TagTime
> So, I would have a table (ValDataName) containing the names of each
> ValData table linked to the project and a table containing my projects.
> Then, I would have up to 5000 tables of type ValData.
> Since my ValData tables are completely independant, any action
> affecting one table does not affect the other.
> So, which way would be the better approach?
> Thanks.
You didn't mention one very important piece of information. What
version of SQL Server are you using? SQL Server 2005 has a lot of new
scalability features, especially table partitioning. For a major new
application I assume SQL Server 2005 will be the obvious choice for
you.
Achieving a scalable solution will depend very much on the design of
your indexing, queries, your application and your hardware
implementation. You've told us nothing of those. I can only suggest
that creating new tables in a bid to improve performance should
generally be a long way down your list of options. Usually there are
much more effective ways to optimise performance. More likely you might
choose to implement horizontal partitioning in order to make certain
admin tasks easier. How you do that depends a lot on the version and
edition you are using.
My other suggestion would be that you rethink very, very carefully
about logical design. When I see tables with names like "ValDataXXX"
and columns with names like "Tag_ID" and "TagVal" I just get an a awful
sinking feeling that this is some naive data-model-on-the-fly kind of
solution - what is sometimes euphemistically called an "EAV" model. If
I'm right then you can probably say goodbye to scalability, performance
and data integrity. Your database will have none of them.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
|||On 19 Sep 2006 12:59:25 -0700, "Francis B."
<francisbeliveau@.gmail.com> wrote:

>I would like to know which way is better for the performance of my
>system:
>Creating a huge table with more than 5 000 000 000 rows or multiple
>tables with the same structure containing 5 000 000 rows?
>Here the context of our application:
>Our application offers the possibility to create up to 3000 projects.
>Each project contains a list of tags and a list of value-timestamp for
>each tag.
No way should a project management system have 10^9 rows.
Beyond that, I second what David has already said.
J.

No comments:

Post a Comment