Saturday, February 25, 2012

Better Practices Wanted for Cascading Inserts of Hierarchical Data from Staging Tables

I apologize if this has been asked, but I can't find a complete answer.

We have a situation with parent/child tables which have an identity column as their PK. We need to be able to insert into the live tables from staging tables. The data in the staging tables are related via a surrogate key.

I have found the OUTPUT clause, but that can only refer to columns of the actual table (since there is no FROM clause in an INSERT). Our current best solution to this problem involves adding bogus "staging" columns to the destination tables, and removing them after we've inserted everything from staging. This is an unattractive solution to say the least.

I'll give an example that mirrors our actual solution, and ask if anyone has a better solution?

-

Code Snippet

CREATE TABLE [dbo].[TABLE_A](

[ID] [int] IDENTITY(1,1) NOT NULL,

[DATA] [nchar](10) NOT NULL,

[STAGING_COLUMN] [bigint] NULL,

CONSTRAINT [PK_TABLE_A] PRIMARY KEY ([ID] ASC)

)

GO

CREATE TABLE [dbo].[TABLE_B](

[ID] [int] IDENTITY(1,1) NOT NULL,

[A_ID] [int] NOT NULL,

[DATA] [nchar](10) NOT NULL,

[STAGING_COLUMN] [bigint] NULL,

CONSTRAINT [PK_TABLE_B] PRIMARY KEY ([ID] ASC)

)

GO

ALTER TABLE [dbo].[TABLE_B]

ADD CONSTRAINT [FK_TABLE_A_TABLE_B] FOREIGN KEY([A_ID]) REFERENCES [dbo].[TABLE_A] ([ID])

GO

CREATE TABLE [dbo].[STAGE_TABLE_A](

[A_Key] [bigint] NOT NULL,

[DATA] [nchar](10) NOT NULL

)

GO

CREATE TABLE [dbo].[STAGE_TABLE_B](

[B_Key] [bigint] NOT NULL,

[DATA] [nchar](10) NOT NULL,

[A_Key] [bigint] NOT NULL

)

GO

The STAGING_COLUMN columns are the ones that will be added before, and dropped after.

Code Snippet

DECLARE @.TABLE_A_MAP TABLE (

A_ID INT,

A_Key BIGINT

)

INSERT INTO TABLE_A (DATA, STAGING_COLUMN)

OUTPUT INSERTED.ID, INSERTED.STAGING_COLUMN INTO @.TABLE_A_MAP

SELECT DATA, A_Key FROM STAGE_TABLE_A

INSERT INTO TABLE_B (A_ID, DATA)

SELECT TAM.A_ID, STB.DATA

FROM STAGE_TABLE_B STB INNER JOIN @.TABLE_A_MAP TAM ON TAM.A_Key = STB.A_Key

This seems to work, but I'd really like another alternative. Even though this is happening when nobody else is using the database, I cringe at the thought of adding and removing columns just to make this work.

Here are a few of my constraints:

    The above is a simplification of the actual problem. The actual problem goes about five levels deep (hence the B_Key in STAGE_TABLE_B). At the top level, our larger customer will have 100,000 rows to insert. Each level will average 3 times as many rows as the next higher level, so we're talking about real volumes here.

    This has to finish over the course of a weekend.

    This has to be delivered to QA this Friday

Thanks for any help or insight.

John,

After trigger is perfect for this.

e.g.

Code Snippet

create trigger tr_tb_a on table_a

for insert

as

if @.@.rowcount = 0 return;

insert table_b(a_id,data)

select i.id, s.data

from stage_table_b s join inserted i on s.a_key=i.staging_column

go

|||

Thanks, I just extended this to three tables, and it worked.

This is sort of a "pull" model as opposed the the "push" model we were using.

I've sent this to our DB expert, and will post the result here.

No comments:

Post a Comment