Hi All,
this seems like the best place for this question.
I've had cause to review the MSFT position/tools in the BI area. I'm
surprised!!! I feel like MSFT have made MUCH more progress than generally
talked about in newsgroups and customers I work with. All I ever hear from
MSFT with respect to DW/BI is 'the next release is going to be great.' Not,
'Hey, take a look at BIA/Reporting Services'...?
Q1. In BI accelerator I installed and tested out the shopfloor application
(manufacturing) and I generated the shopfloor database from the XL Sheet.
But I can't see where it got the table definitions for the staging table and
the underlying dimensional database from. For example, where are the
table/column/datatype definitions of the staging area and the dimensional
database in the XL sheets? I'm sure I looked at all the sheets in the
spreadsheet AnalyticsBuilderWB_ShopFloorPerformance.v.0.2.xls. (But maybe
I'm going crazy and didn't). To me it only seems like the analytical model
is in the spreadsheet.
Q2. I see it generates a ton of .dts files. And when I open them up they
seem to me (a zero level skilled DTS person) to be very complex. The manual
says 'don't change the packages' which I gather means don't look, don't
worry, just run it and it will all be ok....again, I gather that the DTS
packages are somehow generated by BIA into their binary format and I believe
the source to target mappings are defined by the 'mappings' spreadsheet. But
in this spreadsheet I don't seem to find enough columns for all the columns
in the dimensional model. For example I can't find mappings for dim_emp_std
in the spreadsheet but I would have thought it should be there. (Or am I
missing something)?
Q3. In BIA I see lots about sharepoint portal and office objects inside web
pages and all that as a presentation layer. But I just took a look at
Reporting Services. Reporting Services looks absolutely fantastic for what
it is trying to do and it as very obvious how easy it will be to extend
reporting services to do MUCH, MUCH more very, very easily. RDL (Report
Definition Language) is an idea long overdue and I do believe MSFT are first
with that one. I have not heard any other vendor talk about an RDL yet. I
was amazed that a report can just be exposed as a web service and you can
call it from anywhere with anything. Now THAT is a useful thing to have.
So the question is, how come BIA seems to completely ignore Reporting
Services? I would have thought they would be very closely 'joined'. For
example, why not do the front ends for BIA apps in reporting services? I am
assuming this is possible. I'm assuming reporting services can get data out
of Analysis Server because it can get data out of any ado.net server.
(I must say I thought MSF had hidden how good Report Services is very
well...unless I'm greatly mistaken, I've only had a few hours to look at it,
it looks like a really, really useful product!!!)
Anyway, thanks in advance if anyone can let me know what I am missing in
BIA.
Best Regards
Peter Nolan
www.peternolan.com
Wow. There is a lot of info here. A couple of comments:
Q1) datatypes using in BIA -- well, we make several guesses. First we know
that all measures are numeric (that is a requirement of Analysis Services).
Second we know that fields that we generate have specific uses -- and from
that we know their datatypes. For example, the surrogate keys are integers
(as you would expect because they are identity). For those fields which are
user supplies, e.g. member names -- we know just treat them as "names", i.e.
varchars. If you have something different, e.g. names which are really
integers, they will have to re-do things by-hand. For those things which
could be any datatype, e.g. member properties, you can choice the
appropriate datatypes.
If there are specific tables and columns you are wondering about just tell
me and I'd be glad to explain why we did something one way or the other.
What you are seeing the whole point of the BI Accelerator. You layout the
logical multidimensional design and we auto-generate a relational staging
area; final data mart, OLAP structures, and DTS packages to move the data
through it. Logical data model to final app in one click :-)
And that final app should have all of the tips and tricks that you would see
in a production system; not just a rough proof-of-concept system.
Q2) DTS packages in BIA -- thank you. I wrote the generator code and I am
quite proud of them.
It generates a ton of packages because of the type of schema we choose to
implement. Since we generate a snowflake schema, there are tables for every
dimension and level in the dimensional structure.
We had several goals with the DTS packages. First, we wanted them to be
data-driven because we don't expect everyone to be an expert on DTS. Thus
rather than having to make changes to them, most of them have variables you
can change value and have the package do something different. All of this is
documented in the PAG (online doc set). One of the challenges of this was
that DTS with SQL Server 2000 doesn't have the control flow tasks needed to
make this declaritive -- with 2005, we put that all in natively and similar
packages with 2005 would be quite a bit less complex.
Second, we wanted the packages to be visible and extensiveable by
knowledgeable users. Nothing is hidden -- it is all up front and in your
face. Yes, a novice will look at them and dispare -- but don't give up!
There is documentation in the PAG provided for all of them! Lastly you might
be interested in this white paper which talks about the DTS packages and
provides various tips and tricks beyond what the PAG has in it.
http://msdn.microsoft.com/library/de..._dts_ssabi.asp
If there is a specific step you have questions about, I'd be glad to help
also.
Q3) I am glad you like Reporting Services. I agree it is a fantastic tool.
However, it is a totally different product. BI Accelerator is a BI
application generator; not a report generator. It has a client component
only for customization. The idea was that suppose you had a report called a
"template" which was exactly the same from client to client, but one
customer called products "items" and another called products "books".
Wouldn't it be nice to ship a template along with the multidimensional
design and have the system automatically rename products to items or
products to books. That is what the client generator does with the BI
Accelerator. It ships with a Proclarity component which knows how to go
inside a Proclarity briefing book and replace one tag for another tag. A
similar facility is available from Panorama and there is an API available to
other vendors if they would like to plug into the BI Accelerator client
generator.
Hope this helps.
Dave Wickert [MSFT]
dwickert@.online.microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
This posting is provided "AS IS" with no warranties, and confers no rights.
"Peter Nolan" <peter@.peternolan.com> wrote in message
news:uCEWHDlBFHA.2876@.TK2MSFTNGP12.phx.gbl...
> Hi All,
> this seems like the best place for this question.
> I've had cause to review the MSFT position/tools in the BI area. I'm
> surprised!!! I feel like MSFT have made MUCH more progress than generally
> talked about in newsgroups and customers I work with. All I ever hear
from
> MSFT with respect to DW/BI is 'the next release is going to be great.'
Not,
> 'Hey, take a look at BIA/Reporting Services'...?
> Q1. In BI accelerator I installed and tested out the shopfloor application
> (manufacturing) and I generated the shopfloor database from the XL Sheet.
> But I can't see where it got the table definitions for the staging table
and
> the underlying dimensional database from. For example, where are the
> table/column/datatype definitions of the staging area and the dimensional
> database in the XL sheets? I'm sure I looked at all the sheets in the
> spreadsheet AnalyticsBuilderWB_ShopFloorPerformance.v.0.2.xls. (But maybe
> I'm going crazy and didn't). To me it only seems like the analytical
model
> is in the spreadsheet.
>
> Q2. I see it generates a ton of .dts files. And when I open them up they
> seem to me (a zero level skilled DTS person) to be very complex. The
manual
> says 'don't change the packages' which I gather means don't look, don't
> worry, just run it and it will all be ok....again, I gather that the DTS
> packages are somehow generated by BIA into their binary format and I
believe
> the source to target mappings are defined by the 'mappings' spreadsheet.
But
> in this spreadsheet I don't seem to find enough columns for all the
columns
> in the dimensional model. For example I can't find mappings for
dim_emp_std
> in the spreadsheet but I would have thought it should be there. (Or am I
> missing something)?
>
> Q3. In BIA I see lots about sharepoint portal and office objects inside
web
> pages and all that as a presentation layer. But I just took a look at
> Reporting Services. Reporting Services looks absolutely fantastic for
what
> it is trying to do and it as very obvious how easy it will be to extend
> reporting services to do MUCH, MUCH more very, very easily. RDL (Report
> Definition Language) is an idea long overdue and I do believe MSFT are
first
> with that one. I have not heard any other vendor talk about an RDL yet.
I
> was amazed that a report can just be exposed as a web service and you can
> call it from anywhere with anything. Now THAT is a useful thing to have.
> So the question is, how come BIA seems to completely ignore Reporting
> Services? I would have thought they would be very closely 'joined'. For
> example, why not do the front ends for BIA apps in reporting services? I
am
> assuming this is possible. I'm assuming reporting services can get data
out
> of Analysis Server because it can get data out of any ado.net server.
> (I must say I thought MSF had hidden how good Report Services is very
> well...unless I'm greatly mistaken, I've only had a few hours to look at
it,
> it looks like a really, really useful product!!!)
> Anyway, thanks in advance if anyone can let me know what I am missing in
> BIA.
> Best Regards
> Peter Nolan
> www.peternolan.com
>
|||Hi Dave,
wow, nice to see the guy who 'wrote the code' is watching the forum.
Thanks for your feedback. Makes sense and I can see where BIA could
head in the next release.
You'll be pleased to hear this...the reason for my interest is some
colleagues and I are bringing to market a product that will be based
solely on MSFT BI technologies and part of that is we are researching
everything we can find about MSFT BI and the way forward with MSFT
BI.....I haven't seen any kind of comprehensive suite of slides that
is public in this matter...if you hapeen to know where one can be
downloaded from that would be great.
I stall have to figure out if BIA can fit into what we are doing. We
shall see.
My other comment...
"Q2) DTS packages in BIA -- thank you. I wrote the generator =ADcode and
I am
quite proud of them."
Let's just say I recognise a good idea and a smart guy when I see
one...;-)
One of my specialities is ETL and reducing the cost of writing ETL for
my clients. I've been doing DW for 14 years now and ETL consumes so
much of the money it's the obvious thing to keep working on to reduce.
About 2.5 years ago I investigated the possibility of writing a
generator for ETL jobs in Informatica and DataStage. I found it was
certainly possible to do so. The 'problem' was that it would be
impossible for an 'independent' to make any money out of it. If the
idea did well it would be trivial for the vendor to write the same
functionality and release it. My generator was planned to produce XML
as both INFA/DS can export/import jobs in XML format.
So I was impressed that you have come up with the same idea and from
looking at the time frames you must have had that idea before me...well
done..;-)...And that you are able to produce the DTS jobs in binary.
(Though I'm sure being MSFT you can see the source code/structures for
DTS.)
Me, I changed course after my evaluation and wrote my own ETL tool. It
is very productive. Rather than map fields at field level it maps at
table levels and moves fields within tables on matching column names.
So it is not actually necessary to ever define source to target
mappings anywhere. It discovers the column names at run time so that
when more columns are added there is no code to regenerate or change
and it is the changing of ETL code that is expensive in DW maintenance.
This is another reason why I was impressed with what you have
done....yours is the first example I have seen where it might be
possible to get away with ETL changes without re-testing in the
non-production environment while still using a 'real' ETL tool.
We have also settled on a spreadsheet as the way to record all
mappings. But we had not automated the generation of jobs/code because
even on very large projects the amount of time taken is very small. (I
recently built the ETL for a staging area+DW with 100 tables and 3,000
fields in 2 weeks so there didn't seem to be a need to speed it up even
further.) So your idea of putting a 'generate' button in the
spreadsheet and a whole set of ticks on what to generate out of the
spreadsheet also intrigued me. It looks like a really 'neat trick'. I
am an 'XL-dummy' so that thought had not crossed my mind before. I must
ask one of my colleagues how that was done......The thought has
crossed my mind that our spreadsheet could be extended to record other
information required for ETL generation and we could cut even the 2
weeks work we do now out by generating what we need to generate...
So, like I said, I just recognised a good idea and a smart guy...;-)
All the best...I'll be investigating BIA some more and look forward to
seeing what happens next...
Best Regards
Peter Nolan
www.peternolan.com
|||Actually the DTS generator uses the normal SQL Server APIs to create its
packages. All I did was to create the packages that I wanted; then saved the
package as VB. Then I reversed engineered the VB back into production
quality code. We never had to look at the source code for DTS at all.
Dave Wickert [MSFT]
dwickert@.online.microsoft.com
Program Manager
BI SystemsTeam
SQL BI Product Unit (Analysis Services)
This posting is provided "AS IS" with no warranties, and confers no rights.
"Peter Nolan" <peter@.peternolan.com> wrote in message
news:1107346392.279399.284200@.f14g2000cwb.googlegr oups.com...
Hi Dave,
wow, nice to see the guy who 'wrote the code' is watching the forum.
Thanks for your feedback. Makes sense and I can see where BIA could
head in the next release.
You'll be pleased to hear this...the reason for my interest is some
colleagues and I are bringing to market a product that will be based
solely on MSFT BI technologies and part of that is we are researching
everything we can find about MSFT BI and the way forward with MSFT
BI.....I haven't seen any kind of comprehensive suite of slides that
is public in this matter...if you hapeen to know where one can be
downloaded from that would be great.
I stall have to figure out if BIA can fit into what we are doing. We
shall see.
My other comment...
"Q2) DTS packages in BIA -- thank you. I wrote the generator Xcode and
I am
quite proud of them."
Let's just say I recognise a good idea and a smart guy when I see
one...;-)
One of my specialities is ETL and reducing the cost of writing ETL for
my clients. I've been doing DW for 14 years now and ETL consumes so
much of the money it's the obvious thing to keep working on to reduce.
About 2.5 years ago I investigated the possibility of writing a
generator for ETL jobs in Informatica and DataStage. I found it was
certainly possible to do so. The 'problem' was that it would be
impossible for an 'independent' to make any money out of it. If the
idea did well it would be trivial for the vendor to write the same
functionality and release it. My generator was planned to produce XML
as both INFA/DS can export/import jobs in XML format.
So I was impressed that you have come up with the same idea and from
looking at the time frames you must have had that idea before me...well
done..;-)...And that you are able to produce the DTS jobs in binary.
(Though I'm sure being MSFT you can see the source code/structures for
DTS.)
Me, I changed course after my evaluation and wrote my own ETL tool. It
is very productive. Rather than map fields at field level it maps at
table levels and moves fields within tables on matching column names.
So it is not actually necessary to ever define source to target
mappings anywhere. It discovers the column names at run time so that
when more columns are added there is no code to regenerate or change
and it is the changing of ETL code that is expensive in DW maintenance.
This is another reason why I was impressed with what you have
done....yours is the first example I have seen where it might be
possible to get away with ETL changes without re-testing in the
non-production environment while still using a 'real' ETL tool.
We have also settled on a spreadsheet as the way to record all
mappings. But we had not automated the generation of jobs/code because
even on very large projects the amount of time taken is very small. (I
recently built the ETL for a staging area+DW with 100 tables and 3,000
fields in 2 weeks so there didn't seem to be a need to speed it up even
further.) So your idea of putting a 'generate' button in the
spreadsheet and a whole set of ticks on what to generate out of the
spreadsheet also intrigued me. It looks like a really 'neat trick'. I
am an 'XL-dummy' so that thought had not crossed my mind before. I must
ask one of my colleagues how that was done......The thought has
crossed my mind that our spreadsheet could be extended to record other
information required for ETL generation and we could cut even the 2
weeks work we do now out by generating what we need to generate...
So, like I said, I just recognised a good idea and a smart guy...;-)
All the best...I'll be investigating BIA some more and look forward to
seeing what happens next...
Best Regards
Peter Nolan
www.peternolan.com
|||Hi Dave,
interesting...I am not up on DTS and I had never heard that the
packages could be saved to VB...I must look into DTS when the next
release comes out.....
I've ask one of my XL knowledgable guys how to put something like a
'generate' button into the spreadsheets we use to write our mappings
and to call some C++ code we have......In fact, we type the
definitions into a spreadsheet and then drop them into the database to
run the code to generate the tables/views!!! (LOL) This is when we
aren't using a data modelling tool. Most places have a standard tool
that must be used for table definitions...
Best Regards
Peter Nolan
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment