Saturday, 3 April 2010

Best etl for complex files

Further to my last post, if anyone knows of an etl tool that eats complicated proprietary files (multiple lines per record, arbitrary format) please drop me a line !

Thanks

Wednesday, 31 March 2010

Re-learning perl

I haven't used perl much in the last 5 or so years.  I prefer 'modern' languages like ruby but some ETL tasks have come up on our project that are way beyond what can nicely be handled with SSIS.  Rather than having to use SQL as a text parsing language, we have decided to use perl.

An example of the type of problem we are facing is this:  We get a file where the first character determines the format of the rest of the file.  To get a complete transaction, you need to process 3 lines.  For example we have (this is for a telco system):

1[TAB]123345[TAB]Acme Co
2[TAB]+61290099009
3[TAB]+61380088008[TAB]OUTBOUND[TAB]0.56[TAB]0.09

Where [TAB] means a tab character.

So we get the account number and customer name on line 1, calling number on line 2, the called number, call type, duration and cost on line 3.  There is a 1-N relationship from 1 to 2 and 2 to 3.

so rather than write the most complicated cursor ever we are using perl.  The thing is i'm a bit rusty.

I'm interested to know:
a)  What are some good books for re-learning perl (as opposed from starting from scratch)
b)  Do you agree with the approach ?  how would you tackle this problem ?

Leigh.

Sunday, 14 March 2010

Mobile blogging

I just got a new mobile blogging app. Doesn't really suit the more detailed posts I prefer to make but I'll give it a try !

My boss resigned last week largely due to the lack of focus management have been giving BI. Looks like a turbulent period ahead. Hopefully it gives management the wake up call (I believe) they need.

Thursday, 11 March 2010

Can your Business Intelligence team be too good for your organisation ?

Firstly, I don't think for a second I am too good, but I suspect my team is 'too good' for our organisation.

What do I mean by this ? We have been quite a successful business intelligence team delivering to various areas of the business for the last couple of years so when management decided to put in a new multi-million dollar billing system we were brought in from the start to integrate this into out BI world.

And here is where things started to go wrong.  Because we have more experience and skill in extracting data from our various systems, we were quickly roped into the data extraction, cleansing and migration tasks.  Because the application vendor was too expensive in developing interfaces to our other systems and to external organisations, these became deliverables of the data warehouse.  It has ended up that we are now responsible for 75% of technical tasks on the implementation of this new billing system.

Now we are receiving a lot of praise and thanks from management, but we no longer have capacity to cover the BI work we had planned for this financial year.  Our people are getting frustrated and we are starting to lose staff.

The billing system still has another 3-6 months of work to complete (not counting any BAU tasks we may be left with at the end) and most of our team are out of patience.

I have tried to think of ways we can get out of this and back to our core area, but short of us screwing op royally (which we would never do intentionally) I can't see how we can convince management to let us return to our core area.

Any ideas ?  Have you been in this situation and what did you do about it ?