Tuesday, 20 July 2010

Use SSIS Script component to pull data from MS Office Word File

Well some time we face some requirements which really frustrate you..
Same scenario I got end up few days back when I have to pull data from a Word Files.
Yup our Microsoft Office Word Files. I have no choice to make it work so I have started looking into it,
Finally I was manage to write a Script task which can pull data from MS Word.
So here we are.. Following is script to pull data from MS Office Word Files…

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Collections.Generic;
using System.Windows.Forms;
using Microsoft.Office.Interop;
using System.Reflection;
using Office = Microsoft.Office.Core;
using Microsoft.Office.Interop.Word;
using Microsoft.VisualStudio.Tools.Applications.Runtime;
using Excel = Microsoft.Office.Tools.Excel;
using Word = Microsoft.Office.Interop.Word;
public class ScriptMain : UserComponent
public override void PreExecute()
public override void PostExecute()
public override void CreateNewOutputRows()
Microsoft.Office.Interop.Word.ApplicationClass wordApp = new Microsoft.Office.Interop.Word.ApplicationClass();
object file = "D:\\AESMetricsforMarch2010.doc"; // Specify path for word file
object nullobj = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(ref file, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj, ref nullobj);
int count = 0;
int i = 1;
foreach (Microsoft.Office.Interop.Word.Table table in doc.Tables)
Microsoft.Office.Interop.Word.Table firstTable = doc.Tables[i];
foreach (Microsoft.Office.Interop.Word.Row row in firstTable.Rows)
List<string> cellValues = new List<string>();
foreach (Microsoft.Office.Interop.Word.Cell cell in row.Cells)
string cellContents = cell.Range.Text;
// add the cell contents to the array, but remove the strange termination character on the end of the data
cellValues.Add(cellContents.Remove(cellContents.Length - 1));
// the first row is the column header
if (count == 0)
// form.AddHeader(cellValues);
Output0Buffer.Column = cellValues[0].ToString();
Output0Buffer.Column1 = cellValues[1].ToString();
Output0Buffer.Column2 = cellValues[2].ToString();
Output0Buffer.Column3 = cellValues[3].ToString();
Output0Buffer.Column4 = cellValues[4].ToString();
Please note that you have to include some of the office assembly in your SSIS Script Component reference.

Creating MDX for based on the date parameter.

In many reporting requirement we have to generate reports based on the date selected by end user or it can be for current date, current week or current year.
Here I am going how we can do same with the help of MDX –
1. When it comes to current year, current month, current date….
For creating MDX for it you can use VB function in the MDX which will help you to get current date
Example is as follows –
SELECT [Measures].[Order Quantity] on 0,
STRTOSET("[Date].[Calendar Year].&["+FORMAT(VBA!Now(),"yyyy")+"]") on 1
FROM [Adventure Works]
Same can be done for current month or day.
2. If you have to show data based on user parameter then following MDX can help you –
SELECT [Measures].[Order Quantity] on 0,
STRTOSET("[Date].[Date].&["+FORMAT(cdate(@date_param),"YY-MM-DD")+ "T00:00:00]") on 1
FROM [Adventure Works]
3. If you have two start date and End date then –
SELECT [Measures].[Order Quantity] on 0,
{STRTOSET("[Date].[Date].&["+FORMAT(cdate(@Startdate_param),"YY-MM-DD")+ "T00:00:00]")
: STRTOSET("[Date].[Date].&["+FORMAT(cdate(@Enddate_param),"YY-MM-DD")+ "T00:00:00]")}on 1
FROM [Adventure Works]
So.. This how we can create dynamic MDX…………………