Excel and OleDb stupidity

This afternoon I was working on a Windows Forms app for the finance guys, to help them allocate costs to systems, proportional to predefined allocation keys.


The app has to import Excel spreadsheet files and perform all the database actions.


The key columns to uniquely identify a cost are text values. Unfortunately, some of those key values do not contain letters, but only numeric characters. And this is where it sucks to have to work with Excel.


This is because Excel knows no datatypes. You can specify formatting for cells, but Excel ignores that when someone asks for cell data. Instead, it scans the first couple of rows to guess the datatype. By default it checks the first 8 rows. If those contains mixed values, it uses a specified default type for those values that are not numeric.


This means that if you use OleDb to retrieve values, the column type can vary per row.


What can you do about it? Well nothing really. You can tweak some values, and hope that you will never encounter the corner cases that you cannot do anything about. There is plenty of information to be found. One good explanation was this.


However, one of the finance guys gave me an incredibly useful tip, which you can use as a workaround. You have to manually make a couple of easy modifications to the excel file, and Excel will always use the text data type.


  1. Assume that the offending data is in column 1, and that the first row is the header row. 
  2. You insert 2 empty columns to the right of the offending column.
  3. Put an ‘ in the first cell of the first empty column (B2).
  4. Then enter =B2&B1in the cell B3.
  5. Select B2 and B3, and double-click the bottom right corner of the selection box. This will copy the data from there to the last row.
  6. B3 will now hold the data as text, because the ‘ tells Excel that no matter what is in the cell, it is text.
  7. Then you copy column 3, paste it as ‘values only’ over column 1.
  8. Delete columns 2 and 3 again.

Now you can read the column in OleDb and always get text values for that column.

2 thoughts on “Excel and OleDb stupidity”

  1. You can use IMEX=1 extended property when defining connection string for OLE DB driver and get all cells for all columns as text.

  2. I tried that, but it did not work completely.

    The fields that had only numbers were still parsed as doubles, while the fields that contained a mix were passed as text. I needed all fields to be passed as text, and I don’t think IMEX does that.

    IMEX only prevents those mixed fields from being passed as DBNull. At least in the test I did.
    Which version of Excel did you try this with? I used Office 2003 SP2

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>