Clear As Mud




A question on one of the forums recently asked about hiding sheets in a workbook to be posted as an example, but where that workbook contained sensitive information. As I replied, hiding the sheets doesn’t really hide the data so he would be just as exposed.

Jimmy Pena has a a great application for generating random data, but sometimes it is just better to scramble the data that you already have. I suggested to the poster that they should really be scrambling the data, and further suggested that it should not be too difficult to write some code to do so.

That got me thinking as to how to do it, in a generic manner. It should cater for names, ids, amounts, currencies and so on.

A Simple Scrarmbling App

My first thought was to insert a row at the top of the target sheet, and add an entry for all columns to be obfuscated. So, for instance, if it had ‘Name’ at the top, each entry in that column would be changed to Name #1, Name #2 etc. If it had Number<7>, you would generate a random number up to 9,999,999. If it had Decimal<5,2> you would generate numbers up 99,999.99, and so on. If the value in row 1 was blank, that column gets ignored.

So you would insert a row with values such as ‘Name’, ‘Number<7>’, and so on.

The code for this was relatively simple to write, just a loop through each column and change the values. To ensure that all like values get changed to the same value, I did a Replace on the whole column once a value to be changed was found. In other words, if Bob appeared 100 times in the column, each instance of Bob would be changed to the same value. To ensure that I didn’t then go and change the second, changed, instance of Bob to some new value, I pre-pended the new value with ‘Obfuscate’, which I stripped off with a Replace at the end.

The main loop was like this

            mpDataType = .Cells(1, j).Value

            mpNextItem = 0


            If mpDataType <> “” Then

                mpLastRow = .Cells(.Rows.Count, j).End(xlUp).Row


                For i = 2 To mpLastRow

                    If Not .Cells(i, j).Value Like “Obfuscated*” Then


                        mpNextItem = mpNextItem + 1

                        Select Case True


                            Case mpDataType Like “Number<*”

                                ‘number processing


                            Case mpDataType Like “Decimal<*”

                                ‘decimal processing


                            Case mpDataType Like “Currency<*”

                               ‘currency processing


                            Case Else                                

                                .Columns(j).Replace What:=.Cells(i, j).Value, _

                                                    Replacement:=”Obfuscated” & mpDataType & ” #” & mpNextItem, _

                                                    LookAt:=xlWhole, _


                        End Select

                    End If

                Next i


            .UsedRange.Replace What:=”Obfuscated”, _

                           Replacement:=””, _

                           LookAt:=xlPart, _



So far, so good, but I could see one major problem. If there is a formula that refers to data somewhere else on the spreadsheet, that table needs to be obfuscated too, but in a smart way. For instance, say that there is a lookup table of names Bob, Simon and Alex. If there is a formula somewhere of =VLOOKUP(A20,lookup_table,2,FALSE), and A20 is one of those values in the table, then that value in the lookup table should change to the same value that A20 switches too. Unfortunately, it isn’t only VLOOKUP, it is LOOKUP, HLOOKUP, COUNTIF, and so on. Tough!

It is actually worse if the values are used elsewhere a simple cell reference or by a code event update, there is no way in my code to recognise that.


In the end, I decided to avoid this route, far too difficult, and I opted to save all the before values, and all of after values in separate arrays, and after having removed the ‘Obfuscated’ tag I went through each sheet and checked if any of the before values still remained, if so I replaced them.

A bit brute force, but it seems to work okay.

The data is first changed like so


                            Case Else

                                mpIdxChange = mpIdxChange + 1

                                mpBefores(mpIdxChange) = .Cells(i, j).Value

                                mpAfters(mpIdxChange) = mpDataType & ” #” & mpNextItem

                                If mpIdxChange Mod 1000 = 0 Then


                                    mpSizeArray = UBound(mpBefores) + 1000

                                    ReDim Preserve mpBefores(1 To mpSizeArray)

                                    ReDim Preserve mpAfters(1 To mpSizeArray)

                                End If


                                .Columns(j).Replace What:=.Cells(i, j).Value, _

                                                    Replacement:=”Obfuscated” & mpAfters(mpIdxChange), _

                                                    LookAt:=xlWhole, _


 And finally ‘corrected’ like so


        ‘now make the changes for names, ids, etc.

        ReDim Preserve mpBefores(1 To mpIdxChange)

        ReDim Preserve mpAfters(1 To mpIdxChange)


        ‘first remove the tag

        .UsedRange.Replace What:=”Obfuscated”, _

                       Replacement:=””, _

                       LookAt:=xlPart, _



        For Each mpWS In ActiveWorkbook.Worksheets


            ‘then update any associated values

            For i = 1 To mpIdxChange


                mpWS.UsedRange.Replace What:=mpBefores(i), _

                                       Replacement:=mpAfters(i), _

                                       LookAt:=xlWhole, _


            Next i

        Next mpWS



Of course, it can be taken a lot further, adding further sophistication. For example, it doesn’t explicitly cater for an Excel spreadsheet that is constructed as a pseudo-database, separate, linked tables. And it probably needs a decent progress bar  as it could take some time on large data.

An addin with the code and an example file can be downloaded via the RSS feed. The addin adds an item to the Data menu.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>