[PowerShell] How to Join Data in faster way?

Last week, one of Hong Kong ex-MVP asked a question to our ex-MVPs private group. He has a PowerShell script and 2 csv data files. Both csv data file has a common column and he could like to join them and combine into a single csv data file. his code  like this,

Get csv1 | ForEach-Object -Parallel Where-Object $_.id -match $csv2.id

He complaints that it will take a very long time to handle large amount of data. And there are some comments,

  • change from -match to -eq
  • try to use hash
  • something about add-member
    …etc

From me, I just say, why not do it with LINQ? So I try to write 2 PowerShell scripts, one will be using traditional PowerShell command and another one will be using LINQ.

First, in my testing code, I will try to create 2 arrays and then generate some data into it. Here is my code,

#Create empty arrays
$datasetA = @()
$datasetB = @()
#Initialize "status" arrays to pull random values from
$genderArray = @('Male','Female','Decline to Answer')
#Loop 10 times to populate our separate datasets
1..10 | Foreach-Object {
    #Set the name with the current iteration attached
    $thisName = "Person_$_" 
    #Create an object with the name property and a random gender
    $rowA = @{
        Name = $thisName
        Gender = $genderArray[(Get-Random -Minimum 0 -Maximum 3)]
    }
    $datasetA += New-Object -Type PSObject -Property $rowA
    #Create a second object with the same name and a random age
    $rowB = @{
        Name = $thisName
        Age = Get-Random -Minimum 1 -Maximum 120
    }
    $datasetB += New-Object -Type PSObject -Property $rowB
}

It will create 2 array, each contains 10 rows. The common column will be the Name column.  One array contains a Gender column, another array contains Age column. Simple data, right?

In traditional PowerShell, we might iterate through one of the arrays while doing a filter on the second array. And then either add property members to the first array or create a new objects with a combination of properties from both arrays. My code then will be something like this:

$joinedDataset =@()
foreach($rowA in $datasetA) {
    $rowB = $datasetB | Where-Object Name -eq $rowA.Name
    $joinedRow = @{
        Name = $rowA.Name
        Gender = $rowA.Gender
        Age = $rowB.Age
    }
    $joinedDataset += New-Object -Type PSObject -Property $joinedRow
}

It works fine and it only takes 32.6 milliseconds to complete. But if I increase the iterate number from 10 to 100, it then takes 174.1 milliseconds. And then 7025.7 milliseconds for 1000 records, 638429.8 milliseconds (10 min 38 sec) for 10000 records.
(P.S., it include the time it take to generate iteration arrays)

Now, let’s do the same with using LINQ:

$linqJoinedDataset = [System.Linq.Enumerable]::Join(
    $datasetA,
    $datasetB,
    [System.Func[Object,string]] {param ($x);$x.Name},
    [System.Func[Object,string]]{param ($y);$y.Name},
    [System.Func[Object,Object,Object]]{
        param ($x,$y);
        New-Object -TypeName PSObject -Property @{
        Name = $x.Name;
        Gender = $x.Gender;
        Age = $y.Age}
    }
)

$OutputArray = [System.Linq.Enumerable]::ToArray($linqJoinedDataset)

The time it takes will be as following,

  • 37.8 milliseconds for 10 records
  • 79.8 milliseconds for 100 records
  • 809.4 milliseconds for 1000 records
  • 16026.2 milliseconds for 10000 records

In this code, I am calling the Join method on System.Linq.Enumerable and then passing it with 5 parameters,

  1. The first dataset that it is going to join
  2. The second dataset that it joins
  3. The delegate which defines the key to compare against on the first dataset
  4. The delegate which defines the key to compare against on the second dataset
  5. Finally, we pass in the delegate which defines what the output should be

Because the Join method is a Deferred Execution. it is not actually joining the data at that time. It is just building an expression tree which defines the relational algebra needed to perform the join. So the above example code, I will need to call “ToArray()”, so that the execution could start. If you have any further codes that you will handle with the result data, you could skip the “ToArray()” code. For example, calling like this,

$linqJoinedDataset.Where({($_.Age -gt 20) -and ($_.Gender -eq "Male")})

Then the join query would execute at that time and then “Where()” would filter down to just the objects that matching the conditions.

Here is the table on the execution time on each way,

# iterate Traditional PS LINQ
10 32.6 37.8
100 174.1 79.8
1000 7025.7 809.4
1000 638429.8 16026.2

So I can say, LINQ saved more than 40 times of timing to complete the join. With this figure, I am sure that you will love to update and use LINQ if you need a join operation on data.

Hope this blog could help you to get a faster execution script in PowerShell.

Enable TLS 1.2 or above on your ASP.NET Web App or WebAPI

The Transport Layer Security (TLS) 1.2 is a stadnard that provides security improvements over previous versions. More and more thrid-party APIs were configured to disable any requests from clients that were using TLS 1.0/1.1. So if your ASP.NET Web App or WebAPI Services Web Site will need to update to TLS 1.2 as well if your ASP.NET Web App or WebAPI Services Web Site has some calls to the third-party APIs, otherwise they will only return empty responses.

You could disable TLS 1.0/1.1 and only enable TLS 1.2 in your Web Server or in Azure, so that your hosting environments will no longer accept requests from earlier version of TLS.

But what happens on your application (ASP.NET Web App or WebAPI Services)? Depend on what version of .NET framework your project usrs will dicate the possible solutions available to you.

  1. If your project compiles against .NET Framework 4.7 or above, then you don’t have to do anything.
  2. If your project has been developed in a earlier version of .NET Framework, then you could either
    1. Recompile your project using .NET Framework 4.7 or above
    2. If recompiling is not an option, then you will have to update your .config file as below,
<configuration>
  <runtime>
    <AppContextSwitchOverrides value="Switch.System.Net.DontEnableSystemDefaultTlsVersions=false"/>
  </runtime>
  <system.web>
    <compilation targetFramework="x.y.z" />
    <httpRuntime targetFramework="x.y.z" /> 
  </system.web>
</configuration>

It is preferred that x.y.z are the same. So if your application is 4.6.2, then replacing x.y.z into 4.6.2.

Microsoft also has post a useful document on describing the best pratices to TLS 1.2. It will be great if you could read them all and understand them in order to fully secure your application(ASP.NET Web App or WebAPI Services).
https://docs.microsoft.com/en-us/dotnet/framework/network-programming/tls

 

Bot Framework Composer helps you build bot with GUI and “No Code”

Do you know that Microsoft has released a Bot Framework Composer. Using the Bot Framework Composer presents some advantages when compared to creating a bot with the SDK and coding.

  • Use of Adaptive Dialogs allow for Language Generation (LG), which can simplify interruption handling and give bots character.
  • Visual design surface in Composer eliminates the need for boilerplate code and makes bot development more accessible.
  • Time saved with fewer steps to set up your environment.
  • The Composer bot projects contain reusable assets in the form of JSON and Markdown files that can be bundled and packaged with a bot’s source code. The files can be checked into source control systems and deployed along with code updates, such as dialogs, language understanding (LU) training data, and message templates.

It is available as a Desktop application as well as a web-based component. You will also need to install the Bot Framework Emulator and ensure that you have .NET Core SDK 3.1 or later installed as well.

And this Bot Framework Composer is also one of the key components in the exam if you are going to take AI-102: Designing and Implementing a Microsoft Azure AI Solution (the replacement exam of AI-100 after 30th June 2021).

BotComposer_home

 

[PowerShell] How to Retrieve Data with ADO.NET

I am primary a C# developer since 2005. Before that I was a VB5/VB6/VB.NET developer. Recent, I changed my role from development side to a consulting side. I found out that I have to do some development by using powershell script. It is because I will pass the program to client and we should try to build with script or any “no compile needed language”. And so I start writing PowerShell script. Because of this, I will write some blog about PowerShell  that I learned from my job. And this is the first one in this series.

My first PowerShell task is, retrieve data from MSSQL.

From my C# knowledge, I would like to retrieve data by using ADO.NET. So I start looking for how I could use C# library in PowerShell Script. And lucky that there is a way.

Below is my first script in PowerShell.

Function GetDataFromMSSQL {
    Write-Host "====================================="
    $sqlConn = New-Object System.Data.SqlClient.SqlConnection
    $sqlConn.ConnectionString = "Server=localhost;Integrated Security=true;Initial Catalog=AdventureWorks2019;"
    
    try {
        $sqlConn.Open()
        Write-Host "Data Source is connected."
        #$sqlcmd = $sqlConn.CreateCommand()
        $sqlcmd = New-Object System.Data.SqlClient.SqlCommand
        $sqlcmd.Connection = $sqlConn
        #$selectQuery = "SELECT TOP (1000) [SpecialOfferID],[Description],[DiscountPct],[MinQty],[MaxQty] FROM [AdventureWorks2019].[Sales].[SpecialOffer]"
        $sqlcmd.CommandText = $selectQuery

        $sqladp = New-Object System.Data.SqlClient.SqlDataAdapter $sqlcmd
        $data = New-Object System.Data.DataSet
        $sqladp.Fill($data) | Out-Null

        $rowCount = $data.Tables[0].Rows.Count
        Write-Host "$rowCount rows returned."
        $data.Tables
        $true
    } catch {
        Write-Host "Data Source cannot be connected."
        Write-Host $Error[0].Exception.Message
        $false
    } finally {
        $sqlConn.Close()
    }
}
GetDataFromMSSQL

 

No internet access on Host after enabled Hyper-V (or any VM software, i.e., VirtualBox)

Recent, I installed Oracle VM VirtualBox and enabled Hyper-V feature on my Win10 Professional (1709). Since then I cannot browse any intranet or internet web pages in IE and Edge. But there is no problem at all if I disable the newly added virtual network adapter.

I tried to google and search how to fix it. But I found no clue at all.

After few days, suddenly a light bulb goes on in my head.

I think it might be the adapter priorities so all the network traffic is sending out through the virtual network adapter. I then find out the REAL solution to fix it. Now all of the newly added virtual network adapters are enabled and I have no problem at all on browsing intranet and internet web pages.

Here are the steps to fix.

    1. Open Windows PowerShell
    2. Run “Get-NetAdapter” to list all the adapters info
    3. Now you can see the IfIndex value for the virtual network adapter (it is 18 in my PC, it will be other value in your PC). Remember this value.
    4. Then we run “Get-NetIPInterface” to list all the IP interface info
    5. The “real” network adapter for me to go intranet and internet is IfIndex=52, and its InterfaceMetric is 35.
    6. The virtual network adapter (IfIndex=18) is having InterfaceMetric=15.
    7. In windows, the lower number means the higher priorities. So that means all the network traffic will route out from the virtual network adapter and so it seems I lost all the internet access. Now what we should do is, lower the priorities of the virtual network adapter. Because my wifi adapter is having InterfaceMetric=45. So I think putting 55 for the virtual network adapter is a good choice. Let’s try to make the change now.
    8. Let’s open “Windows Settings”
    9. Click into “Network & Internet”
    10. Click into “Change Adapter Options”
    11. Right click the virtual network adapter and select “Properties” button
    12. select “TCP/IPv4” and click “Properties” button
    13. Click “Advanced…” button
    14. Clear the “Automatic metric” and enter “55” to the textbox of Interface Metric (entering other proper value in your case)
    15. Click “OK” button and until all property windows are closed.
    16. Repeat the same steps from (12) to (15) for “TCP/IPv6”
    17. You may run “Get-NetIPInterface” in PowerShell Window to make sure the values are applied.

DONE!
These are all the steps to fix this issue. I hope these steps could also help you.