PowerShell - Sum similar entries from multiple CSV files
One of my script is scheduled to download everyday the proxy logs files from multiple proxies (Approx 1>2GB per file) of the previous day. The second step is to parse each of them and get the top 200 domain names within a specific environment. Finally at the end of the month another script create a report on the monthly internet usage.
I thought this was an interesting exercise even if some tools would probably do a better job ($$$). Also we shouldn’t take those results too seriously since some protocols like Ajax or HTML5 talk a lot to the servers, keep refreshing pages even if you are not actively working on them.
In this post, I will talk about the last part of this process and how I combine all those files to get a real monthly top domains.
Example of daily report
Here is a sample of daily CSV file result after the parsing has done its job:
So at the end of the month I have a bunch of CSV files that were generated during the months and I need to know how many time each domains were accessed.
One-Liner Magic!
To calculate the Sum of each domains for each months I can simply use the following One-Liner:
Get-ChildItem -Path .\*.csv | # Get each CSV files
ForEach-Object -Process {
Import-Csv -Path $PSItem.FullName # Import CSV data
} |
Group-Object -Property Name | # Group per Domain Name
Select-Object -Unique -Property Name, @{
Label = "Sum";
Expression = {
# Sum all the counts for each domain
($PSItem.group | Measure-Object -Property Count -sum).Sum
}
} |
Sort-Object -Property Sum -Descending |
Out-GridView -Title "Top Domains"
Step by Step
Get all the CSV File
```
Now we import the data inside each file. I added a Sort-Object so you can see that we have duplicates (since we have the same values coming from different files).
Using Group-Object by Domain Name. This will group also all the different “counts” for each domain
Finally we use Select-Object to show a unique Domain Name and Sum each counts present in the "Group" property from the Group-Object Cmdlet (see above).
Leave a comment