Scripting Games 2013 - Advanced Event 5 - The Logfile Labyrinth

2013/06/13 | 7 minute read |

This is my solution for the Advanced Event 5. I did not have much time to work on this event, but here is the script I submitted.

Instruction Download [Skydrive]

Dr. Scripto finds himself in possession of a bunch of IIS log files, much like the one at http://morelunches.com/files/powershell3/LogFiles.zip, if you need one to practice with. He's keeping all of the log files in a folder, and he's left the log files with their default filenames, which he's given a .LOG filename extension. All of the files are for a single Web site, on a single Web server. He'd like you to write a tool that accepts a path, and then simply scans through each file in that path somehow, generating a list of each unique client IP address that have been used to access the Web site. No IP address should appear more than once in your output, and you don't need to sort the output in any way. Your tool should optionally accept an IP address mask like "192.0.1.*" and only display IP addresses that match the specified pattern. If run without a pattern, display all IP addresses. Regardless of the addresses found in the sample file linked above, you should assume that any legal IP address may appear in the files Dr. Scripto needs to scan. Your command should scan all of the files in the folder (and the folder doesn't contain any other kind of file) and produce a single set of results. If an IP address appears in multiple log files and it's likely that will be the case then your final output should still only list that IP address.

Solution Dr Scripto requirements

  • Tool to scan IIS logs

  • Parameter Path (to specify the location of the logs

  • Parameter IP Address mask like “192.0.1.*” to filter the output on a specific pattern

  • No pattern specified : Show all Unique client IP

  • Output should show Unique Client IP Address that have been used to access the Website

  • No IP should appear more than once in the output

  • No Sorting

IIS Log File formats overview In my script I only focused on the W3C format, but here is the list of formats supported by IIS

    * W3C (World Wide Web Consortium) Extended log file format– This is the default log file format used by IIS. Its uses ASCII text format and the time are recorded as UTC. This is the only format where you can customize the properties there by you can limit the size of log files and obtain the detailed information. The properties written in the log files are separated by using spaces.
      * IIS (Microsoft Internet Information Services) log file format– This format also uses ASCII text format and uses fixed number of properties. IIS log file format is used when you don't need detailed information from the logs; it logs more information than NSCA common format but less than W3C format. It is a comma separated file and uses the local time.
        * NCSA (National Center for Supercomputing Applications) log file format– This format logs only the basic information. Similar to IIS log file format it uses fixed number of properties. It records the time using the local time and properties are separated by spaces. Note that NCSA log file format does not support FTP sites. Since the entries are small with this format, the storage space required for logging is comparatively less compared to other formats.
          * Centralized Binary Logging– Centralized binary logging is used when multiple web sites running on a server to write binary, unformatted log data to a single log file. Each web server running IIS creates one log file for all sites on that server. The IIS writes log files in binary format and uses a single file there by making it memory efficient. This type of logging is not supported at web site level.
            * ODBC log file format– This method is used when you want to log access information directly to a database. Enabling ODBC logging will disable the kernel-mode cache so this may affect the server performance. Only supported at site level. </blockquote>
            Source:[http://www.surfray.com/blog/2009/08/11/iis-log-file-formats-overview/](http://www.surfray.com/blog/2009/08/11/iis-log-file-formats-overview/)
            Header The header of the W3C Log format stats by the "#fields: " pattern. Let's use that to find our properties.
            IIS Log - W3C Format
            Using Select-String, I find the line with the pattern '#fields: '. Then I use the SubString() Method to get the rest of the line. Finally I use -split to have all my different property names.
            ((Select-String -path .\W3SVC1\u_ex120420.log -Pattern "#fields: " | 
             Select-Object -First 1).line.Substring("#Fields: ".Length) -split ' ')
            ```
            
            
            
            Working with the Data
            After playing a bit with different Cmdlets to read the logs, I finally decided to useImport-CSV.
            That's super lazy since this Cmdlet already has a "Delimiter" Parameter:-)! Exactly what I need!
            
            In the following code, I use Import-Csv with the header previously determined and delimite the data on white space ' '. I also ignore any lines starting by "#"
            
            
            $header = ((Select-String -path .\W3SVC1\u_ex120420.log -Pattern "#fields: " | 
             Select-Object -First 1).line.Substring("#Fields: ".Length) -split ' ')
            
            Import-Csv '.\W3SVC1\u_ex120420.log' -Header $header -Delimiter ' ' | 
             Where-Object { -not($_.Date.StartsWith('#'))}
            ```
            
            
            I repeat the same process for each file (using foreach) and store the information in an array. 
            This technique has some limits and can't handle large file.
            Check-out the references below,Emin Atac did an awesome work on this event. Especially this part.
            
            
            
            Processing the final Data
            
            Outputting
            
            Get-IISLogClientIPAddress -Path .\ -Pattern *55.2*
            Get-IISLogClientIPAddress -Path .\
            
            ```
            
            
            
            What I Missed
            From the other entries I saw and the comments I got, my script miss the following points.
            
            * Handling massive log files
            
            * Test-Path with -container switch on the Path parameter ( in the PARAM() block)
            
            * Try/Catch on the Out-File (in the Process{Catch{}} part)
            References
            Small list of the interesting blog articles/documentations I found.
            
            * Emin Atacwho did an awesome job on this event
            
            * Use PowerShell to Collect, Store, and Parse IIS Log Data using SQL
            
            * W3C Logging - W3.org
            
            * W3C Logging - MSDN
            
            * IIS Log format overview
            
            * Import-IIS-Log by Mark Shevchenko
            
            * Powershell IIS Log Analasysusing Import-CSV
            
            * Windows PowerShell: Extracting Strings Using Regular Expressions
            
            * Don Jones - Select-String
            
            * Parse IIS log files with PowerShell using System.IO.File
            
            * Parsing IIS logs with PowerShell using System.Data.DataTable
            Script
            
            
            ```
            function Get-IISLogClientIPAddress {
            <#
            .SYNOPSIS
               The function Get-IISLogClientIPAddress generates a list of each unique Client IP address that have been used to access the Website.
               The function is checking the IIS logs from the Path parameter specified by the user.
            
            .DESCRIPTION
               The function Get-IISLogClientIPAddress generates a list of each unique Client IP address that have been used to access the Website.
               The function is checking the IIS logs from the Path parameter specified by the user.
            
            .PARAMETER Path
               Specifies the Path to the files to be searched. Wildcards are permitted.
            
            .PARAMETER Pattern
               Specifies the text to find.
            
            .PARAMETER Delimiter
               Specifies the delimiter that separates the property values. Default value is ' '
            
            .PARAMETER ErrorLog
               Specifies the full path of the Error log file.
            
            .EXAMPLE
               Get-IISLogClientIPAddress -Path .\ -Pattern '10.211.55.*5'
            
                IPAddress
                ----
                10.211.55.25
            
               This example generate a list of IPaddress from the logs located in the current directory ".\" 
               with a pattern '10.211.55.*5'.
            
            .EXAMPLE
               Get-IISLogClientIPAddress -Path .\ -Delimiter ' ' -Pattern '*.55.*'
            
                IPAddress
                ----
                10.211.55.25
                10.211.55.29 
                10.211.55.31 
                10.211.55.28 
                10.211.55.27 
                10.211.55.26 
                10.211.55.30  
            
               This example generate a list of IPaddress from the logs located in the current directory ".\"  
               with a pattern '*.55.*'
            
            .EXAMPLE
                Get-IISLogClientIPAddress -Path c:\sysadmin\IISLog\W3SVC8 -Pattern "172.20.96.*" -ErrorLog Errors.log
            
                IPAddress
                ----
                172.20.96.9
                172.20.96.10
                172.20.96.18
            
               This example generate a list of IPaddress from the logs located in the directory c:\sysadmin\IISLog\W3SVC8
               with a pattern '172.20.96.*'. Errors will be logged in Errors.log.
              
            .INPUTS
               String
            
            .OUTPUTS
               Selected.System.Management.Automation.PSCustomObject
            
            .NOTES
               Scripting Games 2013 - Advanced Event #5
            #>
                
            [CmdletBinding()]
                PARAM(
                    [Parameter(Mandatory,HelpMessage = "FullPath to IIS Log files",Position=0,ValueFromPipeline)]
                    [PSDefaultValue(Help='Specifies the Path to the IIS Log files')]
                    [Alias("Directory")]        
                    [ValidateScript({Test-Path -path $_})]
                    [String]$Path="",
                    
                    [PSDefaultValue(Help='Specifies the IPAddress Pattern to search')]
                    [String]$Pattern,
                    
                    [PSDefaultValue(Help='Specifies the Delimiter. Default is " "')]
                    [String]$Delimiter=" ",
                    
                    [PSDefaultValue(Help='Specifies the FullPath to Error log file')]
                    [ValidateScript({Test-path -Path $_ -IsValid})]
                    [String]$ErrorLog
                )
                BEGIN{
                    [email protected]()
                }#BEGIN BLOCK
            
                PROCESS{
            
                    TRY{
                        $Everything_is_OK = $true
            
                        Write-Verbose -Message "Listing and Searching in all *.LOG files in $Path"
                        FOREACH ($LogFile in (Get-ChildItem -Path $Path -include *.log -ErrorAction Stop -Recurse -ErrorVariable GCIErrors)) {
                            Write-Verbose -Message "$($Logfile.Name)"
            
                            # HEADER, The #Fields directive lists a sequence of field identifiers specifying the information recorded in each entry
                            Write-Verbose -Message "$($Logfile.Name) - Identifiying Header"
                            $header = ((Select-String -path $LogFile -Pattern "#fields: " | 
                                Select-Object -First 1).line.Substring("#Fields: ".Length) -split ' ')
            
                            # PARSING/IMPORTING as a CSV format.
                            Write-Verbose -Message "$($Logfile.Name) - Importing Data as a CSV format"
                            $csv = Import-Csv -path $LogFile -Header $Header -Delimiter $Delimiter -ErrorAction Stop -ErrorVariable ImportCSVErrors | 
                                Where-Object { -not($_.Date.StartsWith('#'))}
                            
                            # Outputting information to $info variable
                            Write-Verbose -Message "$($Logfile.Name) - Sending data to final variable"
                            $info += $csv
            
                        }#FOREACH
            
                    }#TRY
            
                    CATCH{
                        
                        # ERROR HANDLING
                        $Everything_is_OK = $false
                        Write-Warning -Message "Wow! Something Went wrong !"
                        Write-Warning -Message "$($_.Exception.Message)"
            
                        IF ($PSBoundParameters['ErrorLog']) {
                            $GCIErrors | Out-file -FilePath $ErrorLog -Append -ErrorAction Continue
                            $ImportCSVErrors | Out-file -FilePath $ErrorLog -Append -ErrorAction Continue
                            Write-Warning -Message "Logged in $ErrorLog"}
            
                    }#CATCH
            
                    IF ($Everything_is_OK){
                        
                        IF ($PSBoundParameters['Pattern']) {
                            Write-Verbose -Message "Applying Pattern: $pattern and Outputting Final Result"
                            $info | Select-Object -Property @{Label="IPAddress";Expression={$_.'c-ip'}} -Unique | 
                 Where-Object {$_.IPAddress -like $Pattern}
                        }ELSE {
                            Write-Verbose -Message "Outputting Final Result (No Pattern Specified)"
                            $info | 
                 Select-Object -Property @{Label="IPAddress";Expression={$_.'c-ip'}} -Unique }
                        }
            
                }#PROCESSBLOCK
            
                END{Write-Verbose -Message "Script completed"}#END BLOCK
            }#function
            ```