4

First time asker here. Please be kind :)

I'm attempting to recursively get all directories in a parallel manner in hopes of decreasing the time it takes to traverse through a drive. Below is the code I've tried. Essentially what I'm looking to do is input a folder and do the same in parallel for it's subfolder and their subfolders and so on, but the function is not recognized inside the parallel block

function New-RecursiveDirectoryList {
    [CmdletBinding()]
    param (
        # Specifies a path to one or more locations.
        [Parameter(Mandatory = $true,
            Position = 0,
            ValueFromPipeline = $true,
            ValueFromPipelineByPropertyName = $true,
            HelpMessage = 'Path to one or more locations.')]
        [Alias('PSPath')]
        [ValidateNotNullOrEmpty()]
        [string[]]
        $Path
    )
    process {
        foreach ($aPath in $Path) {
            Get-Item $aPath

            Get-ChildItem -Path $aPath -Directory |
                # Recursively call itself in Parallel block not working
                # Getting error "The term 'New-RecursiveDirectoryList' is not recognized as a name of a cmdlet"
                # Without -Parallel switch this works as expected
                ForEach-Object -Parallel {
                    $_ | New-RecursiveDirectoryList
                }
        }
    }
}

Error:

New-RecursiveDirectoryList: 
Line |
   2 |                      $_ | New-RecursiveDirectoryList
     |                           ~~~~~~~~~~~~~~~~~~~~~~~~~~
     | The term 'New-RecursiveDirectoryList' is not recognized as a name of a cmdlet, function, script file, or executable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

I've also attempted to use the solution provided by mklement0 here but no luck. Below is my attempt at this:

    function CustomFunction {
    [CmdletBinding()]
    param (
        # Specifies a path to one or more locations.
        [Parameter(Mandatory = $true,
            Position = 0,
            ValueFromPipeline = $true,
            ValueFromPipelineByPropertyName = $true,
            HelpMessage = 'Path to one or more locations.')]
        [Alias('PSPath')]
        [ValidateNotNullOrEmpty()]
        [string[]]
        $Path
    )

    begin {
        # Get the function's definition *as a string*
        $funcDef = $function:CustomFunction.ToString()
    }

    process {
        foreach ($aPath in $Path) {
            Get-Item $aPath

            Get-ChildItem -Path $aPath -Directory |
                # Recursively call itself in Parallel block not working
                # Getting error "The term 'New-RecursiveDirectoryList' is not recognized as a name of a cmdlet"
                # Without -Parallel switch this works as expected
                ForEach-Object -Parallel {
                    $function:CustomFunction = $using:funcDef
                    $_ | CustomFuction
                }
        }
    }
}

Error

CustomFuction: 
Line |
   3 |                      $_ | CustomFuction
     |                           ~~~~~~~~~~~~~
     | The term 'CustomFuction' is not recognized as a name of a cmdlet, function, script file, or executable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

Does anybody know how this may be accomplished or a different way of doing this?

Daniel
  • 4,052
  • 2
  • 5
  • 19
  • Hey Daniel, hope all is well, I still remember this question and had it planned to develop something similar but without multi-threading and classic recursion. In case you're interested this is a [tree like cmdlet for PowerShell](https://github.com/santysq/PSTree). – Santiago Squarzon Dec 23 '21 at 23:53

3 Answers3

3

So, this worked for me, it obviously doesn't look pretty. One thing to note, the foreach ($aPath in $Path) {...} on your script is unnecessary, the process {...} block will handle that for you when you pass multiple paths.

Code:

function Test {
    [CmdletBinding()]
    param (
        # Specifies a path to one or more locations.
        [Parameter(Mandatory)]
        [Alias('PSPath')]
        [string[]]$Path
    )

    begin {
        $scriptblock = $MyInvocation.MyCommand.ScriptBlock.ToString()
    }

    process {
        # Get-Item $Path <= This will slow down the script
        Get-ChildItem -Path $Path -Directory | ForEach-Object -Parallel {
            $_ # You can do this instead
            $i = $using:scriptblock
            $thisIsNotPretty = [scriptblock]::Create($i)
            & $thisIsNotPretty -Path $_
        }
    }
}

You can also do something like this which is pretty cool to display the folders hierarchy (I use something very similar on a function I made for AD Groups):

function Test {
    [CmdletBinding()]
    param (
        # Specifies a path to one or more locations.
        [Parameter(Mandatory)]
        [Alias('PSPath')]
        [string[]]$Path,
        [int]$Nesting = 0
    )

    begin {
        $scriptblock = $MyInvocation.MyCommand.ScriptBlock.ToString()
    }

    process {

        Get-ChildItem -Path $Path -Directory | ForEach-Object -Parallel {

            function Indent{
                param(
                    [String]$String,
                    [Int]$Indent
                )
                
                $x='_';$y='|';$z='    '
                
                switch($Indent)
                {
                    {$_ -eq 0}{return $String}
                    {$_ -gt 0}{return "$($z*$_)$y$x $string"}    
                }
            }

            $z = $using:Nesting

            [pscustomobject]@{
                Nesting = $z
                Path = Indent -String $_.FullName -Indent $z
            }

            $z++
            $i = $using:scriptblock
            $thisIsNotPretty = [scriptblock]::Create($i)
            & $thisIsNotPretty -Path $_ -Nesting $z
        }
    }
}

[System.Collections.ArrayList]$result = Test -Path /home/user/Documents

function Draw-Hierarchy{
    param(
        [System.Collections.ArrayList]$Array
    )
    
    $Array.Reverse()
    
    for($i=0;$i -lt $Array.Count;$i++){
    
        if(
            $Array[$i+1] -and 
            $Array[$i].Path.IndexOf('|_') -lt $Array[$i+1].Path.IndexOf('|_')
        ){
        $z=$i+1
        $ind=$Array[$i].Path.IndexOf('|_')
            while($Array[$z].Path[$ind] -ne '|'){
                $string=($Array[$z].Path).ToCharArray()
                $string[$ind]='|'
                $string=$string -join ''
                $Array[$z].Path=$string
                $z++
                if($Array[$z].Path[$ind] -eq '|'){break}
                }
            }
        }
    
    $Array.Reverse()
    return $Array
    
}

Draw-Hierarchy $result

Result looks something like this:

Nesting Path
------- ----
      0 /home/user/Documents/...
      1     |_ /home/user/Documents/...
      1     |_ /home/user/Documents/...
      1     |_ /home/user/Documents/...
      2     |   |_ /home/user/Documents/...
      1     |_ /home/user/Documents/...
      1     |_ /home/user/Documents/...
      2     |   |_ /home/user/Documents/...
      1     |_ /home/user/Documents/...
      1     |_ /home/user/Documents/...
      3     |       |_ /home/user/Documents/...
      1     |_ /home/user/Documents/...
      3             |_ /home/user/Documents/...
      3             |_ /home/user/Documents/...
      4                 |_ /home/user/Documents/...
      5                     |_ /home/user/Documents/...
Santiago Squarzon
  • 20,988
  • 4
  • 10
  • 27
  • Ah, this works perfectly! ..and also terribly :) To no fault of yours though. Spawning off all these processes for every folder just isn't very efficient I guess. I do really like the 2nd version with the level of nesting. Very nice touch. I was also playing with scriptblock myself but couldn't figure it out. `$scriptblock = $MyInvocation.MyCommand.ScriptBlock.ToString()` was the missing piece for me. Thank you for sharing this and solving my problem. It's good to know how to bring the calling function into a parallel block for future reference. – Daniel May 05 '21 at 05:07
  • @Daniel Check out my last edit, your question was an awesome exercise so thank you :) By the way, I was gonna add, if you're gonna run this recursively against a very big share / drive you probably will want to add a `-ThrottleLimit` to your `foreach-object` – Santiago Squarzon May 05 '21 at 05:11
  • @Daniel Hierarchy doesn't look very good on Linux tho, but if you want to see how the real function for AD Groups works checkout my [GitHub](https://github.com/santysq/Get-Hierarchy) – Santiago Squarzon May 05 '21 at 05:14
1

From what I am seeing you coded a mandatory parameter, which means that you need to call it when you run your function. For example, in your case, you can try to manually run the selected lines in memory. To do so, open a PowerShell session and simply copy/paste the code you posted here. Once the code is loaded into memory, you can then call the function:

CustomFunction -Path "TheTargetPathYouWant"
raDiaSmO
  • 146
  • 2
  • Thanks for the answer raDiaSm0 although I don't think that's the problem. The mandatory path argument is being passed in from the pipeline. The problem is that the function itself cannot be called from within the parallel scriptblock as displayed in the error. I'm wondering if there is a way to pass the function definition into the parallel block so that it can be used. – Daniel May 05 '21 at 02:41
1

I did something similar at the time. I did it using non recursive function but with RunSpace from DotNet. For it, you will need to install PoshRsJob module and create a list of subfolder to extract in dir.txt. then you run this:

Install-Module PoshRsJob -Scope CurrentUser
function ParallelDir {
    param (
        $Folders,
        $Throttle = 8
    )
    $batch = 'ParallelDir'
    $jobs = Get-RSJob -Batch $batch
    if ($jobs | Where-Object State -eq 'Running') {
        Write-Warning ("Some jobs are still running. Stop them before running this job.
        > Stop-RSJob -Batch $batch")
        return
    }

    $Folders | Start-RSJob -Throttle $Throttle -Batch $batch -ScriptBlock {
        Param ($fullname)
        $name = Split-Path -Path $fullname -Leaf
        Get-ChildItem $fullname -Recurse | Select-Object * | Export-Clixml ('c:\temp\{0}.xml' -f $name)
    } | Wait-RSJob -ShowProgress | Receive-RSJob

    if (!(Get-RSJob -Batch $batch | Where-Object {$_.HasErrors -and $_.Completed})) {
        Remove-RSJob -Batch $batch
    } else {
        Write-Warning ("The copy process has finished with ERROR. You can check:
        > Get-RsJob -Batch $batch
        To consolidate the results from each copy run:
        > Get-ChildItem 'c:\temp\*.xml' | Import-Clixml")
    }
}
$dir = gc .\dir.txt
ParallelDir -Folders $dir
dir c:\temp\*.xml|Import-Clixml|select name,length
PollusB
  • 1,510
  • 2
  • 20
  • 29