0

I have two csv files, my goal is to generate a new csv file: for that, I must first

  1. check in the file relation.csv if the values of the column "ID" (file joined.csv) are not present.
  2. check if the "Type" column is "TechnologyInteraction" or "TechnologyEvent" in the joined.csv file.
  3. finally, if ID is not present and the type column is "TechnologyInteraction" or "TechnologyEvent", I change the values of the source and target columns.

My code runs but it takes too much time to generate the expected file. It turns out that the size of the joined.csv (247 kB) and relation.csv (456 kB) files is not as large. How to make the processing fast?

      function updateArchiWithImportData{

    process{
    #Deletion of lines not starting with PAXC in the Extract_AGRe_TWS_ALL_20210607.csv file
    $ElementCsv=Import-csv  $env:USERPROFILE\Desktop\Archi\Extract_AGRe_TWS_ALL_20210607.csv -Delimiter ';' | Where-Object{ $_.Jobstream -match '^PAXC' }

    #Loop through all the record in the CSV file  and Extracting jobsets
    $NewExtract_AGRe_TWS_ALL_20200925= ForEach($Entry in $ElementCsv){

    if ($Entry."Script or expected file(s)" -ilike 'technical') {
    $Entry."Jobstream"=$Entry."Jobstream" -replace '^(?:PAXC)?(.+?)','$1'
    }else {
    $Entry."Jobstream" = $Entry.'Script or expected file(s)' -replace '^(?:P|A|X|C)?(.+?)(\.(?:BAT|KSH)|$)','$1' -replace '^(WEBX|DWHS|COGN|CLOT|CLAI|BTRE|BISI|BDDO|ARXL|AGSO|AGPC|ACTO|FNET|ARX|AGS|INF|CLA|MEM|SWA|REX)-','$1'
    }

    $Entry
    }
    # Export Extract_AGRe_TWS_ALL_20210607.csv  in new Desktop\Archi\NewExtract_AGRe_TWS_ALL_20210607.csv file 
    $NewExtract_AGRe_TWS_ALL_20200925  | Export-Csv $env:USERPROFILE\Desktop\Archi\NewExtract_AGRe_TWS_ALL_20210607.csv -NoTypeInformation -Encoding UTF8
    #End of the generation of the csv $env:USERPROFILE\Desktop\Archi\NewExtract_AGRe_TWS_ALL_20210607.csv file, start of the jobstream grep search in the $env:USERPROFILE\Desktop\Archi\elements.csv  file and the csv X file.
    Write-Host "End of the generation of the csv $env:USERPROFILE\Desktop\Archi\NewExtract_AGRe_TWS_ALL_20210607.csv file" 
    Write-Host "start grep  'jobstream'  in the elements.csv file and the csv $env:USERPROFILE\Desktop\Archi\NewExtract_AGRe_TWS_ALL_20210607.csv file." 

    #Now compare two files  and update  the elements.csv file
    #get Name in old elements.csv
    $oldElementsCsv=Import-Csv $env:USERPROFILE\Desktop\Archi\elements.csv -Delimiter ','
    #get Jobstream in NewExtract_AGRe_TWS_ALL_20210607.csv
    $NewExtract_AGRe_TWS_ALL_20200925Jobstream= Import-Csv $env:USERPROFILE\Desktop\Archi\NewExtract_AGRe_TWS_ALL_20210607.csv -Delimiter ','
    #Output progress:
    $output= @()
    ForEach ( $row in $NewExtract_AGRe_TWS_ALL_20200925Jobstream){

    $objMatch = $oldElementsCsv | Where-Object { $_.Name -ne $row."Jobstream"} 
    $pattern='^(.*)-'
    $pattern2='^(.*)_'
    if(($row."Jobstream"  -notmatch $pattern) ){

    $output += New-Object PsObject -Property @{ID=[guid]::NewGuid().ToString(); Type="TechnologyInteraction"; Name= $row."Jobstream"; Documentation= $row."Job Description"}   
    Write-Warning "Found new Jobstream : $($row."Jobstream")"
    Write-Warning "Found new Jobstream Description : $($row."Job Description")"
    }  elseif($row."Jobstream"  -match $pattern2) {

    $output += New-Object PsObject -Property @{ID=[guid]::NewGuid().ToString(); Type="TechnologyInteraction"; Name= $row.Jobstream; Documentation= $row."Job Description"} 
    Write-Warning "New Jobstream : $($row."Jobstream")"
    Write-Warning "New Jobstream Description : $($row."Job Description")"


    } else{

    $output += New-Object PsObject -Property @{ID=[guid]::NewGuid().ToString(); Type="TechnologyService"; Name= $row.Jobstream; Documentation= $row."Job Description"} 

    Write-Warning "New Jobstream : $($row."Jobstream")"
    Write-Warning "New Jobstream Description : $($row."Job Description")"

    }
    }
    $output |Select-Object -Property ID,"Type","Name","Documentation"| Export-Csv $env:USERPROFILE\Desktop\Archi\newElements.csv -NoTypeInformation -Encoding UTF8 
    #newElements.csv a is new elements.csv generated from NewExtract_AGRe_TWS_ALL_20210607.csv
    #open new File:

    notepad $env:USERPROFILE\Desktop\Archi\newElements.csv

    }


    }
    updateArchiWithImportData  # Call the function

    function MergeTwoCsv {
    $oldElementsCsv=Import-Csv $env:USERPROFILE\Desktop\Archi\elements.csv 
    $newElements= Import-Csv $env:USERPROFILE\Desktop\Archi\newElements.csv 
    $merged= $oldElementsCsv + $newElements
    $merged |Select-Object  -Property * -Unique | Export-Csv $env:USERPROFILE\Desktop\Archi\joined.csv  -NoTypeInformation -Encoding UTF8 
    #Joined.csv is a combination of elements.csv and newElements.csv
    notepad $env:USERPROFILE\Desktop\Archi\joined.csv 
    }
    MergeTwoCsv  # Call MergeTwoCsv the function

    $result= @()
    function linkedRelationCsvToElementsCsv{
        #relation.csv headers are ID,"Type","Name","Documentation","Source","Target"
        #$oldElementsCsv headers are ID,"Type","Name","Documentation"
        $relationCsv=Import-CSV $env:USERPROFILE\Desktop\Archi\relations.csv -Delimiter ','
        $joinedCsv=Import-CSV $env:USERPROFILE\Desktop\Archi\joined.csv -Delimiter ','
        # Check if ID in $oldElementsCsv is presente in $relationCsv
        ForEach ($line in  $joinedCsv){


        $checkID= $relationCsv | Where-Object {
            
            ($_.Source -ne $line.ID) -or  ($_.Target -ne $line.ID)
        }
    
        if(($line.Type -eq 'TechnologyInteraction') -or ($line.Type -eq 'TechnologyEvent') ){

            Write-Warning " the source and target columns in relation file not contains the Id of joined file, case type TechnologyInteraction founded "
        # if the job is of type "TechnologyInteraction" or "TechnologyEvent", we link it to the TWS id's(TechnologyCollaboration,ef2f510b-924b-439d-8720-0183c7294eb3) in archi.
         $newArrayResult= New-Object PsObject -Property @{ID=[guid]::NewGuid().ToString(); Type=$line.Type; Name=$line.Name; Documentation=$line.Documentation; Source="ef2f510b-924b-439d-8720-0183c7294eb3"; Target=$line.ID}
             $result = $result + $newArrayResult
        }else{
            Write-Warning " the source and target columns in relation file contains the Id of joined file. "
         $newArrayResult2 = New-Object PsObject -Property @{ID=$relationCsv.ID; Type=$relationCsv.Type;Name=$relationCsv.Name;Documentation=$relationCsv.Documentation; Source=$relationCsv.Source; Target=$relationCsv.Target}
            $result = $result + $newArrayResult2
        }
       
        }
         $result |Select-Object -Property ID,"Type","Name","Documentation","Source","Target"| Export-Csv $env:USERPROFILE\Desktop\Archi\newRelation.csv -NoTypeInformation -Encoding UTF8 
        notepad $env:USERPROFILE\Desktop\Archi\newRelation.csv 
    }linkedRelationCsvToElementsCsv # Call the function

I show you some of the contents of the two files. joined.csv:

ID,"Type","Name","Documentation"
79570056-ab4f-6969-8c58-a5bd5847a823,"TechnologyEvent","CLOTTRGAAJ","WAIT TRGFIC-CLOT190"
79570056-ab4f-6969-8c58-a5bd5847a824,"TechnologyEvent","CLOTTRGAAK","WAIT TRGFIC-CLOT191"
79570056-ab4f-6969-8c58-a5bd5847a831,"TechnologyEvent","COGNTRGAAA","WAIT TRGFIC-COGN130"
79570056-ab4f-6969-8c58-a5bd5847a842,"TechnologyEvent","FNETTRGAAC","trigger PFNET001 PRD"
79570056-ab4f-6969-8c58-a5bd5847a852,"TechnologyEvent","INFOTRGAAA","Flag envoi flux ACORD"
79570056-ab4f-6969-8c58-a5bd5847a730,"TechnologyInteraction","ACTO650Q","Interface 4.1 Fac (V2)"
79570056-ab4f-6969-8c58-a5bd5847a731,"TechnologyInteraction","ACTO666D","Transfert DUMP ACTO"
79570056-ab4f-6969-8c58-a5bd5847a732,"TechnologyInteraction","ACTO700M","Export FAC pour ACSA"
79570056-ab4f-6969-8c58-a5bd5847a733,"TechnologyInteraction","ACTO705Q","Génération Fichiers IRIS"
79570056-ab4f-6969-8c58-a5bd5847a734,"TechnologyInteraction","ACTO800Q","Logs APP+IIS pour AGPC"

And relation.csv is:

ID,"Type","Name","Documentation","Source","Target"
id-babd7a16753d45c7ab7c9d692149b7ad,"RealizationRelationship","","","ef2f510b-924b-439d-8720-0183c7294eb3","79570056-ab4f-6969-8c58-a5bd5847a823"
id-bdc22e5077524384a7fe6f24e39117a9,"RealizationRelationship","","","79570056-ab4f-6969-8c58-a5bd5847a875","b57a5fff-9dd2-4cm1-9141-04c042f49431"
id-bdca26a3328b401091b671daba6c2692,"RealizationRelationship","","","ef2f510b-924b-439d-8720-0183c7294eb3","id-e78ac811853048f094682a6fd607d7de"
id-be0fbcbe3f2043eca42ed19ac8728b72,"RealizationRelationship","","","79570056-ab4f-6969-8c58-a5bd5847a874","b57a5fff-9dd2-4cm1-9141-04c042f49418"
id-be853ce889ff474cbabd07e481502388,"RealizationRelationship","","","79570056-ab4f-6969-8c58-a5bd5847a874","b57a5fff-9dd2-4cm1-9141-04c042f49411"
id-bef3d4040cc048279871935178349013,"RealizationRelationship","","","ef2f510b-924b-439d-8720-0183c7294eb3","79570056-ab4f-6969-8c58-a5bd5847a807"
id-bf3c421cce83454589c94dab4b7f0a28,"RealizationRelationship","","","79570056-ab4f-6969-8c58-a5bd5847a839","b57a5fff-9dd2-4cm1-9141-04c042f49326"
id-bf5513d5eb6540f0899d22d5d62ae5cb,"RealizationRelationship","","","79570056-ab4f-6969-8c58-a5bd5847a840","b57a5fff-9dd2-4cm1-9141-04c042f49334"
  • 1
    Does your current script _actually_ work as intended? It appears to output 1 duplicate per row in the `joined` csv file (which explains why it would become very slow very fast) – Mathias R. Jessen Sep 09 '21 at 16:27
  • 4
    Do **NOT** use the increase assignment operator (`+=`) to create a collection as it might become very expensive, see: [Why should I avoid using the increase assignment operator (`+=`) to create a collection](https://stackoverflow.com/a/60708579/1701026) – iRon Sep 09 '21 at 16:43
  • I just put the integral script, which allows to make treatment to generate the joined.csv file – Amakouladji Sep 09 '21 at 17:43
  • Parts of this script don't make sense. You've got `$checkIDAndSource= $relationCsv | Where-Object {` on about the sixth line of the `linkedRelationCsvToElementsCsv` function, but the script block for the `Where-Object` doesn't contain comparison or logical operators, it contains assignment and more script. You're also using the new-to-Powershell-v7 pipeline chain operator, `&&`, as though it were the `-and` operator later in the same function in the `elseif`. – Bacon Bits Sep 09 '21 at 19:19
  • I have modified my script, it takes less time to do the processing, however I want to check if the value of the column "ID" of the file joined.csv is present in the file relation.csv, if it is not present I want to make assignments based on the value of the column "Type" of the file joined.csv, so a kind of insertion of new in the file relation.csv. Please, do you have an idea? – Amakouladji Sep 10 '21 at 13:12
  • here is the newRelation.csv file that was generated:"ID","Type","Name","Documentation","Source","Target" "System.Object[]","System.Object[]","System.Object[]","System.Object[]","System.Object[]","System.Object[]" – Amakouladji Sep 10 '21 at 15:22
  • I have partially solved my problem, but in the file newRelation.csv I have lines that appear like this:"System.Object[]","System.Object[]","System.Object[]","System.Object[]","System.Object[]","System.Object[]" "9d19a863-1f0c-4511-965c-18de98d2b301","TechnologyInteraction","WRPT100Q","End Of JobStream","ef2f510b-924b-439d-8720-0183c7294eb3","d1922426-786e-4c69-b4a5-05cf185c292b" – Amakouladji Sep 10 '21 at 16:05
  • Thank you everyone, if i use $elements2Csv=Import-Csv $env:USERPROFILE\Desktop\Archi\elements2.csv |Sort-Object -Stable -Property Name -Unique , the output is correct for me, and the script work fine. – Amakouladji Sep 28 '21 at 07:52

0 Answers0