41

I am trying to remove all the lines from a text file that contains a partial string using the below PowerShell code:

 Get-Content C:\new\temp_*.txt | Select-String -pattern "H|159" -notmatch | Out-File C:\new\newfile.txt

The actual string is H|159|28-05-2005|508|xxx, it repeats in the file multiple times, and I am trying to match only the first part as specified above. Is that correct? Currently I am getting empty as output.

Am I missing something?

Peter Mortensen
  • 30,030
  • 21
  • 100
  • 124
user3759904
  • 755
  • 2
  • 7
  • 7

6 Answers6

42

Suppose you want to write that in the same file, you can do as follows:

Set-Content -Path "C:\temp\Newtext.txt" -Value (get-content -Path "c:\Temp\Newtext.txt" | Select-String -Pattern 'H\|159' -NotMatch)
Peter Mortensen
  • 30,030
  • 21
  • 100
  • 124
Samselvaprabu
  • 15,364
  • 28
  • 124
  • 218
29

Escape the | character using a backtick

get-content c:\new\temp_*.txt | select-string -pattern 'H`|159' -notmatch | Out-File c:\new\newfile.txt
Fourkeys
  • 414
  • 3
  • 4
  • 7
    Warning - I used this to attempt to update a file in-place and the file was deleted. – alex Jun 14 '19 at 14:53
  • Long lines get sliced with `Out-File`, I resolved by using `Set-Content` instead, same syntax – Dariopnc Aug 20 '21 at 15:23
  • `Out-File` adds empty lines to the output (for non matching lines), but `Set-Content` doesn't, which I guess is indeed the desired behaviour. – Fuujuhi Mar 08 '22 at 12:31
4

The pipe character | has a special meaning in regular expressions. a|b means "match either a or b". If you want to match a literal | character, you need to escape it:

... | Select-String -Pattern 'H\|159' -NotMatch | ...
Ansgar Wiechers
  • 184,186
  • 23
  • 230
  • 299
  • In PowerShell, the escape character is the backtick (`). See [About Escape Characters](http://technet.microsoft.com/en-us/library/hh847755.aspx). – orad Oct 21 '14 at 21:36
  • 5
    @orad I am aware of that. In regular expressions, however, the escape character is the backslash. Both work in this case. – Ansgar Wiechers Oct 22 '14 at 07:44
4

You don't need Select-String in this case, just filter the lines out with Where-Object

Get-Content C:\new\temp_*.txt |
    Where-Object { -not $_.Contains('H|159') } |
    Set-Content C:\new\newfile.txt

String.Contains does a string comparison instead of a regex so you don't need to escape the pipe character, and it's also faster

phuclv
  • 32,499
  • 12
  • 130
  • 417
  • 1
    I like this solution over Fourkeys' because (unless I'm an idiot) Select-String also adds file name and line number to the output, which isn't desired in my use case. – tolache Jul 05 '21 at 08:29
  • @tolache I don't see that behaviour with `Select-String` here (PS 5). – Fuujuhi Mar 08 '22 at 12:30
  • 1
    @Fuujuhi you don't get filenames and line numbers if you pass the input strings through a pipe like above, but normally `Select-String pattern file.txt` will output file name and line numbers by default as you can see from the [man page](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/select-string?view=powershell-5.1) – phuclv Mar 08 '22 at 13:12
3

Another option for writing to the same file, building on the existing answers. Just add brackets to complete the action before the content is sent to the file.

(get-content c:\new\sameFile.txt | select-string -pattern 'H`|159' -notmatch) | Set-Content c:\new\sameFile.txt
Robert Brooker
  • 1,668
  • 20
  • 20
  • In my tests, the backets do not change anything in the output produced, which makes sense. However using `Out-File` instead of `Set-Content` adds empty lines. – Fuujuhi Mar 08 '22 at 12:29
  • 1
    Thanks for the tip on `Out-File`, I have updated it to `Set-Content`. Without the brackets it would be writing to the file at the same time it is reading from it (in this one line example). The brackets force the read operation to complete before it starts writing to it. – Robert Brooker Mar 09 '22 at 09:15
  • Ok! This enforces sequential access, and emulates in-place edition of the file. Now it's clear, good tip! – Fuujuhi Mar 14 '22 at 21:37
1

This is probably a long way around a simple problem, it does allow me to remove lines containing a number of matches. I did not have a partial match that could be used, and needed it to be done on over 1000 files. This post did help me get to where I needed to, thank you.

$ParentPath = "C:\temp\test"
$Files = Get-ChildItem -Path $ParentPath -Recurse -Include *.txt
$Match2 = "matchtext1"
$Match2 = "matchtext2"
$Match3 = "matchtext3"
$Match4 = "matchtext4"
$Match5 = "matchtext5"
$Match6 = "matchtext6"
$Match7 = "matchtext7"
$Match8 = "matchtext8"
$Match9 = "matchtext9"
$Match10 = "matchtext10"

foreach ($File in $Files) {
    $FullPath = $File | % { $_.FullName }
    $OldContent = Get-Content $FullPath
    $NewContent = $OldContent `
    | Where-Object {$_ -notmatch $Match1} `
    | Where-Object {$_ -notmatch $Match2} `
    | Where-Object {$_ -notmatch $Match3} `
    | Where-Object {$_ -notmatch $Match4} `
    | Where-Object {$_ -notmatch $Match5} `
    | Where-Object {$_ -notmatch $Match6} `
    | Where-Object {$_ -notmatch $Match7} `
    | Where-Object {$_ -notmatch $Match8} `
    | Where-Object {$_ -notmatch $Match9} `
    | Where-Object {$_ -notmatch $Match10}
    Set-Content -Path $FullPath -Value $NewContent
    Write-Output $File
}
Nuno Chaves
  • 111
  • 2