Removing BOM characters from UTF8 file using Power-Shell (converting UTF8 with BOM to UTF8 without BOM)

When creating UTF8 encoded files using Power-Shell, for example:

$orcFile = "testFile.txt"

Add-Content -Encoding UTF8 $orcFile "bla bla bla"

they are created with BOM (byte order mark) and if there is a need of a just plain UTF8 – we have a problem.

The byte order mark (BOM) is a Unicode character, U+FEFF BYTE ORDER MARK (BOM), whose appearance as a magic number at the start of a text stream can signal several things to a program consuming the text:

  • What byte order, or endianness, the text stream is stored in;
  • The fact that the text stream is Unicode, to a high level of confidence;
  • Which of several Unicode encodings that text stream is encoded as.

One of the solutions would be just removing BOM characters from the file itself:

#remove utf BOM  from file

$orcFile = "testFile.txt"

(Get-Content $orcFile) |

Foreach-Object {$_ -replace "\xEF\xBB\xBF", ""} |

Set-Content $orcFile

This Post Has One Comment

  1. Hi,

    I’m afraid the file is no longer UTF-8 encoded after using the script.



Leave a Reply

Close Menu