Removing BOM characters from UTF8 file using Power-Shell (converting UTF8 with BOM to UTF8 without BOM)

When creating UTF8 encoded files using Power-Shell, for example:

$orcFile = "testFile.txt"

Add-Content -Encoding UTF8 $orcFile "bla bla bla"

they are created with BOM (byte order mark) and if there is a need of a just plain UTF8 – we have a problem.

The byte order mark (BOM) is a Unicode character, U+FEFF BYTE ORDER MARK (BOM), whose appearance as a magic number at the start of a text stream can signal several things to a program consuming the text:

  • What byte order, or endianness, the text stream is stored in;
  • The fact that the text stream is Unicode, to a high level of confidence;
  • Which of several Unicode encodings that text stream is encoded as.

One of the solutions would be just removing BOM characters from the file itself:

#remove utf BOM  from file

$orcFile = "testFile.txt"

(Get-Content $orcFile) |

Foreach-Object {$_ -replace "\xEF\xBB\xBF", ""} |

Set-Content $orcFile

Usefull free software

Avast antivirus

Super Anti Spyware

Microsoft AppLocale Utility – Microsoft AppLocale is a utility that allows Unicode (UTF-16) based Windows XP and 2003 users to run non-Unicode legacy (code-page based) applications without changing the current system locale. AppLocale automatically detects language for non-Unicode program and simulates a corresponding system locale for code-page to/from Unicode conversions.