When creating UTF8 encoded files using Power-Shell, for example:
$orcFile = "testFile.txt"
Add-Content -Encoding UTF8 $orcFile "bla bla bla"
they are created with BOM (byte order mark) and if there is a need of a just plain UTF8 – we have a problem.
The byte order mark (BOM) is a Unicode character, U+FEFF BYTE ORDER MARK (BOM), whose appearance as a magic number at the start of a text stream can signal several things to a program consuming the text:
- What byte order, or endianness, the text stream is stored in;
- The fact that the text stream is Unicode, to a high level of confidence;
- Which of several Unicode encodings that text stream is encoded as.
One of the solutions would be just removing BOM characters from the file itself:
#remove utf BOM from file
$orcFile = "testFile.txt"
(Get-Content $orcFile) |
Foreach-Object {$_ -replace "\xEF\xBB\xBF", ""} |
Set-Content $orcFile
Hi,
I’m afraid the file is no longer UTF-8 encoded after using the script.
BR
Ray