VBA : save a file with UTF-8 without BOM

ghz 7months ago ⋅ 83 views

it's probably sthg simple, here is what I tried :

 Set objStream = CreateObject("ADODB.Stream")
 Set objStreamNoBOM = CreateObject("ADODB.Stream")

 With objStream
        .Open
        .Charset = "UTF-8"
        .WriteText "aaaaaa"
        .Position = 0
    End With

    With objStreamNoBOM
      '.Charset = "Windows-1252"   ' WORK
       .Charset = "UTF-8"          ' DOESN'T WORK!!
       .Open
       .Type = 2
       .WriteText objStream.ReadText
       .SaveToFile "toto.php", 2
       .Close
    End With
    objStream.Close

if the charset is UTF-8, then there is ï» at the beginning of the file.

Any idea on how to save a file with UTF-8 and without BOM?

Answers

To save a file with UTF-8 encoding and without a BOM (Byte Order Mark) using ADODB.Stream in VBScript, you need to bypass the automatic addition of the BOM. The ADODB.Stream object automatically adds a BOM when writing text with UTF-8 encoding, which is why you see the ï» characters at the beginning of the file.

Here’s a modified version of your script that avoids the BOM:

  1. First, write the text to a stream with the UTF-8 BOM.
  2. Then, read the binary data from the stream and strip off the BOM.
  3. Finally, write the binary data to the file.

Here is how you can do it:

Set objStream = CreateObject("ADODB.Stream")
Set objStreamNoBOM = CreateObject("ADODB.Stream")

' Step 1: Write text to a stream with BOM
With objStream
    .Open
    .Charset = "UTF-8"
    .WriteText "aaaaaa"
    .Position = 0
    .Type = 1 ' Switch to binary mode
    .Position = 3 ' Skip BOM
End With

' Step 2: Write binary data to another stream
With objStreamNoBOM
    .Open
    .Type = 1 ' Binary mode
    .Write objStream.Read
    .SaveToFile "toto.php", 2
    .Close
End With

objStream.Close

Explanation:

  • The objStream is first used to write the text with a UTF-8 BOM.
  • After writing the text, the stream is switched to binary mode (Type = 1).
  • The Position is set to 3 to skip the first three bytes, which are the BOM bytes.
  • The objStreamNoBOM is then used to write the remaining binary data (without the BOM) to the file.

This approach ensures that the resulting file will be in UTF-8 encoding without the BOM.