Python UTF-16 BE, SyntaxError: unknown parsing error.

created at 02-10-2022 views: 17

problem

While learning Python, I tried the weather data, and downloaded the corresponding csv file (with Chinese in it, between two quotation marks, for example: "从东北方吹来的风";), it is no problem to open it in Notepad, The encoding shown in Notepad is UTF-16 BE,

code

import csv
filename = 'sw_2022.csv'
with open(filename, 'r', encoding='utf-8') as f:
    reader = csv.reader(f)
    header_row = next(reader)
    for index, column_header in enumerate(header_row):
        print(index, column_header)

Result and error

When the compiler is set to UTF-16 BE, the error is displayed

SyntaxError: unknown parsing error

For other encodings such as UTF-8, the error is reported as:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 0: invalid start byte

I have tried many methods on the Internet, including encoding='utf-8', etc., but they have not been resolved.

solution

utf-16 and utf-8 are not the same encoding, change to utf-16 encoding to read

import csv
filename = 'sw_2022.csv'
with open(filename, 'r', encoding='utf-16') as f:
    reader = csv.reader(f)
    header_row = next(reader)
    for index, column_header in enumerate(header_row):
        print(index, column_header)

Or save the csv file as utf-8 encoding, select utf-8 in Notepad encoding, then save and overwrite the original file

created at:02-10-2022
edited at: 02-10-2022: