0%

Python print UTF8 encoded string

Printing the English string “Hello, World!” is simple, but if you want to print the Chinese string “你好, 世界” you may encounter coding problems.

SyntaxError

❌ If no encoding is specified in the Python file, an error will occur during execution :

1
2
#!/usr/bin/python
print "你好,世界";

The above program execution output is :

1
2
File "test.py", line 2
SyntaxError: Non-ASCII character '\xe4' in file test.py on line 2, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

Solution

The default encoding format in Python is ASCII, which does not print Chinese characters correctly when the encoding format is not modified.

✔️ The solution is to just add # -*- coding: UTF-8 -*- or # coding=utf-8 at the beginning of the file.

Note: There should be no spaces on both sides of the = of # coding=utf-8.

eg. (Python 2.0+)

1
2
3
4
#!/usr/bin/python
# -*- coding: UTF-8 -*-

print "你好,世界";

The above program execution output is :

1
你好,世界

Python3.X

Note: The Python3.X source file defaults to UTF-8 encoding, so you can parse Chinese normally without specifying UTF-8 encoding. 👌


Editor Settings

Note: If you use the editor, you also need to set the py file storage format to UTF-8, otherwise an error message similar to the following will appear:

1
2
SyntaxError: (unicode error) ‘utf-8’ codec can’t decode byte 0xc4 in position 0:
invalid continuation byte

📜 Pycharm setup steps:

  • Go to file > Settings and search for encoding in the input box.
  • Find Editor > File encodings and set IDE Encoding and Project Encoding to utf-8.