I just started learning Python a few months ago for a class (my first-ever attempt at any kind of programming) and was really having a hard time with it until I got inspired by a project of my own. Now I'm so close to being successful and really want to achieve this! But I've run into a character-encoding problem and am finally admitting I'm way out of my depth and need help to solve it. If things I say sound ignorant or naive or confused it's because I am, and would love to be enlightened.
So I have this Python script that is reading UTF-16-encoded plain text files that include IPA (International Phonetic Alphabet) characters, and it picks out some of the lines in each file and writes them into a new plain text file which I think is also UTF-16-encoded (admittedly I am not 100% sure of this and don't know how to check). I chose UTF-16 encoding because I did some reading and learned that Excel should be able to read that without me doing anything special. Ultimately I need to take the new file and put it in Excel and have the IPA characters show up. But what I'm getting is instead the stuff with all the slashes and x's. Here's an example line from the new file:
'H2,5,[\'\\xc9\\x99n\\xc9\\x91p\\xc9\\x99l\', "\\xc2\\xa0\\xc2\\xa0\\xc2\\xa0\\xc2\\xa0(\'n", \'appel)\\xc2\\xa0\\xc2\\xa0\']\n'
Here's the line that came from in the original file:
H2 5 ənɑpəl ('n appel)
The part of my Python script that wrote the line from the original file to the new file (where "data" is the thing that opens the file I'm writing into and this is all inside of a few loops that do the part about picking out the lines):
data.write(str(line.split()[0]))
data.write(",")
data.write(str(line.split()[1]))
data.write(",")
data.write(str(line.split()[2:]))
data.write("\n")