What is encoding? Please help?

Question:

Daniel

2012-01-31 23:06:13 UTC

Unicoding, encoding, ANSI, UTF-8? When I go on a computer I always see these type of things in word doucments, notepads, you name it! But I have no idea what it means?! Can someone explain this to me? Thanks!

Three answers:

MistWing

2012-02-01 01:31:29 UTC

The best way to find out what these mean (as well as others you may run into) is to enter the terms in Wikipedia.

Unicode: A computer standard for representing text from the world languages. Just as ASCII is a standard for representing English in a computer, Unicode extends this so that every language is represented.

Encoding: Changing data from one representation to another. For example: we can encode the letter A as ASCII 65 (which would have meaning to a computer)

ANSI: American National Standards Institute: A private nonprofit organization that oversees the development of voluntary consensus standards

UTF-8: UCS Transformation Format - 8 Bit (UCS stands for Universal Character Set). A way of representing Unicode is a series of 8-bit bytes. The number of bytes used is variable, depending on the character you want to represent

anonymous

2012-02-01 12:29:31 UTC

I just started learning Python a few months ago for a class (my first-ever attempt at any kind of programming) and was really having a hard time with it until I got inspired by a project of my own. Now I'm so close to being successful and really want to achieve this! But I've run into a character-encoding problem and am finally admitting I'm way out of my depth and need help to solve it. If things I say sound ignorant or naive or confused it's because I am, and would love to be enlightened.

So I have this Python script that is reading UTF-16-encoded plain text files that include IPA (International Phonetic Alphabet) characters, and it picks out some of the lines in each file and writes them into a new plain text file which I think is also UTF-16-encoded (admittedly I am not 100% sure of this and don't know how to check). I chose UTF-16 encoding because I did some reading and learned that Excel should be able to read that without me doing anything special. Ultimately I need to take the new file and put it in Excel and have the IPA characters show up. But what I'm getting is instead the stuff with all the slashes and x's. Here's an example line from the new file:

'H2,5,[\'\\xc9\\x99n\\xc9\\x91p\\xc9\\x99l\', "\\xc2\\xa0\\xc2\\xa0\\xc2\\xa0\\xc2\\xa0(\'n", \'appel)\\xc2\\xa0\\xc2\\xa0\']\n'

Here's the line that came from in the original file:

H2 5 ənɑpəl ('n appel)

The part of my Python script that wrote the line from the original file to the new file (where "data" is the thing that opens the file I'm writing into and this is all inside of a few loops that do the part about picking out the lines):

data.write(str(line.split()[0]))

data.write(",")

data.write(str(line.split()[1]))

data.write(",")

data.write(str(line.split()[2:]))

data.write("\n")

Rohit

2012-02-01 08:56:21 UTC

Encoding is the process of putting a sequence of characters into a special format for transmission or storage purposes.

ⓘ

This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.

about - legalese