How computers are able to take words and translate them?

Question:

2010-03-30 13:49:02 UTC

I'm doing a report on how computers are able to take the words entered on a keyboard and translate them into a digital information. This is what I have so far.

Computers can translate what you type by using, ASCII. American Standard Code for Information Interchange. Each letter or symbol has a number assigned that's up to 255. That number then gets translated into Binary which is a series of 1's and 0's that computer programs read and speak to perform commands. The keyboard you type on is an input device and below the keys are a matrix, or circuit board. Depending on the operating system the computer application will take the information entered and decide how to use it by translating it to binary.

If any one thinks part of this is wrong or I'm missing good information please let me know. This is only a rough draft.

Three answers:

Mantis

2010-03-30 14:30:37 UTC

Your answer was right 10 years ago, but things have changed since then. The majority of programs these days use UNICODE or UTF-8 instead of ASCII. As you point out above, each character in ASCII has a value from 0 to 255, which is fine for US English but falls apart a little when you get German, Swedish, and other similar languages and brakes utterly when you get to Japanese, Korean, Hebrew, Arabic and other languages that use completely different character sets.

Window uses UNICODE internally. It's very similar to ASCII in many ways, but instead of using 8 bits and having values from 0 to 255, it uses 16 bits and values from 0 to 65535. This is sufficient for most languages. (Worth noting that ASCII is technically only 0 to 127, but most programs that use ASCII use extended ASCII, which goes from 0 to 255 and has more European accent characters, like ö and ƒ and ô and such).

The internet uses UTF-8 which is much more closely relate to ASCII but is much more complex. Values from 1 to 255 (note it starts at 1!) use something very similar to extended ASCII... in fact the first 127 characters are ASCII. But if the value is zero, then that means the next two or more values use an international character set. In fact it can use up to 4 bytes (ASCII uses 1, UNICODE uses 2) to represent a single character. It gets really complicated really fast, but it supports international languages better than even UNICODE.

There's also a UTF-16 which is almost a cross between UNICODE and UTF-8. You can read more details about either UTF-8 or UTF-16 in Wikipedia. Just to confuse things a little further, there's an older international standard called "multibyte" based on ASCII and similar in some ways to UTF-8, although more limited. Few programs use that anymore.

As for keyboards, although the mechanics you've discussed are true for most keyboards (some actually do things differently internally) the translation from the matrix of wires and contacts to a keystroke is handled on the device itself so the computer doesn't need to worry about it. The computer receives a scan code, which is basically a code for which button is hit. Depending on what language setup you're using (international keyboards put buttons in different places, and even within the US there are a few common keyboard layouts, such as QWERTY and Dvorak) the computer then takes that scan code and turns it into a character.

Probably more information than you wanted, but hope it points you in the right direction. In summary, nearly all programs in Windows use UNICODE, and internet/e-mail traffic is nearly all UTF-8 these days. They are more complex than ASCII but support international characters better, and since the computer world is far bigger than the United States, better international support is a good thing. (For those of us who prefer US English, know that the core of these international character sets are still based on the American ASCII standard. So we still get a leg up on any other language).

Good luck.

mcclister

2016-09-10 09:05:03 UTC

It begins within the Bios. That is wherein a pc learns how one can be a pc. There is learns how one can speak to a keyboard. Next whilst the OS a lot up it varieties up what the ones signs will imply. Each time you press a key it creates a targeted sign. Think of a extra complicated variation of morse code for an illustration. Each time you press a key it creates an interupt, that's a sign telling the pc there's a keystroke ready within the buffer. The keystrokes take a seat there till a application gets rid of them or the keyboard buffer fills up. Something that does not most likely occur until your program freezes. You gets an unpleasent beep if it occurs and also you retain to press keys. Which application has first dibs at the keystrokes depends upon the OS you're strolling and what has the present cognizance. The manner keystroke shooting methods paintings is that they do non-desctructive reads at the keyboard buffer then pop again within the heritage permitting the origional application to learn the buffer and get rid of the keystrokes supposed for it. What the ones keystrokes imply rely at the charactor set you're utilizing. Each one evaluates right into a code. ASCII is an illustration of such. EBCDIC is one more, regardless that no longer many machines use EBCDIC any further. Each key maps to a specified code. For illustration whilst you press the input key that's Ascii thirteen. Your program then makes use of those codes to interpet their which means. If you're in a phrase processing kit the insert key will toggle overwrite on and off, if you're in a recreation urgent the insert secret's traditionally disregarded. Programs that retailer keystrokes to the tough force retailer them within the charactor set of the OS until one more is specificed. There are a few editions. For illustration Linux/Unix makes use of a unmarried line feed to indicate the tip of a line. Microsoft makes use of a tough go back that's a line feed and a carraige go back. Intel machines use Little Endian and Sun machines are colossal and endian. That manner after they write out the textual content they retailer it reasonably another way. Basically it has to do with the best way the codes are written to disk. Little endian writes the low order bytes first. Big Ending the reverse. This used to rationale a well little bit of havoc whilst studying documents created on Intel machines on a Sun or Mainframe and vice versa if the switch protocol didn't robotically convert them. The last product that's saved is a binary illustration of the ones numbers. 0000-0001 for illustration. Whouch might translate to a decimal a million. When the computing device reads it again in whether it is knowledge it is going to then translate that into an ASCII 01 for illustration. Which is a Nul string. Typically used to inform a application finish of area or finish of dossier.

2010-03-30 13:55:33 UTC

Yep, you got it right.

But, you could to increase your research by searching about Extended ASCII table ( see full ascii table) , the ASCII table IBM pattern, stuff like this...

ⓘ

This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.

about - legalese