what is hashing??

cool_navin1

2007-05-14 04:38:18 UTC

The Process of mapping large amounts of data into a smaller table is known as hashing.

The mapping function is known as a Hash Function. The table where the data is stored is known as a Hash table.

The Hash table can be implemented using a simple array.

Hi dude here is ur hashing program.......

//The program shows a basic hash function implementation

//using datastructure from the STL in C++

//Compilation : c++ hash.cpp

//Run : ./a.out

#include

#include

#include

using namespace std;

//The hash function returning the hash value

int hash(string name)

{

int key;

key = name[0]%26; // mark the hashing expression

return key;

}

int main()

{

int choice,key;

map hashtable; //The data structure present in the map library

string name;

while(1)

{

cout << "Enter the choice" << endl;

cout << "1: Insert a new name in the hash table" << endl;

cout << "2: Search if a name is already in the hash table(retrieve)" << endl;

cout << "3: To exit\n" << endl;

cin >> choice;

switch(choice)

{

case 1: cout << "Enter the name to be put in the hash table" << endl;

cin >> name ;

key = hash(name); //HASHING !!! done for the name to be inserted

hashtable[key] = name; //name entered in the hash table

break;

case 2: cout << "Enter the name to be searched " << endl;

cin >> name;

key = hash(name);

if (name == hashtable[key])

cout << "Yes dude u exist in the hash table!!!" << endl;

else

cout << "Nope sry u are not registered in the hash table :-( " << endl;

break;

case 3: exit(0);

}

}

return 0;

}

2007-05-14 04:40:38 UTC

- Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string. Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. It is also used in many encryption algorithms.

As a simple example of the using of hashing in databases, a group of people could be arranged in a database like this:

Abernathy, Sara

Epperdingle, Roscoe

Moore, Wilfred

Smith, David

(and many more sorted into alphabetical order)

Each of these names would be the key in the database for that person's data. A database search mechanism would first have to start looking character-by-character across the name for matches until it found the match (or ruled the other entries out). But if each of the names were hashed, it might be possible (depending on the number of names in the database) to generate a unique four-digit key for each name. For example:

7864 Abernathy, Sara

9802 Epperdingle, Roscoe

1990 Moore, Wilfred

8822 Smith, David

(and so forth)

A search for any name would first consist of computing the hash value (using the same hash function used to store the item) and then comparing for a match using that value. It would, in general, be much faster to find a match across four digits, each having only 10 possibilities, than across an unpredictable value length where each character had 26 possibilities.

The hashing algorithm is called the hash function (and probably the term is derived from the idea that the resulting hash value can be thought of as a "mixed up" version of the represented value). In addition to faster data retrieval, hashing is also used to encrypt and decrypt digital signatures (used to authenticate message senders and receivers). The digital signature is transformed with the hash function and then both the hashed value (known as a message-digest) and the signature are sent in separate transmissions to the receiver. Using the same hash function as the sender, the receiver derives a message-digest from the signature and compares it with the message-digest it also received. They should be the same.

The hash function is used to index the original value or key and then used later each time the data associated with the value or key is to be retrieved. Thus, hashing is always a one-way operation. There's no need to "reverse engineer" the hash function by analyzing the hashed values. In fact, the ideal hash function can't be derived by such analysis. A good hash function also should not produce the same hash value from two different inputs. If it does, this is known as a collision. A hash function that offers an extremely low risk of collision may be considered acceptable.

Sharath U

2007-05-14 05:57:56 UTC

its basically a method which is used to locate records... in a huge collection of files.. each record will have something called key.. which is called as hash key.. and there are something called hash function the use of these functions are.. we give the key of the record we r searching as input to this function and the output will be the LOCATION of the record..... the best advantage of this method is the search complexity is O(1).....

use have various ways in which u can implement hashing...

one of the methods is...

1. convert the key into integer using any simple technique

2.take square of the key consider the middle few digits

3. take the mod of the number with the total no of record in the file...

4..

use the above value as an index to the in the file to locate the record.. enjoyy.........

santosh_musicman

2007-05-14 04:39:12 UTC

A hash function is a reproducible method of turning some kind of data into a (relatively) small number that may serve as a digital "fingerprint" of the data. The algorithm "chops and mixes" (i.e., substitutes or transposes) the data to create such fingerprints. The fingerprints are called hash sums, hash values, hash codes or simply hashes. (Note that hashes can also mean the hash functions.) Hash sums are commonly used as indices into hash tables or hash files. Cryptographic hash functions are used for various purposes in information security applications.

visit this link given below for the example you wanted. cheers!

Kapil C

2007-05-14 20:44:54 UTC

Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string. Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. It is also used in many encryption algorithms.

As a simple example of the using of hashing in databases, a group of people could be arranged in a database like this:

Abernathy, Sara

Epperdingle, Roscoe

Moore, Wilfred

Smith, David

(and many more sorted into alphabetical order)

Each of these names would be the key in the database for that person's data. A database search mechanism would first have to start looking character-by-character across the name for matches until it found the match (or ruled the other entries out). But if each of the names were hashed, it might be possible (depending on the number of names in the database) to generate a unique four-digit key for each name. For example:

7864 Abernathy, Sara

9802 Epperdingle, Roscoe

1990 Moore, Wilfred

8822 Smith, David

(and so forth)

A search for any name would first consist of computing the hash value (using the same hash function used to store the item) and then comparing for a match using that value. It would, in general, be much faster to find a match across four digits, each having only 10 possibilities, than across an unpredictable value length where each character had 26 possibilities.

The hashing algorithm is called the hash function (and probably the term is derived from the idea that the resulting hash value can be thought of as a "mixed up" version of the represented value). In addition to faster data retrieval, hashing is also used to encrypt and decrypt digital signatures (used to authenticate message senders and receivers). The digital signature is transformed with the hash function and then both the hashed value (known as a message-digest) and the signature are sent in separate transmissions to the receiver. Using the same hash function as the sender, the receiver derives a message-digest from the signature and compares it with the message-digest it also received. They should be the same.

The hash function is used to index the original value or key and then used later each time the data associated with the value or key is to be retrieved. Thus, hashing is always a one-way operation. There's no need to "reverse engineer" the hash function by analyzing the hashed values. In fact, the ideal hash function can't be derived by such analysis. A good hash function also should not produce the same hash value from two different inputs. If it does, this is known as a collision. A hash function that offers an extremely low risk of collision may be considered acceptable.

Here are some relatively simple hash functions that have been used:

* The division-remainder method: The size of the number of items in the table is estimated. That number is then used as a divisor into each original value or key to extract a quotient and a remainder. The remainder is the hashed value. (Since this method is liable to produce a number of collisions, any search mechanism would have to be able to recognize a collision and offer an alternate search mechanism.)

* Folding: This method divides the original value (digits in this case) into several parts, adds the parts together, and then uses the last four digits (or some other arbitrary number of digits that will work ) as the hashed value or key.

* Radix transformation: Where the value or key is digital, the number base (or radix) can be changed resulting in a different sequence of digits. (For example, a decimal numbered key could be transformed into a hexadecimal numbered key.) High-order digits could be discarded to fit a hash value of uniform length.

* Digit rearrangement: This is simply taking part of the original value or key such as digits in positions 3 through 6, reversing their order, and then using that sequence of digits as the hash value or key.

A hash function that works well for database storage and retrieval might not work as for cryptographic or error-checking purposes. There are several well-known hash functions used in cryptography. These include the message-digest hash functions MD2, MD4, and MD5, used for hashing digital signatures into a shorter value called a message-digest, and the Secure Hash Algorithm (SHA), a standard algorithm, that makes a larger (60-bit) message digest and is similar to MD4.

2007-05-14 04:40:27 UTC

The conversion of a column's primary key value to a database page number on which the row will be stored. Retrieval operations that specify the key column value use the same hashing algorithm and can locate the row directly. Hashing provides fast retrieval for data that contains a unique key value.

Hash tables are one of the most efficient means of searching for data. A hash table establishes a mapping between all of the possible keys and the position where the data associated with those keys is stored. I don't know exactly how all of the hashing methods that you are asking about work but the following should go at least partway towards answering your questions.

Direct Addressing is where the hashing algorithm guarantees that no two keys will generate the exact same hash value. This is the ideal situation but is impractical in most situations. Instead most hash functions map multiple keys to the same has value and provide indirect addressing mechanisms to handle the situation where two keys map to the same value. The situation where this occurs is called a collision. Two keys that hash to the same value like this are called synonyms.

A hash function attempts to distribute keys in a random manner so as to minimize the need to resolve collisions. One of the simplest hashing algorithms is division remainder where you divide the key value by some fixed number and take the remainder as the key location (also known as modulo arithmetic). For example let's say we have space for one thousand entries, so we divide the key of each value by 1000 and take the remainder so the entry with key 5078950770 will occupy position 770 in our array (5078950770 modulo 1000 is 770). Of course with this particular example every 1000th key will map to the same location.

A Chained hash table is one where the data is stored in linked lists which contain all of the data entries whose keys map to the same hash value. The linked list (or bucket) can grow to contain as many entries as required but as the list grows it becomes gradually less efficient in the speed at which the entries can be accessed.

An open addressed hash table stores the data in a table instead of in a linked list and can use a variety of probing methods to locate the required entry within the table. Linear probing is the least efficient method as it involves searching sequentially through the table until the desired entry is found. Where only a few entries map to the same hash value this may be good enough but where lots of entries map to the same hash value, a more efficient method is required.

Double hashing is one of the more effective methods of probing an open addressed hash table. It works by adding the hash codings of two auxiliary hash functions. The function for performing double hashing is h(k,i) = (h1(k) + ih2(k)) mod m where 0 <= i

Dynamic or Universal hashing involves using a random number generator to dynamically generate the hashing algorithm at run time. This method reduces the chance that your keys will produce a bad distribution across the table.

As you have noticed from the above examples, hashing algorithms work on numeric keys. digit analysis/extraction is the method that you use to convert the actual keys into numerical values that cal be fed through your hashing algorithm. To get the best results you should aim to base your hashing values on those parts of the actual keys most likely to vary from one record to another.

sathiyendran a

2007-05-14 08:04:55 UTC

Hashing is a form of "fox and hounds" running competition where one individual sets out ahead of the others and marks a path for them. The Hash House Harriers are the major organizers of these events.

You can read the history at their web site (see link below).

Source(s):

http://www.gthhh.com/bible/biblepage.asp...

http://www.gthhh.com/

bharath

2007-05-14 04:37:34 UTC

It is a technique for locating data in a file by applying a transformation, usually arithmetic, to a key.