If you are pursuing a degree in data science, you may have come across the term “hashing”. What exactly is hashing, and what role does it play in data structure? Let’s find out.
When you are going through a large list of items, such as a dictionary, finding a specific word can be a challenging task. If each word is randomly arranged, you may have to go over the entire dictionary to find the exact phrase you have been looking for.
Hashing techniques have made such lookup tasks easier. Read on to learn about hashing in data structures, what they are used for, the benefits, hashing examples, and limitations.
Get Complete Details From Expert
Understanding Hashing in Data Structures
Hashing in a data structure assigns a specific hash code or a fixed numeric value to a piece of data, also known as a key. Using this code as an index, you can find any specific item in a hash table (a collection of items). In simple terms, hashing converts data into a hash code, which decides where the data will be stored in memory, making retrieval easier.
Think of hashing like a locker in a large room. Each item is placed in a specific locker, which is identified by a number. Instead of scanning the entire room, you just need to get to the specific locker room to find your stored item. That’s how hashing works in computer science.
Here are the main components of hashing:
- Key: The actual piece of data that needs to be stored in the memory, such as a student’s admission number or name. It can be text, a number, or other characters.
- Hash Function: A hash function is a formula that’s used to assign an integer or a fixed-size value to the key. Using a proper hash function is crucial, as it ensures the precise placement of keys across the hash table and minimises the risk of collision. For simple operations, the hash function “key % tableSize” is used; for large tables, a more complex mathematical hash formula may be necessary.
- Hash Table: It’s a table or a collection of slots that store different pieces of data with their hash codes. The size of the hash table is determined based on the data volume.
Also Read: Linear Data Structures vs Nonlinear: Types, Examples & Key Differences
Types of Hashing in Data Structures
An efficient hashing function offers easy storage and retrieval of the data. Hashing is a straightforward concept that requires a key, a hash function that assigns the key a unique value, and a table that holds each key for quick lookup. The question is, how exactly the hash function is used and which formula helps assign each key to a specific slot in the hash table.
Let’s understand different hashing techniques.
Division Method
The fastest and easiest method to assign a slot to the key in a hash table is the division method. Divide the key by the table number and use the remainder as the index. For example,
- Key: 1453
- Table size: 11
To get the index, you can divide the key (1453) by the table size (11). The remainder value (1) is the slot.
Multiplication Method
In this method, a constant (A) is chosen and multiplied by the key. The decimal part of the result is then multiplied by the table size. Finally, the floor value (whole number part) is used as the index.
- Key: 25
- Table size: 10
- Constant (A): 0.618
- Formula: h(key) = floor(m * (key * A % 1))
Step 1: Multiply the key by A
25 × 0.618 = 15.45
Step 2: Multiply the decimal part of the value by the table size
0.45 × 10 = 4.5
Step 3: Use the floor (whole number part) for the index
h(25) = 4
Folding Method
For large values, such as mobile numbers, the folding method is your best bet. For this, you need to split the numbers into equal segments and find the sum of the values. Divide the sum of these values by the table size to get a unique index.
Here’s how to do it:
- Key: 980564
- Table size: 20
Step 1: Break the key
980, 564
Step 2: Add the values
980 + 564 = 1544
Step 3: Divide it by the table size
1544 % 20 = 4
Step 4: Use the remainder as the index
h(980564) = 4
Mid-Square Method
This approach works well for keys that have squares that vary enough to prevent collision. In the mid-square hashing technique, you take the square value of the key, extract the middle values, and use them after applying the modulo table size. The remainder value is the index for that key.
- Key: 57
- Table size: 10
Step 1: Find the square of the key
57 × 57 = 3249
Step 2: Take the middle digits
Middle digits of 3249 = “24”
Step 3: Divide it by the table size
24 % 10 = 4
Step 4: Use the remainder as the index
h(57) = 4
Also Read: Understanding Sequences in Python: Lists, Tuples, and Strings
How to Handle Collisions
A collision in hashing is when two or multiple keys are assigned the same index.
For example:
Key = 123 → 123 % 10 = 3
Key = 433 → 433 % 10 = 3
Both keys are in slot 3.
Even if you implement the most complex hashing technique for large datasets, collisions can occur. There are multiple ways to handle collisions. The most common technique is chaining them.
In chaining, each slot in the hash table has a list of entries. If multiple keys have the same index, they are chained together. For example: Index 3 → [123 → 433].
Another option is open addressing, where you find the next single empty slot if the existing one is taken.
Also Read: How to Master Data Analytics with an Online Course
Why is Hashing Important in Data Structures?
From banking to schools, hashing is everywhere.
Here’s a look at some common hashing benefits.
- Retrieve data in seconds.
- Works for all data types, including strings, objects, or complex keys.
- Designed to handle large datasets efficiently.
- Works for compilers, databases, and caching systems.
Limitations
Here are some key hashing limitations:
- Collisions happen.
- Not suitable for range queries.
Take the next step in your career ?
Conclusion
Without hashing in the database, you’d endlessly scan the entire datasets or tables to find the exact information you are looking for. The right hashing technique can help you jump to the slot that contains your desired piece of data. This makes storage and retrieval efficient, especially for companies that work with large and complex databases.
Stay updated with our latest Webstories:- AI Career Starts After 12th -Here's How
Check Out Our Top Online Programs