LeetCode-哈希表算法总结

VocabVictor2024-09-022024-09-04

引言

哈希表（Hash Table）是一种高效的数据结构，它通过键值对的方式存储数据，并利用哈希函数实现快速的查找、插入和删除操作。本文将深入探讨哈希表的原理、实现方法、应用场景以及相关的解题技巧。

1. 哈希表基本原理

1.1 概念

哈希表是一种基于数组的数据结构，它使用哈希函数将键映射到数组索引，从而实现快速访问。

1.2 哈希函数

哈希函数是哈希表的核心，它将键转换为数组索引。一个好的哈希函数应该：

计算速度快
均匀分布
减少冲突

1.3 处理冲突

当两个不同的键被哈希到同一个索引时，就会发生冲突。常见的解决方法有：

链地址法（Chaining）
开放寻址法（Open Addressing）

2. Python 中的哈希表实现

Python 的字典（dict）就是一种哈希表实现。以下是一个简单的哈希表类实现：

class HashTable:
    def __init__(self, size=100):
        self.size = size
        self.table = [[] for _ in range(self.size)]

    def _hash(self, key):
        return hash(key) % self.size

    def insert(self, key, value):
        index = self._hash(key)
        for item in self.table[index]:
            if item[0] == key:
                item[1] = value
                return
        self.table[index].append([key, value])

    def get(self, key):
        index = self._hash(key)
        for item in self.table[index]:
            if item[0] == key:
                return item[1]
        raise KeyError(key)

    def remove(self, key):
        index = self._hash(key)
        for i, item in enumerate(self.table[index]):
            if item[0] == key:
                del self.table[index][i]
                return
        raise KeyError(key)

3. 哈希表的应用场景

快速查找和检索
去重
缓存实现
计数器
数据库索引
密码存储（配合加盐和加密算法）

4. 哈希表解题技巧

4.1 两数之和

使用哈希表可以将时间复杂度从 O(n^2) 降到 O(n)。

def two_sum(nums, target):
    hash_table = {}
    for i, num in enumerate(nums):
        complement = target - num
        if complement in hash_table:
            return [hash_table[complement], i]
        hash_table[num] = i
    return []

4.2 字母异位词分组

使用排序后的字符串作为键，将异位词分组。

from collections import defaultdict

def group_anagrams(strs):
    groups = defaultdict(list)
    for s in strs:
        key = ''.join(sorted(s))
        groups[key].append(s)
    return list(groups.values())

4.3 最长连续序列

使用哈希表优化查找过程。

def longest_consecutive(nums):
    num_set = set(nums)
    longest = 0

    for num in num_set:
        if num - 1 not in num_set:
            current_num = num
            current_streak = 1

            while current_num + 1 in num_set:
                current_num += 1
                current_streak += 1

            longest = max(longest, current_streak)

    return longest

4.4 LRU 缓存

结合哈希表和双向链表实现 O(1) 时间复杂度的 LRU 缓存。

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = OrderedDict()

    def get(self, key):
        if key not in self.cache:
            return -1
        self.cache.move_to_end(key)
        return self.cache[key]

    def put(self, key, value):
        if key in self.cache:
            self.cache.move_to_end(key)
        self.cache[key] = value
        if len(self.cache) > self.capacity:
            self.cache.popitem(last=False)