Indexing¶
约 458 个字 -1 行代码 7 张图片 预计阅读时间 2 分钟
Outline:
- Basic Concepts
- Ordered indices
- B+ Tree Index
- B+ Tree File Organization
- B Tree Index Files
- Indices on Multiple Keys
- Indexing on a Flash
- Indexing in Main Memory
- Write Optimized Indices
- Log Structured Merge(LSM) Tree
- Buffer Tree
- Bitmap indices
Basic Concepts¶
Search Key - attribute to set of attributes used to look up records in a file.
An index file consists of records(called index entries) of the form
Search-key | pointer |
---|---|
Two basic kinds of indices:
- Ordered indices: search keys are stored in sorted order
- Hash indices: search keys are distributed uniformly across "buckets" using a "hash function".
Index Evaluation Metrics¶
- Access types supported efficiently.
- Point query: records with a specified value in the attribute
- Range query: or records with an attribute value falling in a specified range of values
支持点查以及范围查询。
- Access time
- Insertion time
- Deletion time
- Space overhead
Ordered Indices¶
- Primary index(主索引): in a sequentially ordered file, the index whose search key specifies the sequential order of the file.
- Also called clustering index(聚集索引)
- The search key of a primary index is usually but not necessarily the primary key.
- Secondary index(辅助索引): an index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index
- Index-sequential file(索引顺序文件): ordered sequential file with a primary index.
主索引只能有一个,其他都是辅助索引。
Example
Dense Index Files¶
Dense index(稠密索引) - index record appears for every search-key value in the file.
每一个 search-key 都要出现在文件里。


图一是以 ID 为主索引来搜索,图二是以 dept_name
为主索引来搜索。
Sparse Index Files¶
Sparse index(稀疏索引) - contains index records for only some search-key values.

Good tradeoff: sparse index with an index entry for every block in file, corresponding to least search-key value in the block.

Multilevel Index(多级索引)¶
对索引再次建立索引,类似 B+ 树。
B+ Tree Index Files¶
- All paths from root to the leaf are of the same length
- Inner node(not a root or a leaf) between \(\lceil n/2 \rceil\) and \(n\) children
- Leaf node between \(\lceil (n-1)/2 \rceil\) and \(n-1\) values
- Special cases:
- If the root is not a leaf: at least 2 children
- If the root is a leaf: between \(0\) and \(n-1\) values

数据库中B+ 树一般一个叶子结点就是一个块的大小,即 4KB.