🏠 Home
Benchmark Hub
📊 All Benchmarks 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List Applications 🎨 Creative Free Pages 🎯 FSACB - Ultimate Showcase 🌍 Translation Benchmark
Models
🏆 Top 10 Models 🆓 Free Models 📋 All Models ⚙️ Kilo Code
Resources
💬 Prompts Library 📖 AI Glossary 🔗 Useful Links

AI Glossary

The complete dictionary of Artificial Intelligence

162
categories
2,032
subcategories
23,060
terms
📖
terms

Document Chunking

Process of segmenting large documents into smaller, coherent fragments to optimize their processing by language models and vector search systems.

📖
terms

Fixed-size Chunking

Segmentation strategy that divides documents into fragments of predefined size, based on a constant number of characters, words, or tokens.

📖
terms

Semantic Chunking

Segmentation approach based on semantic understanding of content, creating fragments that preserve thematic and contextual coherence.

📖
terms

Recursive Character Splitting

Hierarchical segmentation method that divides documents according to a sequence of separators (paragraphs, sentences, words) until reaching the desired fragment size.

📖
terms

Token-based Chunking

Segmentation strategy using tokens as the basic unit, essential for respecting the context limits of language models like GPT or BERT.

📖
terms

Overlapping Chunks

Technique creating fragments with overlapping areas to preserve context between adjacent segments and improve coherence during retrieval.

📖
terms

Hierarchical Chunking

Multi-level approach organizing fragments according to a hierarchical structure (chapters, sections, paragraphs) to enable contextual retrieval at different granularities.

📖
terms

Sliding Window Chunking

Method sliding a fixed-size window over the document with a defined step, creating sequential fragments with controlled overlap.

📖
terms

Markdown-aware Chunking

Intelligent segmentation strategy that respects the Markdown structure of documents, splitting at logical boundaries of headings, lists, and code blocks.

📖
terms

Context-aware Chunking

Advanced approach considering the global semantic context of the document to determine optimal breakpoints that preserve narrative coherence.

📖
terms

Embedding-based Chunking

Method using semantic embeddings to identify natural boundaries between thematically distinct segments in a document.

📖
terms

Hybrid Chunking Strategy

Combination of multiple segmentation techniques, such as semantic chunking with fixed size limits, to optimize both coherence and efficiency.

📖
terms

Dynamic Chunk Sizing

Adaptive approach adjusting fragment size based on information density and semantic complexity of each document section.

📖
terms

Metadata-enriched Chunking

Technique associating contextual metadata (position, parent title, hierarchical level) with each fragment to improve context retrieval and reconstruction.

📖
terms

Cross-document Chunking

Advanced strategy segmenting sets of related documents into coherent fragments preserving inter-document relationships for better global understanding.

📖
terms

Multi-level Chunking

Approach creating multiple levels of fragments (summaries, detailed sections, paragraphs) to enable flexible retrieval according to granularity needs.

📖
terms

Adaptive Chunking

Intelligent system dynamically adjusting the segmentation strategy based on document type, domain, and observed usage patterns.

🔍

No results found