6. Dataset transformations¶
- 6.1. Preprocessing data
- 6.2. Feature extraction
- 6.2.1. Loading features from dicts
- 6.2.2. Feature hashing
- 6.2.3. Text feature extraction
- 6.2.3.1. The Bag of Words representation
- 6.2.3.2. Sparsity
- 6.2.3.3. Common Vectorizer usage
- 6.2.3.4. Tf–idf term weighting
- 6.2.3.5. Applications and examples
- 6.2.3.6. Limitations of the Bag of Words representation
- 6.2.3.7. Vectorizing a large text corpus with the hashing trick
- 6.2.3.8. Customizing the vectorizer classes
- 6.2.4. Image feature extraction
- 6.3. Kernel Approximation
- 6.4. Random Projection
- 6.5. Pairwise metrics, Affinities and Kernels