From 841bc996a6cd22a53429264b1be65b2dead826ec Mon Sep 17 00:00:00 2001 From: George Nagib <77877978+GeorgeNagib@users.noreply.github.com> Date: Thu, 7 Aug 2025 19:27:26 +0300 Subject: [PATCH] Update models-on-hugging-face@dLEg4IA3F5jgc44Bst9if.md (#8978) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Update models-on-hugging-face@dLEg4IA3F5jgc44Bst9if.md Added overview of embedding models on Hugging Face - Added a concise summary explaining what embeddings are and their common use cases - Listed popular embedding models like MiniLM, GTE, E5, and BGE - Included an official Hugging Face video on text embeddings - Linked to the full list of embedding models available on Hugging Face Hub * Fix formatting issues in Hugging Face embedding models content - Add missing blank line after title (follows established pattern) - Fix grammatical error: remove duplicated 'which' and em dash - Maintain original content structure and meaning 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --------- Co-authored-by: Kamran Ahmed Co-authored-by: Claude --- .../models-on-hugging-face@dLEg4IA3F5jgc44Bst9if.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/data/roadmaps/ai-engineer/content/models-on-hugging-face@dLEg4IA3F5jgc44Bst9if.md b/src/data/roadmaps/ai-engineer/content/models-on-hugging-face@dLEg4IA3F5jgc44Bst9if.md index 941c86818..b7c73e7a0 100644 --- a/src/data/roadmaps/ai-engineer/content/models-on-hugging-face@dLEg4IA3F5jgc44Bst9if.md +++ b/src/data/roadmaps/ai-engineer/content/models-on-hugging-face@dLEg4IA3F5jgc44Bst9if.md @@ -1 +1,7 @@ -# Models on Hugging Face \ No newline at end of file +# Models on Hugging Face + +Embedding models are used to convert raw data like text, code, or images into high-dimensional vectors that capture semantic meaning. These vector representations allow AI systems to compare, cluster, and retrieve information based on similarity rather than exact matches. Hugging Face provides a wide range of pretrained embedding models such as `all-MiniLM-L6-v2`, `gte-base`, `Qwen3-Embedding-8B` and `bge-base` which are commonly used for tasks like semantic search, recommendation systems, duplicate detection, and retrieval-augmented generation (RAG). These models can be accessed through libraries like transformers or sentence-transformers, making it easy to generate high-quality embeddings for both general-purpose and task-specific applications. + +Learn more from the following resources: +- [@video@Hugging Face - Text embeddings & semantic search](https://www.youtube.com/watch?v=OATCgQtNX2o) +- [@official@Hugging Face Embedding Models](https://huggingface.co/models?pipeline_tag=feature-extraction)