Go, Wasm bindings for HF Tokenizers and Tiktoken
-
Updated
Mar 30, 2026 - Rust
Go, Wasm bindings for HF Tokenizers and Tiktoken
Run local LLMs like Gemma, Qwen, and LLaMA on Android for offline, private, real-time chat and question answering with LiteRT and ONNX Runtime.
⛔ [DEPRECATED] Adapt Transformer-based language models to new text domains
We use phonetics as a feature to create a joint semantic-phonetic embedding and improve the neural machine translation between Chinese and Japanese. 🥳
This project leverages deep learning transformers to classify YouTube comments into six distinct emotions.
tool to split text into semantic chunks
Natural Language Processing Practice — a hands‑on repository spanning the full spectrum of NLP, from classical algorithms to cutting‑edge large language models (LLMs). Built around the Hugging Face LLM Course, it’s enriched with practical notebooks on foundational libraries and advanced fine‑tuning workflows.
Automated glossary generation and QA assistant that extracts technical terms from text corpora using regex and Levenshtein clustering, tokenizes them with custom BPE, and generates definitions and examples using a local Ollama LLM, all accessible through a CLI interface.
Sesta attività di Big Data Analytics
Hugging Face Transformers offer a powerful framework for state-of-the-art NLP, with the Pipeline API for easy inference, Tokenization for efficient preprocessing, and Quantization for optimized deployment.
Add a description, image, and links to the huggingface-tokenizers topic page so that developers can more easily learn about it.
To associate your repository with the huggingface-tokenizers topic, visit your repo's landing page and select "manage topics."