Which Hebrew NLP model is best for general use?

DictaBERT and AlephBERT are the two most popular in 2026. Both are available on HuggingFace, and DictaBERT is the actively maintained, higher-accuracy choice.

How do I handle nikud (diacritics) in tokenization?

Strip diacritics before processing using unicodedata.normalize. Most known models are trained on text without nikud.

How do I add nikud back to Hebrew text?

Diacritization is the reverse task: adding nikud back to unvocalized text. Dicta Nakdan is the standard tool, available as a HuggingFace char model (dicta-il/dictabert-large-char-menaked) or as a hosted API.

Do I need a GPU for Hebrew models?

Not required but recommended for performance. BERT models run on CPU but slowly. For large batches use GPU.

Hebrew Nlp Toolkit

Verified97/100

Before deciding whether to install, talk to the skill

Guide developers in using Hebrew NLP models and tools including DictaLM, DictaBERT, AlephBERT, and ivrit.ai. Use when user asks about Hebrew text processing, Hebrew NLP, "ivrit", Hebrew tokenization, Hebrew NER, Hebrew sentiment analysis, Hebrew speech-to-text, or needs to process Hebrew language text programmatically. Covers model selection, preprocessing, and Hebrew-specific NLP challenges. Do NOT use for Arabic NLP (different tools) or general English NLP tasks.

The Problem

Natural language processing for Hebrew remains a significant technical challenge. Hebrew is a morphologically rich language with unvocalized script, making tasks like entity extraction, sentiment analysis, and part-of-speech tagging far more complex than in English. Available models are scattered and not always well documented.

skills-il Localization|1,934installs3,794views

5.0Write a Review

1.3.0MITGitHub

1,934installs3,794views

5.0Write a Review

Updated: July 11, 2026|Tags:nlp hebrew dictalm dictabert ivrit-ai machine-learning

How to use this skill

Not sure how? Read the guide

1. Click "Download ZIP" to download the skill files.
2. Open Claude Desktop and go to Customize > Skills.
3. Click "+" and select "Upload a skill", then upload the ZIP file.
4. Start a new conversation. The skill will activate automatically when relevant.

A new version released? How to update your installed skill

Developers? Install via command line (CLI)

npx skills-il add skills-il/localization@v1.3.0-hebrew-nlp-toolkit --skill hebrew-nlp-toolkit -a claude-code

When to Apply

When tokenizing and morphologically processing Hebrew text
When performing Named Entity Recognition (NER) on Hebrew to extract names, places, organizations
When using Hebrew BERT models like DictaBERT or AlephBERT for embeddings
When restoring nikud (diacritization) to unvocalized Hebrew text with Dicta Nakdan
When handling Hebrew morphology challenges (binyanim, inflections, smichut)

Try These Prompts

Sentiment analysis

How do I use DictaBERT for sentiment analysis of Hebrew texts? Provide a complete Python code example.

Named entity recognition

How do I perform named entity recognition (NER) in Hebrew using DictaBERT NER? I want to identify people, places, and organizations in text.

Nikud restoration

How do I restore nikud (diacritization) to unvocalized Hebrew text? Show me both the Dicta Nakdan model and the hosted API option.

Speech to text

What are the best ivrit.ai tools for Hebrew speech-to-text conversion? How do I integrate them into a Python project?

Frequently Asked Questions

Changelog

v1.3.0

Fixed the DictaBERT load snippet (base masked-LM via AutoModelForMaskedLM, not a randomly-initialised SequenceClassification head; points to dictabert-sentiment/ner for classification) and reframed the ivrit.ai 22K-hour corpus claim.

Jun 16, 2026

v1.2.0

Added nikud restoration, hosted Dicta APIs and alternative frameworks; corrected the DictaLM 3.0 base-model lineage and tech-report link.

May 14, 2026

v1.1.0

Corrected DictaLM 3.0 sizes (24B, 12B Nemotron, 1.7B), verified HuggingFace model IDs, fixed VRAM requirements, added Reference Links section, and included NeoDictaBERT bilingual embeddings.

Apr 14, 2026

Related Skills

Hebrew Document Generator

Verified·91

Author: skills-il

v1.7.1PopularTop Rated

Generate professional Hebrew documents (PDF, DOCX/Word, and PPTX) with correct right-to-left layout, mixed Hebrew-and-English bidi handling, and proper Hebrew typography. Use whenever the output is a Hebrew or mixed Hebrew/English Word document, Hebrew PDF, or Hebrew PowerPoint, including phrasings like "Hebrew Word document", "Word document in Hebrew", "מסמך Word בעברית", "create a .docx in Hebrew", "lehafik heshbonit", and "litstor hozeh", or Israeli templates such as Heshbonit Mas (tax invoice), Hozeh (contract), Hatza'at Mechir (proposal), or Protokol (meeting minutes). ALSO use this for the symptom where a Hebrew document looks correct on screen or in Claude but comes out scrambled, reversed, or broken after export to Word, with English words, numbers, or punctuation landing on the wrong side, phrased as "Hebrew text reversed in Word", "my Hebrew Word file is broken", "fix Hebrew formatting in Word", or "the docx came out messed up"; the fix is regenerating the .docx with paragraph-level RTL/bidi, NOT a web/CSS RTL change. Prefer this over the generic docx or pdf skills ONLY when the document is Hebrew or right-to-left, because those do not set RTL/bidi and produce scrambled Hebrew with English words and punctuation in the wrong place; for English-only documents use the generic skill. Covers reportlab, WeasyPrint, python-docx, and pptxgenjs. Do NOT use for OCR or reading existing documents (use hebrew-ocr-forms instead).

Ask the Skill

4.04,19011,671

Claude CodeCursorGitHub Copilot+5

Hebrew Content Writer

Verified·91

Author: skills-il

v1.2.0Popular

Write and edit professional content in Hebrew including marketing copy, UX text, articles, emails, and social media posts. Use when user asks to write in Hebrew, "ktov b'ivrit", create Hebrew marketing content, edit Hebrew text, write Hebrew UX copy, or optimize Hebrew content for SEO. Covers grammar rules, register from formal to dugri, mixed Hebrew/English, gendered language, nikud and numerals, and Hebrew SEO best practices. Do NOT use for Hebrew NLP/ML tasks (use hebrew-nlp-toolkit) or translation (use a translation skill).

Ask the Skill

5.03,5838,238

Claude CodeCursorGitHub Copilot+5

Hebrew RTL Best Practices

Verified·90

Author: skills-il

v1.4.0Popular

Implement right-to-left (RTL) layouts for Hebrew web applications. Use when user asks about RTL layout, Hebrew text direction, bidirectional (bidi) text, Hebrew CSS, "right to left", or needs to build a Hebrew web UI. Covers CSS logical properties, the :dir() pseudo-class, Tailwind RTL, React/Next.js RTL setup, icon mirroring, Hebrew typography, and font selection. Do NOT use for Arabic RTL (similar but different typography) unless user explicitly asks for shared RTL patterns, or for native mobile RTL (React Native I18nManager, SwiftUI, Android) which is out of scope.

Ask the Skill

0.03,4629,215

Claude CodeCursorGitHub Copilot+5

Found an issue with this skill?

Use at your own risk. Terms of Use · Security

Want to build your own skill? Try the Skill Creator · Submit a Skill

Reviews (0)

No reviews yet. Be the first to write one!

Hebrew Nlp Toolkit

How to use this skill

When to Apply

Try These Prompts

Developer & AI Agent Instructions

Security Analysis

Quality Score

Performance Data

Frequently Asked Questions

Which Hebrew NLP model is best for general use?

Which Hebrew NLP model is best for general use?

How do I handle nikud (diacritics) in tokenization?

How do I handle nikud (diacritics) in tokenization?

How do I add nikud back to Hebrew text?

How do I add nikud back to Hebrew text?

Do I need a GPU for Hebrew models?

Do I need a GPU for Hebrew models?

Changelog

Related Skills

Hebrew Document Generator

Hebrew Content Writer

Hebrew RTL Best Practices

Reviews (0)