Keyword Density Checker
What it does
The Keyword Density Checker analyses pasted text or HTML and surfaces the most-frequent words and phrases — single tokens, two-word combinations, three-word phrases — with frequency counts and density percentages. Stop words and HTML tags are filtered out by default so what you see is the actual topical signal of the content, not “the the the and and the”.
Common situations
You have written an article targeting a specific keyword and want to verify the keyword and its variants are appearing naturally in the content. The 2-gram and 3-gram view shows whether your target phrase actually appears in the content in the form you’re trying to rank for, or whether you’ve talked around it without using the actual phrase.
A piece of content is performing poorly for the keyword you assumed it was optimised for. Run the density check — often the keyword is barely mentioned because the writer focused on synonyms or related terms instead. The page is technically about the topic but Google doesn’t see the canonical phrase.
You are auditing a competitor’s article that is outranking yours and want to understand their phrase use. Paste their content, see the top 2-grams and 3-grams. The phrases they’re using consistently are the ones search engines are picking up as topical signals.
You are reviewing AI-generated content for unnatural keyword stuffing. Generative tools sometimes overuse the prompt’s keyword, producing density figures that look spammy. The checker surfaces the over-use; the editor cuts it down.
You are checking whether a page is actually about the topic it claims to be about. Run the density check; the top phrases should align with the page’s stated focus. Misalignment is a signal that either the title is wrong or the content has drifted from the topic.
What you need to know
Keyword density used to be a primary ranking signal in the early days of search. It is not one any more — modern search engines use much richer signals (semantic relevance, entity recognition, topical authority) that don’t reduce to “how often does the keyword appear in the text”. Density still matters, but as a sanity check rather than as a tunable optimisation lever.
The pragmatic uses:
Confirm the keyword is present. A page targeting “free SEO audit” should have “free SEO audit” appear several times — in the title, an H1 or H2, the introduction, naturally in the body. If the 3-gram view shows the phrase doesn’t appear at all, the page isn’t really about that keyword regardless of intent.
Check for over-use. Anything over 5% density for a single keyword reads as stuffing. Even 3-4% feels heavy in body content (less so in titles and headings). The natural rate for a target keyword in well-written content is usually 1-2%.
Spot accidental themes. When the density check shows an unexpected phrase appearing five times, the article has accidentally drifted toward a topic the writer didn’t plan for — usually a sign of unfocused writing or scope creep during drafting.
Compare with competitors. If competitors ranking for your keyword have density 1-2% and yours is 0.3%, the gap is real. If theirs is 0.5% and yours is 1%, you’re already at a higher density than what’s working — the gap is something else.
The density calculation:
The denominator is total tokens after stop-word stripping. So a 1,000-word piece with 700 stop words has 300 substantive tokens; a phrase appearing 6 times among them has 2% density. This is the right denominator — it reflects topical signal density rather than mechanical word frequency.
The 2-gram and 3-gram counts use overlapping windows: in “free SEO audit tool”, the 2-grams are “free SEO”, “SEO audit”, “audit tool”. A phrase like “search engine optimisation” appears once as a 3-gram but contributes to multiple 2-grams. This is standard n-gram analysis and matches how search engines tokenize content for relevance scoring.
The stop-word list strips the most common English non-content words. For non-English content, the analyser still works (because token splitting is whitespace-based) but the stop-word stripping removes only English stop words — the noise floor in other languages will be higher.
Frequently asked questions
What’s a healthy keyword density?
Around 1-2% for the target keyword. Anything above 5% reads as stuffing. The historical advice “aim for X%” is largely obsolete — modern search engines don’t reward density tuning. Use the checker to verify the keyword is present, not to engineer a specific number.
Should I use exact-match keywords or variations?
Both. Modern search engines understand variations and synonyms — “free SEO audit”, “free SEO audit tool”, “SEO auditor”, “audit my SEO” are all related signals. The keyword you want to rank for should appear, but stuffing it at the expense of natural variations is a trade-off, not a win.
Why is the checker showing 1-grams that aren’t keywords?
Single words tend to be too generic to be useful targets. “SEO”, “audit”, “free” individually are not the keywords you’re trying to rank for; “free SEO audit” is. The 1-gram view is mostly diagnostic — checking whether weird non-content words are appearing too frequently — rather than the primary view for keyword analysis.
What’s the difference between 2-grams and 3-grams?
2-grams are two-word phrases (“SEO audit”), 3-grams are three-word phrases (“free SEO audit”). For most keyword analysis, 2-grams are most useful — that’s where actual keyword phrases live. 3-grams are useful for confirming long-tail target phrases.
Why are stop words filtered?
Stop words (“the”, “and”, “of”, “is”, etc.) appear with very high frequency in any English text and would dominate the density chart, hiding the actual topical signal. Search engines down-weight stop words heavily; filtering them gives a more accurate picture of what the page is actually about.
Does keyword density affect rankings?
Weakly. Modern search engines use richer signals (semantic relevance, BERT-era language understanding, entity recognition) that don’t reduce to keyword frequency. Density is a sanity check, not a tunable lever. Use it to verify keyword presence, not to engineer a specific number.
How accurate is the density calculation?
The arithmetic is exact — count the phrase, divide by tokens, multiply by 100. The interpretation is the soft part: the optimal density is fuzzy and varies by topic. Treat the percentage as approximate guidance.
Can I check density against multiple keywords at once?
The checker shows the top N most-frequent phrases — so if you’re targeting multiple keywords, paste the content and look for them in the output. There’s no per-keyword input field; the checker surfaces what’s there rather than what you’re hoping to find.
Common problems
Problem: Target keyword doesn’t appear in the top 20 phrases.
Either the keyword genuinely isn’t appearing in the content, or the content is much longer than the keyword usage. If the content is 5,000 words and the keyword appears 3 times, density is 0.06% — well below the threshold for ranking signal. Add the keyword in natural context (title, headings, intro, conclusion).
Problem: Density is 4% but the page isn’t ranking.
Density alone doesn’t get pages ranked. Even at 4% — which is heavy — if the content is thin, the title is wrong, the page has no inbound links, or the SERP for that keyword is dominated by larger sites, density doesn’t compensate. Check the broader picture.
Problem: Top phrases are obviously irrelevant brand or navigational terms.
The HTML-strip is including the navigation, footer, and sidebar in the body content. Either copy just the article body before pasting, or accept that the noise floor includes site-wide elements.
Problem: Stop word stripping is removing words that should count.
The stop word list is English-only and conservative. If the content is in another language, or if certain English words (“might”, “should”) are part of your topic, the stripping is removing useful signal. Toggle off stop word stripping if it’s causing more noise than benefit.
Problem: A 2-word brand name is being split into two 1-grams instead of staying together.
Tokenizers split on whitespace and treat words independently. Multi-word brand names show up in the 2-gram analysis as their full name; in the 1-gram view they appear as separate tokens. Use the 2-gram view for brand and entity tracking.
Tips
- Use the 2-gram and 3-gram views as the primary check. 1-grams are too generic to be useful for keyword analysis.
- Verify the target keyword is in the top 10 phrases. If it’s outside the top 20, the page isn’t really about that keyword.
- Density above 5% reads as stuffing to humans long before it triggers any ranking penalty. Write naturally; check density only to verify presence.
- Compare against competitor pages for context. Density is meaningful relative to what’s currently ranking, not as an absolute target.
- Run the check after editing as well as before. Editing for clarity and concision often shifts density unexpectedly.
Related tools in this suite
The Word Count tool gives the length context that density requires (4% of 100 words is meaningless; 4% of 5,000 words is real). The AEO Content Scorer goes deeper into content quality, scoring for AI-extractability — a complementary lens on the same content.
What this looks like at scale
For a single page, the density check is fine. For a content set, density should be checked at editorial-review time, not as a one-off audit — the cost of fixing density issues at draft stage is minutes, the cost of fixing them post-publish is the redirect cycle. The WP Beacon Plugin does not run density analysis (it’s pre-publish work, not ongoing monitoring).
Take it further
If you are inheriting a content set where keyword targeting has drifted from intent across many pages, the right scope is a content audit rather than per-page density tuning. Start a conversation about how to audit and remediate at scale.