Auditing advanced indexing and serving rules

Sitebulb's Indexability report prioritizes the most important with the insights at the top of the page, so you easily explore pages that use noindex or nofollow: 

Indexability Report

However, Sitebulb also detects a set of other more specific indexing and serving rules, which you will find tabulated further down the Indexability Overview, and also picked out as 'Insights' in the Indexability Hints.

The additional directives Sitebulb extracts are as follows:

data-nosnippet

This allows you to designate textual parts of an HTML page not to be used as a snippet. This can be done on an HTML-element level with the data-nosnippet HTML attribute on span, div, and section elements. The data-nosnippet is considered a boolean attribute. As with all boolean attributes, any value specified is ignored. To ensure machine-readability, the HTML section must be valid HTML and all appropriate tags must be closed accordingly. Read more on Google's documentation here.

nocache

This gives you some control over which content is used to train Microsoft's generative AI models. For content in the Bing Index that is labeled nocache, only URLs, Titles and Snippets may be used in training Microsoft’s generative AI foundation models (i.e. not the main body content on the page). Content with the nocache tag may be included in Bing Chat answers, but they will only display the URL/Snippet/Title in the answer. Read more on Bing's blog post and the Search Engine Land announcement.

noarchive

Similar to nocache above, noarchive allows you control over which content is used to train Microsoft's generative AI models. Content tagged noarchive will not be included in Bing Chat answers, and will not be linked to in the answers. For content in the Bing Index that is labeled noarchive, they will not use the content for training Microsoft's generative AI foundation models. If content has both nocache and noacchive tags, they will treat it as nocacheRead more on Bing's blog post and the Search Engine Land announcement.

On Google, noarchive acts as an instruction to not show a cached link in search results. If you don't specify this rule, Google may generate a cached page and users may access it through the search results. Read more on Google's documentation here.

nositelinkssearchbox

This instructs Google to not show a sitelinks search box in the search results for this page. If you don't specify this rule, Google may generate a search box specific to your site in search results, along with other direct links to your site. Read more on Google's documentation here.

nosnippet

This instructs Google to not show a text snippet or video preview in the search results for this page. A static image thumbnail (if available) may still be visible, when it results in a better user experience. This applies to all forms of search results (at Google: web search, Google Images, Discover).

If you don't specify this rule, Google may generate a text snippet and video preview based on information found on the page. Read more on Google's documentation here.

indexifembedded

This tells Google that it is allowed to index the content of a page if it's embedded in another page through iframes or similar HTML tags, in spite of a noindex rule.

indexifembedded only has an effect if it's accompanied by noindexRead more on Google's documentation here.

max-snippet

This instructs Google to use a maximum of [number] characters as a textual snippet for this search result. (Note that a URL may appear as multiple search results within a search results page.) This does not affect image or video previews. This rule is ignored if no parseable [number] is specified. Read more on Google's documentation here.

max-image-preview

This allows you to set the maximum size of an image preview for this page in a search results.

If you don't specify the max-image-preview rule, Google may show an image preview of the default size. Read more on Google's documentation here.

max-video-preview

This allows you to set a maximum of [number] seconds as a video snippet for videos on this page in search results.

If you don't specify the max-video-preview rule, Google may show a video snippet in search results, and you leave it up to Google to decide how long the preview may be. Read more on Google's documentation here.

notranslate

This instructs Google not to offer translation of this page in search results. If you don't specify this rule, Google may provide a translation of the title link and snippet of a search result for results that aren't in the language of the search query. If the user clicks the translated title link, all further user interaction with the page is through Google Translate, which will automatically translate any links followed. Read more on Google's documentation here.

noimageindex

This instructs Google not to index images on this page. If you don't specify this value, images on the page may be indexed and shown in search results. Read more on Google's documentation here.

unavailable_after

This instructs Google not to show this page in search results after the specified date/time. The rule is ignored if no valid date/time is specified. By default there is no expiration date for content. If you don't specify this rule, this page may be shown in search results indefinitely. Googlebot will decrease the crawl rate of the URL considerably after the specified date and time. Read more on Google's documentation here.