In Silktide, scores are meant to represent how good or bad an area is, in a way that feels naturally fair and useful. Approximately speaking, 0% is terrible, 100% is perfect, and 50% is usually average.
Most scoring is based on the number of issues, and number of pages impacted, but the specifics vary per check.
Each check is calibrated against real-world data for what is "good" or "bad". For example, how many spelling issues would be needed for a website to score 0% for spelling? The answer to questions like these is based on lots of testing and research.
How checks are scored
A Silktide website report is made up of many separate tests or checks, for example the Spelling or Broken links checks.
These independent check scores form a part of the scoring metrics for other broader categories, for example Content or Accessibility.
In general, single checks are scored based on a mix of the following:
The percentage of pages in a report that have an issue present
The average number of issues per page
Some checks don’t care about the average number of issues, because some issues can only occur once per page.
For example, the check that looks for a missing Skip-to-content link can only pass or fail once for a page, so we only count the number of pages in this case, and not the number of issues.
Some checks don’t apply to pages but do apply to other entities, for example the volume of files, or PDFs, or another criteria. In these cases, the same principle applies, but is graded against that other measure of volume.
To turn these counts into a score, Silktide has a target "worst" value for each. For example:
The worst "average issues per page" for the "Avoid Flash animation" check is 0.1, meaning that if 1 in 10 pages had Flash animation on it, Silktide would score that website 0% for that check.
The worst number of broken links per page is 4. This means, if you have an average of 4 broken links per page, you’d score 0% for Broken Links.
Some checks view specific issues with different weights, so a particular issue could be more or less impactful. For example:
Spelling considers an "incorrect case" issue (for example, writing "Linkedin" instead of "LinkedIn") to be 20% as impactful to the overall score score as a definitely incorrect spelling such as "suberb".
Broken links considers a "potentially broken link" to be 40% as impactful on score as a "definitively broken link".
Each check is calibrated based on how common and/or impactful an issue is. For example, a single page not working in mobile is more harmful to a score than a single spelling issue. We conduct extensive research on many websites to calibrate our results.
Note: the Web Vitals checks are a special case. Web Vitals scores are copied from Google's Lighthouse scoring system. Silktide do not modify these scores.
How categories are scored
For category scoring (groups of checks), such as Content or Accessibility:
Each category starts by scoring a perfect 100.
Each check can subtract from that score based on its "maximum impact", meaning the maximum percentage it is allowed to reduce a category score by.
A category score cannot go below zero.
Categories can be nested inside each other. In this case, a category can be a weighted average of the categories inside it.
For example, Accessibility is weighted as follows:
45% = WCAG 2.1 Level A
40% = WCAG 2.1 Level AA
15% = WCAG 2.1 Level AAA
The weight of a PDF or HTML check is based on the number of PDFs or pages inside a report. If a website has very few PDFs, then PDF checks will have a very small impact on scoring, and vice versa.
Category scoring bands
0 – 29 = Very poor
30 – 49 = Poor
50 – 74 = Fair
75 – 84 = Good
85 – 94 = Great
95 – 100 = Excellent
Why not use weighted average scoring?
By using a weighted average, adding new checks has a counterintuitive effect: category scores will go up.
For example, upgrading from WCAG 2.0 to 2.1 would mean more accessibility checks, which sounds like it should mean lower scores. However, as most webpages tend to pass most checks, pages would then tend to pass a higher % of these checks and therefore score higher under a more demanding standard.
This counterintuitive result is avoided with Silktide’s approach.