How Scoring Works

Every PromptScore is calculated from 5 weighted dimensions that together measure how effectively someone uses AI to accomplish a task. No black boxes — here is exactly what we measure and why.

Methodology based on research into AI-assisted productivity from Harvard Business School, Wharton, and enterprise prompting benchmarks.

Score Scale

95-100

Exceptional

80-94

Strong Hire

65-79

Hire

50-64

Consider

35-49

Below Avg

0-34

Not Ready

Dimension Weights

Prompt Quality

25%

Efficiency

25%

Speed

15%

Response Quality

20%

Iteration Intelligence

15%

Weights are calibrated so that efficient, high-quality first prompts score highest. This reflects real-world productivity — the best AI users get great results fast.

The Five Dimensions

25%

Prompt Quality

How well-constructed are your prompts? We analyze clarity, specificity, structure, formatting instructions, constraints, and context-setting.

High Score Looks Like

Clear, structured instructions with numbered steps
Explicit constraints (word count, tone, what to avoid)
Role/persona setting for the AI
Audience awareness baked into the prompt

Low Score Looks Like

Vague, one-line prompts with no structure
No constraints or formatting guidance
Missing context about who the output is for
Copy-pasting the same prompt repeatedly

Anti-Gaming

We analyze linguistic patterns, not just length. A 500-word prompt full of filler scores lower than a precise 100-word prompt with clear structure.

25%

Efficiency

How economically do you use your resources? Measured by attempts used vs. allowed and tokens consumed vs. budget.

High Score Looks Like

Achieving the goal in 1-2 attempts
Using less than 50% of the token budget
Getting it right the first time

Low Score Looks Like

Using all available attempts
Burning through the entire token budget
Repeating similar prompts without meaningful changes

Anti-Gaming

Using fewer attempts only helps if the output quality is good. A single bad prompt scores lower than two well-crafted iterations.

15%

Speed

How quickly do you complete the task? Faster completion (with quality maintained) indicates confidence and fluency with AI tools.

High Score Looks Like

Completing in 20-50% of the allotted time
Quick, decisive prompting without long pauses
Finishing with significant time remaining

Low Score Looks Like

Using 90-100% of available time
Long pauses suggesting uncertainty
Running out the clock

Anti-Gaming

Completing in under 15% of the time triggers a review flag. Suspiciously fast completions are capped to prevent gaming.

20%

Response Quality

How good is the AI output you elicited? We evaluate the final response against the task requirements, expected keywords, structure, and constraints.

High Score Looks Like

Response covers all required elements
Proper structure (headings, lists, sections as needed)
Matches the expected tone and audience
Contains relevant domain-specific content

Low Score Looks Like

Response misses key requirements
No structure or formatting
Wrong tone for the audience
Generic output that could apply to any task

Anti-Gaming

We evaluate the best (final) response, not just the first. This rewards smart iteration — improving your output across attempts.

15%

Iteration Intelligence

When you iterate, do you improve? We track whether subsequent prompts build on AI feedback, introduce new requirements, and produce better results.

High Score Looks Like

Each prompt meaningfully different from the last
Referencing AI output ('change X to Y', 'instead of...')
Introducing new vocabulary and requirements
Responses improving in quality across attempts

Low Score Looks Like

Repeating the same prompt verbatim
Random changes without clear direction
No reference to what the AI previously produced
Response quality staying flat or declining

Anti-Gaming

Single-attempt completions receive a neutral score (60) for this dimension — you're not penalized for getting it right the first time.

Custom Scoring Criteria

Employers can add custom criteria on top of the standard 5 dimensions. When custom criteria are used, the final score blends standard dimensions (50%) with custom criteria (50%).

Keyword

Must-include and must-not-include terms in the output

Tone

Professional, casual, technical, or creative tone matching

Length

Word count within a specified min/max range

Rubric

Free-form criteria matched against response content

See it in action

Try a free demo assessment and get your PromptScore with a full breakdown.

Take the Demo Create Account