What Are Statistical Tests?

A statistical test is a formal, quantitative procedure used to make decisions about a process or population based on sample data. Its purpose is to evaluate whether the observed data provide sufficient evidence to contradict a specific assumption about the process.

That assumption is called the null hypothesis.

In short, a statistical test helps answer questions like:

Is this process behaving as expected?
Is the observed difference real, or could it be due to random variation?
Is there enough evidence to justify changing how we act or decide?

What Is Meant by a Statistical Test?

A statistical test provides a decision rule for determining whether to reject or not reject a stated hypothesis.

The hypothesis being tested is called the null hypothesis, usually denoted by $H_0$.
The test does not attempt to prove $H_0$ is true.
Instead, it assesses whether the data are inconsistent enough with $H_0$ to reject it.

Failing to reject $H_0$ does not mean it is true. It simply means that, given the data and the test used, there is not enough evidence to conclude it is false.

This outcome can be:

Desirable, if continuing to assume $H_0$ is acceptable, or
Disappointing, if the goal was to demonstrate a difference or effect.

The Concept of the Null Hypothesis

The null hypothesis ($H_0$) represents a default position or benchmark assumption about a population parameter or process.

Example: Two-sided hypothesis

Suppose a manufacturing process is designed to produce items with a mean linewidth of 500 micrometers.

Null hypothesis: $H_0$: the mean linewidth is 500 micrometers
Alternative hypothesis: $H_1$: the mean linewidth is not 500 micrometers

This is a two-sided test, because deviations in either direction (too large or too small) are considered problematic.

How the test works

Measurements are taken at random locations.
A test statistic is computed from the sample.
This statistic is compared to upper and lower critical values.
If the statistic falls outside these limits, $H_0$ is rejected.

Rejection means the data are unlikely to have occurred if the mean were truly 500 micrometers.

One-Sided Tests of Hypothesis

In some situations, only one direction of deviation is of concern.

Example: Minimum performance requirement

Suppose a manufacturer requires light bulbs to last at least 500 hours on average.

Null hypothesis: $H_0$: mean lifetime $\ge 500$ hours
Alternative hypothesis: $H_1$: mean lifetime $< 500$ hours

This is a one-sided test, because only lifetimes that are too short matter.

Decision rule

A test statistic is compared to a lower critical value.
If it falls below that value, $H_0$ is rejected.
Large values do not lead to rejection, because they do not violate the requirement.

Required Components of a Statistical Test

Every statistical hypothesis test involves two competing statements:

Null hypothesis ($H_0$)
A baseline assumption or status quo.
Alternative hypothesis ($H_1$ or $H_a$)
The competing claim that represents a deviation from the null.

The test is constructed to control how often incorrect decisions are made.

Significance Level

The significance level, denoted by $α$, is the probability of making a Type I error:

Rejecting $H_0$ when $H_0$ is actually true.

Common values are $α = 0.05$ or $α = 0.01$.

A small $α$ means:

Stronger evidence is required to reject $H_0$
Fewer false alarms
More confidence when rejection occurs

This is why rejection of $H_0$ is often described as “statistically significant.”

Errors of the Second Kind

A Type II error occurs when:

The null hypothesis is not rejected, even though it is false.

The probability of this error is denoted by $β$.

Key properties:

$β$ depends on how far reality deviates from $H_0$
Large true differences are easier to detect → small $β$
Small true differences are harder to detect → large $β$

There is a trade-off between $α$ and $β$:

Reducing $α$ usually increases $β$
Making a test very conservative makes it harder to detect real effects

The power of a test is defined as $1 – β$, the probability of correctly rejecting a false null hypothesis.

Operating Characteristic (OC) Curves

An operating characteristic curve summarizes how $β$ changes as the true parameter value moves away from the null hypothesis.

OC curves show:

How sensitive a test is
How likely it is to detect meaningful deviations
The relationship between sample size, effect size, and error rates

They are especially useful in quality control and industrial testing.

Practical Purpose of Statistical Tests

Statistical tests are designed to support decision-making under uncertainty by:

Quantifying evidence
Controlling the risk of incorrect conclusions
Balancing false alarms against missed detections
Guiding sample size requirements

They do not prove hypotheses true or false in an absolute sense.
They provide structured evidence-based rules for action.

Summary

A statistical test evaluates whether data contradict a null hypothesis.
The null hypothesis represents a baseline assumption.
Tests can be one-sided or two-sided.
The significance level $α$ controls false rejections.
The error rate $β$ reflects missed detections.
Statistical testing is about evidence, not certainty.
Conclusions should always be interpreted in context, not mechanically.

Your Gateway to Data Mastery

Learn, explore, and innovate with data science.

What Are Statistical Tests?

What Is Meant by a Statistical Test?

The Concept of the Null Hypothesis

Example: Two-sided hypothesis

How the test works

One-Sided Tests of Hypothesis

Example: Minimum performance requirement

Decision rule

Required Components of a Statistical Test

Significance Level

Errors of the Second Kind

Operating Characteristic (OC) Curves

Practical Purpose of Statistical Tests

Summary

Like this:

Related

Leave a ReplyCancel reply

What Is Meant by a Statistical Test?

The Concept of the Null Hypothesis

Example: Two-sided hypothesis

How the test works

One-Sided Tests of Hypothesis

Example: Minimum performance requirement

Decision rule

Required Components of a Statistical Test

Significance Level

Errors of the Second Kind

Operating Characteristic (OC) Curves

Practical Purpose of Statistical Tests

Summary

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Your Gateway to Data Mastery