NEW+--
Min 75% OFF | Pay Day Sale Extended till 3rd June
Move Left Complete list

Hypothesis: Concept & Design

A Hypothesis, in general, is a statement which is capable of being tested by scientific methods.
Ideally, a Hypothesis must be
  • Clear and precise
  • Capable of being tested and,
  • should have some empirical reference (some observed basis).
 
In Statistics, it is an expression about the whole population which can be put to test under some assumptions to determine its validity. 

Our goal in hypothesis testing is actually just to "test" our belief/claim solely based on the evidence we get from the sample data in an unbiased way.
How and what we define a null/alternate hypothesis depends on how we set up/design our test so that we can mathematically validate our claim.
 
Traditionally, the null hypothesis is an assertion that we hold as true unless we have sufficient statistical evidence (from sample data) to conclude otherwise. It could be a belief that you hold or is generally accepted/observed normally. Suppose the null hypothesis is denoted by H0 and the Alternative hypothesis (negation of null)  is denoted by HA.
 
The following 2 examples should shed some light:
 
Example 1:
We know that pharma companies spend billions of dollars in research for finding a drug which cures a disease like Cancer. It is generally accepted there is no perfect drug which cures Cancer. It could be the case that when researchers come up with 1000 drugs only 1 or 2 might have a significant impact. So normally they start with the assumption that the new drug has no impact as the null hypothesis, as it is the likely observed case from their experience. However, what is really of interest is the case when the new drug has a significant impact, which is taken as the alternative hypothesis. 
 
In such a case we can design the test as  
H: Impact of new drug = 0 
HA : Impact of new drug is significantly different from 0 
 
 
Example 2:
Consider another case where a Women's magazine is trying to find whether the average income of all females in Investment banking is less than the average income of their male counterparts. Suppose they generally believe this case to be true and want to test this statistically, based on some sample data. Now, how do we go about designing a test around this belief in a way which is mathematically convenient? 
 
Even though they believe this case, taking it as Null will complicate the testing procedure, so this is taken as the alternate hypothesis.
 
In such a case we can design the test as  
H: Average income of all female bankers= Average income of all male bankers
H: Average income of all female bankers < Average income of all male bankers
 
 
Generally speaking, it is mathematically convenient if the Null has "=" i.e. equality in it.
 
The significance level is the percentage risk to reject a null hypothesis when it is true i.e.,  it represents the probability of a  "false positive".  By convention, it is preset at 1%, 5% or 10%. For e.g. in case 1, the probability of accepting the new drug as effective, even when it has no real impact (a serious and costly mistake!),
so better to limit this at say, 5%.
 
 
RELATED KEYWORDS