Now, imagine that we got a correlation of 0.4 after testing the sample of 30 people. How do we know that such correlation occurs in the population as well? And, how do we know if the size of our sample was sufficiently big to be representative?
In this post, I will explain what is meant by significance testing. In practice, it does not involve any calculation; in fact, there is a ready-made table of significance which simply tells you whether your result is significant enough or not depending on your sample size and correlation coefficient. However, I think it is important to understand the logic behind this table - and this is exactly what I will attempt to explain in this post. I will give the table in the end as well, so if you are not really interested in theoretical considerations at the moment you can safely go straight there.
Logic behind significance testing
Almost anything is possible in a sample - although only certain things are likely. Thus, we try to stipulate which correlations are LIKELY correlations in samples of a given size and which are UNLIKELY (if the population correlation equals zero). So, if the population correlation does equal zero, then the correlations in the middle 95% of the sample distributions are likely ones. Consequently, correlations which fall in the extreme 5% of the sample distribution (2.5% in each direction) are unlikely.
So, when our sample correlation coefficient falls into these 5% of the sample distribution from a population where the correlation is zero, we deem it to be statistically significant. In other words, we say that it is unlikely that we obtain these results if the population correlation was zero - therefore, it is probably NOT a zero which gives us a right to reject a null hypothesis and accept an alternative hypothesis (see next section).
The Null Hypothesis
An alternative hypothesis suggests that there IS a relationship between the variables; to be able to accept it we need to reject the Null Hypothesis first.
This is where significance testing comes into play. We assume that the null hypothesis is true; then we take our correlation coefficient and check whether its value was likely to occur in a sample from such a population where the null hypothesis is true (i. e., where the correlation is zero). If it falls in the extreme 5% of the sample distributions then it is unlikely - and so, we can reject the null hypothesis.
So how do we check for it?
We simply consult the Significance Tables according to the size of our sample. Note that there are two different tables for Pearson Correlation Coefficient and Spearman's rho. These tables will be given in any statistics and psychology textbooks - have fun!