2021 EDIT: Originally there were images in this post, but they were lost to the ravages of time.
Currently I’m getting into Data Science and as way to practice I’m exploring data on previous Portuguese Local Elections. One interesting thing that I found is that municipalities with a higher number of voters enrolled also have a higher percentage of people not voting. There seems to be some kind of inverse exponential relationship. Each data point is a municipality.
Pearson Correlation Coefficient: 0.453047145972 p-value: 5.4002095121e-17
This is more clear if I use the the natural logarithm of the number of voters:
Pearson Correlation Coefficient: 0.69994788675 p-value: 1.20211071448e-46
The results are similar when using data from the 2009 and 2005 elections.
Pearson Correlation Coefficient: 0.579329081296 p-value: 5.27916030293e-29
Pearson Correlation Coefficient: 0.579329081296 p-value: 5.27916030293e-29
Funny. Maybe in smaller places people feel their votes make a bigger difference? Does the relationship holds in other elections beyond local elections?
Correlation Coefficient: 0.209724326747 p-value: 0.000209736784349
Correlation Coefficient: 0.113888507278 p-value: 0.0458131679479
Well, the correlation is weaker and less significant but it still exists. Did anyone noticed this before? Or I’m making something wrong? You can find data and code here.