Skip to main content

Association is not the same as Causation


 Correlation is not causation. The profound implications of confusing… | by  Anthony Figueroa | Towards Data Science

Image source: https://www.megapixl.com/causation-and-correlation-illustration-37881989

This post is my reaction after coming across the two terms "association" and "causation" while reading a paper on High Cost of Not Educating Girls.

"In virtually all cases, estimates of the potential impacts of low educational attainment for girls – or equivalently of gains associated with higher educational attainment as captured by secondary education, are large. As documented in more detailed in the study, most gains are associated with secondary as opposed to primary education. It should again be emphasized that what is measured is associations, not necessarily causal impacts."

(HighCostOfNotEducatingGirls, WorldBank as seen on Sep 12, 2020)

 Since I don't have a research background and didn't have to do any kind of research in my undergrad study, I wasn't familiar to these terms. And I wasn't getting what that statement was trying to say. And thus I ended up doing a quick google search trying to understand the difference. I wasn't disappointed. But yes, confused definitely. 

I am not sure I fully understand what is the difference. I make no claims whatsoever of what I am trying to explain is correct or near to it. PLEASE CORRECT ME IF I AM WRONG. :)

Correlation is not causation. The profound implications of confusing… | by  Anthony Figueroa | Towards Data Science

Let's take this picture as a reference. (Also, correlation and association aren't same things.  Correlation measures a specific form of association. So for my easy understand for now, I'm assuming correlation in the above picture as association.)   

Let's have some statements:

1. In summer weather, it is dry, hot and sunny. That 'causes' the ice-cream to melt.  Is this true?

2. In summer weather, it is dry, hot and sunny. That 'causes' people to get sunburns. Is this true?

In the first statement, I can confidently say it's true. Because it's pretty sure that the heat will cause ice-cream to melt. Right?

The second statement is definitely true too.  Sunburn is a reaction of the body to direct DNA damage from UVB (Ultraviolet B) light either from the sun or from artificial sources. (https://en.wikipedia.org/wiki/Sunburn

In both the statements, we can say with reasonable certainty that there is indeed a 'causative link' between the two factors i.e. between sun and ice-cream / sunburn. This is causation.

Causation means that one variable produces the effect in the other variable.

 Now in the same example, let's say we find a statistical relationship between ice-cream and sunburn. For example- "The number of ice-cream melted during the summer correlated with the number of people get sunburns during summer."

Now, can we say that "eating ice-cream 'causes' sunburns" or "getting sunburns 'causes' the ice-cream to melt"?

Doesn't that sound funny and absurd? It does, because there is no 'causal relationship/link' between ice-cream and sunburns. It is mere correlation and association.

Association is a statistical relationship between two variables.

Let's see some more examples.

1. It has been convincingly demonstrated that people of lower socioeconomic status (SES) have a higher risk of lung cancer, i.e., there is a clear association, but does that mean that low SES is a cause of lung cancer? A more plausible explanation is that people of lower SES are more likely to smoke and to be chronically exposed to air pollution and that exposure of the respiratory tract to these contaminants causes mutations in bronchial cells that can eventually produce a cancer.

 

2. Jewish women have a higher risk of breast cancer, while Mormons have a lower risk. However, one's religion is not a cause of breast cancer. There are other explanations. So we can't say ,"Being a Jew woman causes breast cancer."

3. In 1997, a very large population study looking at alcohol consumption and death rates (amongst other variables) was published by the New England Journal of Medicine [4]. It showed very clearly that moderate levels of drinking (between 1-2 drinks per day) was associated with a decrease in death rates from all causes, particularly from cardiovascular disease, even compared to people who don’t drink at all. There is undeniably an association in their results, but we cannot say with certainty that the alcohol itself caused the increase in life expectancy. This is because there may well be other factors involved that explain the difference. For instance, what if people who have a drink a day are more relaxed? There is an association between stress and increased risk of cardiovascular disease, and the result could have been caused by this. Another possible explanation is increased social interaction in people who drink moderately, as loneliness may also be associated with shorter life expectancy [5].

From this, what we can say is, having an association between two things doesn't mean there is necessarily a causation in the two factors. 

Okay, got it. But what do I do now with this information? Let's look at another example. I love this one.

Suppose a study finds an association between paternal silk tie ownership and infant mortality. Based on this study, the government implements a program in which 5 silk ties are given to all men aged 18–45 with a view to reducing infant mortality. We would all agree that this is madness. This is because we understand the difference between association and causation. 

Got it? Understanding the difference between association and causation helps us make decisions.

Let me try to connect to the current COVID pandemic situation. 

Let's say a study finds that a 'turmeric related treatment' is associated with better outcomes for COVID patients.

Now can you say that treatment caused COVID patients to be healthier?  

The people who seeked and received that treatment may be healthier and have better living conditions than those who did not. Therefore, people receiving the treatment might appear to benefit, but the difference in outcomes could be because they are healthier and have better living conditions. There are dozens of ways in which external factors can influence experimental results, even in a clinical trial.

Disentangling cause from association is a tricky business and it takes a brave person to claim that they can definitively prove one factor causes another. 

What you should take away from this is a healthy dose of skepticism. If you come across someone professing that one thing causes the other, assume that they’re wrong until you’re convinced otherwise. 

Ask: 
Is what you have an association or a cause? How was this investigated? 
Was the study an RCT(randomised controlled trial)? How were all other variables kept the same.

When it comes to a treatment, remember that whilst the outcome of a trial may show an association between a treatment and an outcome, the treatment may not necessarily be the cause.

I think this information regrading association and causation is going to be helpful when I read researches which show some kind of relationship between factors.

Sources: 

https://core.ac.uk/download/pdf/230014813.pdf

https://www.students4bestevidence.net/blog/2017/06/23/association-is-not-the-same-as-causation/

https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717-Module1A-Populations/PH717-Module1A-Populations6.html

http://samples.jbpub.com/9781449604752/04752_ch16_final.pdf




Comments

Popular posts from this blog

Bikharney Ka Mujhko Shauq Hai Bada | Qala | Guitar Chords and Lyrics

Full song here:  https://youtu.be/-lcd1ixHqjE?t=670  or here  https://youtu.be/1Hd7Gl_mLzw G                                             C Bikharney Ka Mujhko, Shauq Hai Bada G                D                        C  G  D Sameitega Mujhko, Tu Bata Zara... G                                                C Haaye, Bikharney Ka Mujhko, Shauq Hai Bada G                D                        C  G  D Sameitega Mujhko, Tu Bata Zara... G D G G        ...

Timi Uta Ma Yeta (Sunsaan Raat Ma) Guitar Chords - Pratik Gurung Cover - Original by Milan Newar, Rajan Raj Shiwakoti

 I came across this song when Kanden Limbu sang this in Nepal Idol.  He was one of the contestants. I thought it was his original. Surprisingly, it is not. His was based on Pratik Gurung's cover. This is actually a full fledge song with a music video. Here's the original one. However, I like the cover version. Here are the chords for a portion of the song. The chords are based on this tutorial . In the tutorial, plucking is used. You can strum as well if that's what you prefer.  CAPO: 2nd fret STRUMMING: D-DU Plucking used: 5-2-3 (C and Am), 4-2-3 (FT and GT), 6-2-3 (G) Chords: C, Am, G Major, F Major Triad (FT), G Major Triad (GT) NOTE: It's easier if you play G Major in C's original position; just move 2 and 3 fingers one step up.   C                             Am (Sunsan raat ma yo mero aatma FT               GT    C Chasakka bi...

Pokhiyera Ghaam Ko Jhulka Guitar Chords|| Narayan Gopal || Ambar Gurung || Hari Bhakta Katuwal

  Pokhiyera Ghaam Ko Jhulka Guitar Chords || Narayan Gopal  Composer: Ambar Gurung Lyrics: Haribhakta Katwal Vocals: Narayan Gopal Guitar Chords based on tutorial by Nepali E-Chords . I take no responsibility if the chords are not absolutely right. But it sounded good enough for me.  Chords used: G, C, D, Am, Em, Bm Strumming: D-DUDU Intro: D       G      D             G              D      G Pokhiyera Ghamko Jhulka, Bhari Sangharama G                     D                     Em                G Timro Jindagiko Dhoka, Kholu Kholu Lagchha Hai Am                     C  ...