Professor Zewotir taught us in lecture ‘if you learn how to walk before you can crawl, is there a problem?. I only came to
appreciate what this esteemed professor of Statistics, from South Africa via Ethiopa actually meant later when he
taught us about the Odd ratio. I had seen the this statistics in books and published studies before, and I knew how
to calculate, but did I know what the meaning of odds was?
‘The odds are against us winning’ people would say, but this seemingly basic concept of the probability of something
happening versus the probability of it not happening only solidified later when I went through the lecture material
sitting on a bench in a beautiful graden, ‘Huis Ten Bosch’ (House Ten Bosch) not far from the Math department building where we have our lectures.
Only now did I get, it: the odds ratio is the odds of something happening give another thing over the odds of that
something happening given the other other thing not happening? Confusing? Read about it and then sit under a bench
for a while and do it on the back of till slips (or on the back of an envelope, very Einstein and if he ever came to
Stellenbosch I think he would agree it would rival Princeton).
I hope, dear reader, you will also take a liking to statistics, unless you are some sort of statistician already.
In any event, here are some of my favorite quotes that I overhead during the course of our lectures, Level 1: Introduction
“Refer to drug test data: Determine the effect of Drug – two antibiotics (A and D) and control F and Pretreatment –
a pretreatment score of leprosy bacilli on Post Treatment – a post treatment score of leprosy bacilli’
This was a question we had to answer using our computer package R (yeah for open source software!).
“This question is not worded well..what are the response and explanatory variables?” came the question
from one of my classmates Indren,himself a lecturer at Stellenbosch University. I chuckled, because I too failed to
grasp what was what, but Professor Zewotir cleared that up. Now what was the point of this question? It was also
to illustrate how finding the variables and knowing which is which can be quite a problem of its own.
“If you do not know what you are looking for, you can spend your whole lifetime playing with data” was the wise
saying from our professor. He went on to say “For this dataset we have just one objective, our interest should
just be one clear objective. If we have another objective, we need to look for a method and see if it is appropriate”.
Indeed, I wish I had known that before I spent half an hour or regressing variables in different combinations on the
alligator dataset last night. Goodness, statisitics can be intensly hypnotizing,
espcially with R.
“So if I have data, how do I know which is my explanatory variable and which is my response” was the very question
I asked him when he first introduced the topic of linear regression. Silly question it was, for he told me with a
touch of humour. “So how can we say the house size and house cost are related?”
“Don’t be foolish now, It is you who determines which variable is explanatory variable and which is response variable”
“it should make sense to you such as income and saving, saving depends on the income.
and to make a relationship between these two, that is our role, to determine alpha and beta from the data.” Ok done being foolish.
“Reject the null hypothesis, how can I just write reject the null hypothesis in my thesis? It sounds very bad…”
Came the question from our Lybian classmate Emhemed working on a new cancer cell line. Faikah, our most wonderful teaching
assistant from SACEMA was a cool as cucumber in replying “No you must explain what reject your null hypothesis
means.” And so I thought, does anyone ever mention they have a null hypothesis they are rejecting when citing a significant
p-value? Do the readers know that data seems to say the null hypothesis is probably not true (the test statistic
in question does not have this or that distribution), but that if the null hypothesis is actually true and we are wrong to
reject it, then we will be doing this 1 out of every 20 dataset we get from this population described by the null
hypothesis? And what about the whole thing about power of the data? (the probability of getting it right – correctly stating that the null hypotheis is actually not a correct
description of the ‘state of nature’) I guess there is so much implied in hidden by that p < 0.05.
I hope you have a sense this course is not only about teaching how to do statistics, but actually understand conceptually
what these methods are about. So there are two approaches one can take in teaching this material. One would be to start
from mathematical first principles and show how the methods are built up. The other is work with heuristic first principles and explain
the logic using illustrations (graphs with lines and scatter plots) and datasets. The lecturers opted for the latter and I am
so glad they did, because frankly, the former would be impractical in just two weeks. I had to come terms that this would
mean they would wave their hands from time to time, but we have the rest of our lives to deepen our knowledge of these concepts.
Wait, why I am using the past tense? There are still two lectures left and they are on logistic regression, very relevant for data I am working
with in Namibia on the chance of heart disease given blood pressure, smoking, cholesterol, age. Yey!
So why am I writing this blog — I should be preparing for tomorrows lecture! But then when I read ahead how do
I stay in the moment during the lecture and not jump ahead with my question? Well, we are in South Africa and it’s Nelson Mandela day and we have to understand
how and why investigators often chose ‘white race’ as a baseline for studying health inequalities in between races, such as
blacks. So why am I writing this blog? Because on Nelson Mandela day 67 minutes of service are a way of appreciating
your wider community (as well as the to University of Gent and the Royal Belgian government for subsidising this course!)
There is always one thing that underneath all the others, motivates me to write. Beyong thanking there is the purification of the self.
Old regrets of not having achieved relatively good marks in math courses are cleansed. Now I have come to realise,
everything is indeed truly possible and this is just the beginning. I am grateful for all the stimulating conversations
outside of the classroom, with SACEMA fellows and staff and my own classmates ( I am hoping for more!). One of them is even biological, social anthropologist and epidemiologist. I feel such fields beckon me
and the more I learn, the more I remember how motivated I am by questions arising from humanity. The funiest dinner table topic so far was that Lagos versus Johannesburg
in terms of safety: the null hypothesis has a p-value of 0.999998, which did not amuse my Nigerian classmate who insisted Lagos was by far safer, but she laughed nonetheless.