A rain drop
On the tree top
A rain drop
On the tree top
I float in and out
You conjure beauty
I gaze in a trance
As the wheels of the world
Spin silently by
When James Watson, Nobel lauriette known for the discovery of the double-helix structure of DNA with Francis Crick made his genome public, he decided to keep one small bit private. The APOE gene, whose variants are known to correlate with higher probability of the Alzheimer’s disease.
The logic was, why worry about chances of getting Alzeimers when anyway the information was only probabilistic and there is not much that one can do towards prevention or cure?
Thus in the online data the region of the DNA sequence for the gene as well as adjoining region was erased. All well and good, now no one including Watson himself would know his risk factors for Alzeimer right? Of-course not, that is what this story is about.
There comes a guy named Mike Cariaso with interest in gene variations and expertise in coding, who coolly informed public that erasure notwithstanding, he can still figure out what Watson’s APOE status is from the data.
His strategy is simple to understand, the APOE gene variants happen to have accompanying sequences not only from nearby regions but also from distant sequences. Looking at data on which are the highly probable co-travellers for given variants, it is possible to know which APOE variant Watson has.
That is the crux of the story. The whole story is naturally more complex. There were large number of very capable genetists, atheists and legal experts involved in publication (and witholding) of Watson’s data and yet this important point was missed.
Amazing feats can be achieved with interest, insights and hard work.
Based on an anecdote in “Here is a human being” by Misha Angrist.
Learning by doing is both rewarding and challenging. Many institutes worldwide have embraced this path to engage with students. I regularly have a component of hands-on projects in my courses, here is a partial list of regression projects carried out by my class in the last semester.
E-waste and economy: Abhijit Manek, Deven Parikh and Shivam Sony followed the research paper by Sigrid Kusch-Brandt and Colin Hills to verify that as per capita GDP-PPP increase the e-waste generation increases linearly.
Ph and TDS relation in waste-water from dye industries: Saumya Parekh, Deep Wadher and Preet Sangani studied waste water from Dye industry and followed the work by Farhana Maqbool to show that the Ph values in the waste water are a good indicator of TDS values.
Space-launch success rates: Shubham Ghodasara, Dhruv Shah and Ruchit Trived used data from wikipedia and from space-launch report to show that the success rate of space-launch has nearly saturated with time.
Carrier Selection in Gujarat: Haard Monpara, Tirth Panchamia and Jeet Dand collected data from various newspaper sites on the number of students appearing for Science or General stream exams over the last 10 years to study the trends and fluctuations.
Cricket score prediction: Manush Gami, Heet Patel and Ashutosh Mehta were interested in looking at T-20 match scores as a function of overs bowled. Their data came from a random sample.
Birth rate in India: Aashman Dave, Sagar Shah and Nikhil Tantia studied birth-rate per 1000 people in India from 2005 to 2016 from knoema.com and found a very good fit showing a downward trend.
Smokers in India: Praj Patel, Parva Raval and Aman Desai looked at percentage of population that smokes, using data from ourworldindata.org. They found a downward trend in data from 1980 to 2012.
Accidents in Gujarat: Darshil Kothari, Pavitra Patel and Kenil Shah did a polynomial fit on the accident data in Gujarat from 1999 to 2017 from rtogujarat.gov.in to study the trend.
Engine Horse-power and milage: Gimil Shah, Chirant Patel and Ishan Patel collected data from the websites of different car companies to study the relationship between engine horse-power and fuel efficiency.
Home-loan interest rates: Samay Patel, Parth Varshani and Aman Shah analysed home-loan amount and interest rates trends usinf data from the SBI website.
Alcohol consumption and GDP: Yashh Raj, Tathya Shah and Samyak Thaker analysed the trend for the Graph of alcohol consumption and GDP, with data from wikipedia, ourworldindta.org and alcohol.org
(To be continued)