Learning by doing: Regression

Learning by doing is both rewarding and challenging. Many institutes worldwide have embraced this path to engage with students. I regularly have a component of hands-on projects in my courses, here is a partial list of regression projects carried out by my class in the last semester.

E-waste and economy: Abhijit Manek, Deven Parikh and Shivam Sony followed the research paper by Sigrid Kusch-Brandt and Colin Hills to verify that as per capita GDP-PPP increase the e-waste generation increases linearly.

Ph and TDS relation in waste-water from dye industries: Saumya Parekh, Deep Wadher and Preet Sangani studied waste water from Dye industry and followed the work by Farhana Maqbool to show that the Ph values in the waste water are a good indicator of TDS values.

Space-launch success rates: Shubham Ghodasara, Dhruv Shah and Ruchit Trived used data from wikipedia and from space-launch report to show that the success rate of space-launch has nearly saturated with time.

Carrier Selection in Gujarat: Haard Monpara, Tirth Panchamia and Jeet Dand collected data from various newspaper sites on the number of students appearing for Science or General stream exams over the last 10 years to study the trends and fluctuations.

Cricket score prediction: Manush Gami, Heet Patel and Ashutosh Mehta were interested in looking at T-20 match scores as a function of overs bowled. Their data came from a random sample.

Birth rate in India: Aashman Dave, Sagar Shah and Nikhil Tantia studied birth-rate per 1000 people in India from 2005 to 2016 from knoema.com and found a very good fit showing a downward trend.

Smokers in India: Praj Patel, Parva Raval and Aman Desai looked at percentage of population that smokes, using data from ourworldindata.org. They found a downward trend in data from 1980 to 2012.

Accidents in Gujarat: Darshil Kothari, Pavitra Patel and Kenil Shah did a polynomial fit on the accident data in Gujarat from 1999 to 2017 from rtogujarat.gov.in to study the trend.

Population growth in Gujarat: Nisarg Gandhi, Niraj Maniyar and Kaustubh Patel studied the population growth in Gujarat and found a very good linear fit.

Engine Horse-power and milage: Gimil Shah, Chirant Patel and Ishan Patel collected data from the websites of different car companies to study the relationship between engine horse-power and fuel efficiency.

Home-loan interest rates: Samay Patel, Parth Varshani and Aman Shah analysed home-loan amount and interest rates trends usinf data from the SBI website.

Alcohol consumption and GDP: Yashh Raj, Tathya Shah and Samyak Thaker analysed the trend for the Graph of alcohol consumption and GDP, with data from wikipedia, ourworldindta.org and alcohol.org

(To be continued)


I recently bought a new trinket M0 from adafruit. There are several fun features that project builders would find attractive.

Small size: about one rupee coin.

Power freedom: power with your phone-cable or a battery

No driver: it works with circuit python, driver not needed.


Easy to fix on a project

Three LED on-board, one of them RGB

Now I am looking into possibilities of doing fun things with it.

May the power be with you

One reason I tell my students as to why they need to learn mathematics is that knowledge of mathematics gives a sense of power. It often gives a neat understanding of problems and aha insights into solutions. Here are a list of popular problems that students can handle with confidence provided they have a clear idea of how regression works.

(1) Was the coffee spiked with Barbiturates?

Forensic scientists regularly need to handle situation where they have to verify if some chemical is present in a given sample. The absorption spectrometry gives a set of linear equations connecting chemical concentration to absorbance. The best fit solution would yield the chemical concentrations in the sample.

(2) Does higher rain correlate with high temperatures?

Folklore says that harsher a summer is, better will be the monsoon. With data available online one may solve a least square fit for a model and figure out if the folklore makes sense.

(3) Best way to mix.

In a industrial set-up you want to mix several fluids and do it in energy efficient manner. A standard way is to compare the performance of different types of mixers and for each of they fit a curve to motor speed and energy used. The minimum of the curve gives an operating speed for the mixer for which energy usage is minimum.

(4) Do people living in big cities use less energy compared to people in towns?

This and similar other questions have been analyzed by Prof. Geoffrey West and his colleagues using best fit on data of (say) population density and electricity usage. You may check for yourself if city dwellers pay lesser electricity bills.

Often not only one is able to answer interesting questions using regression on data but also gain insights into efficient designs and behavior of complex systems. If you like to think about data and modeling do watch this TED talk. Mathematical skills are powerful tools, May the power be with you.