AI Insights

Analyzing the Effects of Seasonal Affective Disorder on Mental Health of People in London

July 14, 2022

article featured image

Report by Omdena Coventry Chapter, UK

Author: Anshul Dixit

Contributors: Muhammad Hassan, Susan, Jezza, Amrita Mande, Evans, Phani Parsa, Tymon LewandowskiUtpal MishraGideon ObaTanisha Banik

Source Adobe

Source Adobe


Various factors can influence mental health. Weather as one of these factors is often overlooked and underestimated. Changing seasons affect mood in complex ways. This project by the Omdena Conventry (UK) Local Chapter tries to understand the reality of the situation and how important it is to consider weather as one of the factors affecting mental health.

The annual cost of addressing mental illness in the London population is close to £7.5 billion. Included are charges for the criminal justice and education services, as well as health and social care spending for the treatment of disease. However, these expenses only make up a small portion of the £26 billion that London loses annually due to problems including decreased productivity and quality of life. At first, we will try to understand what Seasonal Affective Disorder (SAD) is and decipher the effect of weather on mental health using quantitative analysis (1).

The impact of Seasonal Affective Disorder (SAD)

Seasonal Affective Disorder (SAD) is the most probable way of weather affecting our mental health. People with a history of mental illness are predominantly affected by it. We must understand that SAD is not synonymous with winter blues and is much more severe. It most likely affects people in the northern latitudes, where the days are shorter, and nights are more extended with cold temperatures. Scientists have estimated that SAD can be caused because of disrupted circadian rhythm. Living the whole day in the dark is common, and we need light to wake up required for some hormones to be initiated. People most probably go for light therapy in these times as one of the ways to confront it. Below we have tried to analyze SAD quantitatively and come out with significant results.

Data collection     

Data collection was the most critical stage of our project. We browsed through various websites to collect the data through web-scraping using Python, BeautifulSoup, and Selenium, and collected data via Twitter scraping, besides gathering it from NHS digital websites. We gathered the historical weather data for London with daylight saving hours and the mental health reports for different regions of London with the help of NHS and Twitter. Our final database took a while to be formed, but we were satisfied as we had everything required for our analysis. We also focused on alcohol abuse during the SAD and successfully had our hands on the relevant data following the techniques above.

Our methodology for data analysis

Our struggle with data collection made us learn and tweak our objectives meanwhile. We admired our team members for coming forward to suggest everything we could add to run our analysis successfully. Our methodology was simple yet insightful. We wanted to reaffirm the effect of seasons on the mental health of people in London through quantitative and qualitative analysis. 

Data cleaning and processing

  • Data cleaning and data munging 
  • Handling missing information
  • Removing duplicate information
  • Modifying invalid data types and values
  • Introducing new relationships in the data

Analysis of dataset

The initial analysis of the source data is done, involving the following steps:

  • Fetching the data related to the City of London and its various regions from the NHS website.
  • Visualization of trends between weather and mental health reports using Python libraries like Seaborn and Matplotlib.
  • Identify any relationship gaps.
  • Also, consider the alcohol abuse during the SAD.
  • We performed statistical analysis and visualizations to get a better understanding of the data and to see if it satisfies the business use case.
  • Topic Modelling of Twitter – data related to SAD and its associated terms.
    • Using (Latent Dirichlet Allocation (LDA)) modeling and finding perplexity and coherence scores.
    • We identified the coherence value relationship with the number of topics.

Below are the graphs that our team produced using Python, to analyze the daylight hours and mental health measure value in all the regions of London. We were satisfied with the analysis and how the visualization turned up.      

In all the graphs, the measured value signifies the number of reported cases related to mental illness.

Central and West London


We can clearly see that in Central and West London, during the peak winter hours when the daylight hours are the lowest, we have the highest number of measured values of Mental health reports.

Southern London


Northern London


Meanwhile, in Northern and Southern London with the Winter SAD we can precisely see the effect of Summer SAD. 10% of people with SAD may have reversed symptoms, whereby they feel better during the winter, but worse during the summer. The reason can be too much sunlight, turning off the melatonin production in the body responsible for hormones that drive your sleep-wake cycle. It may result in symptoms different than those of Winter SAD with sleepiness, weight loss, and feelings of misery.

SAD and Alcohol Abuse

Next, we tried to analyze the effect of SAD on Alcohol Abuse in London. We found out that alcohol and mental health are closely linked. During Quarter-4 of the season when days are shorter during the winter season, alcohol abuse was the highest. The possibility of coexistence between SAD and alcoholism highlights the value of a thorough clinical study. Professionals in the fields of mental health and drugs and alcohol should receive training to help them recognize, treat, and refer patients who have co-occurring alcoholism and SAD (2).

Alcohol Abuse Data visualization

Alcohol Abuse Data visualization                    

Quarter Months
Q1 January – March
Q2 April – June
Q3 July – September
Q4 October – December

*Estimates of alcohol abuse signifies the number of reported issues in a particular quarter.

We also considered the twitter data for analyzing the terms in London related to depression for the year 2020-2022 and the related tweets and were able to come up with the word cloud below. Most people reported feeling sad, with feelings of anxiety, despair, and blues with prominent words like night, day, hopeless, lockdown, etc. To reflect on the research, these terms were prevalent almost the whole year but were tweeted the most during the winter season.

Word-cloud Twitter Data on SAD

Word-cloud Twitter Data on SAD

On Latent Dirichlet Allocation (LDA)

To assign text in a document to a certain topic, Latent Dirichlet Allocation (LDA), a type of topic model, is used. It creates a topic per document model and words per topic model using Dirichlet distributions as the modeling framework. Our team used Gensim and Spacy libraries to carry out the LDA on the corpus fetched through Twitter API

After creating the topic model, the team visualized the topics. The left panel shows the topics as circles in the two-dimensional plane, whose centers are calculated using the Jensen-Shannon divergence between topics and then projected into two dimensions using multidimensional scaling. The areas of the circles are used to encode the general predominance of each issue. In the right panel, the team found the second set of topics relevant with the word sad (Note:  it’s the general term “sad” – unhappy not SAD). Although the term sad was used prominently for the whole year, most were used during the winter seasons, according to our analysis. With the term “sad”, people are found talking about leaves and family the most. Depicting the link to depression, leaves, and spending time with family.

Visualisation of Topics 

Visualisation of Topics

To evaluate our topic model, we visualized the coherence score and number of topics and found it increasing. 

We can use the coherence score in topic modeling to measure how interpretable the topics are to humans. In this case, topics are represented as the top N words with the highest probability of belonging to that topic. Briefly, the coherence score measures how similar these words are to each other.

Coherence score; number of topics

Coherence score; number of topics

Our analysis, therefore, is in line with our hypothesis and provides its validity. Lastly, we thank our team members for putting in days of hard work to collect data from various sources and performing the analysis, presenting accurate results. 


Unfortunately, among all health issues, mental illness due to SAD continues to be one of the least understood, and its lack of understanding prevents people from seeking help. It’s time we addressed mental illness due to weather changes and its impacts on our society. This report has shed light to clarify the scope and severity of mental illness in London due to the weather. In this report, we were able to present concrete insights on the effect of weather on the mental health of people and the necessity to understand it to be prepared for and effectively deal with the situation.


Ready to test your skills?

If you’re interested in collaborating, apply to join an Omdena project at:

media card
Harnessing AI to Monitor and Optimize Reforestation Efforts in Madagascar
media card
How We Leveraged Advanced Data Science and AI to Make Farms Greener
media card
A Beginner’s Guide to Exploratory Data Analysis with Python
media card
AI-Powered Chatbots Initiative to Enhance Mental Health Support