Projects / Local Chapter Project

The Longevity Project: Identifying Counties With the Highest Life Expectancies in U.S.

Start Date: July 30, 2024 | 2 years ago


Omdena feature image

Challenge Background

Everyone has the desire to live a long and healthy life. Imagine if there were specific "cheat codes" that could help us unlock the secret to living to be a hundred or more. Before 2000, these "cheat codes," also known as "blue zones," were not widely recognized. These zones were discovered by Dan Buettner of National Geographic. After extensive exploration, Dan and his team pinpointed five regions around the world where people live longer and experience lower rates of chronic illness. These areas are Sardinia (Italy), Okinawa (Japan), Loma Linda (California), Ikaria (Greece), and Nicoya (Costa Rica). The concept of the blue zone was born from their discoveries.

The Problem

As of last year, the total cost of the U.S. healthcare system totaled 4.5 trillion dollars, averaging $13,493 per person. Despite spending more than any other developed nation, the U.S.A. falls short on several health metrics and ranks lower. Before the pandemic, healthcare expenses had been rising gradually, but the pandemic has exasperated the issue. At present, the life expectancy in the U.S. is even lower than in China for the first time in decades. Because of the flaws in our healthcare system, Americans today are living shorter and unhealthier lives. However, when it comes to several chronic conditions, Americans have the highest rate yet visit the doctor the least frequently compared to other developed countries.

If we compare cancer, it portrays a different picture of the American healthcare system. According to recent CNN data, the U.S. has the highest number of breast cancer screening among women ages 50 to 69, and the U.S. exceeds in terms of average cancer screening rate. Additionally, according to the American Cancer Society, cancer deaths have decreased by 33% since 1991.

The inconsistency in the health data highlights the complexity of the U.S. healthcare system. Analyzing health outcomes nationwide and in other countries is challenging due to the size of the country and its 50 states, each with distinct policies, topography, populations, racial makeup, and health ecosystems. To have a complete understanding of the American healthcare system, we must analyze and appreciate the subtleties of the local healthcare system to identify areas that need improvement.

This project aims to determine American countries with the highest number of centenarians, or people who live up to 100 years of age or more and investigate the factors contributing to such longevity. The data lifecycle covers a wide range of healthcare variables, environmental conditions, lifestyle decisions, socioeconomic status, and demography. The finding of the “blue zone” will aid us in understanding the American healthcare system or overall health at a local level in the U.S. By doing this, we can have a broader impact on public policy and encourage healthy living.

This project also intends to foster local government collaboration so they can implement appropriate policies, rules, and cost analyses. A local-level analysis allows governments to compare underperforming counties to those outperforming counties while also providing an insightful understanding of rural health at the same time.

Goal of the Project

  • Examine the impact of healthcare variables, environmental conditions, lifestyle choices, socioeconomic status, and demography on life expectancy.
  • Explore the disparities in these factors between rural and urban areas. Investigate how conditions, lifestyle choices, socioeconomic status, and demography correlate with life expectancy.
  • Evaluate the reasons behind variations in life expectancy among U.S. counties.

Project Timeline

1

Week 1: Project Discussion

  • Research factors relevant to measuring Blue-Zone counties to support our research.
  • Locate and compile information from government agencies, such as the U.S. Census Bureau, EPA, etc.
  • Understand what Blue Zones are
  • Get a proper objective and outline of the project
  • Identify our various data resources to collect data from
  • Finalize the variables we need to collect data on
  • Finalize task leads for each phase of the project
  • Design task assignment for each phase of the project

2

Week 2: Data Collection

  • Identify data sources
  • Understand data requirements and define it
  • Gather data from online databases
  • Reach out to the local department if there is some data not available
  • Consolidate data in a central repository

3

Week 3: Data Cleaning

  • Load data into a central system for cleaning
  • Use a relational database like MYSQL or Data Lake to store our data
  • Design an ETA pipeline to transform the data with tools like Apache
  • Ensure data integrity or data migration
  • Data Profiling – identifying key columns and data types, assessing, etc.
  • Clean and preprocess the data to ensure it is ready for analysis by handling duplicates, missing values, outliers, or normalize and standardize the data
  • Address any inconsistencies with the dataset.

4

Week 4: Data Analysis, Data Visualization and Reporting

  • Allocate tasks to each member properly depending on the task sheet
  • Conduct EDA to identify patterns and trends
  • Perform statistical Analysis to determine variables contributing to longevity.
  • Conduct geospatial analysis to map centenarian distribution and relevant variables across counties.
  • Create visualization and dashboards to present key insights.
  • Develop a predictive model
  • Develop a comprehensive report detailing the analysis, findings, and recommendations.
  • Prepare our findings to stakeholders

What you'll learn

1. Data Management and Preprocessing:

  • Skills in data collection, cleaning, and preprocessing, ensuring data integrity and readiness for analysis.

2. Advanced Analytical Techniques:

  • Experience in conducting exploratory, statistical, and geospatial analyses to uncover patterns and key factors in complex data.

3. Predictive Modeling:

  • Proficiency in developing and applying predictive models using machine learning techniques to forecast longevity trends.

4. Data Visualization and Communication:

  • Expertise in creating interactive visualizations and comprehensive reports to effectively communicate data insights to diverse audiences.

5. Project Management and Public Health Insights:

  • Competence in managing project timelines, collaborating with team members, and formulating actionable policy recommendations based on data-driven findings.

First Omdena Local Chapter Project?

Beginner-friendly, but also welcomes experts

Education-focused

Duration: 4 to 8 weeks

Open-source



Your Benefits

Address a significant real-world problem with your skills

Build your project portfolio

Access paid projects (as an Omdena Top Talent)

Get hired at top organizations



Requirements

Good English

Suitable for AI/ Data Science beginners but also more senior collaborators

Learning mindset



Application Form

This Challenge is hosted by:

Become an Omdena Collaborator

media card
Visit the Omdena Collaborator Dashboard Learn More