This project implements a machine learning-based web application for predicting obesity levels based on various lifestyle and health factors. The application uses XGBoost for classification and is deployed using Streamlit, providing an interactive and user-friendly interface for predictions.
The project uses the "Obesity Dataset" which contains data about individuals' lifestyle and health factors. The dataset includes both raw and synthetic data, making it comprehensive for obesity level prediction. The target variable (NObeyesdad) has multiple classes representing different obesity levels.
-
Demographic Information:
- Gender
- Age
- Height
- Weight
- Family history of obesity
-
Eating Habits:
- FAVC (Frequent high caloric food consumption)
- FCVC (Vegetable consumption frequency)
- NCP (Number of main meals)
- CAEC (Food between meals)
- CH2O (Daily water consumption)
-
Physical Activity:
- FAF (Physical activity frequency)
- TUE (Technology usage)
- SCC (Calorie monitoring)
-
Transportation:
- CALC (Transportation mode)
- MTRANS (Transportation time)
Distribution of different obesity levels in the dataset
Correlation matrix showing relationships between different features
XGBoost model feature importance analysis
Distribution of BMI across different obesity levels
Relationship between age and weight across different obesity levels
- Interactive web interface for inputting personal health and lifestyle data
- Real-time obesity level prediction using XGBoost model
- BMI calculation and visualization
- Feature importance analysis
- Personalized health recommendations
- Responsive and modern UI design
- Frontend: Streamlit
- Machine Learning: XGBoost
- Data Processing: Pandas, NumPy
- Visualization: Plotly, Matplotlib, Seaborn
- Model Serialization: Joblib, Pickle
- Clone the repository:
git clone [repository-url]
cd obesity-prediction- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install required packages:
pip install -r requirements.txt- Start the Streamlit application:
streamlit run app.py-
Open your web browser and navigate to the provided local URL (typically http://localhost:8501)
-
Input your information in the form:
- Personal Information (Age, Gender, Height, Weight)
- Eating Habits
- Physical Activity
- Transportation Details
-
Click "Predict Obesity Level" to get your results
The application uses an XGBoost classifier trained on various features including:
- Demographic information
- Eating habits
- Physical activity levels
- Transportation patterns
- Lifestyle choices
The XGBoost model was chosen for its ability to handle both numerical and categorical features effectively. The model provides:
- High accuracy in obesity level prediction
- Feature importance analysis
- Robust performance across different obesity categories
The application provides:
- Predicted obesity level
- BMI calculation and category
- Visual BMI gauge
- Feature importance analysis
- Personalized health recommendations based on BMI category
obesity-prediction/
├── app.py # Main Streamlit application
├── requirements.txt # Project dependencies
├── xgb_model.json # Trained XGBoost model
├── label_encoder.joblib # Label encoder for categorical variables
├── Obesity_Prediction.ipynb # Jupyter notebook with analysis and model training
├── images/ # Directory containing visualization images
│ ├── obesity_distribution.png
│ ├── correlation_matrix.png
│ ├── feature_importance.png
│ ├── bmi_distribution.png
│ └── age_weight.png
└── README.md # Project documentation
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Dataset source: Obesity Dataset (Raw and Synthetic)
- XGBoost documentation
- Streamlit documentation