Skip to the content.

London-Real-Estate-Investment-Estimator

An End-to-End Machine Learning tool that identifies undervalued properties in London by analyzing Geospatial, Textual, and Physical data.

πŸš€ The Business Problem

Real Estate investors in London face a massive data problem: Price does not equal Value.

A property listed for $100/night might be a steal, while one at $150 might be overpriced junk.

Traditional analysis looks at simple averages, missing the nuance of β€œVibe” (guest reviews) and precise location dynamics.

The Solution: I built an End-to-End Machine Learning Engine that:

  1. Estimates Fair Value for any property based on 150+ features.
  2. Identifies Arbitrage Opportunities: Properties where Actual Price < Predicted Price (Undervalued Gems).
  3. Quantifies β€œVibe”: Uses NLP to turn guest sentiment into hard data.

πŸ“Š Key Results & Insights

1. What actually drives price?

TOP 15 Factors

2. The Market Map

London Heatmap

3. Streamlit Deployment

Streamlit Deployment

Streamlit Deployment


πŸ› οΈ Tech Stack & Methodology

This project follows a 4-layer Data Science architecture:

Layer Technology Description
1. Engineering Pandas, NumPy Cleaned 96,000 raw rows, parsed complex text (Amenities), and imputed missing values.
2. Geospatial Folium, Haversine engineered Distance_to_Center feature and built interactive cluster maps.
3. NLP NLTK (VADER) Analyzed 50,000+ guest reviews to generate a Sentiment_Score (-1 to +1) for every listing.
4. Machine Learning XGBoost, Scikit-Learn Trained a Gradient Boosting Regressor to predict price with 82% accuracy.
5. Deployment Streamlit Built a user-facing app for investors to test scenarios (e.g., β€œWhat if I add a pool?”).

πŸ“‚ Repository Structure

β”œβ”€β”€ app.py                   # The Streamlit Application (Deployment)
β”œβ”€β”€ notebook_analysis.ipynb  # The Core Analysis (Cleaning, EDA, ML Training)
β”œβ”€β”€ xgb_model.json           # Trained XGBoost Model
β”œβ”€β”€ model_columns.joblib     # Saved feature names for consistency
β”œβ”€β”€ app_data.csv             # Lightweight data for the map visualization
β”œβ”€β”€ requirements.txt         # List of libraries
β”œβ”€β”€ README.md                # Project Documentation
└── Visuals/                  # Screenshots for README

Author

Ahmed A. Elatwy

Built as a Capstone Project demonstrating Full-Stack Data Science capabilities.