Seattle Airbnb Data Analysis

A Blog post for this project is on Medium. You can check here A deep look into Seattle Airbnb business.

Description

This project is to analyse the 'Seattle Airbnb Open Data" from kaggle. The analysis focused on 3 questions and I use both EDA and Modeling to find answers to 3 questions. There are 3 parts in this project.

1.Data Exploration

In this part, I briefly check 3 datasets, the rows and columns in the datasets, and check if there are nulls or duplicates in rows or columns

2. Data wrangling

In this part,I clean up the 'listing' dataset,and prepare 3 sub-datasets for each of my questions.

3. Data analysis

In this part,I answer my questions. For the first 2 questions, I use EDA to find the answer, and for question 3, I train a model and assessing its accuracy

Questions and Findings

Question1: What kinds of property are the favorite type in airbnb?

From host side, they prefer to list 'entire home/apartment' as room type. House and apartment are their favorite listing property type.

From customer side, they are more willing to write reviews for property type 'Cabin' and room type 'Private room'

Question2:Which neighborhood are most popular?

Based on the zipcode, 98122 area is the most popular neighborhood for listings.

Question3:How well can I predict the price(in airbnb) of a property based on the data?

Based on the model results, the best R-square for test data is 0.527. The best model use 19 out of 38 features as predictors.

Acknowledgement:

Files in this repo:

calendar.csv : Seattle Airbnb Open Data
listings.csv: Seattle Airbnb Open Data
reviews.csv: Seattle Airbnb Open Data
seattle-airbnb-data-analysis.ipynb: Notebook containing data analysis

Libraries:

numpy == 1.25.2
pandas == 2.0.3
matplotlib == 3.7.1
seaborn == 0.13.1
scikit-learn == 1.2.2

References:

Source code:

Function name: find_optimal_lm_mod(X, y, cutoffs, test_size = .30, random_state=42, plot=True)

Lesson Title: Putting It All Together - Solution.ipynb

Sources Code Name: AllTogether.py

Author: Udacity "Introduction to Data Science"

Date: 2022

Code version: N/A

Availability: Udacity-Intro-to-data-secience

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md
calendar.csv		calendar.csv
listings.csv		listings.csv
reviews.csv		reviews.csv
seattle_airbnb_data_analysis.ipynb		seattle_airbnb_data_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Seattle Airbnb Data Analysis

Description

1.Data Exploration

2. Data wrangling

3. Data analysis

Questions and Findings

Acknowledgement:

Files in this repo:

Libraries:

References:

Source code:

About

Uh oh!

Releases

Packages

Languages

License

SarahG1010/Seattle_Airbnb_Data_Analysis

Folders and files

Latest commit

History

Repository files navigation

Seattle Airbnb Data Analysis

Description

1.Data Exploration

2. Data wrangling

3. Data analysis

Questions and Findings

Acknowledgement:

Files in this repo:

Libraries:

References:

Source code:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages