Welcome to my Machine Learning repository — a complete end-to-end journey from learning core concepts to building industry-grade ML and DL projects. This repo is structured for beginners to intermediate learners and includes:
- 📚 Daily topic-wise learning
- 🛠️ Real-world machine learning projects
- 📊 Data preprocessing and EDA notebooks
- 🤖 Algorithms from scratch
- 🔬 Deep learning experiments and research
- 🎯 Interview Q&A and ML readiness
Machine-learning/
│
├── Datasets/ # Sample datasets used across projects
├── Day1-Python/ # Python Basics
├── Day2-EDA/ # Exploratory Data Analysis
├── Day3-DataAnalysis/ # Data cleaning & feature exploration
├── Day4-oops/ # OOPs concepts in Python
├── Day5-Numpy/ # Numpy operations and tricks
├── Day6-Pandas/ # DataFrame operations
├── Day7-matplotlib/ # Data Visualization - Matplotlib
├── Day8-seaborn/ # Data Visualization - Seaborn
│
├── FeatureEncoding/ # Label Encoding, One-Hot Encoding
├── FeatureScaling/ # Standardization, Normalization
├── missingValues/ # Handling nulls, imputation
│
├── Bagging/ # Ensemble technique
├── Decision Tree/ # Tree-based classifier
├── GradientDescent/ # Manual GD implementation
├── K NearestNeighbors/ # KNN classifier
├── LR-Algorithm/ # Linear Regression from scratch
├── LogisticRegression/ # Binary classifier implementation
├── Maths/ # Stats, probability, algebra essentials
├── Naive Bayes/ # Bayesian classifier
├── RandomForest/ # Ensemble technique
│
├── InterviewQ\&A/ # ML and DL interview prep
├── Projects/ # Full-scale projects (see below)
├── UnitOne/Two/Three/ # Course/module-based structured practice
├── SeeDataset/ # Dataset viewing/inspection notebooks
│
├── batch\_vs\_stochastic.ipynb # Compare batch sizes in optimization
├── dropout\_classification.ipynb # Dropout in deep learning
├── early\_stopping.ipynb # Overfitting prevention
├── feature\_scaling.ipynb # Feature scaling visualization
├── vanishing\_gradient.ipynb # Vanishing gradient exploration
│
├── Test.ipynb
├── main.py
└── .gitignore
This section includes hands-on practice by day:
| Day | Topic | Highlights |
|---|---|---|
| Day 1 | Python | Syntax, loops, functions, modules |
| Day 2 | EDA | Visual + statistical exploration |
| Day 3 | Data Analysis | Missing data, duplicates, types |
| Day 4 | OOPs | Classes, inheritance, encapsulation |
| Day 5 | NumPy | Arrays, slicing, reshaping |
| Day 6 | Pandas | Merging, grouping, filtering |
| Day 7 | Matplotlib | Visualization fundamentals |
| Day 8 | Seaborn | Heatmaps, Pairplots, Distribution plots |
Covers preprocessing and core ML algorithm implementation from scratch:
missingValues/: Dropping, filling, interpolationFeatureEncoding/: LabelEncoder, OneHotEncoderFeatureScaling/: StandardScaler, MinMaxScaler, RobustScaler
Linear Regression&Logistic RegressionKNN,Naive Bayes,Decision Tree,Random ForestGradient Descent(Manual + library-based)Bagging(Ensemble methods)
Each project is complete with data loading, preprocessing, model training, evaluation, and saving.
| Project | Description |
|---|---|
| 📊 BangaloreHousePricePrediction | Regression model to estimate house prices in Bangalore |
| 📚 BookRecommenderSystem | Recommender using cosine similarity and collaborative filtering |
| 🚗 CarPricePredictor | ML regression model for second-hand car pricing |
| 🛍️ E-CommerceProductsRecommendationSystem | Personalized product recommendations |
| ✉️ EmailSpamFilter | NLP-based spam classifier |
| 🪙 GoldPricePredictor | Time-series forecasting of gold prices |
| 💻 LaptopPricePredictor | Price prediction based on specs |
| 🎥 MovieRecommenderSystem | Hybrid model (Content + Collaborative Filtering) |
| 📱 sms-spam-detection | SMS spam classifier using Naive Bayes |
| Notebook | What it Shows |
|---|---|
batch_vs_stochastic.ipynb |
Mini-batch vs full-batch training |
dropout_classification.ipynb |
Dropout effect on neural networks |
early_stopping.ipynb |
Prevent overfitting in training |
feature_scaling.ipynb |
Importance of normalization |
vanishing_gradient.ipynb |
Effect on deep neural networks |
- Contains a curated list of most asked interview questions and answers in ML, DL, and Python.
- Covers:
- Supervised vs Unsupervised
- Bias-Variance Tradeoff
- Feature Selection Techniques
- Regularization, Overfitting
- Evaluation Metrics (Accuracy, Precision, Recall, F1)
- Cost Function, Gradient Descent
- Clone the repository:
git clone https://github.com/Priya-Rathor/Machine-learning.git
cd Machine-learning- Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Launch Jupyter Notebook:
jupyter notebookOr use directly in Google Colab if notebooks are available there.
- Languages: Python
- Libraries: NumPy, Pandas, Matplotlib, Seaborn, Scikit-learn
- Notebook: Jupyter, Google Colab
- ML: Linear Models, Trees, Ensemble, Naive Bayes, KNN
- DL Concepts: Dropout, Early Stopping, Batch Training
✅ All models and projects are reproducible ✅ Clear file organization for learning and projects ✅ Preprocessing and math from scratch ✅ Visualizations and insights on real data ✅ Ideal for ML interview preparation
I’m Priya Rathor, passionate about building intelligent systems using AI/ML. This repository is my personal journey as I explored Python, ML algorithms, data preprocessing, and real-world use cases through consistent practice.
🔗 GitHub: Priya-Rathor
If this project helped you, consider:
- 🌟 Starring the repo
- 🍴 Forking to learn your way
- 🤝 Connecting on GitHub