This reposatry contains my DS and ML-contest's projects, along with my personal fun project. I have dealt with diverse set of problem/data/metrics. Following is the summary of each project, which contains the type of dataset, type of problem and my-approach to handle that(all in very brief). More details can be found in each subdirectory.
If you want to look the following text in a table format, click here
- DataSet:
- Image
- Objective:
- Bounding Box prediction
- My Approach:
- Designed a visual feature pipeline with attention on the object in image
- Data Augmentation Technique along with its bounding box
- Used
Single Stage DetectorApproach - Focal Loss with
YOLOandSSD
- DataSet:
- Text
- Objective:
- Classification
- My Approach:
- Data Cleaning/feature enginnering
- Linear/Non-Linear Model
Deep Learning Attention ModelPretrained Bert Model- Ensemble
- DataSet:
2500unknown predictors
- Objective:
- Classification
- My Approach:
- Feature Understanding(
EDA) - feature engineering
- designed feature interaction tools
- ensemble model using
xgboost/lighgbm/catboostandlinear/non-linearsimple model - statistical model to understand the feature importance using
p-values
- Feature Understanding(
- DataSet:
- Very big Dataset(45M observation, graph edge-representation)
- Relational Feature
- Category + Numerical
- Objective:
- Link Prediction
- My Approach:
- Graph Based features such as (
adamic-adar,common-resource-allocation,...) SVDfeature for each userComunity-clusteringSubsemble(I did this after competition is over, to understand more about sampling and model building)neighbour-basedfeature(Removed highly cardinal feature)- Also tried
Deep learning approach(Graph Embedding), but couldn't handle at that time properly
- Graph Based features such as (
- DataSet:
- Category + Numerical
- Relational Dataset
- Objective:
- Regression
- My Approach:
- Feature engineering
date-timebased featureAggregationbased featureRelationalFeatures
- Ensemble using different set of
tranformedtarget space
- Feature engineering
- DataSet:
- Image
- Objective:
- Comparison between ResNet and my modified feature pipeline
- Classification
- My Approach:
- Developed a
weighted feature pipeline using global and local feature. Global feature put constrained on local feature, to specifically focused on features of objectin imageBetter attention map around object, which reflect its learned feature.- Improved score by
1.37%overResnet
- Developed a
- DataSet:
- Image
- Objective:
- Face Verification
- My Approach:
Matching Network Approach- Build a
Student-Attentdance hardware using arduino Hard Mining Approach(generate all permutation between classes to handle small dataset)network-in-networkapproach to handle overfitting as i have very small dataset.- Achieved
93%accuracy
- DataSet:
- Image
- Objective:
- Classification (training on very small dataset)
- My Approach:
Prototype Algorithmimplementation- There is more to this(will update in future)
- DataSet:
- Category + Numerical
- Objective:
- Regression
- My Approach:
- Date based feature and Dummy feature
Interaction based featureBayesian optimizationout of fold predictionto generateMeta featureforensemble
- DataSet:
- Text
- Objective:
- User-Problem Rating Prediction
- My Approach:
- My main concerns was to handle following question carefully:
- What is the strongest and weakest area of user?
- What is the level of problem?
- What problem user have just solved?
- If user gets stuck at current problem, what problem should help him(to gain confidence and to improve skill in that area)?
- Exploration and explotation strategy in recommending problem
- And many more?
- My main concerns was to handle following question carefully:
- DataSet:
- Category + Numerical
- Objective:
- Classification
- My Approach: +
- DataSet:
- Image
- Objective:
- Segmentation
- My Approach:
- Implemented an U-Net architecture on blood cell Dataset.
- fully convolutional network on traffic-street dataset.
- Finally experimented with generative adverserial network for better generalization in the presence of limited dataset.
- DataSet:
- Relational feature
- Time-Series Feature
- Categorical + Numerical
- Objective:
- Future Sales Prediction for different store in different cities
- My Approach: +
- DataSet:
- Image
- Objective:
- Classification
- My Approach:
- EDA
- Feature Engineering
- DataSet:
- Time-Series stock prices
- Objective:
- Future price prediction
- Regression
- My Approach:
- Deep learning approach using RNN and LSTM