Building innovative business solutions

I am a creative problem solver, leveraging the power of AI and coding.

Arunabho Kanti Som
Arunabho Kanti Som

"You can’t connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future."

Steve Jobs

It can be hard to trust in the process when you can’t see the bigger picture. But you never know what might be around the corner, so you have to keep moving forward. And one day, you may recognize that some of the hardest things you had to go through were also the best things that ever happened to you.

Black Wavy Background

My Experience

5 years of active work

2019
2024
Data Science Consultant, Beam Data
July 2023 - Present
Data science Machine learning
  • Implemented machine learning algorithms using Python, scikit-learn, and TensorFlow to create a Vehicle Valuation Model.
  • Conducted extensive data scraping for multiple 50-year historical datasets and managed the data in PostgreSQL.
  • Developed a predictive model for luxury watch price fluctuations, improving client investment decisions, and created Tableau dashboards to visualize trends.
  • Built a machine learning-driven dashboard in PowerBI to analyze Amazon marketplace trends, identifying key customer preferences to optimize inventory.
  • Managed complex data manipulation tasks in PostgreSQL and big data processing using PySpark, leveraging Azure Cloud Services and AWS for scalable data solutions.
  • Focused on data-driven insights and decision-making by integrating cloud services for efficient data management and analysis.
Data Analyst, Epiq Global
2020 - 2023 | Toronto, Canada
Data Analysis Data Visualization Machine Learning
  • Utilized Python and SQL to conduct detailed financial and non-financial data analysis, uncovering key patterns in claims filings, transaction completions, and seasonal trends, leading to enhanced understanding of operational and fraud risks.
  • Developed customer segmentation analysis to identify critical customer categories driving successful claims processing, employing statistical techniques and pattern recognition to optimize customer engagement and retention strategies.
  • Created multiple dynamic interactive dashboards using Tableau, visualizing crucial financial indicators and customer behavior metrics, providing real-time insights to internal stakeholders and enhancing strategic decision-making.
  • Spearheaded the analysis of customer churns, claims completion rates, and sentiment, leveraging data analytics to evaluate platform engagement through satisfaction levels, contributing to the improvement of customer service and operational efficiency.
Health And Dental Insurance Analyst, GreenShield
2019 - 2020 | St. John's, Canada
Data Analysis Machine learning
  • Managed a comprehensive database of over 18,000 profiles using SQL and Excel, focusing on financial trend analysis and market insights in the insurance sector.
  • Applied financial modeling techniques in Excel, analyzing claim premium structures, claim patterns, and risk profiles enhancing underwriting decisions and policy optimizations.
  • Conducted profitability analysis and reserve estimations, evaluating claim frequencies and severities for strategic financial planning and risk assessment.
  • Leveraged financial analytics to assess customer lifetime value, and data tools to outline crucial insurance metrics such as expense analysis for informed decision-making and retention strategies.

My Work

Welcome to my comprehensive portfolio website! As a versatile data scientist and skilled data specialist, I possess a unique blend of expertise in utilizing data to drive business decisions and creating innovative technology solutions.

GitHub Repo

PROJECT OVERVIEW

Leveraged Twitter APIs to capture tweets during Black Friday 2022, and processed data using PySpark on Databricks.

KEY FEATURES

  • NLP and Model Training: Implemented NLP and trained models with Logistic Regression and Random Forest for sentiment classification (86% accuracy).
  • Data Storage and Querying: Stored predictions in AWS S3 and utilized Athena for sophisticated SQL queries.
  • Real-time Sentiment Insights: Enhanced the Quicksight dashboard with real-time sentiment insights derived from the analyzed tweets.

TECHNICAL IMPLEMENTATION

Tools & Technologies

Twitter APIs PySpark Databricks Logistic Regression Random Forest AWS S3 Athena Quicksight

Backend

Django Django REST Framework LangChain
GitHub Repo

PROJECT OVERVIEW

Led a capstone initiative centered on predicting employee turnover using HR datasets, with a focus on indicators like salary and satisfaction metrics.

KEY FEATURES

  • In-depth Analytical Reviews: Discerned potential correlations between employee churn and factors such as the number of undertaken projects and average monthly work hours.
  • Classical and Advanced Techniques: Utilized an array of classical machine learning methodologies, including Logistic Regression, Random Forest, and Gradient Boost Classifier.
  • Deep Learning Integration: Enhanced prediction accuracy by integrating Deep Learning techniques and Neural Network models.

TECHNICAL IMPLEMENTATION

Technologies

Python Logistic Regression Random Forest Gradient Boost Classifier Deep Learning Neural Networks
GitHub Repo

PROJECT OVERVIEW

Analyzed 300,000 records to determine metrics such as the credit-to-income ratio, average earnings, and poor credit history using SQL.

KEY FEATURES

  • Data Processing & Storage: Utilized Microsoft Azure for data processing and storage, ensuring data integrity for attributes closely linked with default accounts.
  • Algorithm Evaluation: Evaluated and contrasted the effectiveness of Logistic Regression and Random Forest algorithms, with Random Forest showing a 17% enhancement in performance.

TECHNICAL IMPLEMENTATION

Tools & Technologies

SQL Microsoft Azure Logistic Regression Random Forest
GitHub Repo

PROJECT OVERVIEW

Leveraged DataGrip to craft the databases, importing advertisement and user data from CSVs for processing.

KEY FEATURES

  • Data Cleaning & EDA: Conducted in-depth Data Cleaning and EDA using Python, SQLAlchemy, and Pandas.
  • Model Development: Developed machine learning models, employing GridSearchCV and K-fold validation, achieving a prediction accuracy of 83% on user click behaviors post-transaction.
  • Hyperparameter Tuning: Executed Hyperparameter Tuning and evaluated model performance, offering key insights into ad click-through behaviors.

TECHNICAL IMPLEMENTATION

Tools & Technologies

DataGrip Python SQLAlchemy Pandas GridSearchCV K-fold validation

Contact me