Available for new projects
Arunabho Kanti Som

Arunabho Som

I help businesses turn
data into clear decisions.

I'm Arunabho Kanti Som, a data and AI professional based in Toronto with 5+ years across analytics, machine learning, and AI engineering. I build predictive models, dashboards, AI agents, and GenAI/RAG applications using Python, SQL, LLMs, LangChain, and cloud‑native stacks on AWS and Azure.

Featured Projects

Explore a curated selection of
my recent data science projects

Twitter Sentiment Analysis dashboard

Twitter Sentiment Analysis 2022

NLP PySpark AWS

Captured Black Friday 2022 tweets via the Twitter API, trained Logistic Regression and Random Forest classifiers (86% accuracy), and surfaced real‑time insights through a QuickSight dashboard.

View on GitHub
Employee churn analysis

Predictive Employee Churn 2023

Machine Learning Python Deep Learning

Capstone project predicting employee turnover with HR datasets — combined classical models (Logistic Regression, Random Forest, Gradient Boosting) with neural networks to lift accuracy.

View on GitHub
Agoda booking analysis

Agoda Booking Analysis 2023

EDA Pandas Seaborn

Analyzed booking patterns and price trends to optimize urgency messaging — examined ADR variations, customer booking behaviors, and the best timing for conversion‑driving prompts.

View on GitHub
Click-through rate prediction

Click‑Through Rate Prediction 2023

ML SQL SQLAlchemy

Built a CTR model using DataGrip and SQLAlchemy pipelines, with GridSearchCV and K‑fold validation. Reached 83% prediction accuracy on post‑transaction click behavior.

View on GitHub
Benefit

Work with me, and you'll
see these benefits

Data‑Driven Insights

I turn raw data into clear, actionable decisions — using statistical analysis, modeling, and visualization tailored to your specific goals.

Collaborative Approach

I partner closely with stakeholders at every stage — from scoping the problem to delivering production‑ready models that fit your workflow.

Proven Results

Measurable outcomes — 86% sentiment classification accuracy, 83% CTR prediction accuracy, and analytics platforms supporting 18,000+ profiles.

Service

What I deliver

Get in touch

Machine Learning Models

Production‑ready models in scikit‑learn, TensorFlow, and PyTorch — from feature engineering to deployment.

Data Analysis

Exploratory analysis, customer segmentation, and statistical insights using Python, SQL, and Pandas.

Dashboards

Interactive Tableau, Power BI, and QuickSight dashboards turning metrics into stakeholder‑ready stories.

Predictive Modeling

Forecasting and classification models — vehicle valuation, churn, CTR, and pricing — built for business outcomes.

NLP & Sentiment Analysis

Text classification, topic modeling, and sentiment systems — from data collection through LLM integrations with LangChain.

Big Data Pipelines

PySpark and Databricks pipelines on AWS and Azure — scaling from millions of rows to production workloads.

How It Works

A process built for
measurable results

01

Discovery

Understanding your goals, data landscape, and what success looks like for the business.

02

Analysis

Exploratory data analysis, cleaning, and statistical review to surface the signal in your data.

03

Modeling

Building, tuning, and validating predictive models — classical ML through deep learning where it pays off.

04

Delivery

Shipping dashboards, models, and documentation your team can use, extend, and trust over time.

Experience

My journey in the world
of data science.

2025 — Present Toronto, Canada

Data Science Consultant — Beam Data

  • Built an automated competitor‑monitoring pipeline using n8n workflows, Python (Playwright, Selenium, Apify, BrightData) for large‑scale scraping, and PostgreSQL — embedded content into a RAG architecture (Chroma, FAISS) with scheduled LLM summarization to surface pricing changes, product launches, and competitive positioning shifts.
  • Delivered Power BI dashboards for Awegoo, an Amazon aggregator covering 200+ brands — tracking inventory health, listing performance, MAP compliance, sales velocity, and SKU‑level profitability.
  • Engineered a Vehicle Valuation Model in Python, scikit‑learn, and TensorFlow for a vintage car auction house, trained on 50‑year scraped auction data to drive inventory pricing and sale forecasting.
  • Built predictive price models and Tableau dashboards for a luxury watch broker, scraping reference, marketplace, and auction data and processing on Azure to guide retail‑portfolio acquisition.
  • Developed AI agents using OpenAI/Anthropic APIs and LangChain to automate research, drafting, scheduling, and engagement workflows for social media operations.
2022 — 2025 Toronto, Canada

Data Analyst, Taxpayer Services — Canada Revenue Agency

  • Delivered SQL‑based reports, dashboards, KPI trackers, and data‑validation workflows for a federal contact‑centre network handling 25–32M calls and 16–17M unique callers annually, supporting a program administering $379B in tax revenue and $46B in benefits.
  • Applied EDA, trend analysis, predictive analytics, and pattern recognition to taxpayer, benefits, and service‑delivery data — informing workload planning, coaching priorities, and continuous improvement of program‑level accuracy (92%) and professionalism (96%) scores.
  • Handled sensitive taxpayer data under Income Tax Act and Excise Tax Act confidentiality provisions, applying strict need‑to‑know controls and audit traceability across every reporting output.
2020 — 2022 Toronto, Canada

Data Analyst — Epiq Global

  • Conducted financial and operational analysis in Python and SQL on claims, transactions, and seasonal trends; built customer segmentation models using clustering to drive engagement and retention strategy.
  • Built interactive Tableau dashboards for financial KPIs, customer behaviour, churn, and sentiment — delivering real‑time insights that improved service quality and decision cycles.
2019 — 2020 St. John's, Canada

Data Analyst, Insurance — GreenShield

  • Managed 18,000+ member profiles in SQL and Excel; applied financial modelling to claim premium structures, claim patterns, and risk profiles to support underwriting and policy decisions.
  • Conducted profitability analysis and reserve estimations in SQL, evaluating claim frequency and severity for strategic financial planning and risk assessment.
Inspiration

You can't connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future.

— Steve Jobs

It can be hard to trust in the process when you can't see the bigger picture. But you never know what might be around the corner, so you have to keep moving forward. And one day, you may recognize that some of the hardest things you had to go through were also the best things that ever happened to you.

I'm Arunabho

Have a data challenge?
Let's solve it together.

Send email