Author Image

Hi, I am Ashlesh

I have completed my MS in Data Science from the Illinois Institute of Technology. Previously, I gained valuable experience as a Cloud Engineering & Analytics at LTI (now LTIMindtree) while working on the Premium Data Pipeline Project. Additionally, I served as a Data Engineer at Navy Federal Credit Union, Data Scientist with Calamos Investment, Data Analyst Intern at the Illinois Institute of Technology & GlobalShala initiative.

As a data enthusiast, I thrive on embracing new challenges and utilizing my passion to design innovative technical solutions. I constantly strive to improve my competencies and skills, focusing strategically on contributing to the growth and expansion of companies. With a strong emphasis on leveraging data, I aim to make a meaningful impact and contribute to the advancement of data-driven initiatives.

I’m best reached via email. I’m always open to interesting and engaging conversations and I appreciate feedback from others.

My personal website: https://ashleshk.github.io

Data Engineering
Data Analyst
Data Science

Experiences

1
Data Engineer Intern
Navy Federal Credit Union

Aug 2023 - May 2024, Vienna, VA

Responsibilities:
  • Tech Stack: Azure, DataBricks, ADF, AAD, Azure Key Vault, ADLS, Informatica, Scala, SparkSQL, Python, parquet, SSIS, TeraData, Azure DevOps, Power BI
  • Engineered, deployed, and orchestrated data pipelines in Databricks for extracting, transforming & loading data from various structured and unstructured sources into delta lakes for processing and analysis.
  • Automated workflows to clean, transform, and prepare data for analytics, resulting in a 25% decrease in data preparation time.
  • Conducted advanced statistical analysis to enhance data accuracy and consistency of large volume data with the help of Kafka.
  • Developed and released multiple feature updates in Azure DevOps, collaborating with teams to ensure smooth data integration.
  • Managed documentation for data models, processes, and data flows to ensure compliance with organizational standards.
  • Collaborated with data stewards and developers to optimize performance of match/merge rule transformation in Azure Data Factory.

Practicum - Data Scientist
Calamos Investment

May 2023 - Aug 2023, Naperville, IL

Responsibilities:
  • Tech Stack: Machine Learning, Data Modelling, Python, parquet, Github, Power BI
  • Calamos Investments is a diversified global investment firm offering innovative investment strategies including alternatives, multi-asset, convertible, fixed income, equity, and sustainable equity.
  • Applied Feature Selection technique on a dataset comprising 150 variables to develop predictive models, including XGBoost and Random Forest, for forecasting Investor firm categories.
  • Collected & integrated financial data coming from multiple sources using SQL to translate it into interactive insights to support decision-making.
  • Employed statistical methods and visualization tools to identify potential business growth opportunities and gauge market sentiment.
  • Conducted EDA across various asset classes and implemented a clustering model (K-Means) to identify distinct customer segments.
2

3
Data Analyst Intern
GlobalShala

September 2022 - November 2022, Chicago, IL

Data Analyst Internship Program

Responsibilities:
  • Tech Stack: JupyterLab, Python, Microsoft Excel, Tableau
  • Orchestrated end-to-end data analytics projects, from collecting, integrating, analyzing, visualizing, and reporting on financial data.
  • Collaborated with 8 data analysts across countries to boost social media engagement by 60%, & reduced advertising costs by 30% by developing a Database System in MS SQL Server.
  • Fostered collaboration with stakeholders across divisions, ensuring effective communication.
  • Designed a social media dashboard using Tableau to track key performance indicators and optimize content/ads.

Cloud Data Engineering & Analytics
LTIMindtree

Oct 2020 - July 2022, Pune, Maharashtra, India 411005

Responsibilities:
  • Tech Stack: Azure, Apache Spark, hadoop, Java, Oracle SQL, ETL, Kubernetes, REST APIs, Dynatrace, Angular-8, Power BI
  • Platform & Framework: Spring-boot, Autosys, Remedy, Google Cloud Platform (GCP), Teams, Git, Rally
  • Automated 150+ Batch Scripts for 20+ policy sources on Autosys, reducing manual effort and improving data processing efficiency.
  • Spearheaded end-to-end ETL data pipeline development with AirFlow, Apache Spark, and Hadoop, achieving a 50% increase in processing speed and enhancing insights generation capabilities.
  • Integrated Azure Cosmos DB for NoSQL Data modelling, harnessing models like document, key-value, and graph for flexible storage and high-performance data storage solutions.
  • Collaborated with stakeholders to refine master data quality standards, ensuring data integrity and reliability in Azure DQS.
  • Managed Azure Data Factory and SQL Database for scalable, reliable, and cost-effective data storage and integration.
  • Cataloged daily ad-hoc data analysis using Python while maintaining documentation for data processes, flows, and system configurations to support cross-functional teams.
  • Crafted Power BI dashboards to deliver data-driven insights to clients, contributing to a 30% increase in profitability in May 2022.
  • Built a complete prototype of Net-banking App using Angular 8, Spring-boot, Oracle-11g as primary database for a bank with team of 4 juniors.
4

5
Mentor, DeepLearning.AI
Coursera

January 2023 - Present, Chicago, IL 60616 (Remote)

Responsibilities:
  • Tech Stack: Python, Machine Learning, Deep Learning
  • Mentored & guided learners from all over the world with programming methods, problem-solving, pair programming, code review, and debugging.
  • Checked requirement satisfaction and helped Engineers with solutions and new ideas by actively participating in Discussion Forums.

Research Assistant
PICT (Pune Institute of Computer Technology), Pune

January 2017 - June 2018, Pune, Maharashtra 411046

Responsibilities:
  • Tech Stack: Arduino, Eagle, Proteus-8, SQL
  • Project- Building Easy Study Learning Module for Blind Students Using Arduino & Pneumatic Sensors
  • This project presents a new method of using Pneumatic sensors & Arduino together to build system to display braille characters so that blind students can read efficiently, replacing traditional on-paper text printed by needle.
  • The Device built performed very-well for different age-category students, helping then to learn language, Maths, Science better.
6

Projects

End to end : Machine Learning Project with MLflow
Data Scientist June 2024

  • Spearheaded end-to-end ML project integration with MLflow and AWS GitHub Actions for CI/CD.
  • Tailored user interactive app deployment on AWS ECR, employing Docker for seamless integration.
  • Amplified project management efficiency through MLflow's robust experiment tracking and model logging capabilities.
  • Revolutionized deployment process, ensuring production-grade implementation and comprehensive documentation for replicability.

Data Symphony: Orchestrating Real-time Insights
Data Engineer May 2024

  • Task 1: Design and implement a data pipeline using Apache Airflow to orchestrate ETL workflows, ensuring seamless integration and automated data processing across various data sources.
  • Task 2: Set up real-time data streaming with Apache Kafka and manage distributed synchronization using Apache Zookeeper, facilitating robust, low-latency data ingestion and processing.
  • Task 3: Configure and deploy a containerized data engineering environment with Docker, integrating Apache Spark for data processing and utilizing Cassandra and PostgreSQL for scalable and efficient data storage solutions.

Stride Logistics Analytics Application
Data Analyst May 2024

  • Exploratory Data Analysis: Understand the dataset structure, check for missing values, perform univariate and bivariate analysis, and summarize findings with visualizations.
  • Feature Engineering & Machine Learning: Create new features, evaluate correlations, and select the best machine learning model based on evaluation metrics, providing insights into factors affecting timely deliveries.
  • Application Development: Develop a Python Flask web application for users to upload data, perform analysis, and deliver insights, model results, and business recommendations.

Tokyo Olympics: Azure Data Engineering Project
Data Engineer May 2024

  • Analyzed athlete participation and medal counts across Olympic editions, identifying trends in sports and disciplines and highlighting athlete dominance in specific events.
  • Provided a comprehensive overview of Olympic history and evolution, offering insights into athlete performance and sport-level trends.
  • Utilized Azure Storage Containers, Databricks, Synapse Analytics, and Power BI for efficient data transformation and visualization, with findings aimed at informing future Olympic planning and inspiring athletes and fans.

YouTube Data Alchemist
Data Engineer May 2023 - June 2023

  • Task 1: Data Ingestion & Storage: Utilize AWS Glue to create a mechanism for ingesting data from multiple sources into Amazon S3, ensuring efficient and secure data transfer. Store the ingested data in a centralized data lake within S3 for unified storage and easy access.
  • Task 2: ETL & Scalability: Use AWS Glue to transform raw data into the required format, and ensure scalability by leveraging AWS Lambda for scalable compute resources. As data volume grows, AWS Glue and Lambda will seamlessly handle the increased load without manual intervention.
  • Task 3: Cloud Integration & Reporting: Manage access and security using AWS IAM. Use Amazon Athena to perform interactive queries on data stored in S3. Build a reporting dashboard with Amazon QuickSight to visualize data insights and answer key business questions, leveraging the cloud's scalability and performance.

Cryptocurrency Market Evaluation
Data Analyst January 2023 - April 2023

  • Prepared and cleaned data on top cryptocurrencies (e.g. Bitcoin, Ethereum, Binance Coin, Cardano) using R language.
  • Analyzed cryptocurrencies for price trends, market share, and volatility using technical analysis (e.g. trend analysis and charting) and sentiment analysis (news articles).
  • Conducted fundamental analysis to identify key factors (e.g. adoption rates, regulations, market sentiment) affecting cryptocurrency prices and used analytical techniques to develop a portfolio and propose risk management strategies for investing in volatile markets.

Emotion Classification of Product Reviews on Twitter
Data Scientist, Data Engineering August 2022 - December 2022

  • Implemented two pipeline in AWS to categorise the Article/Conversation into 6 Classes (happy, angry, surprise, sad, disgust, not-relevant).
  • Performed a comparative analysis of deep learning models including BERT, Roberta, and RNN to classify Twitter data into 6 emotions.
  • Achieved 97.83% accuracy and 0.83 F1-score on RNN model and deployed the final model on Amazon SageMaker and Spark-MLlib.

Zillow Real Estate Market Analysis
Data Scientist May 2023

  • Performed Data Wrangling, Data preprocessing, Visualization and Analysis to predict price using Regression pipeline.

Spotify Music Data Analysis
Data Scientist May 2023 - June 2023

  • Conducted data mining for 200000 tracks extracted by Spotify API, in order to analyze the trend of music industry development, and produce a predictive model for track popularity.

Quantium Data Analytics
Data Scientist, Data Analyst May 2023 - June 2023

  • Task 1: Data preparation and Customer Analytics: Collect, clean, and analyze customer data to gain insights about behavior and preferences.
  • Task 2: Experimentation and Uplift Testing: Conducted controlled experiments and A/B testing to measure the impact of strategies and optimize business outcomes.
  • Task 3: Analytics and Commercial Application: Applied data-driven insights to make informed decisions, formulate strategies, and drive commercial success.

Predicting Customer buying behaviour for British Airways
Data Scientist May 2023

  • A Data Science project to analyze the customer feedback on quality service and improve the pitfalls.
  • Utilize web scraping techniques to gain valuable insights about companies and predict customer buying behavior.

Tableau PowerBI Visualisation Projects
Data Visualization October 2022 - Present

  • Uncover actionable insights and visually communicate complex data patterns through interactive dashboards, empowering businesses to make data-driven decisions with confidence.

Uber Analytics Using Google Cloud Platform (GCP)
Data Engineering April 2023

  • Performed data analytics on Uber data using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.

Classification Model for American Sign Language(ASL)
Data Scientist Jan 2023 - April 2023

  • To design and develop a highly accurate real-time classification model capable of recognizing and interpreting American Sign Language (ASL) gestures from video data.

Covid-19 Recognition in CT Scans using Artificial Intelligence (AI) guided tools
Machine Learning Engineer Aug 2022 - Oct 2022

  • Utilization of AI-guided algorithms for Covid-19 analysis of CT scans.
  • Collection of 1,810 CT scan datasets, including 1,267 Covid-19 patients' and 543 healthy patients' scans.
  • Training and evaluation of pre-trained models (InceptionNet V3 and U-net) using k-fold cross-validation to identify the most effective model for detecting, localizing, and segmenting Covid-19 cases in CT scan images.

Vehicle detection of Aerial Imagery using deep Learning Algorithm
Machine Learning Engineer September 2019 - May 2020

  • In this project, we collected Aerial Imagery datasets from OVERHEAD and MUNICH, analyze using python to solve Vehicle detection or sector related problems.
  • Images were segmented using Region based method, HOG & SURF feature Extraction alogrithm were compared followed by SVM and Logistic regression.
  • Statistical Results of Combinations of methods was reported. later we implmented Deep Learning Model (ACP-Fast CNN & Faster R-CNN). computed growth-truth, confusion-matrix, recall, precision, F1-score.

Neural Network
Machine Learning Engineer July 2020 - September 2020

  • Trained Neural Network with linear, sigmoid, and tanh activation functions.
  • Utilized k-fold cross-validation (k=5) to evaluate performance.
  • Computed confusion matrix, precision, recall, and F1-score to compare activation functions and determine the best-performing one.

Wireless Scrolling Display Unit
Electronic Circuit Designing October 2018 - March 2019

This project was built for Academic Evaluation of term work in third year mini Project course of E&TC. we used 144 Leds, 6 4017-decade counter, 3 Shift register, resistors and 1 Long PCB to build this system. Using Andriod App connected via bluetooth module, custom text was shared and displayed on the board.

Easy-Study for Blind System
Programmer November 2017 - December 2019

In this project, developed an Embedded System using miniature piston-solonoid to read and write for blind students. when the kid wish to read, he was able to read so direclty by attaching USB to system, when he wish to write, he could press solonoid pistons in pattern on Braille, the system was able to store it in USB file. The system was built through Arduino, Solonoid Piston(5mm), MySQL and provides the flexibility to add, modify or recreate new text-material for students.

Bottle Filling Plant System
Developer August 2018 - December 2018

Built Automated Bottle filling Plant System completely from scratch using Arduino, IR-sensors,dual directional-pump, bluetooth module, LCD display, transformer, potentiometer.

Home Automation using Raspberry-Pi
Electronic System Design, IoT November 2019 - December 2019

Designed a home automation tool, which can contol many smart devices like lights, mocks and even media devices. built upon Raspberry-Pi, controlled via Android Mobile App to display temperature and humidity. graphical display of temperature and humidity was also done using ThinkSpeak.

Library Management System
Software Developer December 2017 - February 2018

This is a tkinter based application where the list of books can be saved in the database. People can log in and can read books and can also hire or purchase books. The application is built through Python & MySQL.

Skills

Education

Aug 2022 - May 2024
Master of Applied Science, Data Science
Courses Taken
  • Data Preparation and Analysis
  • Machine Learning
  • Applied Statistics
  • Statistical Learning
  • Big Data Technologies
  • Data Science Practicum
  • Public Engagement Scientists
  • Probability & Statistics
  • Natural Language Processing
  • Machine Learning in Finance
August 2014 - June 2018
Bachelor in Engineering, Electronics and Telecommunication Engineering
Courses Taken
  • Data Structures & Algorithms
  • Object Oriented Programming
  • Machine Learning
  • Wireless Sensor Networks
  • Mobile Communication
  • Internet of Things
  • Computer Networks & Security
  • VLSI Design & Technology
  • Broadband Communication Systems
  • Radiation & Microwave Theory
  • Electronic Product Design
  • Project Stage I
  • Project Stage II
  • Information Theory Coding & Communication Networks
  • Advanced Processors
  • Microcontrollers
  • Power Electronics
  • Electronic System Design
  • Business Management
  • Employbility Skills & Mini Project
  • Digital Electronics
  • Integrated Circuits
  • Digital Signal Processing
  • Signal & System
  • Electronic Devices & Circuits
  • Electric Circuits & Machine
  • Analog Communication
  • Digital Communication
  • Electromagnetics
  • Electronic Measurement Instruments & Tools
  • Control Systems
  • Mechatronics
  • Employability Skill Development
  • Engineering Mathematics III
  • Engineering Mathematics II
  • Engineering Mathematics I
  • Fundamental of Programming Language I
  • Fundamental of Programming Language II

Certifications

Data Scientist in Python
Datacamp November 2022 - February 2023

Starting with the Python essentials for data science, you’ll work through interactive exercises that test your abilities. You’ll get hands-on with some of the most popular Python libraries for data science, including pandas, NumPy, Seaborn, Matplotlib, and many more. Courses:

  1. Introduction to Python - Certification
  2. Intermediate Python - Certification
  3. Data Manipulation with Pandas - Certification
  4. Joining Data with pandas - Certification
  5. Introduction to Statistics in Python - Certification
  6. Introduction to Data Visualization with Matplotlib - Certification
  7. introduction to Data Visualization with Seaborn - Certification
  8. Introduction to NumPy - Certification
  9. Python Data Science Toolbox 1 & 2 - Certification
  10. Intermediate Data Visualization with Seaborn - Certification
  11. Intermediate to Importing Data in Python - Certification
  12. Working with Dates and Times in Python - Certification
  13. Exploratory Data Analysis in Python - Certification
  14. Sampling & Hypothesis Testing in Python- Certification
  15. Supervised Learning, Unsupervised Learning and ML with Tree Based Models in Python - Certification

Data Analyst in SQL
Datacamp November 2022 - December 2022

Database design is critical for a high-performance application. Just like you wouldn't build a house without a blueprint, you need to plan how you’ll store your data beforehand.Working with real-world datasets, gain the SQL skills you need to query a database, analyze results, and effectively communicate your insights to stakeholders. Courses:

  1. Understanding Data Visualization - Certification
  2. Introduction to Statistics - Certification
  3. Intermediate SQL - Certification
  4. Data Manipulation in SQL - Certification
  5. PostgreSQL Summary Stats and Window Functions - Certification
  6. Functions for Manipulating Data in PostgreSQL - Certification
  7. Exploratory Data Analysis in SQL - Certification
  8. Data-Driven Decision Making in SQL - Certification
  9. Data Communication Concepts - Certification
  10. Project : When Was the Golden Age of Video Games?- Certification

Google Data Analytics Professional Certificate

This is a path to a career in data analytics. Courses:

  1. Foundations: Data, Data, Everywhere - Certification
  2. Ask Questions to Make Data-Driven Decisions - Certification
  3. Prepare Data for Exploration - Certification
  4. Process Data from Dirty to Clean - Certification
  5. Analyze Data to Answer Questions - Certification
  6. Share Data Through the Art of Visualization - Certification
  7. Data Analysis with R Programming - Certification
  8. Google Data Analytics Capstone: Complete a Case Study - Certification

Practical Data Science Specialization

This Specialization is designed for data-focused developers, scientists, and analysts familiar with the Python and SQL programming languages and can learn how to build, train, and deploy scalable, end-to-end ML pipelines - both automated and human-in-the-loop. - GitHub

Courses:

  1. Analyze Datasets and Train ML Models using AutoML - GitHub - Certification
  2. Build, Train, and Deploy ML Pipelines using BERT - GitHub - Certification
  3. Optimize ML Models and Deploy Human-in-the-Loop Pipelines - GitHub - Certification

Applied Data Science with Python Specialization

This skills-based specialization is intended for learners who want to apply statistical, machine learning, information visualization, text analysis, and social network analysis techniques through popular python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx to gain insight into their data. - GitHub

Courses:

  1. Introduction to Data Science in Python - GitHub - Certification
  2. Applied Plotting, Charting & Data Representation in Python - GitHub - Certification
  3. Applied Machine Learning in Python - GitHub - Certification
  4. Applied Text Mining in Python - GitHub - Certification
  5. Applied Social Network Analysis in Python - GitHub - Certification

Deep Learning Specialization
DeepLearning.AI (offered on Coursera) January 2020 - May 2020

The Deep Learning Specialization is a foundational program that will help you understand the capabilities, challenges, and consequences of deep learning and prepare you to participate in the development of leading-edge AI technology. - GitHub

Courses:

  1. Neural Networks and Deep Learning- GitHub - Certification
  2. Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization - GitHub - Certification
  3. Structuring Machine Learning - GitHub - Certification
  4. Convolutional Neural Networks - GitHub - Certification
  5. Sequence Models - GitHub - Certification

Python for Everybody Specialization

This Specialization builds on the success of the Python for Everybody course and introduce fundamental programming concepts including data structures, networked application program interfaces, and databases, using the Python programming language. - GitHub

Courses:

  1. Programming for Everybody (Getting Started with Python) - GitHub - Certification
  2. Python Data Structures - GitHub - Certification
  3. Using Python to Access Web Data - Github - Certification
  4. Using Databases with Python - GitHub - Certification
  5. Capstone: Retrieving, Processing, and Visualizing Data with Python - Certification

IBM Data Science Professional Certificate
IBM (offered on Coursera) April 2020 - July 2020

This course is a complete guide to master the SDIs. It is created by hiring managers who’ve been working at Google, Facebook, Microsoft, and Amazon. We’ve carefully chosen a set of questions that have not only been repeatedly asked at top companies, but also provide a thorough experience to handle any system design problem.

Courses:

  1. What is Data Science? - GitHub - Certification
  2. Tools for Data Science - GitHub - Certification
  3. Data Science Methodology - Github - Certification
  4. Python for Data Science, AI & Development - GitHub - Certification
  5. Python Project for Data Science - GitHub - Certification
  6. Databases and SQL for Data Science with Python - GitHub - Certification
  7. Data Analysis with Python - GitHub - Certification
  8. Data Visualization with Python - GitHub - Certification
  9. Machine Learning with Python - GitHub - Certification
  10. Applied Data Science Capstone - GitHub - Certification
The Fundamental of Digital Marketing

This course is the basics of digital marketing and there are 26 modules to explore, all created by Google trainers, packed full of practical exercises and real-world examples to help turn knowledge into action.

Awards

Inspire Rising Star Award
2021
In recognization to my zealous attitute, proven agility for learning and focus on delivering quality outcomes I was awared as Rising star.
IETE's TECHNO 2k19 Project Competition
2019
Won 1st Prize & Cash Prize (Rs 10k) in State level Project Exhibition Competition.
Prime Minister’s Scholarship Scheme (PMSS)
2018
Merit Scholarship in Undergraduate Level, Savitribai Phule Pune University, Maharashtra, India.
Prime Minister’s Scholarship Scheme (PMSS)
2017
Merit Scholarship in Undergraduate Level, Savitribai Phule Pune University, Maharashtra, India.
Prime Minister’s Scholarship Scheme (PMSS)
2016
Merit Scholarship in Undergraduate Level, Savitribai Phule Pune University, Maharashtra, India.

Activities

March'23
Teens in AI, San Francisco, CA.
Oct'22 - Present
GGO (Gateway to the Great Outdoors), Chicago, IL 60616.
Presented research in the Impetus And Concepts 2020 (InC), an Intercollegiate International Level Technical Event of PICT.
PICT IET Student Chapter (PISC)'s Scientia '18 Technical Event, Pune, Maharashtra, India.
Credenz '17, PICT's IEEE Student Branch (PISB) Annual Technical Symposium, Pune, Maharashtra, India.
Impetus And Concepts 2017 (InC), an Intercollegiate International Level Technical Event of PICT.
Credenz '16, PICT's IEEE Student Branch (PISB) Annual Technical Symposium 2016.

Organizations

Executive member of GGO at IIT
Lead Organizer & Event Developer
Student - Software Engineer
Student Member