Shristi Shrestha

A

Passionate about the entire data lifecycle, from engineering robust software solutions to deploying machine learning models that tackle real-world challenges.

About

I am a results-driven Data Scientist and Software Engineer with a passion for transforming complex data into actionable insights. I thrive at the intersection of machine learning, cloud computing, and robust software development. My professional experience at industry leaders like IBM has equipped me with a strong background in developing and testing scalable, data-intensive applications.

Currently, I'm pursuing my Master's in Data Science at the University of Memphis, where I'm diving deep into advanced statistical learning and deep learning. I'm on a mission to leverage technology to solve real-world challenges and build intelligent, data-driven applications.

Languages: Python, R, SQL, Bash, SAS, Java, HTML
Databases: MSSQL, MySQL, Oracle, Snowflake
Methodologies/Concepts: Agile, Statistical Analysis, Hypothesis Testing, Regression Analysis, Machine and Deep Learning
Tools & Technologies: RStudio, VS Code, PyCharm, Jupyter, Toad, Kibana, Grafana, Tableau, Excel, Matplotlib, JIRA, Git, Bitbucket, Maven, Gradle, Jenkins

Looking for an opportunity to work in a challenging position combining my skills in Data Science and Software Engineering, which provides professional development, interesting experiences and personal growth.

Experience

IBM

Software Quality Assurance Engineer

Worked in IBM Cloud Object Storage (COS) System Integration team to develop Python automation test scripts for Cloud Storage solutions.
Collaborated with cross-functional teams to investigate software defects across the IBM COS application.
Proactively adapted to shifting priorities and effectively managed multiple competing projects, consistently delivering high-quality results on time.
Involved in peer code review of GitHub PRs and Confluence Wiki documentation of application features.
Actively engaged in Agile practices such as stand-ups, sprint planning, backlog grooming, and sprint retrospectives to ensure quality and timely Sprint deliverables.

Oct 2021 - Apr 2023 | Chicago, Illinois

Yes Energy

Software Quality Assurance Engineer

Worked in an evolving power market domain that involved highly complex and intricate business logic to deliver robust and high-quality energy data to customers in real time.
As a member of the Big Data team, designed and executed PL/SQL Oracle and Snowflake scripts to verify complex backend business logic and ensure data accuracy in the Data Cloud and AWS Data Lake respectively.
Designed and developed Python automation scripts using a multiprocessing library to simulate multiple concurrent requests for testing system performance and subscription-tier throttling limits.
Collaborated in full SDLC from collecting business requirements, analyzing user stories, designing and documenting test cases for use-case scenarios along with test results.

Nov 2020 - Oct 2021 | Boulder, Colorado

Projects

                
Wildfire Severity Classification
                  
                  Investigating Wildfire Severity using Data Mining and Machine Learning Classification Models                
AccomplishmentsTechnologies and tools used: Python, Pandas, Google Colab, RapidMiner, sklearn, imblearn, numpy, Leaflet, Jupyter Notebook.
Followed the KDD process and performed correlation analysis to identify relevant features causing the spread of wildfire and classify the wildfire sizes.
Dealt with multi-class classification and imbalanced data points. 
Designed various machine learning classification models and compared them using performance metrics like precision, recall, and confusion matrix.

                  Smart IoT Data Analytics and Visualization
                  A Smart IoT Data Analytics pipeline and Visualization framework for IoT sensor data.
                
AccomplishmentsSoftware-based technologies and tools used: AWS (IoT Core, IAM, DynamoDB, EC2, Lambda, EMR, CloudWatch, S3, SNS), MySQL, HTML, CSS, JavaScript, AJAX, Leaflet, Linux, MATLAB.
Hardware-based technologies and tools used: Waspmote IDE, Waspmotes, Meshlium, C++, MQTT.
Established AWS cloud infrastructure to perform data analysis and visualized this IoT data in a web dashboard.

                  Web Dashboard for Smart IoT Data
                  A web dashboard to visualize the IoT data mapped from the AWS NoSQL database.
                
AccomplishmentsTools: HTML, CSS, JavaScript, Node.js, ChartJS, AJAX, Leaflet.

Nursing Home Database Management
                  
                  A CRUD Spring application with a UI for management of a nursing home database.
                
AccomplishmentsTools: Spring Boot, JSP, Java, JDBC, Maven, MySQL.

Exploring ML/DL for Named Entity Recognition (NER)
                  A model comparison for NER classifying named entities from unstructured texts.
                
AccomplishmentsEvaluated and compared models like CRF, LSTM, BiLSTM, and a hybrid BiLSTM-CRF for identifying and classifying named entities.
Technologies and tools used: Python, Pandas, sklearn, TensorFlow, Keras, K-fold cross validation, Jupyter Notebook.

Predictive Policing in Memphis
                  Using data science to derive insights from MPD Public Safety data for strategic resource deployment.
                
AccomplishmentsTechnologies and tools used: Python, Pandas, sklearn, NumPy, Matplotlib, Jupyter Notebook.
Executed the full data science lifecycle, from data cleaning and EDA to model development.
Cleaned and preprocessed a large-scale dataset of over 640,000 crime incident records.
Developed Decision Tree and Random Forest models to predict crime severity.

House Price Prediction using Linear Regression
                  Predictive models using linear regression and regularization for predicting house prices.
                
AccomplishmentsTechnologies and tools used:  R, R Studio.
Performed data preprocessing and exploratory data analysis (EDA) to understand variable relationships.
Implemented various linear regression models, including subset selection, polynomial regression, and regularization techniques (Ridge, Lasso, Elastic Net).