Skip to content

Sharing some python notebooks created along my Data Science Learning Journey

Notifications You must be signed in to change notification settings

nVidiaPriyadarshini/DataScienceLearning

Repository files navigation

DataScienceLearning

Sharing some Python notebooks created along my Data Science Learning Journey

Project 1: GoodReads Web Scraping

Scraping is simply a process of extracting (from various means), copying, and screening data. Web scraping provides a way for developers to collect and analyze data from the internet.

Web-scraping provides one of the great tools to automate most of the things a human does while browsing.

In this project, we will explore "How to extract information from the popular Good Reads platform to analyze and generate interesting insights around Book Trends"

Goodreads is one the world’s largest community for reviewing and recommending books. It's a favorite platform for many voracious readers!!

This project is partly inspired by the following linked project. Reference link: https://medium.com/@soodakriti175/goodreads-web-scraping-92345b620f9c

I have structured this first Python notebook detailing below tasks:

How to scrape certain sections of a page using Beautiful Soup? In particular, all books are listed under a Good Reads user-defined list on a given page. How to iteratively scrape all pages to obtain specific attributes on all books-related information belonging to a particular list? How to load the scraped contents into a Pandas data frame? How to expand the scope and iteratively scrape all lists for book-related information for a list of user-defined tags and append the extracted info to an existing .csv file loaded in Google Drive? Example: Tags such as "fiction", "science-fiction" etc.

Relevant files: (Output) goodreads_fiction_types - goodreads_fiction_types goodreads_web_scraping.py GoodReads_Web_Scraping.ipynb

Project 2: Text2SQLApplication

Converting Natural Language to SQL and querying a database using Gemini Pro LLM

CodeBase: Text2SQLAppGeminiPro

Project 3: Ensemble_Project

A telecom company wants to use its historical customer data and leverage machine learning to predict behavior in an attempt to retain customers. The end goal is to develop focused customer retention programs

The objective, as a data scientist hired by the telecom company, is to build a model that will help to identify the potential customers who have a higher probability of churn. This will help the company to understand the pain points and patterns of customer churn and will increase the focus on strategizing customer retention.

CodeBase: Ensemble_Project.ipynb

Project 4: Cat Vs. Dog Image Classification

Building a Convnet from Scratch [Estimated completion time: 20 minutes]

In this exercise, we will build a classifier model from scratch that can distinguish dogs from cats. We will follow these steps:

  1. Explore the example data
  2. Build a small convnet from scratch to solve our classification problem
  3. Evaluate training and validation accuracy

CodeBase: Cat vs. Dog Image Classification

About Me

Let’s connect at https://www.linkedin.com/in/vpnarayanan/ and exchange ideas about the latest tech trends and advancements! 🌟

About

Sharing some python notebooks created along my Data Science Learning Journey

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published