Skip to content

This repository consists of various web scraping projects that are built during my python stack internship @ Infosys Springboard.

Notifications You must be signed in to change notification settings

VarshiniShreeV/DealsHunter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DealsHunter

This project is built during my python stack internship @ Infosys Springboard.

Dependencies:

  1. Python (version: 3.13.0 or above)
  2. Python IDE (Visual Studio Code / PyCharm / IDLE / Eclipse)
  3. Libraries : Streamlit, Beatiful Soup, Selenium

Milestone 1

A website "DealsHunter" is built using streamlit in python, which scrapes the data from the website DealsHeaven[https://dealsheaven.in/] using beautifulsoup and requests libraries.

In the folder Milestone 1 -> Run app.py:

streamlit run app.py

Screenshot 2024-11-17 194413

Milestone 1 Enhancement

A status bar is added and the UI is modified for visual appeal. Also, help section is provided.

In the folder Milestone 1 -> Run milestone_1.py:

streamlit run milestone_1.py

enhance

image

Milestone 2

DealsHunter

The DealsHunter website is enhanced further for better user experience and filtering by category is integrated. The products are displayed with their respective images and other details.

In the folder Milestone 2 -> Run milestone_2_t1.py:

streamlit run milestone_2_t1.py

8 9

Public Library

Using Selenium, we scrape the states and their respective libraries information from the Public Libraries website. Using sqlite3, we store the scraped data in 2 tables, which are related to each other by having common state id. Using selenium, the scraped information of libraries for s specific chosen state is displayed.

In the folder Milestone 2 -> Run milestone_2_t2.py:

streamlit run milestone_2_t2.py

States Table (2 fields: state_id, state_name)

L 2

Libraries Table (7 fields : id, state_id, city, library, address, zip, phone)

L 3

Schema of the 2 tables in the Data Base libraries_data.db

L 4

Relation between 2 tables (States (Strong Entity) -> Libraries (Weak Entity)

L 5

GUI using Streamlit

L 1 L 9 L 7

Milestone 3

Using selenium, we scrape the job cards from the Behance Job Listings, up until the pages scrolled (here, default 10). Then, a gui is built using streamlit, where a dynamic search bar (which helps you search easier by providing pre-existing options in a drop down), is implemented and the corresponding job listings are displayed as cards. So, the scraper file must be executed before ui file, since it scrapes and stores the data.

In the folder milestone 3 -> Run the scraper file:

python scraper.py

Then run the ui file:

streamlit run ui.py

GUI : Displaying all the scraped data initially

3 1

Dynamic Search Assistance

3 2

Fetched Results

3 3

About

This repository consists of various web scraping projects that are built during my python stack internship @ Infosys Springboard.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages