Youtube Channel Scraper (Python)

This script lets you scrape youtube channels so that you can use them in generating the leads.

Script:


from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
import time
import pandas as pd

def convert_subscriber_count(subscriber_count):
    subscriber_count = subscriber_count.replace(' subscribers', '')
    if 'K' in subscriber_count:
        return int(float(subscriber_count.replace('K', '')) * 1000)
    elif 'M' in subscriber_count:
        return int(float(subscriber_count.replace('M', '')) * 1000000)
    else:
        return int(subscriber_count.replace(',', ''))

search_keyword = input("Enter the search keyword: ")
try: 
    scroll_limit = int(input("Enter the number of times to scroll the page: "))
except ValueError:
    print("Please enter a valid number.")
    exit()
    
# Set up the WebDriver
driver = webdriver.Chrome()

# Navigate to the YouTube search results URL
url = f"https://www.youtube.com/results?search_query={search_keyword}&sp=EgIQAg%253D%253D"
driver.get(url)

# Wait for the page to load
time.sleep(5)

# Scroll the page to load more results
scroll_count = 0
last_height = driver.execute_script("return document.documentElement.scrollHeight")


while scroll_count < scroll_limit:
    driver.execute_script("window.scrollTo(0, document.documentElement.scrollHeight);")
    time.sleep(5)
    new_height = driver.execute_script("return document.documentElement.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height
    scroll_count += 1

# Locate the elements containing the channel name and subscriber count
channels = driver.find_elements(By.XPATH, '//yt-formatted-string[@id="text" and not(@has-link-only_) and not(@ellipsis-truncate)]')
subscribers = driver.find_elements(By.XPATH, '//span[@id="video-count"]')

# write channels, subscribers to a excel
df = pd.DataFrame(columns=['Channel Name', 'Subscribers'])
for channel, subscriber in zip(channels, subscribers):
    df = df._append({"Channel Name": channel.text, "Subscribers": convert_subscriber_count(subscriber.text)}, ignore_index=True)

df.to_excel(f'{search_keyword}_youtube_channels.xlsx', index=False)


# Close the WebDriver
driver.quit()

Prerequisites

Python 3.x Selenium library Pandas library Chrome WebDriver

Installation


pip install selenium pandas

Usage

Run the script using the following command:


python script.py

Enter the search keyword when prompted. Enter the number of times to scroll the page when prompted. The script will scrape the YouTube search results and save the channel names and subscriber counts in an Excel file named <search_keyword>_youtube_channels.xlsx. Note: Make sure to have a stable internet connection while running the script.

License

This project is licensed under the MIT License.

Get in touch

Interested in working with me? Drop me a mail at osafalisayed@gmail.com or message me on WhatsApp.