Scrape User Accounts from Instagram & TikTok

Post Time: Sep 26, 2024

Last Time: Feb 19, 2025

In today’s data-driven landscape, social media platforms like Instagram and TikTok are rich sources of information. Whether you're analyzing trends, gathering insights, or building a dataset for research, scraping user accounts can be highly beneficial. This article will guide you through scraping user accounts from Instagram and TikTok using AWS infrastructure, along with Python tools and libraries.

Why Use AWS for Scraping?

AWS (Amazon Web Services) provides scalable and reliable cloud computing resources. By leveraging AWS, you can:

Scale your scraping operations efficiently.
Access powerful EC2 instances for processing.
Store large datasets securely in S3.

Prerequisites

Before diving into the scraping process, ensure you have:

An AWS account.
Basic knowledge of Python and web scraping concepts.
Familiarity with command-line operations.

Step 1: Setting Up Your AWS Environment

A. Create an AWS Account

B. Launch an EC2 Instance

Navigate to the EC2 dashboard and launch a new instance. Choose an Amazon Machine Image (AMI), preferably Ubuntu for ease of setup. Select an instance type; t2.micro is often sufficient for low-volume scraping. Configure the security group to allow SSH access.

C. Connect to Your Instance

Use SSH to connect to your EC2 instance:

bash Copy

1ssh -i your-key.pem ubuntu@your-ec2-public-ip
2

Step 2: Install Necessary Dependencies

A. Install Python and Libraries

Once connected, install Python and the required libraries.

1. Install Python:

bash Copy

1sudo apt update
2sudo apt install python3 python3-pip
3

2. Install Libraries:

bash Copy

1pip3 install requests beautifulsoup4 selenium instaloader TikTokApi
2

B. Set Up Web Driver for Selenium

Install Chrome and ChromeDriver (if you plan to use Selenium):

Download Chrome from here.
Download ChromeDriver from here.

Step 3: Scraping Instagram Accounts

A. Using Instaloader

Instaloader is a powerful tool specifically designed for Instagram scraping.

Basic Usage

Log in and Scrape User Data:

python Copy

1import instaloader
2
3L = instaloader.Instaloader()
4L.login('your_username', 'your_password')  # Replace with your credentials
5
6# Get profile information
7profile = instaloader.Profile.from_username(L.context, 'target_username')  # Replace with target username
8
9print(f'Username: {profile.username}')
10print(f'Bio: {profile.biography}')
11print(f'Followers: {profile.followers}')
12print(f'Following: {profile.followees}')
13
14# Scraping posts
15for post in profile.get_posts():
16    print(f'Post URL: {post.url}')
17

B. Using Selenium

If you need to scrape data from a public profile or handle specific interactions:

python Copy

1from selenium import webdriver
2from selenium.webdriver.common.by import By
3import time
4
5# Set up Selenium
6driver = webdriver.Chrome()  # Ensure chromedriver is in your PATH
7driver.get('https://www.instagram.com/accounts/login/')
8
9# Wait for the login page to load
10time.sleep(3)
11
12# Log in
13username_input = driver.find_element(By.NAME, 'username')
14password_input = driver.find_element(By.NAME, 'password')
15
16username_input.send_keys('your_username')
17password_input.send_keys('your_password')
18password_input.submit()
19
20# Wait for the profile page to load
21time.sleep(5)
22
23# Navigate to target profile
24driver.get('https://www.instagram.com/target_username/')  # Replace with target username
25
26# Scrape user data
27bio = driver.find_element(By.CSS_SELECTOR, 'div.-vDIg > span').text
28print(f'Bio: {bio}')
29
30# Close the driver
31driver.quit()
32
33

Step 4: Scraping TikTok Accounts

A. Using TikTokApi

The TikTokApi library allows easy access to TikTok's public data.

Basic Usage

python Copy

1from TikTokApi import TikTokApi
2
3api = TikTokApi.get_instance()
4
5# Get user object
6user = api.user.getUserObject('username')  # Replace with target username
7
8print(f'Username: {user.username}')
9print(f'Display Name: {user.display_name}')
10print(f'Followers: {user.follower_count}')
11print(f'Following: {user.following_count}')
12

B. Using Selenium

If you want to interact with TikTok's web interface:

python Copy

1from selenium import webdriver
2from selenium.webdriver.common.by import By
3import time
4
5# Set up Selenium
6driver = webdriver.Chrome()
7driver.get('https://www.tiktok.com/@target_username')  # Replace with target username
8
9# Wait for the page to load
10time.sleep(5)
11
12# Scrape user data
13username = driver.find_element(By.TAG_NAME, 'h1').text
14followers = driver.find_element(By.XPATH, '//strong[contains(text(),"Followers")]/..').text
15
16print(f'Username: {username}')
17print(f'Followers: {followers}')
18
19# Close the driver
20driver.quit()
21

C. Use Octoparse

For More, please read:

Step 5: Important Considerations

Rate Limiting: Both Instagram and TikTok have rate limits. Be mindful of how frequently you make requests to avoid being banned.
Respect Privacy: Scrape only public data and adhere to each platform's terms of service.
Captcha Handling: Be prepared to handle CAPTCHA challenges, especially with automated scripts.
Proxy Management: Regularly rotating proxies to reduce the risk of being blocked.

Conclusion

Scraping user accounts from Instagram and TikTok using AWS can provide valuable insights while allowing for scalable operations. By following this guide, you can set up a robust scraping environment and gather the data you need ethically and responsibly.

Scrape User Accounts from Instagram & TikTok

Why Use AWS for Scraping?

Prerequisites

Step 1: Setting Up Your AWS Environment

A. Create an AWS Account

B. Launch an EC2 Instance

C. Connect to Your Instance

Step 2: Install Necessary Dependencies

A. Install Python and Libraries

1. Install Python:

2. Install Libraries:

B. Set Up Web Driver for Selenium

Step 3: Scraping Instagram Accounts

A. Using Instaloader

B. Using Selenium

Step 4: Scraping TikTok Accounts

A. Using TikTokApi

B. Using Selenium

C. Use Octoparse

Step 5: Important Considerations

Conclusion

Related articles

Proxy Scraper Japanese In 2024

Complete Guide to Google Search Data Scraping In 2025

Web Scraping With a Proxy Pool In 2025

Top 10 Free Web Scrapers You Must Try in 2024 for Easy Data Extraction

Web Scraping Activities Detected As Bot When Using Selenium Proxy

Start your Free Trial Now!