this post was submitted on 18 Jul 2023
8 points (90.0% liked)

Python

6230 readers
41 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

๐Ÿ“… Events

October 2023

November 2023

PastJuly 2023

August 2023

September 2023

๐Ÿ Python project:
๐Ÿ’“ Python Community:
โœจ Python Ecosystem:
๐ŸŒŒ Fediverse
Communities
Projects
Feeds

founded 1 year ago
MODERATORS
 

When I'm writing webscrapers I mostly just pivot between selenium (because the website is too "fancy" and definitely needs a browser) and pure requests calls (both in conjunction with bs4).

But when reading about scrapers, scrapy is often the first mentioned Python package. What am I missing out on if I'm not using it?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] qwertyasdef@programming.dev 1 points 1 year ago (1 children)

Oh shit that sounds useful. I just did a project where I implemented a custom stream class to chain together calls to requests and beautifulsoup.

[โ€“] Wats0ns@programming.dev 2 points 1 year ago

Yep try scrapy. And also it handles for you the concurrency of your pipelines items, configuration for every part,...