Quantcast
Channel: How do I scrape data from URLs in a python-scraped list of URLs? - Stack Overflow
Viewing all articles
Browse latest Browse all 2

How do I scrape data from URLs in a python-scraped list of URLs?

$
0
0

I'm trying to use BeautifulSoup4 in Orange to scrape data from a list of URLs scraped from that same website.

I have managed to scraped the data from a single page when I set the URL manually.

from urllib.request import urlopenfrom bs4 import BeautifulSoupimport requestsimport csvimport reurl = "https://data.ushja.org/awards-standings/zone-points.aspx?year=2021&zone=1&section=1901"req = requests.get(url)soup = BeautifulSoup(req.text, "html.parser")rank = soup.find("table", class_="table-standings-body")for child in rank.children:    print(url,child)

and I have been able to scrape the list of URLs I need

from urllib.request import urlopenfrom bs4 import BeautifulSoupimport requestsimport csvimport reurl = "https://data.ushja.org/awards-standings/zones.aspx?year=2021&zone=1"req = requests.get(url)soup = BeautifulSoup(req.text, "html.parser")rank = soup.find("table", class_="table-standings-body")link = soup.find('div',class_='contentSection')url_list = link.find('a').get('href')for url_list in link.find_all('a'):    print (url_list.get('href'))

But so far I haven't been able to combine both to scrape the data from that URL list. Can I do that only by nesting for loops, and if so, how? Or how can I do it?

I am sorry if this is a stupid question, but I only started trying with Python and Web-Scraping yesterday and I have not been able to figure this by consulting similar-ish topics.


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images