python - Using the `.find_next_siblings` function in Beautiful Soup -


i attempting write output of web scraping csv file, here code:

import bs4 import requests import csv  #get webpage apple inc. september income statement page = requests.get("https://au.finance.yahoo.com/q/is?s=aapl")  #put beautiful soup soup = bs4.beautifulsoup(page.content)  #select table holds data of interest table = soup.find("table", class_="yfnc_tabledata1")  #creates headers table headers = table.find('tr', class_="yfnc_modtitle1")  #creates generator holds 4 values yearly revenues company total_revenue = headers.next_sibling cost_of_revenue = total_revenue.next_sibling gross_profit = cost_of_revenue.next_sibling.next_sibling wang = headers.find_next_siblings("tr")  #iterates through generator above , writes output csv file open('/home/kwal0203/desktop/apple.csv', 'a') csvfile:             writer = csv.writer(csvfile,delimiter="|")             writer.writerow([value.get_text(strip=true).encode("utf-8") value in headers])             writer.writerow([value.get_text(strip=true).encode("utf-8") value in total_revenue])             writer.writerow([value.get_text(strip=true).encode("utf-8") value in cost_of_revenue])             writer.writerow([value.get_text(strip=true).encode("utf-8") value in gross_profit])             dude in wang:                 writer.writerow([dude.get_text(strip=true).encode("utf-8")]) 

the problem repeating lot of code when creating , writing each row csv. can see keep repeating next_sibling next row of values. found .find_next_siblings() function in beautiful soup , want each row functions reads gets outputted 1 cell of csv file.

any ideas? let me know if question not clear.

thanks.

okay, not perfect solution, suppose, idea check next siblings amounts , skip rows without:

next_rows = [[td.get_text(strip=true).encode("utf-8") td in row('td')]               row in headers.find_next_siblings("tr")]  pattern = re.compile(r'^[\d,]+$') data = [[item item in l if pattern.match(item)] l in next_rows] data = [l l in data if l]  open('/home/kwal0203/desktop/apple.csv', 'a') csvfile:     writer = csv.writer(csvfile, delimiter="|")     writer.writerows(data) 

produces:

42,123,000|37,432,000|45,646,000|57,594,000 26,114,000|22,697,000|27,699,000|35,748,000 16,009,000|14,735,000|17,947,000|21,846,000 1,686,000|1,603,000|1,422,000|1,330,000 3,158,000|2,850,000|2,932,000|3,053,000 11,165,000|10,282,000|13,593,000|17,463,000 307,000|202,000|225,000|246,000 11,472,000|10,484,000|13,818,000|17,709,000 11,472,000|10,484,000|13,818,000|17,709,000 3,005,000|2,736,000|3,595,000|4,637,000 8,467,000|7,748,000|10,223,000|13,072,000 8,467,000|7,748,000|10,223,000|13,072,000 8,467,000|7,748,000|10,223,000|13,072,000 

these amounts table.


Comments

Popular posts from this blog

python - mat is not a numerical tuple : openCV error -

c# - MSAA finds controls UI Automation doesn't -

wordpress - .htaccess: RewriteRule: bad flag delimiters -