使用python爬蟲實現(xiàn)網(wǎng)絡股票信息爬取的demo
更新時間:2018年01月05日 14:12:00 作者:OliverkingLi
下面小編就為大家分享一篇使用python爬蟲實現(xiàn)網(wǎng)絡股票信息爬取的demo,具有很好的參考價值,希望對大家有所幫助。一起跟隨小編過來看看吧
實例如下所示:
import requests from bs4 import BeautifulSoup import traceback import re def getHTMLText(url): try: r = requests.get(url) r.raise_for_status() r.encoding = r.apparent_encoding return r.text except: return "" def getStockList(lst, stockURL): html = getHTMLText(stockURL) soup = BeautifulSoup(html, 'html.parser') a = soup.find_all('a') for i in a: try: href = i.attrs['href'] lst.append(re.findall(r"[s][hz]\d{6}", href)[0]) except: continue def getStockInfo(lst, stockURL, fpath): for stock in lst: url = stockURL + stock + ".html" html = getHTMLText(url) try: if html=="": continue infoDict = {} soup = BeautifulSoup(html, 'html.parser') stockInfo = soup.find('div',attrs={'class':'stock-bets'}) name = stockInfo.find_all(attrs={'class':'bets-name'})[0] infoDict.update({'股票名稱': name.text.split()[0]}) keyList = stockInfo.find_all('dt') valueList = stockInfo.find_all('dd') for i in range(len(keyList)): key = keyList[i].text val = valueList[i].text infoDict[key] = val with open(fpath, 'a', encoding='utf-8') as f: f.write( str(infoDict) + '\n' ) except: traceback.print_exc() continue def main(): stock_list_url = 'http://quote.eastmoney.com/stocklist.html' stock_info_url = 'https://gupiao.baidu.com/stock/' output_file = 'D:/BaiduStockInfo.txt' slist=[] getStockList(slist, stock_list_url) getStockInfo(slist, stock_info_url, output_file) main()
優(yōu)化并且加入進度條顯示
import requests from bs4 import BeautifulSoup import traceback import re def getHTMLText(url, code="utf-8"): try: r = requests.get(url) r.raise_for_status() r.encoding = code return r.text except: return "" def getStockList(lst, stockURL): html = getHTMLText(stockURL, "GB2312") soup = BeautifulSoup(html, 'html.parser') a = soup.find_all('a') for i in a: try: href = i.attrs['href'] lst.append(re.findall(r"[s][hz]\d{6}", href)[0]) except: continue def getStockInfo(lst, stockURL, fpath): count = 0 for stock in lst: url = stockURL + stock + ".html" html = getHTMLText(url) try: if html == "": continue infoDict = {} soup = BeautifulSoup(html, 'html.parser') stockInfo = soup.find('div', attrs={'class': 'stock-bets'}) name = stockInfo.find_all(attrs={'class': 'bets-name'})[0] infoDict.update({'股票名稱': name.text.split()[0]}) keyList = stockInfo.find_all('dt') valueList = stockInfo.find_all('dd') for i in range(len(keyList)): key = keyList[i].text val = valueList[i].text infoDict[key] = val with open(fpath, 'a', encoding='utf-8') as f: f.write(str(infoDict) + '\n') count = count + 1 print("\r當前進度: {:.2f}%".format(count * 100 / len(lst)), end="") except: count = count + 1 print("\r當前進度: {:.2f}%".format(count * 100 / len(lst)), end="") continue def main(): stock_list_url = 'http://quote.eastmoney.com/stocklist.html' stock_info_url = 'https://gupiao.baidu.com/stock/' output_file = 'BaiduStockInfo.txt' slist = [] getStockList(slist, stock_list_url) getStockInfo(slist, stock_info_url, output_file) main()
以上這篇使用python爬蟲實現(xiàn)網(wǎng)絡股票信息爬取的demo就是小編分享給大家的全部內(nèi)容了,希望能給大家一個參考,也希望大家多多支持腳本之家。
您可能感興趣的文章:
- Python爬蟲回測股票的實例講解
- python基于機器學習預測股票交易信號
- 如何用Python中Tushare包輕松完成股票篩選(詳細流程操作)
- python爬取股票最新數(shù)據(jù)并用excel繪制樹狀圖的示例
- python實現(xiàn)馬丁策略回測3000只股票的實例代碼
- 基于Python爬取搜狐證券股票過程解析
- 基于Python爬取股票數(shù)據(jù)過程詳解
- 關于python tushare Tkinter構建的簡單股票可視化查詢系統(tǒng)(Beta v0.13)
- Python爬取股票信息,并可視化數(shù)據(jù)的示例
- python用線性回歸預測股票價格的實現(xiàn)代碼
- python 簡單的股票基金爬蟲
相關文章
Pytorch distributed 多卡并行載入模型操作
這篇文章主要介紹了Pytorch distributed 多卡并行載入模型操作,具有很好的參考價值,希望對大家有所幫助。如有錯誤或未考慮完全的地方,望不吝賜教2021-06-06Python連接Oracle數(shù)據(jù)庫的操作指南
Oracle數(shù)據(jù)庫是一種強大的企業(yè)級關系數(shù)據(jù)庫管理系統(tǒng)(RDBMS),而Python是一門流行的編程語言,兩者的結合可以提供出色的數(shù)據(jù)管理和分析能力,本教程將詳細介紹如何在Python中連接Oracle數(shù)據(jù)庫,并演示常見的數(shù)據(jù)庫任務,需要的朋友可以參考下2023-11-11python numpy生成等差數(shù)列、等比數(shù)列的實例
今天小編就為大家分享一篇python numpy生成等差數(shù)列、等比數(shù)列的實例,具有很好的參考價值,希望對大家有所幫助。一起跟隨小編過來看看吧2020-02-02