如何使用Python的BeautifulSoup模組分析網頁中的標籤(CSS)？

by 龍冥 2021-05-14

written by 龍冥 2021-05-14

若沒有安裝過lxml解析器，需先透過指令安裝

pip install lxml

前兩行先將BeautifulSoup以及urlopen函示庫匯入，因為網頁中包含中文字，因此讀取後會需要decode(‘utf-8’)編碼。
再將html資料將其印出，網站來源為莫凡python提供的測試網站

from bs4 import BeautifulSoup
from urllib.request import urlopen

# if has Chinese, apply decode()
html = urlopen("https://mofanpy.com/static/scraping/list.html").read().decode('utf-8')

print(html)

這段程式碼將class類別中為month的資料解析出來，並透過print(m.get_text())將有使用month這個類別的標籤全部顯示。

from bs4 import BeautifulSoup
from urllib.request import urlopen

# if has Chinese, apply decode()
html = urlopen("https://mofanpy.com/static/scraping/list.html").read().decode('utf-8')
# use class to narrow search
#尋找指定的class類別
soup = BeautifulSoup(html, features='lxml')
month = soup.find_all('li',{"class":"month"})
for m in month:
    print(m.get_text())

如何使用Python的BeautifulSoup模組分析網頁中的標籤(CSS)？

讀後反思：你覺得子女的成就會因為父母社會階級而受影響嗎？

安裝BeautifulSoup教學(環境：Visual Studio Code)

Related Posts