我正在尝试运行 beautifulSoup 从网站中提取链接和文本(我已获得许可)
\n\n我运行以下代码来获取链接和文本:
\n\nimport requests\nfrom bs4 import BeautifulSoup \n\nurl = "http://implementconsultinggroup.com/career/#/6257"\nr = requests.get(url)\n\nsoup = BeautifulSoup(r.content)\n\nlinks = soup.find_all("a")\n\nfor link in links:\n if "career" in link.get("href"):\n print "<a href=\'%s\'>%s</a>" %(link.get("href"), link.text)\nRun Code Online (Sandbox Code Playgroud)\n\n这给了我以下输出:
\n\nView Position\n\n</a>\n<a href=\'/career/business-analyst-within-human-capital-management/\'>\nBusiness analyst within human capital management\nCOPENHAGEN \xe2\x80\xa2 We are looking for an ambitious student with an interest in HR \nwho is passionate about working in the cross-field of people management, \nbusiness and technology\n\n\n\n\nView Position\n\n</a>\n<a href=\'/career/management-consultants-within-strategic-workforce-planning/\'>\nManagement consultants within strategic workforce planning\nCOPENHAGEN \xe2\x80\xa2 We are …Run Code Online (Sandbox Code Playgroud)