小编Bri*_*ley的帖子

Python Beautiful Soup 'NavigableString' 对象没有属性 'get_text'

我正在尝试从以下 html 结构中提取文本：

<div class="account-places">
    <div>
        <ul class="location-history">
            <li></li>
            <li>Text to extract</li>
        </ul>
    </div>
</div>

Run Code Online (Sandbox Code Playgroud)

我有以下 BeautifulSoup 代码来做到这一点：

from bs4 import BeautifulSoup as bs

soup = bs(html, "lxml")
div = soup.find("div", {"class": "account-places"})
text = div.div.ul.li.next_sibling.get_text()

Run Code Online (Sandbox Code Playgroud)

但是 Beautiful Soup 抛出错误：'NavigableString' 对象没有属性 'get_text'。我究竟做错了什么？

python beautifulsoup

Bri*_*ley

lucky-day

8
推荐指数

1
解决办法

7424
查看次数

用 Python 读取 .xlsx 文件的最快方法

我正在尝试使用 Python 将 .xlsx 文件中的数据读取到 MySQL 数据库中。

这是我的代码：

wb = openpyxl.load_workbook(filename="file", read_only=True)
ws = wb['My Worksheet']

conn = MySQLdb.connect()
cursor = conn.cursor()

cursor.execute("SET autocommit = 0")

for row in ws.iter_rows(row_offset=1):
     sql_row = # data i need
     cursor.execute("INSERT sql_row")

conn.commit()

Run Code Online (Sandbox Code Playgroud)

不幸的是，openpyxl'sws.iter_rows()非常缓慢。我已经尝试过使用xlrd和pandas模块的类似方法。还是慢。有什么想法吗？

python mysql xlrd pandas openpyxl

Bri*_*ley

2018 07-10

6
推荐指数

1
解决办法

2364
查看次数

Python selenium：selenium.common.exceptions.NoSuchWindowException：消息：浏览上下文已被丢弃

我有以下代码...

# instantiate web driver
profile = webdriver.FirefoxProfile("C:\\Users\\me\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\me.default")
driver = webdriver.Firefox(firefox_profile=profile)
driver.wait = WebDriverWait(driver, 5)

# browse to bot detection page
driver.get("https://botometer.iuni.iu.edu")

# click dropdown button on navbar
button = driver.wait.until(EC.presence_of_element_located((By.CLASS_NAME, "dropdown-toggle")))
button.click()

# click login link
login_link = driver.wait.until(EC.presence_of_element_located((By.LINK_TEXT, "Log In")))
login_link.click()

# switch to authorize window
new_window = driver.window_handles[1]
driver.switch_to.window(new_window)

# click authorize button 
authorize_button = driver.wait.until(EC.presence_of_element_located((By.ID, "allow")))
authorize_button.click()
time.sleep(5)

Run Code Online (Sandbox Code Playgroud)

...执行以下操作：

实例化Web驱动程序
导航到页面
单击页面上的按钮以打开新窗口
切换到新窗口
单击新窗口中的另一个按钮

不幸的是，单击第一个按钮后，新窗口将永远不会打开，并且程序会因以下错误而终止：

selenium.common.exceptions.NoSuchWindowException: Message: Browsing context has been discarded

Run Code Online (Sandbox Code Playgroud)

今天之前一切正常，我不确定发生了什么。有任何想法吗？

python selenium

Bri*_*ley

lucky-day

6
推荐指数

1
解决办法

2487
查看次数

OTP主管可以监视远程节点上的进程吗？

我想在我正在构建的分布式应用程序中使用erlang的OTP主管。但是我很难弄清这种主管如何监视远程节点上运行的进程。与erlang的start_link函数不同，start_child没有用于指定将在其上生成子节点的Node的参数。

OTP主管可以监视远程孩子吗？如果没有，我如何用erlang实现呢？

erlang distributed erlang-otp erlang-supervisor

Bri*_*ley

lucky-day

5
推荐指数

1
解决办法

663
查看次数

使用 awk 分割带有多个字符串分隔符的行

我有一个名为 pet_owners.txt 的文件，如下所示：

petOwner:Jane,petName:Fluffy,petType:cat
petOwner:John,petName:Oreo,petType:dog
...
petOwner:Jake,petName:Lucky,petType:dog

Run Code Online (Sandbox Code Playgroud)

我想使用 awk 使用分隔符分割文件：'petOwner'、'petName' 和 'petType'，以便我可以提取宠物主人和宠物类型。我想要的输出是：

Jane,cat
John,dog
...
Jake,dog

Run Code Online (Sandbox Code Playgroud)

到目前为止，我已经尝试过：

awk < pet_owners.txt -F'['petOwner''petName''petType']' '{print $1 $3}'

Run Code Online (Sandbox Code Playgroud)

但结果是一堆换行符。

关于如何实现这一目标的任何想法？

bash awk

Bri*_*ley

lucky-day

5
推荐指数

1
解决办法

6663
查看次数

标签统计

python ×3

awk ×1

bash ×1

beautifulsoup ×1

distributed ×1

erlang ×1

erlang-otp ×1

erlang-supervisor ×1

mysql ×1

openpyxl ×1

pandas ×1

selenium ×1

xlrd ×1

Python Beautiful Soup 'NavigableString' 对象没有属性 'get_text'

用 Python 读取 .xlsx 文件的最快方法

Python selenium：selenium.common.exceptions.NoSuchWindowException：消息：浏览上下文已被丢弃

OTP主管可以监视远程节点上的进程吗？

使用 awk 分割带有多个字符串分隔符的行

标签 统计

小编Bri_ley的帖子

标签统计