upy*_*yop 3 python python-requests
I want to know if a word is in the dictionary.
Here is what I am trying.
import requests
def word_in_dictionary(word):
response = requests.get('https://en.wiktionary.org/wiki/'+word)
return response.status_code==200
print(word_in_dictionary('potato')) # True
print(word_in_dictionary('nobblebog')) # False
Run Code Online (Sandbox Code Playgroud)
But unfortunately the dictionary contains a lot of words that are not English and I don't want to match those.
print(word_in_dictionary('bardzo')) # WANT THIS TO BE FALSE
Run Code Online (Sandbox Code Playgroud)
So I tried to look in the content.
def word_in_dictionary(word):
response = requests.get('https://en.wiktionary.org/wiki/'+word)
return response.status_code==200 and 'English' in response.content.decode()
Run Code Online (Sandbox Code Playgroud)
But I am still getting True. It is finding "English" somewhere in the page source even though the rendered page doesn't have it (nothing when I search with ctrl-F in the browser).
How can I make it only return True if it is actually listed as having a meaning in English?
Looking at the HTML code, if the word is english, there's tag with id="English". You can try this code:
import requests
from bs4 import BeautifulSoup
def word_in_dictionary(word):
response = requests.get('https://en.wiktionary.org/wiki/'+word)
return response.status_code==200 and bool(BeautifulSoup(response.content, 'html.parser').select_one('#English'))
print(word_in_dictionary('potato')) # True
print(word_in_dictionary('nobblebog')) # False
print(word_in_dictionary('bardzo')) # False
Run Code Online (Sandbox Code Playgroud)