1
votes

Comment extraire du texte dans une balise forte et du texte normal

J'ai une liste de balises li et dans chaque balise li il y a du texte avec une balise forte et du texte normal Xpath pour // * [@ id = "main"] / li [1] / strong Comment puis-je obtenir du texte normal, si je prends xpath de la balise li alors il grattera tout le texte, y a-t-il un moyen d'obtenir du texte séparé

<li>
<strong>Heading</strong>
: Sample paragraph to get the text from here.
</li>

python selenium web-scraping webdriverwait xpath

2 commentaires

Êtes-vous prêt à utiliser BeautifulSoup? Ou faut-il que ce soit XPath?

essayez votre_browser.find_element_by_xpath ("// * [@ id =" main "] / li [1]"). ‌ texte

3 Réponses :

0
votes

Si vous utilisez le sélénium Induce JavaScript Executor et obtenez le lastChild du nœud.

print(driver.execute_script('return arguments[0].lastChild.textContent;', driver.find_element_by_xpath('//*[@id="main"]/li[1]')))

0 commentaires

0
votes

Vous pouvez obtenir le texte

et en supprimer le

 element = driver.find_element_by_xpath('//*[@id="main"]/li[1]')
all_text = element.text
element = element.find_element_by_xpath('./strong')
text = all_text.replace(element.text, '')

0 commentaires

-1
votes

Pour gratter le texte normal, vous devez induire WebDriverWait pour le visibilité_of_element_located () et comme le nœud souhaité est un nœud de texte , vous pouvez utilisez la méthode execute_script () avec la Stratégie de localisation :

xpath 1 :

print(driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//li/strong[text()='Heading']/..")))))

xpath 2 :

print(driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//li[./strong[text()='Heading']]")))))

0 commentaires