python - Combine Duplicate Sibling HTML Tags -


i trying parse/search/modify html in python, running issues when there duplicate sibling tags dividing string.

for example string : "“$7,500,000” reference “$10,000,000”."

duplicate tag html:

<font style="font-size:12pt;"></font>  <font style="font-style:italic;font-size:12pt;">$</font>  <font style="font-style:italic;font-size:12pt;">7</font>  <font style="font-style:italic;font-size:12pt;">,</font>  <font style="font-style:italic;font-size:12pt;">500</font>  <font style="font-style:italic;font-size:12pt;">,000</font>  <font style="font-size:12pt;">” reference “</font>  <font style="font-style:italic;font-size:12pt;">$</font>  <font style="font-style:italic;font-size:12pt;">10,0</font>  <font style="font-style:italic;font-size:12pt;">00,000</font>  <font style="font-size:12pt;">”.</font>

desired output combined tags:

<font style="font-size:12pt;"></font>  <font style="font-style:italic;font-size:12pt;">$7,500,000</font>  <font style="font-size:12pt;">” reference “</font>  <font style="font-style:italic;font-size:12pt;">$10,000,000</font>  <font style="font-size:12pt;">”.</font>

i have tried using tidy html, option saw there remove tags (with "drop-font-tags" option), not want because still want styling tags provide.


Comments

Popular posts from this blog

java - Spring Data JPA: Why findOne(id) executing delete query internally? -

python - Mongodb How to add addtional information when aggregating? -

java - Incorrect order of records in M-M relationship in hibernate -