Help Center

Manual XPaths

Last Updated: Aug 31, 2016 09:51AM PDT
The Manual XPath feature allows you to override the XPath that has been automatically generated by the Extractor. The button is located in the top right hand of a column that you are training. Clicking "Manual XPath" from this button will cause a bar to appear underneath the column headings where you can enter your own specified XPath.

Prefer a video Example? - click here
 



Why would I want to do this?


If the data you want to extract from a page is hidden behind a drop down menu, not fully visible on the webpage, or otherwise not selectable in the Extractor, it is still possible to extract if it appears in the html.
 

Show me an Example


https://www.etsy.com/uk/listing/178381309/engagement-ring-box-proposal-box-ring?ref=hp_so_crsl_l



This item on etsy.com has 2 prices that can be selected by clicking on a drop down menu. Imagine you want to capture them both.

In the Extractor, it is not possible to click on the dropdown and the select each piece of data individually, but you can see from the html that the data is there (right click > inspect):

 


In this case, the XPath that locates the data we want is: 

//*[contains(@name,'listing')]/option[position()>1]

In short, this XPath is telling the Extractor to: Find any attribute called "name" that contains the value "listing"; then look for the data that lives in the element called "option"; but ignore the first item.

When entered in the box, the corresponding data will appear in the column.


 

Important


Element names containing "io"

e.g.
<io-text 

or attributes containing "io"

e.g. 
class="io-cursor-not-allowed-CHFG"

cannot be used as XPath markers.

These html elements have been added to the page by the Extractor and are not part of the original webpage.

This will not make any difference to the ability to extract using manual XPaths, as all the original html will still be present.



If you want to lean more about XPaths, try one of these two great sources:

W3 Schools
Mozilla Developer Network

Have fun!
c2d12fc2f876f019701e1c3951e354bd@importio.desk-mail.com
http://assets0.desk.com/
false
desk
Loading
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
about
false
Invalid characters found
/customer/en/portal/articles/autocomplete