FILTERXML function
The FILTERXML function in Excel is used to extract data from an XML string based on a specified XPath query. It is a powerful function for working with XML data, allowing you to query XML documents or strings to retrieve specific pieces of data.
Syntax:
=FILTERXML(xml, xpath)
Parameters:
- xml (required): The XML string or reference to an XML data source that you want to extract information from.
- xpath (required): The XPath query used to define the data you want to extract from the XML.
What the Function Does:
- The
FILTERXMLfunction takes an XML string and uses an XPath query to extract values from that XML. - XPath is a language used to query XML documents, and it allows you to specify elements or attributes you want to extract.
Example of Usage:
1. Basic Example:
Suppose you have the following XML data stored in cell A1:
<bookstore>
<book>
<title lang="en">Excel 101</title>
<author>John Doe</author>
<price>29.99</price>
</book>
<book>
<title lang="en">Excel Advanced</title>
<author>Jane Smith</author>
<price>39.99</price>
</book>
</bookstore>
You can extract the title of the first book using the FILTERXML function. The XPath query for extracting the title is //title.
In cell B1, use the formula:
=FILTERXML(A1, "//title")
This will return:
Excel 101
2. Extracting Multiple Values:
If you want to extract the titles of all books in the XML, you can still use FILTERXML. The XPath query //title selects all title elements in the XML.
For example, the following formula:
=FILTERXML(A1, "//title")
Will return an array of values (in Excel 365 or Excel 2021, this would spill the results into adjacent cells):
Excel 101
Excel Advanced
3. Using XPath to Extract Specific Attributes:
You can also use XPath to extract specific attributes from XML elements. For example, if you want to extract the lang attribute from the title element:
=FILTERXML(A1, "//title/@lang")
This will return:
en
4. Extracting a Value from Nested Elements:
If you have deeply nested elements in your XML data, you can use more specific XPath queries. For example, if you want to extract the price of the first book, you could use an XPath query like //book[1]/price:
=FILTERXML(A1, "//book[1]/price")
This will return:
29.99
Notes:
- Error Handling: If the XPath query does not find a match,
FILTERXMLwill return#VALUE!. - XPath Syntax: XPath is a query language, and you should be familiar with basic XPath syntax to use
FILTERXMLeffectively. It allows you to select elements, attributes, and text nodes, and it can navigate complex XML structures. - XML Input: The XML data can be directly input as a string or as a reference to a cell that contains XML data.
Limitations:
- The
FILTERXMLfunction is not available in Excel for Mac or older versions of Excel. It’s generally available in Excel 2013 and later versions (Windows only). - Excel imposes a limit on the size of the XML string. Very large XML documents might cause performance issues.
Related Functions:
WEBSERVICE: This function can retrieve data from a web service as a string, which can then be parsed usingFILTERXML.TEXTandMID: These are other text functions you might use if you need to manipulate strings that aren’t necessarily XML but have similar structures.
The FILTERXML function is ideal for anyone working with XML data directly in Excel, especially when you need to automate the extraction of information from structured data sources, such as APIs, web services, or XML files.