XmlSelectNodes Task for MSBuild

Since MSBuild ships with only a core set of intrinsic tasks, we recently downloaded the MSBuild community tasks and went to town with experimenting converting some of our existing nAnt scripts to MSBuild. We use nAnt’s xmlpeek/xmlpoke quite a bit to tweak our .NET config files for different deployment targets. I noticed the community tasks project had a task called XmlRead so I figured that would pretty much provide us with everything we need to get the job done. Unfortunately it turned out to be a pretty limited implementation and lacked a couple things I would expect:

  1. It only supports a single namespace/prefix mapping – If you had a mixed XML file and wanted to work with multiple namespaces in a single query you’d be SOL.
  2. The result of the XPath is nothing more a string – If the supplied XPath results in multiple nodes being found, the value of those nodes is concatented into a semi-colon separated string.

For #1, the answer seemed pretty straightfoward to me after a little research on properties. It turns out properties can contain XML and this XML can itself be “parameterized” by other properties and then passed to a task as a simple string value. The task can then parse this and interpret it how it pleases. So, instead of exposing two separate properties, Prefix and Namespace, like XmlRead did, I would instead take a string of XML that followed the folowing structure and could be easily placed in a property like so:

<PropertyGroup>
  <
XmlNamespaces>
    <
Namespaces xmlns=“”>
      <
Add Prefix=med NamespaceUri=http://schemas.mimeo.com/entityDirectory/>
    </
Namespaces>
  </
XmlNamespaces>
</
PropertyGroup>

Notice that you need to put the element into an empty namespace. That’s because otherwise it would be in the MSBuild namespace. That would then simply be passed to the Task like so:

<XmlSelectNodes NamespacesXml=$(XmlNamespaces)” … />

The task then parses that XML and builds an XmlNamespaceManager which it will pass to SelectNodes when performing the query so that namespace prefixes in the XPath will resolve accordingly. Please note this property is also optional, if you don’t need namespaces… don’t even worry about it.

#1 was easy and, in all honestly, wasn’t a show stopper for us. I just knew it could be done a little better. #2 on the other hand turned out to be a big problem for us. We needed to execute a task based on every occurrence of a node that was found based on an XPath query. Since all the community project’s XmlRead task does is return a string, it’s pretty much useless to us since, as you’ll learn quickly with MSBuild, there are no functions that can be used to crack a string apart and then act on it in a programattic way. Upon first trying to figure out how to solve it, I realized why XmlRead probably took the route it did… MSBuild is tough to wrap your head around at first! Yet, I knew that there had to be a better way to implement this thing otherwise MSBuild was doomed and I just knew MS couldn’t have messed up this bad. So I dug in and read everything I could find for the next hour or so.

What I imagined initially in my head was returning the actual XmlNode instances that I found from my custom task to MSBuild and then just pull properties off like I would if I were working in some environment that used reflection against CLR objects. Well, forget that! MSBuild doesn’t use reflection that way and so you can’t actually return random native CLR types. Turns out the answer to working with structured data lies in MSBuilds’ concept of Items and Metadata. I already went into this in the previous post, so I’ll skip repeating it all here. Suffice to say that until you really grasp these concepts you won’t get very far with the technology.

Tasks can be passed, as well as return, MSBuild Items through their properties. Items are worked with via an interface called ITaskItem. ITaskItem provides one important identifying property, called ItemSpec, and then a bunch of methods for getting/setting metadata. As soon as I discovered this I knew exactly how I would solve my problem of returning structured data for the nodes that were found. I created an Output property for my Task called SelectedNodeData which was an array of ITaskItem. By making it an array I’ve basically told MSBuild that I may return more than one item and that it should treat these items as a set.

Now that I had a means for returning a structured set of data, all I had to do was perform my XPath query and translate the resulting node list into a set of ITaskItem implementations. Luckily MSBuild comes with a Utilities namespace that provides a simple ITaskItem implementation, fittingly called TaskItem. This guy is basically a wrapper to an IDictionaryof name/value pairs. Initially I simply filled the metadata with the basics of XmlNode:

Name Value
NodeType XmlNode.NodeType.ToString() (since it’s an enum)
NodeLocalName XmlNode.LocalName
NodeNamespaceURI XmlNode.NamespaceURI
NodeValue XmlNode.Value

Later I extended the task to also output any attributes of a node that has them (i.e. elements), more on that later.

Now, to get a better feeling for where we are, let’s look at a sample that might actually use this task:

<Project DefaultTargets=Main xmlns=http://schemas.microsoft.com/developer/msbuild/2003>
  <
UsingTask AssemblyFile=Mimeo.MSBuildUtilities.dll TaskName=Mimeo.Utilities.MSBuild.XmlSelectNodes/>
  <
PropertyGroup>
   
<!– this is the XML property that defines the namespace mappings to use –>
    <XmlNamespaces>
      <
Namespaces xmlns=“”>
        <
Add Prefix=msbuild NamespaceUri=http://schemas.microsoft.com/developer/msbuild/2003/>
      </
Namespaces>
    </
XmlNamespaces>
  </
PropertyGroup>
  <
ItemGroup>
   
<!–
           this creates an item out the build file itself, which is just another
           XML file I can use for demo purposes
    
–>
    <ProjectFile Include=test.build/>
  </
ItemGroup>
  <
Target Name=Main>
   
<!–
         Here we go:
            1) pass in the XmlNamespaces property
            2) pass the XPath that says to grab all the Name attributes
                of Target elements in the Project
            3) pass the project file item as the XML file to evaluate via the XmlFile property
     
–>
    <XmlSelectNodes NamespacesXml=$(XmlNamespaces) XPath=/msbuild:Project/msbuild:Target/@Name XmlFile=@(ProjectFile)>
     
<!–
         use the Output element to tell MSBuild to turn the items return from the task
         into an item set named SelectedNodes
      
–>
      <Output TaskParameter=SelectedNodeData ItemName=SelectedNodes/>
    </
XmlSelectNodes>
   
<!– 
         Prove that it worked by outputting the value of
    
–>
    <Message Text=Target found: %(SelectedNodes.NodeValue)/>
  </
Target>
  <
Target Name=Fo
o
>
   
<!–
          NOTE: this is just a sample Target for illustration purposes
          so the XPath resolves more than one node
   
–>
 
</Target>
</
Project>

This sample script will basically end up outputting the following:

Target Main:
    Target found: Main
    Target found: Foo

As this script shows, I can act on each node that is found and it’s data with a task. In the sample’s case I simply use a Message task, but this can now easily be any other Task.

To wrap up I just want to follow up on what I said above before the example about returning a node’s attributes. Once I had the basic functionality above working I realized how much richer I could get with the implementation. Since the metadata is just a dictionary, I can actually transplant the values of an XmlNode’s attributes into the metadata collection for the Item. You can accomplish this by doing two things:

  1. You need to select nodes that have attributes. Basically this means your XPath will result in element type nodes being selected.
  2. You set ExpandAttributes=“true” on the XmlSelectNodes task. This tells the task that you want the metadata exported. The default is false since it’s cheaper not to do this obviously.

Now let’s look at what would change in our sample if we used this feature instead. First we’d change the XPath to select the Target elemements themselves instead of their Name attributes:

<XmlSelectNodes  XPath=/msbuild:Project/msbuild:Target” … />

Next, instead of outputting NodeValue in our Message Task we can actually output the named attribute itself:

<Message Text=Target found: %(SelectedNodes.Name)/>

I thought this feature was a nice addition since it allows you to use the actual attribute names right in your code, not to mention that if you want to output multiple attributes from a single element this will be faster since the element can be resolved and all it’s attribute read in one shot. Do note that I don’t support namespace qualified attributes with this syntax though since there’s no straightforward way that I can think of doing that. So for that you’ll just have to use a straight attribute selecting XPath.

So, here’s the source for the task and sample build file. Hope you find it as useful as I do.

Leave a Reply