How to perform UI automation of browser apps with MSH (WATIR without the Ruby)

For anyone who’s familiar with WATIR, check out this simple MSH script I whipped up:

# Create an instance of IE
$ie = new-object -ComObject "InternetExplorer.Application"

# Navigate to MSN search
$ie.Navigate("http://search.msn.com")

# Make IE visible
$ie.Visible = $true

# Grab the DOM document instance
$document = $ie.Document

# !! See note at end of post !!

# Get the query text box and set the search term
$document.getElementById("q").innerText = "Drew Marsh"

# Click the search button
$document.getElementById("srch_btn").click()

Now granted this isn’t doing the exact same thing as WATIR yet (i.e. it’s not really emulating keystrokes to the text box), but basically all one needs to do is create a customized “snap-in” for MSH which would introduce a library of cmdlets that provide functionality analogous to the utility functions provided by WATIR’s set of libraries and off you go.

Just to show off a little more power of how MSH could be leveraged to do browser app automation, imagine, as part of a test, you wanted to go to the main page of your blog, find the most recent post and click on the header to make sure it takes you to the posts stand alone page. Well, the structure of my weblog is such that all post headers are inside of <div> with the class “posthead”. Knowing this, we can now write the following script to find the first post’s header link and click it:

# Create an instance of IE
$ie = new-object -ComObject "InternetExplorer.Application"

# Navigate to my weblog homepage
$ie.Navigate("http://blog.hackedbrain.com")

# Make the browser visible
$ie.Visible = $true

# Grab the DOM document instance
$document = $ie.Document

# !! See note at end of post !!

# Get all the DIVs, then reduce the set to only those
# that have a className of "posthead", then walk the DOM
# into the anchor element and click it
($document.getElementsByTagName("div") | where-object { $_.className -eq "posthead" })[0].firstChild.firstChild.click()

Now I will gladly concede that this example sucks compared to the WATIR equivalent you could whip up right now. However, as I mentioned earlier, we could quickly and easily extend MSH with a snap-in that introduces a set of cmdlets and which return extended types using adaptation that provide simple, rich document navigation and interaction. Just imagine writing this instead:

# Get an instance of IE the nice way
$ie = get-internetexplorer

# Hide all the threading junk behind a nice method
$ie.Open("http://blog.hackedbrain.com")

# Select elements out of the live DOM using XPath syntax
$postHeaderLink = $ie.Select("//div[@class=posthead][1]//a")

# Interact with the elements, can even add custom methods to DOM
# elements using adaptation
$postHeaderLink.Click()

So, the only remaining question is: who’s going to step up to the plate and write this thing? 🙂 I’m notoriously fickle when it comes to focusing on any one technology, so I’m not gonna volunteer. 😉 You could probably make some nice cash off of donations or something if you took it far enough, yet kept it open source. However, I think I’m going to have someone on my team investigate this a little further and continue playing the architect role. I’ll be sure to keep everyone up to date with any progress made.

Note: I left the following lines out of the above two code samples because it clouded the meaning, but basically IE loads documents asynchronusly and you’re supposed to wait for the NavigateComplete event to fire before attempting to touch the document property, but you can also just sleep the main thread like so:

# Wait for IE to load the document
while($document -eq $null)
{
  [System.Threading.Thread]::Sleep(100)
  $document = $ie.Document
}

# Wait for the document to load completely
while($document.readyState -ne "complete")
{
  [System.Threading.Thread]::Sleep(100)
}

3 thoughts on “How to perform UI automation of browser apps with MSH (WATIR without the Ruby)

  1. There is also SWExplorerAutomation (www.webunittesting.com)
    The program creates an automation API for any Web application which
    uses HTML and DHTML and works with Microsoft Internet Explorer. The Web
    application becomes programmatically accessible from any .NET language.

    SWEA API provides access to Web application controls and content. The
    API is generated using SWEA Visual Designer. SWEA Visual Designer helps
    create programmable objects from Web page content.

  2. Scott,

    That’s a good point. Although I would *think* a quick call to [System.Runtime.InteropServices.Marshal]::ReleaseComObject($ie) would do the trick without any need to bring the GC into the picture. Will play around with it and see what I figure out.

    Cheers,
    Drew

Leave a Reply