Pages

Thursday, March 1, 2012

COUNTER SUSHI harvester, pt 1

COUNTER is a standard developed for journal and database vendors to deliver usage statistics to libraries. SUSHI came a few years later as a standard way to gather COUNTER reports with automation tools. Theoretically, any library can make a simple program to gather this data and download it instantly.

But apparently no one has.

At niso.org, there is a listing of several tools and projects for helping you create a SUSHI 'client' harvester. Most of these projects have not been updated since mid-2009, which is around the time the SUSHI standard was first released. Most of the tools are focused toward helping vendors develop a SUSHI-compliant server.

From what I can tell, no libraries have taken advantage of SUSHI. The only libraries using SUSHI are the ones who are subscribing to an ERM product--Expens... errr... Electronic Resource Management. This is a sad situation. SUSHI was intentionally designed as a free, open, standards-compliant avenue for gathering data, but hardly anyone is using it, and those who are pay huge amounts of money to do so.

Since all the ERM products on the market are too expensive and rather unimpressive, I have taken it upon myself to build one, and the SUSHI harvester is a critical part of that. It's one of the first parts I plan to tackle because gathering usage stats by hand from hundreds of locations takes hours, and, frankly, I am too lazy to do so.

Luckily, laziness is the catalyst for all great change. Who do you think invented the remote control? Rocky Balboa?

Building a SUSHI harvester

The first step in building a SUSHI harvester is... well... figuring out how in the world to make a SUSHI harvester. Since I am apparently going it alone on this, I started with the SUSHI standard definition.

SUSHI uses the SOAP protocol to communicate between a client and a server. That's a start. SOAP is a well-known web technology. So here's the game plan:

  1. Create a list of SUSHI services I want to harvest from each month.
  2. Build a script that will run as a scheduled task each month that reads items from my SUSHI list and turns that into SOAP requests.
  3. Parse the SOAP responses back into a list.
Okay, 3 steps is manageable. The first step is easy. Build a list. Luckily, NISO provides a list of SUSHI-compliant vendors with the URL for their SUSHI service. So I hunt for the ones I want and stick the pertinent information into a MySQL database table.

(several hours later)

Now I need to build a script. In the last few months, I have been building a large data management tool in PHP that will eventually evolve into an ERM tool. So, PHP it is!

Unfortunately, the whole two existing (incomplete) examples of a SUSHI harvester anywhere on the open web are written in Java and C#. I am still fairly new to code and am just finally feeling comfortable with PHP. Throwing more languages into the mix does not sounds like fun. Plus, I know Java and C# both require hefty amounts of software underneath them to be able to do anything. I will try the DIY way with PHP.

There are two major PHP packages for dealing with SOAP requests. One of them is actually built into the PHP core. How convenient, right? Wrong. SOAP requests read a WSDL file that tells them how to format the XML request that will be sent to the server. The PHP SoapClient looked once at the SUSHI WSDL file and decided to have a

FATAL ERROR

You know that's never a good thing. It's especially never a good thing when there is absolutely no documentation on the web describing why that error is happening. After tinkering with it for half a day or so, I decided to call SoapClient quits and I found another package called NuSoap.

NuSoap is slightly better in that it doesn't immediately have a heart attack when it reads the WSDL file. Instead, after working with me for several hours and giving me the illusion that it wants to be my friend, NuSoap betrays me by sending botched XML headers and tags that makes the server have a heart attack. Et tu, NuSoap?

At this point, I completely give up on PHP and decide that a lightweight scripting language is the way to go. I have a brief fling with Ruby, but we both decide that we just can't make our relationship work in a Windows development environment.

At the end of my rope, after two and a half days of utter failure, I stumble upon Python and a package called SUDS. The strangest thing of all is that SUDS actually works! I make my first successful call to a SUSHI server.

In the spirit of Python, I am now developing a small tool called sushi_py to make automated calls to a SUSHI service. I am stumbling along as I learn Yet Another Language, but when I am finished, I plan to share the full results and the code with all the Internets so that no one has to swim blindly through the dark waters of SUSHI ever again.

Stay tuned for part 2.