Pages

Monday, April 2, 2012

COUNTER SUSHI harvester, pt 2: A summary of how I made it

After a month of on-and-off coding and head-banging, my Python SUSHI harvester is in a working condition. Hello, Sushi Py!

How It Works

Here are the steps for making a SUSHI client:

  1. Input the parameters for a SUSHI request (ie Requestor ID, Customer ID, date range, etc). In Sushi Py, instead of manually running the script for each SUSHI request, it reads a list of them either from a CSV file or a MySQL table and runs requests for all of them.
  2. Create a SOAP envelope. You can do this manually (not recommended) by creating an XML file from scratch, or you can utilize a SOAP service's WSDL file. Sushi Py uses the latter route with a Python library called SUDS. SUDS takes the URL of a WSDL file as input and allows you to easily create a SOAP envelope from it.
  3. Fill the SOAP envelope with data from step 1. Fairly straightforward.
  4. Make the SOAP request. If you used a WSDL file, a smart SOAP library like SUDS will automatically know the URL of the SOAP service. If all goes well, you will be returned a response from the SUSHI service at the other end.
  5. Parse the response. You will want to have an example of what the response looks like. Options for this include the free tool SoapUI or using lots of print/debug commands in whatever SOAP library you are using. I used both. Each response is parsed to a Python dictionary (associative array), and each dictionary is added to a list (array) of all the responses. The shape and values of the dictionary varies based on which COUNTER report is being requested.
  6. Record the response. Sushi Py loops through each dictionary in the list of responses and writes it to either a CSV file or a MySQL table. The output location also varies based on which COUNTER report is being used. For example, JR1 reports go into one table/file, DB1 reports into another, and so on.
Sushi Py currently works for JR1, DB1, DB2, and DB3 reports. Sushi Py is hard-coded to deal with each report and will not work "out of the box" with new report types. In the future, I will hopefully modify it so that report types are not hard-coded. It's pretty easy to add support for new report types, and if you look into the source code, you can probably figure it out fairly quickly just by looking at how other report types are defined.



No comments:

Post a Comment