Skip to main content

Command Line Programs

Building a Command Line Program

In this chapter we are going to build a program that based on the parameters passed, downloads a page from the Internet and stores it in an output file. If it gets an error while retrieving the page it will output an error message. If run without any parameters it will report the correct method of running the program. The program is called urldump.smp, and there is a project already located in the Projects\console directory.

Before you run off and start devouring that program though, it would be a good idea to continue reading here. The a reason is that although that program will show you how its done, it won't be able to explain how it came to be in that form. That said, it is probably time to do just that.

First Steps

Since every SIMPOL program begins with the function main(), that is where we will start. The image below shows the beginning of the project. At this very early stage, there is not much there. The httpresponse type is also not in blue, but instead it is in black. That is a sign that the library is not yet part of the project.

page37image127265664

Initial stage of the urldump.smp project.

To resolve this, we can add the required library to the project. From the Project menu, select the Settings item to display the Project Settings dialog. Select the second tab, Includes and libraries, and then click on the Add button next to the (*.sml) Libraries to link; label. From there, enter the SIMPOL lib directory and pick the httpclientlib.sml file. The result should look like the image below:

page38image2047379280

The Project Settings dialog after adding the httpclientlib.sml library.

At this point, clicking on the OK button will result in a warning dialog being shown. This one warns us that the httpclientlib.sml library requires the SIMPOL component sock and therefore this will also be added to the project. This is quite handy, since otherwise the library wouldn't even work. The warning dialog looks like this:

image-1624264131314.png

The warning dialog shown when a library has been added that requires components that are not currently part of the project.

Depending on the size of the screen area on our computer, it may be useful to turn off a couple of windows while writing the program. This can be done from the View menu, by selecting the Call Stack and Variables items, for example. After a bit more code has been written, and with our new adjusted windows, the result might look like the following image.

page39image127402432

The project in its more advanced state after also adjusting some of the windows for greater code visibility.

At this point, let's actually have a look at our first version of this program.

Example 4.1. Initial version of urldump.sma

constant sERRTXT_PAGE              "Page '"
constant sERRTXT_NOTFOUND          "' not found"
constant sERRTXT_SUCCESS           "' successfully retrieved"
constant sERRTXT_FILEOPENFAILED    "Error opening output file"
constant sCRLF                     "{d}{a}"

function main(string sUrl, string sOutfile)
string errtext
integer e
fsfileoutputstream fpo
httprespone response

if sUrl <= ""
errtext = usage()
else
response =@ httpget(sUrl)
 if response !@= .nul
if response.errorstatus > ""
errtext = response.errorstatus
else if sOutfile > ""
e= 0
fpo =@ fsfileoutputstream.new(sOutfile, error=e)
if fpo =@= .nul or e != 0
errtext = sERRTXT_FILEOPENFAILED + sCRLF
else
if response.statuscode < 200 or \
response.statuscode >= 300
errtext = sERRTXT_PAGE + sUrl + \
sERRTXT_NOTFOUND + sCRLF
else
errtext = sERRTXT_PAGE + sUrl + sERRTXT_SUCCESS + sCRLF
end if
fpo.putstring(.if(response.entitybody != .nul, \
response.entitybody.getstring(1, .inf, 1), ""), 1)
end if
else
if response.statuscode < 200 or response.statuscode >= 300
errtext = sERRTXT_PAGE + sUrl + sERRTXT_NOTFOUND + sCRLF
else
errtext = ""
end if
errtext = errtext + .if(response.entitybody != .nul, \
response.entitybody.getstring(1, .inf, 1), "")
end if
end if
end if
end function errtext


function usage()
string s

s = "smprun[32.exe] urldump.smp <url> <outputfile>{d}{a}"
end function s

Understanding the Code

Although there is not much to this program, it covers a number of concepts that are worth exploring. To begin with, the command line parameters are always string variables and they do not allow for default values, so to set those you will need to write some code for it. At the beginning of the program, there is a test for the sUrl variable. If it finds that no value has been passed, then it calls the usage() function. This approach makes it quite easy to both document how the program works and also inform the user when the parameters are not correct.

The next thing to note is the call to the httpget() function. That returns an httpresponse object (and should do so under all circumstances, so the following test for .nul may be unnecessary). The httpre- sponse object contains all the information that is a result of the attempt to retrieve the resource represented by the sUrl variable. Should there have been any unexpected problem with the retrieval of the resource then the errorstatus property would have some value greater than the empty string ("").

The remaining code simply checks whether the output is going to a file or if it will be output as part of the return value. In each case, it outputs the content of the entitybody property if the retrieval was successful (a value between 200 and 299 in the statuscode property) then a success string is returned, otherwise an error string.

Running Our Project

At this point we should build our project (Ctrl+B — Build). Now we can run the program, but if we want to try it in the IDE we will also need to define the argument that is being passed to the main() function. We can do that in the Project menu by selecting the Settings item again. In the first tab, in the Command line box, enter the URL:

http://www.google.co.uk/search?hl=en&q=SIMPOL&btnG=Google +Search&meta=&aq=f&oq=

and then click OK to close the dialog.

Note

It is always a good idea to click on the Save All icon on the tool bar after making changes to the project's settings. This ensures that those changes are saved to the project file. If you don't, and the IDE crashes for some reason, you may lose the changes that you have made.

Now to run the program, press Ctrl+E (Execute). The result should be the page containing the Google search results for the key word "SIMPOL". The page will be messy and hard to read, since it normally returns as an unformatted stream of characters without any new line characters. To see a page that may look a little more familiar, try the URL "www.simpol.com". That should look like a fairly readable page of XHTML.

Improving Our Program

Although this program isn't bad, it might be useful if it were a little more flexible. One thing we might want to do is allow it to take parameters, so that it can do not only a GET operation, but also a POST. We could also decide to allow the program to output the header information that it received from the web server, which can be very handy when trying to debug routines that retrieve data from a web server. A method of handling and validating command line parameters might also be useful. Let's add support for not only the output file, but also a flag to decide if the header is output, and also support for passing variables through using the POST method.

As a first step, we can update our usage() function with the new information. The new version looks like this:

Example 4.2. Updated usage() Function
function usage()
  string s
  s = "smprun[32.exe] urldump.smp <url> [--outfile=<filename>] \
        [--showheader]{d}{a}"
  s = s + "    [--vars=<varlist>]{d}{a}{d}{a}"
  s = s + "    Where the vars need to already be URL-encoded and \
        if they violate the{d}{a}"
  s = s + "    shell rules they will also need to be escaped to \
        be hidden from the{d}{a}"
  s = s + "    shell. It is recommended to place quotes around \
        the --vars= entry.{d}{a}"
  s = s + "    The equals sign and what follows CANNOT allow \
spaces! If necessary,{d}{a}"
s = s + " surround any entry with quotes.{d}{a}"
end function s

Now that we have decided what the parameters are going to be (and incidentally also the format), we can add the code to handle the parameters. This is probably best done using a specifically designed data type. This will allow us to offload most of the work to the type itself, without cluttering our existing main() function with all the associated code. It will also make it easier to lift it and use it again in another program, or even in the future to create a more versatile type that is more universal. Let's see what that code looks like:

Example 4.3. The parameters Type
type parameters
  embed
  string outfilename
  boolean showheader
  string variables
  string operation
reference function getparam
end type

function parameters.new(parameters me)
me.outfilename = ""
me.showheader = .false
me.variables = ""
me.operation = "GET"
end function me

function parameters.getparam(parameters me, string parameter)
if parameter <= ""
// do nothing
else if .like1(parameter, "--outfile=*")
me.outfilename = .substr(parameter, .instr(parameter, "=") + 1, .inf)
else if .like1(parameter, "--showheader")
me.showheader = .true
else if .like1(parameter, "--vars=*")
me.variables = .substr(parameter, .instr(parameter, "=") + 1, .inf)
end if
end function

The parameters type has properties to store all the information that we will use for this call to the program. It defaults to running in GET mode, and will not return the header from the web server. The way the getparam() method has been coded requires that each parameter that has a value component must be separated from the value by an equals (=) sign and no white space to either side. Part of the reason for this is that allowing white space would require a more complex algorithm, since each of the white space separated items would arrive as separate parameter values from the shell to the main() function.

We now have a method of handling the various parameters to the program, and one of the nice features of this approach is that the order of the parameters does not matter. The only parameter that has a fixed position is the URL itself, since it must be first. Using this approach requires some changes to the main() function as well. Let's have a look at those now.

Example 4.4. The Final Version of the main() Function
function main(string sUrl, string param1, string param2, \
               string param3)
  string errtext
  integer e
  fsfileoutputstream fpo
  httpresponse response
  parameters params
e= 0
if sUrl <= ""
errtext = usage()
else
params =@ parameters.new()
params.getparam(param1)
params.getparam(param2)
params.getparam(param3)

errtext = ""
if params.variables > ""
response =@ httppost(sUrl, params.variables)
else
response =@ httpget(sUrl)
end if
if response !@= .nul
if response.errorstatus > ""
errtext = response.errorstatus
else if params.outfilename > ""
fpo =@ fsfileoutputstream.new(params.outfilename, error=e)
if fpo =@= .nul or e != 0
errtext = sERRTXT_FILEOPENFAILED + sCRLF
else
if response.statuscode < 200 or response.statuscode >= 300
errtext = sERRTXT_PAGE + sUrl + sERRTXT_NOTFOUND + sCRLF
else
errtext = sERRTXT_PAGE + sUrl + sERRTXT_SUCCESS + sCRLF
end if
fpo.putstring(.if(params.showheader, makenotnull(response.fullheader) + \
sCRLF + sCRLF, "") + \
.if(response.entitybody != .nul, \
end if
else
response.entitybody.getstring(1, .inf, 1),""), 1)
if response.statuscode < 200 or response.statuscode >= 300
errtext = sERRTXT_PAGE + sUrl + sERRTXT_NOTFOUND + sCRLF
else
errtext = ""
end if
errtext = errtext + makenotnull(.if(params.showheader, \
response.fullheader + \
sCRLF + sCRLF, "")) + \
.if(response.entitybody != .nul, \
response.entitybody.getstring(1, \
.inf, 1),"")
end if
end if
end if
end function errtext

As we can see from the previous code, not a lot has changed from the original version. We now support the POST operation if we were given variables (which must be URL-encoded when they are passed in). We can also optionally return the entire header from the web server if requested to do so. All of the actual handling of the parameters is done by the parameter type and its getparam() method.

Running the Final Version

Now that all our coding is done (the final coded version of this example can be found in the supplied program samples as a console project called urldump. This is the command line we will use to try out the new features:

Example 4.5. Sample Command
urldump.smp "wwwx.cs.unc.edu/~jbs/aw-wwwp/docs/resources/perl/
  perl-cgi/programs/cgi_stdin.cgi" --showheader "--vars=name=Joe&
  textarea=Cool&radiobutton=middleun&checkedbox=pizza&
  selectitem=hamburgers"

Note
The URL in the previous command was found while searching on the Internet. It may or may not be there forever, but it is greatly appreciated for providing an opportunity to test the POST operation in this program. Eventually we may produce a sample program running from our own web site but anticipate the likely web load of a few people trying this out will not greatly inconvenience the university site.

The result of running this new version of the program with the command line parameters shown above, can be seen in the section below:

----------------- 20:41:34 13/08/2009 -----------------
Executing "X:\simpol\projects\console\urldump\bin\urldump.smp" ... --------------------- program result --------------------
HTTP/1.1 200 OK
Date: Thu, 13 Aug 2009 19:41:35 GMT
Server: Apache/2.2.3 (Red Hat)
Connection: close
Content-Type: text/html

<HEAD>
<TITLE>stdin vars.</TITLE> <H1>Print CGI STDIN Variables</H1> </HEAD>
<BODY>
<HR>
<H3>STDIN Variables</H3>
<UL>
<LI>radiobutton = middleun <LI>checkedbox = pizza
<LI>name = Joe Bloggs
<LI>selectitem = hamburgers <LI>textarea = Cool form dude
</UL>
</BODY>

--------------------------------------------------------- Successfully executed

Summary

In this chapter we have developed a command line program to retrieve a page across the Internet using both GET and POST. We extended the initial version of the program to also take named parameters in any order. The techniques learned here could also be applied in other programs. The parameter handling can be reused in other command line programs. The use of the httpclientlib.sml library could be added to a web server or desktop program to retrieve information from another location on the Internet, such as currency exchange rates, stock market values, etc.