Wapp

1.0 Introduction

Wapp is a library for building web applications in Tcl.

Wapp strives for:

Other web frameworks are designed to construct high-volume, scalable web-sites, tended by expert developers. Wapp, in contrasts, is designed for small to medium scale web utility programs which can be brought online quickly and then left to run for years without maintenance and managed by programmers whose primary focus is something other than web applications.

The Tcl tradition is to provide a lot of capability in just a few lines of easily read and written code. Tcl programs are typically ten times shorter than the equivalent code in a compiled language like C, C++, Rust, or Java. Shorter programs are easier to read, easier to write, and have fewer bugs. Tcl empowers programmers to write useful utility scripts that are safe and efficient, without burdening the programmer with excessive API minutia. Wapp tries to bring these same benefits to web application development.

1.1 Features

Wapp is implemented as a single file containing less than 1000 lines of uncomplicated Tcl code that can be loaded into a larger Tcl program using source or package required. Or the "wapp.tcl" source file can be copy/pasted into a larger script to construct a complete application in a single file of Tcl code.

Wapp is cross platform. Since it is pure Tcl, it of course runs wherever Tcl runs. But more than that, Wapp can serve web pages by a variety of techniques, including CGI, SCGI, or direct HTTP. Wapp contains its own built-in web-server that is useful for development. Once the application is written and debugged, it can then be deployed using CGI or SCGI or as a stand-alone web server with no changes to the underlying code.

A key goal of Wapp is simplity. To this end, the API is deliberately kept small enough so that a one page cheat-sheet is sufficient documentation. The idea is that Wapp should be usable by developers who are not full-time web application coders, and who work with Wapp only rarely. There is not a lot to learn or remember with Wapp, so programmers can be productive using Wapp in just a few minutes. When maintenance is required on a Wapp program, perhaps years after deployment, it is more easily accomplished since programmers do not have to relearn a complex interface.

1.2 License

Wapp is released under a liberal 2-clause BSD license, so it can be used anywhere and for any purpose.

1.3 Website

Complete source code and documentation for Wapp is available on-line at https://wapp.tcl.tk/.

2.0 Hello World

Wapp applications are easy to develop. A hello-world program is as follows:

#!/usr/bin/wapptclsh
package require wapp
proc wapp-default {} {
   wapp-subst {<h1>Hello, World!</h1>\n}
}
wapp-start $::argv

Every Wapp application defines one or more procedures that accept HTTP requests and generate appropriate replies. For an HTTP request where the initial portion of the URI path is "abcde", the procedure named "wapp-page-abcde" will be invoked to construct the reply. If no such procedure exists, "wapp-default" is invoked instead. The latter technique is used for the hello-world example above.

The hello-world example generates the reply using a single call to the "wapp-subst" command. Each "wapp-subst" command appends new text to the reply, applying various substitutions as it goes. The only substitution in this example is the \n at the end of the line.

The "wapp-start" command starts up the application.

2.1 Running A Wapp Application

To run this application, copy the code above into a file named "main.tcl" and then enter the following command:

wapptclsh main.tcl

That command will start up a web-server bound to the loopback IP address, then launch a web-browser pointing at that web-server. The result is that the "Hello, World!" page will automatically appear in your web browser.

To run this same program as a traditional web-server on TCP port 8080, enter:

wapptclsh main.tcl --server 8080

Here the built-in web-server listens on all IP addresses and so the web page is available on other machines. But the web-browser is not automatically started in this case, so you will have to manually enter "http://localhost:8080/" into your web-browser in order to see the page.

To run this program as CGI, put the main.tcl script in your web-servers file hierarchy, in the appropriate place for CGI scripts, and make any other web-server specific configuration changes so that the web-server understands that the main.tcl file is a CGI script. Then point your web-browser at that script.

Run the hello-world program as SCGI like this:

wapptclsh main.tcl --scgi 9000

Then configure your web-server to send SCGI requests to TCP port 9000 for some specific URI, and point your web-browser at that URI. By default, the web-server must be on the same machine as the wapp script. The --scgi option only accepts SCGI requests from IP address 127.0.0.1. If your webserver is running on a different machine, use the --remote-scgi option instead, probably with a --fromip option to specify the IP address of the machine that is running the webserver.

2.2 Using Plain Old Tclsh

Wapp applications are pure TCL code. You can run them using an ordinary "tclsh" command if desired, instead of the "wapptclsh" shown above. We normally use "wapptclsh" for the following reasons:

We prefer to use wapptclsh and wapptclsh is shown in all of the examples. But ordinary "tclsh" will work in the examples too.

3.0 Longer Examples

Wapp keeps track of various parameters (see section 5.0) that describe each HTTP request. Those parameters are accessible using routines like "wapp-param NAME" The following sample program gives some examples:

package require wapp
proc wapp-default {} {
  set B [wapp-param BASE_URL]
  wapp-trim {
    <h1>Hello, World!</h1>
    <p>See the <a href='%html($B)/env'>Wapp
    Environment</a></p>
  }
}
proc wapp-page-env {} {
  wapp-allow-xorigin-params
  wapp-subst {<h1>Wapp Environment</h1>\n<pre>\n}
  foreach var [lsort [wapp-param-list]] {
    if {[string index $var 0]=="."} continue
    wapp-subst {%html($var) = %html([list [wapp-param $var]])\n}
  }
  wapp-subst {</pre>\n}
}
wapp-start $argv

In this application, the default "Hello, World!" page has been extended with a hyperlink to the /env page. The "wapp-subst" command has been replaced by "wapp-trim", which works the same way with the addition that it removes surplus whitespace from the left margin, so that the generated HTML text does not come out indented. The "wapp-trim" and "wapp-subst" commands in this example use "%html(...)" substitutions. The "..." argument is expanded using the usual TCL rules, but then the result is escaped so that it is safe to include in an HTML document. Other supported substitutions are "%url(...)" for URLs on the href= and src= attributes of HTML entities, "%qp(...)" for query parameters, "%string(...)" for string literals within javascript, and "%unsafe(...)" for direct literal substitution. As its name implies, the %unsafe() substitution should be avoided whenever possible.

The /env page is implemented by the "wapp-page-env" proc. This proc generates HTML that describes all of the query parameters. Parameter names that begin with "." are for internal use by Wapp and are skipped for this display. Notice the use of "wapp-subst" to safely escape text for inclusion in an HTML document.

The printing of all the parameters as is done by the /env page turns out to be so useful that there is a special "wapp-debug-env" command to render the text for us. Using "wapp-debug-env", the program above can be simplified to the following:

package require wapp
proc wapp-default {} {
  set B [wapp-param BASE_URL]
  wapp-trim {
    <h1>Hello, World!</h1>
    <p>See the <a href='%html($B)/env'>Wapp
    Environment</a></p>
  }
}
proc wapp-page-env {} {
  wapp-allow-xorigin-params
  wapp-trim {
    <h1>Wapp Environment</h1>\n<pre>
    <pre>%html([wapp-debug-env])</pre>
  }
}
wapp-start $argv

Many Wapp applications contain an /env page for debugging and trouble-shooting purpose. Examples:

3.1 Binary Resources

Here is another variation on the same "hello, world" program that adds an image to the main page:

package require wapp
proc wapp-default {} {
  set B [wapp-param BASE_URL]
  wapp-trim {
    <h1>Hello, World!</h1>
    <p>See the <a href='%html($B)/env'>Wapp
    Environment</a></p>
    <p>Broccoli: <img src='broccoli.gif'></p>
  }
}
proc wapp-page-env {} {
  wapp-allow-xorigin-params
  wapp-trim {
    <h1>Wapp Environment</h1>\n<pre>
    <pre>%html([wapp-debug-env])</pre>
  }
}
proc wapp-page-broccoli.gif {} {
  wapp-mimetype image/gif
  wapp-cache-control max-age=3600
  wapp-unsafe [binary decode base64 {
    R0lGODlhIAAgAPMAAAAAAAAiAAAzMwBEAABVAABmMwCZMzPMM2bMM5nMM5nMmZn/
    mczMmcz/mQAAAAAAACH5BAEAAA4ALAAAAAAgACAAAAT+0MlJXbmF1M35VUcojNJI
    dh5YKEbRmqthAABaFaFsKG4hxJhCzSbBxXSGgYD1wQw7mENLd1FOMa3nZhUauFoY
    K/YioEEP4WB1pB4NtJMMgTCoe3NWg2lfh68SCSEHP2hkYD4yPgJ9FFwGUkiHij87
    ZF5vjQmPO4kuOZCIPYsFmEUgkIlJOVcXAS8DSVoxB0xgA6hqAZaksiCpPThghwO6
    i0kBvb9BU8KkASPHfrXAF4VqSgAGAbpwDgRSaqQXrLwDCF5CG9/hpJKkb17n6RwA
    18To7whJX0k2NHYjtgXoAwCWPgMM+hEBIFDguDrjZCBIOICIg4J27Lg4aGCBPn0/
    FS1itJdNX4OPChditGOmpIGTMkJavEjDzASXMFPO7IAT5M6FBvQtiPnTX9CjdYqi
    cFlgoNKlLbbJfLqh5pAIADs=
  }]
}
wapp-start $argv

This application is the same as the previous except that it adds the "broccoli.gif" image on the main "Hello, World" page. The image file is a separate resource, which is provided by the new "wapp-page-broccoli.gif" proc. The image is a GIF which has been encoded using base64 so that it can be put into an text TCL script. The "[binary decode base64 ...]" command is used to convert the image back into binary before returning it.

Other resources might be added using procs like "wapp-page-style.css" or "wapp-page-script.js".

4.0 General Structure Of A Wapp Application

Wapp applications all follow the same basic template:

package require wapp;
proc wapp-page-XXXXX {} {
  # code to generate page XXXXX
}
proc wapp-page-YYYYY {} {
  # code to generate page YYYYY
}
proc wapp-default {} {
  # code to generate any page not otherwise
  # covered by wapp-page-* procs
}
wapp-start $argv

The application script first loads the Wapp code itself using the "package require" at the top. (Some applications may choose to substitute "source wapp.tcl" to accomplish the same thing.) Next the application defines various procs that will generate the replies to HTTP requests. Different procs are invoked based on the first element of the URI past the Wapp script name. Finally, the "wapp-start" routine is called to start Wapp running. The "wapp-start" routine never returns (or in the case of CGI, it only returns after the HTTP request has been completely processed), so it should be the very last command in the application script.

4.1 Wapp Applications As Model-View-Controller

If you are accustomed to thinking of web applications using the Model-View-Controller (MVC) design pattern, Wapp supports that point of view. A basic template for an MVC Wapp application is like this:

package require wapp;
# procs to implement the model go here
proc wapp-page-XXXXX {} {
  # code to implement controller for XXXXX
  # code to implement view for XXXXX
}
proc wapp-page-YYYYY {} {
  # code to implement controller for YYYYY
  # code to implement view for YYYYY
}
proc wapp-default {} {
  # code to implement controller for all other pages
  # code to implement view for all other pages
}
wapp-start $argv

The controller and view portions of each page need not be coded together into the same proc. They can each be sub-procs that are invoked from the main proc, if separating the functions make code clearer.

So Wapp does support MVC, but without a lot of complex machinery and syntax.

5.0 Parameters

The purpose of a Wapp invocation is to answer an HTTP request. That HTTP request is described by various "parameters".

Each parameter has a key and a value.

The Wapp application retrieves the value for the parameter with key NAME using a call to [wapp-param NAME]. If there is no parameter with the key NAME, then the wapp-param function returns an empty string. Or, if wapp-param is given a second argument, the value of the second argument is returned if there exists no parameter with a key of NAME.

5.1 Parameter Types

There are four source of parameter data:

  1. CGI Parameters
    Parameters with upper-case names contain information about the HTTP request as it was received by the web server. Examples of CGI parameters are CONTENT_LENGTH which is the number of bytes of content in the HTTP request, REMOTE_ADDR which holds the IP address from which the HTTP request originated, REQUEST_URI which is the path component of the URL that caused the HTTP request, and many others. Many of the CGI Parameters have names that are the same as the traditional environment variables used to pass information into CGI programs - hence the name "CGI Parameters". However, with Wapp these values are not necessarily environment variables and they all exist regardless of whether the application is run using CGI, via SCGI, or using the built-in web server.

  2. Cookies
    If the HTTP request contained cookies, Wapp automatically decodes the cookies into new Wapp parameters. Only cookies that have lower-case names are decoded. This prevents a cookie name from colliding with a CGI parameter. Cookies that have uppercase letters in their name are silently ignored.

  3. Query Parameters
    Query parameters are the key/value arguments that follow the "?" in the URL of the HTTP request. Wapp automatically decodes the key/value pairs and makes a new Wapp parameter for each one.

    Only query parameter that have lower-case names are decoded. This prevents a query parameter from overriding or impersonating a CGI parameter. Query parameter with upper-case letters in their name are silently ignored. Furthermore, query parameters are only decoded if the HTTP request uses the same origin as the application, or if the "wapp-allow-xorigin-params" has been run to signal Wapp that cross-origin query parameters are allowed.

  4. POST Parameters
    POST parameters are the application/x-www-form-urlencoded key/value pairs in the content of a POST request that typically originate from forms. POST parameters are treated exactly like query parameters in that they are decoded to form new Wapp parameters as long as they have all lower-case keys and as long as either the HTTP request comes from the same origin or the "wapp-allow-xorigin-params" command has been run.

All Wapp parameters are held in a single namespace. There is no way to distinguish a cookie from a query parameter from a POST parameter. CGI parameters can be distinguished from the others by having all upper-case names.

5.2 Parameter Examples

To better understand how parameters work in Wapp, run the "env.tcl" sample application in the Wapp source tree (https://wapp.tcl.tk/home/file/examples/env.tcl). Like this:

 wapptclsh examples/env.tcl

The command above should cause a web page to pop up in your web browser. That page will look something like this:

Wapp Environment

BASE_URL = http://127.0.0.1:33999
DOCUMENT_ROOT = /home/drh/wapp/examples
HTTP_ACCEPT_ENCODING = {gzip, deflate}
HTTP_COOKIE = {env-cookie=simple}
HTTP_HOST = 127.0.0.1:33999
HTTP_USER_AGENT = {Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0}
PATH_HEAD = {}
PATH_INFO = {}
PATH_TAIL = {}
QUERY_STRING = {}
REMOTE_ADDR = 127.0.0.1
REMOTE_PORT = 53060
REQUEST_METHOD = GET
REQUEST_URI = /
SAME_ORIGIN = 0
SCRIPT_FILENAME = /home/drh/wapp/examples/env.tcl
SCRIPT_NAME = {}
SELF_URL = http://127.0.0.1:33999/
WAPP_MODE = local
env-cookie = simple
[pwd] = /home/drh/wapp

Try this. Then modify the URL by adding new path elements and query parameters to see how this affects the Wapp parameters. Notice in particular how query parameters are decoded and added to the set of Wapp parameters.

5.3 Parameter Security

Parameter values in the original HTTP request may be encoded in various ways. Wapp decodes parameter values before returning them to the application. Application developers never see the encoded values. There is never an opportunity to miss a decoding step.

For security reasons, Query and POST parameters are only added to the Wapp parameter set if the inbound request is from the "same origin" or if the special "wapp-allow-xorigin-params" interface is called. An inbound request is from the same origin if it is in response to clicking on a hyperlink or form on a page that was generated by the same website. Manually typing in a URL does not constitute the "same origin". Hence, in the "env.tcl" example above the "wapp-allow-xorigin-params" interface is used so that you can manually extend the URL to add new query parameters.

If query parameters can have side effects, then you should omit the wapp-allow-xorigin-params call. The wapp-allow-xorigin-params command is safe for read-only web pages. Do not invoke wapp-allow-xorigin-params on pages where the parameters can be used to change server state.

5.4 CGI Parameter Details

The CGI parameters in Wapp describe the HTTP request that is to be answered and the execution environment. These parameter look like CGI environment variables. To prevent environment information from overlapping and overwriting query parameters, all the environment information uses upper-case names and all query parameters are required to be lower case. If an input URL contains an upper-case query parameter (or POST parameter or cookie), that parameter is silently omitted.

The following CGI parameters are available:

All of the above are standard CGI environment values. The following are supplemental environment parameters are added by Wapp:

5.4.1 URL Parsing Example

For the input URL "http://example.com/cgi-bin/script/method/extra/path?q1=5" and for a CGI script named "script" in the /cgi-bin/ directory, the following CGI environment values are generated:

The first five elements of the example above, HTTP_HOST through QUERY_STRING, are standard CGI. The final four elements are Wapp extensions. The following is the same information show in a diagram:

http://example.com/cgi-bin/script/method/extra/path?q1=5
       \_________/\_____________/\________________/ \__/
            |            |               |           |
        HTTP_HOST   SCRIPT_NAME      PATH_INFO       `-- QUERY_STRING


http://example.com/cgi-bin/script/method/extra/path?q1=5
       \_________/\_______________________________/ \__/
            |                    |                   |
        HTTP_HOST         REQUEST_URI                `-- QUERY_STRING


http://example.com/cgi-bin/script/method/extra/path?q1=5
\_______________________________/ \____/ \________/
                |                    |        | 
            BASE_URL           PATH_HEAD   PATH_TAIL



http://example.com/cgi-bin/script/method/extra/path?q1=5
\______________________________________/ \________/
                   |                          |
                SELF_URL                   PATH_TAIL

5.4.2 Undefined Parameters When Using SCGI on Nginx

Some of the CGI parameters are undefined by default when using SCGI mode with Nginx. If these CGI parameters are needed by the application, then values must be assigned in the Nginx configuration file. For example:

location /scgi/ {
   include scgi_params;
   scgi_pass localhost:9000;
   scgi_param SCRIPT_NAME "/scgi";
   scgi_param SCRIPT_FILENAME "/home/www/scgi/script1.tcl";
}

6.0 Wapp Commands

Wapp is really just a collection of TCL procs. All procs are in a single file named "wapp.tcl".

The procs that form the public interface for Wapp begin with "wapp-". The implementation uses various private procedures that have names beginning with "wappInt-". Applications should use the public interface only.

The most important Wapp interfaces are:

Understand the four interfaces above, and you will have a good understanding of Wapp. The other interfaces are merely details.

The following is a complete list of the public interface procs in Wapp:

Caution #1: When using Tcl 8.6 or earlier, command substitution, but not variable substitution, occurs outside of the quoted regions. This problem is fixed using the new "-command" option to the regsub command in Tcl 8.7. Nevertheless, it is suggested that you avoid using the "[" character outside of the %-quotes. Use "&#91;" instead.

Caution #2: The %html() and similar %-substitutions are parsed using a regexp, which means that they cannot do matching parentheses. The %-substitution is terminated by the first close parenthesis, not the first matching close-parenthesis.

7.0 Security

Wapp strives for security by default. Applications can disable security features on an as-needed basis, but the default setting for security features is always "on".

Security features in Wapp include:

  1. The default Content Security Policy [2] or "CSP" for all Wapp applications is default-src 'self'. In that mode, resources must all be loaded from the same origin, the use of eval() and similar commands in javascript is prohibited, and no in-line javascript or CSS is allowed. These limitations help keep applications safe from Cross-site Scripting or XSS attacks [3], attacks, even in the face of application coding errors. If these restrictions are too severe for an application, the CSP can be relaxed or disabled using the "wapp-content-security-policy" command.

  2. Access to GET query parameters and POST parameters is prohibited unless the origin of the request is the application itself, as determined by the Referrer field in the HTTP header. This feature helps to prevent [Cross-site Request Forgery][4] attacks. The "wapp-allow-xorigin-params" command can be used to disable this protection on a case-by-case basis.

  3. Cookies, query parameters, and POST parameters are automatically decoded before they reach application code. There is no risk that the application program will forget a decoding step or accidently miscode a decoding operation.

  4. Cookies, query parameters, and POST parameters are silently discarded unless their names begin with a lower-case letter and contain only alphanumerics, underscores, and minus-signs. Hence, there is no risk that unusual parameter names can cause quoting problems or other vulnerabilities.

  5. Reply text generated using the "wapp-subst" and "wapp-trim" commands automatically escapes generated text so that it is safe for inclusion within HTML, within a javascript or JSON string literal, as a URL, or as the value of a query parameter. As long as the application programmer is careful to always use "wapp-subst" and/or "wapp-trim" to generate replies, there is little risk of injection attacks.

  6. If the application is launched on a command-line with the --lint option, then instead of running the application, Wapp scans the application code looking for constructs that are unsafe. Unsafe constructs include things such as using "wapp-subst" with an argument that is not contained within {...}.

  7. The new (non-standard) SAME_ORIGIN variable is provided. This variable has a value of "1" or "0" depending on whether or not the current HTTP request comes from the same origin. Applications can use this information to enhance their own security precautions by refusing to provide sensitive information or perform sensitive actions if SAME_ORIGIN is not "1".

  8. The --scgi mode only accepts SCGI requests from localhost. This prevents an attacker from sending an SCGI request directly to the script and bypassing the webserver in the event that the site firewall is misconfigured or omitted.

  9. Though cookies, query parameters and POST parameters are accessed using the same mechanism as CGI variables, the CGI variable names use a disjoint namespace. (CGI variables are all upper-case and all others are lower-case.) Hence, it is not possible for a remote attacher to create a fake CGI variable or override the value of a CGI variable.

Part of what makes Wapp easy to use is that it helps free application developers from the worry of accidently introducing security vulnerabilities via programming errors. Of course, no framework is fool-proof. Developers still must be aware of security. Wapp does not prevent every error, but it does help make writing a secure application easier and less stressful.

8.0 End Notes

  1. https://sqlite.org/
  2. https://en.wikipedia.org/wiki/Content_Security_Policy
  3. https://en.wikipedia.org/wiki/Cross-site_scripting
  4. https://en.wikipedia.org/wiki/Cross-site_request_forgery