UNB's Luci - The eLUCIdator
The following Perl modules are required by Luci. All are available from http://www.cpan.org/.
NOTE 1: The versions listed here are those that were used during development. It is possible that Luci will work with an older version of the same module, but when using an older module perfomance should then be considered uncertain.
NOTE 2: It is possible that some of the modules listed here have module requirements of their own.
NOTE: These settings are required for Luci to work properly. Any parameter within the config file may be modified to tweak Luci to best suit your needs, but you only need to change the required list to get Luci up and running.
- app_root
- absolute path under which you've installed the distribution.
ex: /web/text_only/parser
- app_root_url
- absolute URL of the application root directory.
( NOTE: the 's' in http's' is recommended, yet not necessary. Please see
the FEATURE LIST for information on how Luci works with SSL and why Luci
should reside behind a Secure Socket Layer. )
ex: https://www.yourdomain.com/text_only/parser
Under <system>
- default_target
- absolute URL of the default site where Luci should
be directed. This site will be used if a user references the luci.cgi
directly.
ex: http://www.yourdomain.com/
- apache_2
- set true if running under apache 2, false otherwise
- allow_url
- urls that Luci will treat as internal ( ie. allowed
for parse ). This directive should be repeated for every url under which
Luci should allow for parsing. If you leave this field empty, Luci will
allow parsing of any domain. See luci.conf.cgi for specific info on how
to set this directive.
ex: allow_url = http://yourdomain.com/* allow_url = http://yourotherdomain.com/*
- deny_url
- urls that Luci will not allow for parse. This directive
follows the same conventions as does the allow_url directive. Note: deny_url
will override settings in the allow_url directive. If you leave this field
empty, Luci will allow parsing of any url restricted by those set with
allow_url. See luci.conf.cgi for specific info on how to set this directive.
ex: deny_url = http://yourdomain.com/personal/* deny_url = http://yourdomain.com/personal/mypage.html
- cryptkey
- the encryption key is used with CBC::Crypt for generating
parameter names, but also used for en/decrypting passwords used in
conjunction with 401 authorization. ( ie. .htaccess ) If you are not familiar
with this, set it to some random string. cryptkey can be anything you like
- I think - don't be too creative, but don't use the example given here.
( see the FEATURE LIST for information on how 401 authorization works with
Luci and why you need to set cryptkey. )
ex: cryptkey = aZ4eg3P
Under <cookies>
- domain
- the domain under which you've installed Luci.
ex: yourdomain.com
- secure
- Set to 1 if Luci is running under SSL. Used with cookies
set by Luci. According to CGI::Cookie: ``If the 'secure' attribute is set,
the cookie will only be sent to your script if the CGI request is occurring
on a secure channel, such as SSL.''
You should now be ready to test the install. See USAGE.
- hide luci.conf.cgi from your web server document root - By moving the config file to a directory outside the web tree, anonymous users will not be able to access your configuration details. To do this, you need to first move luci.conf.cgi to some directory outsie of your web server document root, then edit the luci.cgi and the index.cgi, and update the following line:
use constant LUCI_CONF => $path.$sep.'luci.conf.cgi';
to read:
use constant LUCI_CONF => "/path/to/hidden/luci.conf.cgi";
- run setuid - By running Luci setuid, you can probably leave the config under the web server document root, as long as the permissions are set properly.
NOTE: When running setuid, Luci runs as the owner of the Luci scripts, and therefore has those permissions available to that user on the system. Please see the setuid man page for more information on how it works. When running setuid you should probably run the application with a user account that has minimal permissions on the system.
NOTE: As of version 1.3, we have renamed the configuration file with the appended '.cgi'. This should allow you to leave the file in place if you serve perl generated pages with the cgi extension. In this case it is recommended that you set it with read only permissions so any user attempting to read the file with their browser will get a forbidden error.
The solution has two parts:
- a
- As per the googlebot documentation, we've added the meta content
necessary to hide your Luci pages from a crawler that may attempt indexing
via your site. This is available as an option in the config, and is by default
turned on. If you have a small website, and do make use of the allow_url
directive in your luci.conf.cgi, you shouldn't be too concerned about robots
because they will only index those pages allowed by your Luci installation.
- b
- For larger sites, or those allowing open access by not using
the allow_url directive, it would be good practice to at minimum leave
the option to include the nofollow meta turned on as is described in
- a
- above. Additionally, you can include a robots.txt file at the
root of your site which is the standard method used by a website admin to
communicate with a crawler engine.
As an example: if you install Luci under
https://www.yourdomain.com/text_only/parser/
, then you would create a
robots.txt file at https://www.yourdomain.com/robots.txt
. The entries
necessary to hide this installation are as follows:
User-agent: * Disallow: /text_only/parser
NOTE: The robots.txt file *MUST* exist at the root of your site, else it will have no effect.
For information on googlebot, see: http://www.google.com/webmasters/bot.html
For information on robot exclusion via robots.txt and the meta nofollow, see: http://www.robotstxt.org/wc/exclusion.html
For detailed information on Web Robots, see: http://www.robotstxt.org/wc/faq.html
Below is a list of environments under which we know Luci will run. With each is a description of what software versions were used, and what was required to get Luci running. Some systems have special requirements:
- UNIX/Linux/OS X - Luci was developed in a UNIX environment, and should work without issue. See BUGS for reporting issues found when running/installing Luci in a UNIX environment.
- Windows 2003 Server Version 5.2.3790 Build 3790 : Service Pack 1:: Processor - x86 Family 6 Model 6 Stepping 5 GenuineIntel ~500 Mhz -
- Install ActivePerl (we used 5.8.8.817-MSWin32-x86-257965) - top level dir: c:\Perl - install all options - Luci requires the 'Add Perl to the PATH environment variable' option - Using the perl package manager: - install config-general - install html-template - install crypt-cbc - install http://theoryx5.uwinnipeg.ca/ppms/Crypt-Twofish_PP.ppd - install http://theoryx5.uwinnipeg.ca/ppms/Crypt-SSLeay.ppd - Fetch ssleay32.dll? yes - Where should ssleay3.dll be placed? C:\Perl\bin - Fetch libeay32.dll? yes - Where should libeay32.dll be placed? C:\Perl\bin - see item 4 below for information on permissions required by these two files
- version 6.0 comes with this version of Windows 2003 - configure ISS 6 to support ActivePerl as per the following document:
ActivePerl 5.8 - Online Docs : Web Server Configuration
- note the following when corresponding with the configuration document above - New > Virtual Directory settings: - Virtual Directory Alias: 'cgi-bin' - Web Site Content Directory: c:\path\to\cgi-bin (ex: C:\Inetpub\cgi-bin) - Virutal Directory Access Permissions: check Read and Execute - Virtual Directory > Properties - Virtual Directory Tab > Configuration > Mappings: - choose Add - Executable: c:\Perl\bin\perl.exe -T "%s" %s - Extension: .cgi
- Documents Tab: - check Enable default content page - choose Add - Default content page: index.cgi - luci uses taint checking. you can either remove the -T from the perl invocation at the top of both index.cgi and luci.cgi, or when you add the mapping, specify -T before "%s" %s ex: C:\Perl\bin\perl.exe -T "%s" %s
- extract (install) luci in your cgi-bin - rename the folder 'luci' - setup your luci.conf.cgi as described in SETUP & CONFIGURATION - test your install as described in USAGE
- LWP will support https URLs if the Crypt::SSLeay module is installed. - 501 Protocol scheme 'https' is not supported (Crypt::SSLeay not installed). it is required that the IIS internet guest account (iusr) has 'Read and Execute' permissions set on the Crypt-SSLeay dlls, namely, ssleay32.dll and libeay32.dll. to set permissions, locate these files, choose properties from the context menu, and under the security tab, add the iuser account to the user names menu with 'Read and Execute' permissions checked.
- the following document was provided by John Newman from http://www.newluna.com/, and may be of assistance for those installing under Windows 2003: http://luci.sourceforge.net/other/win2003_05_30_2006.pdf if you intend to perform a non-network install, it will be necessary that you retrieve the modules and port them to your local machine manually. we cannot provide them here, and would also prefer that the latest version of the modules be used.
- Windows 2000 Server Version 5.0.2195 Build 2195 :: Processor - x86 Family 6 Model 6 Stepping 5 GenuineIntel ~500 Mhz -
- Install ActivePerl (we used 5.8.6.811-MSWin32-x86-122208) - top level dir: c:\Perl - y to all options (especially env var) - Using the perl package manager: - install config-general - install html-template - install crypt-cbc - install http://theoryx5.uwinnipeg.ca/ppms/Crypt-Twofish_PP.ppd - install http://theoryx5.uwinnipeg.ca/ppms/Crypt-SSLeay.ppd - Fetch ssleay32.dll? yes - Where should ssleay3.dll be placed? C:\Perl\bin - Fetch libeay32.dll? yes - Where should libeay32.dll be placed? C:\Perl\bin - see item 4 below for information on permissions required by these two files
- version 5.0 comes with this version of Windows 2000 - configure ISS 5 to support ActivePerl as per the following document:
ActivePerl 5.8 - Online Docs : Web Server Configuration
- note the following when corresponding with the configuration document above - New > Virtual Directory settings: - Virtual Directory Alias: 'cgi-bin' - Web Site Content Directory: c:\path\to\cgi-bin (ex: C:\Inetpub\cgi-bin) - Virutal Directory Access Permissions: check Read and Execute - Virtual Directory > Properties - Virtual Directory Tab > Configuration > App Mappings: - choose Add - Executable: c:\Perl\bin\perl.exe -T "%s" %s - Extension: .cgi
- Documents Tab: - check Enable default content page - choose Add - Default content page: index.cgi - luci uses taint checking. you can either remove the -T from the perl invocation at the top of both index.cgi and luci.cgi, or when you add the mapping, specify -T before "%s" %s ex: C:\Perl\bin\perl.exe -T "%s" %s
- extract (install) luci in the web tree - rename the folder 'luci' - setup your luci.conf.cgi as described in SETUP & CONFIGURATION - test your install as described in USAGE
- we've yet to encounter any installation errors while following the directions provided here. If any are found, please let us know so we can update the documentation.
- test configuration - Once you have Luci installed and configured to work on your server, using Luci is quite simple. To test your configuration, simply point your web browser at the Luci install directory. You should see a 'text only' version of the default_target which was set in the configuration file.
- adding accessibility links to your pages - Using the provided index.cgi you can quite easily add accessibility links to all your pages. Use the following in your HTML to provide quick and easy access to Luci. The index.cgi will take care of forcing Luci to parse the appropriate page.
<a href="https://www.yourdomain.com/path/to/luci/">Accessible
Version</a>
The following icons are also available for linking to Luci: ( download them from here or simply refrence them via the images directory that came with the distribution )
Luci is the bright young, colonial cousin of the venerable dowager Betsie, BBCs Education Text to Speech Internet Enhancer. While still bearing a family resemblance to Betsie, Luci has been completely re-written mainly to accommodate SSL. ( see SSL in the FEATURE LIST for more information on why we wrote Luci )
Luci is clear, plain, simple and easy to use.
Luci allows you to change the way your browser views web pages by simplifying their content into a well-structured, text-only format, mainly for accessibility purposes. Luci works equally well whether you wish to change the font-size and colour scheme of your dislpay to make it easier to read, or want to send it to a text-reader.
Once in Luci's unified text rendering view, the user has several options for adjusting the application's display settings ( by choosing font, colour scheme, font size, and line height ). These settings are maintained as you continue to browse within the alloted domain(s), or or until you switch back to a Graphical Version of the site.
Unlike Betsie, Luci is not an acronym. Luci gets her name from the word 'elucidate'.
Elucidator ( noun )
1. One who explains or elucidates;
Elucidate ( verb )
2. To make clear or plain; clarify.
Clear, plain, simple and easy to use - Elucidator is a no-brainer.
Luci has many features, those of which we've deemed most important are listed here. For more detailed information on what Luci can do, see the source, the config, and you may also find some detail in the changelog.
At the time of this writing, Betsie was not capable of parsing secure content. An attempt was made to modify Betsie yet after much ado, a decision was made that we could benefit from a full re-write, in that we would add the ability to parse encrypted content, and would take advantage of any features that may not have been available/feasible at the time when Betsie was originally written. ( ex: OO concepts, certain Perl Modules, etc... )
- Luci should reside behind SSL - In Figures 1 and 2, you can see how Luci works. Basically, Luci acts as an intermediary between the user and some web site. In both diagrams, you can easily see that Luci may communicate with either secure or non-secure web servers.
In Figure 1, under http ( non-SSL ), the communication between the user and Luci is not encrypted, and therefore, not secure. This can be dangerous because those users accessing secure content using Luci in this scenario would be transferring data intended for encryption across an un-encrypted connection.
In Figure 2, Luci is hosted in a secure environment, and therefore, communication between a user and the web server is now encrypted. This is how Luci should be hosted.
Because of these security concerns, we've added a check in Luci that will fail parsing of secure pages when Luci is hosted from http. In this case, users will be presented with an error message when attempting to parse a secure document. See DISCLAIMER.
- size and excessive use -
The cookie specification as per RFC2109 ( see: http://rfc.net/rfc2109.html ) states that a user agent must at LEAST accommodate for a cookie of 4 kb in size. ( quashing the myth that browsers only support cookies 4kb in size ) ( see: http://search.cpan.org/~gaas/libwww-perl-5.800/lib/HTTP/Cookies.pm#METHODS for details on cookie creation ) As a result, the application should work very well where cookie sizes are moderate. It is up to the browser how it will perform once the cookie size surpasses the required 4 kb limit. ( truncation a definite possibility ) Therefore, application performance with respect to cookie size and excessive use is uncertain, and most likely dependent on the users browser specification.
( see: changelog for more information on cookies )
Using this model, you cannot be logged into more than one 401 site at a time. Luci passes the credential information to the server upon each and every 401 request. If the user navigates to a separate domain, they will be prompted for their credentials with respect to that domain, and any existing authorization data will be overwritten.
The cryptkey parameter in the luci.conf.cgi is the key that is used to en/decrypt the credential information that is stored in the authorization cookie. As stated, you should change the cryptkey and hide the luci.conf.cgi from being viewed via the web. If your cryptkey is publicly available ( ie. if you use the one provided with the download ), and some intruder manages to steal the authorization cookie from one of your users, they could then easily decrypt the data and obtain their credential information.
Using the Luci configuration file, you can easily tweak Luci to best suit your sites needs.
- templates -
All the HTML associated with Luci is templated, and availabe in the templates directory provided with the distribution. With these templates, you can quite easily port your site branding to Luci, etc...
<span class="luciignore"> Luci will ignore any content nested within this tag block </span>
Many thanks to the authors of Betsie ( see: http://betsie.sourceforge.net/ ) for it was that application that inspired the creation of Luci.
We would also like to thank the following for their contributions on this project:
Project Site ( download ): http://sourceforge.net/projects/luci/
CVS: http://luci.cvs.sourceforge.net/luci/luci/
The University of New Brunswick: http://www.unb.ca/
UNB's Luci Install: https://www.unb.ca/sweb/parser/
The authors make no representations or warranties of any kind concerning the quality, safety or suitability of this software. See LICENSE for more information.