wikiserv

Wiki server with manual editing of text files as backend, with a selectable filter converting them to HTML and a caching mechanism.

Justification

I've tried using purely web-based wikis like MediaWiki, but the database and server maintenance involved with them have always ended up leaving me neglecting them. This time, I'm going to try something a bit different.

Requirements

Python 3 (3.3 or newer)
pytz
python-dateutil
lxml
chardet
python-magic
tornado
An OS that supports the built-in fcntl module
asciidoc in PATH (optional)
markdown for Python (optional)

Configuration

<?xml version="1.0" ?>

<configuration>
	<log-level>DEBUG</log-level><!-- Passed to logging module -->
	<bind-address>127.0.0.1</bind-address><!-- OPTIONAL: address to bind to -->
	<bind-port>8080</bind-port><!-- Port to bind to -->
	<document-root>testdata/test_root</document-root><!-- Root of directory containing files which will be procesed and served -->
	<preview-lines>5</preview-lines><!-- OPTIONAL: When performing a search, show this many lines from the source document -->
	<worker-threads>4</worker-threads><!-- OPTIONAL: Number of all-purpose worker threads to spawn.  DEFAULT: 1 -->
	<runtime-vars>4</runtime-vars><!-- Storage for runtime variables separate from the cache -->
	<cache dir="testdata/test_cache"><!-- dir=Root of cache directory -->
		<checksum-function>sha1</checksum-function><!-- Checksum algorithm used on the files to be processed to determine cache state -->
		<max-age>86400</max-age><!-- OPTIONAL: Whenever a scrub is performed, delete files that are older than this age (seconds) -->
		<max-entries>2048</max-entries><!-- OPTIONAL: Use an LRU algorithm to limit the approximate maximum number of entries in the cache -->
		<auto-scrub /><!-- OPTIONAL: When the LRU algorithm hits the maximum number of entries, automatically scrub the cache to clear up free slots -->
		<dispatcher-thread /><!-- OPTIONAL: Use the DispatcherCache class instead, which will perform automatic scrubbing in a separate thread -->
		<send-etags /><!-- OPTIONAL: Send Etags based on checksum algorithm -->
	</cache>
	<search-cache><!-- OPTIONAL: Simply having this element here enabled cached searches
		<max-age>3600</max-age><!-- OPTIONAL: Whenever a scrub is performed, delete files that are older than this age (seconds) -->
		<max-entries>32</max-entries><!-- OPTIONAL: Use an LRU algorithm to limit the approximate maximum number of entries in the cache -->
		<auto-scrub /><!-- OPTIONAL: When the LRU algorithm hits the maximum number of entries, automatically scrub the cache to clear up free slots -->
	</search-cache>
	<processors>
		<encoding>utf8</encoding><!-- Output encoding passed to all the processors -->
		<processor>asciidoc-xhtml11</processor><!-- OPTIONAL: Sets the default processor used to convert files to HTML -->
		<!-- If no default processor is specified, the 'autoraw-nocache' processor is used -->
		<processor extensions="txt foo">asciidoc-xhtml11</processor><!-- For the extensions txt and foo, use this processor to convert -->
		<processor extensions="bar">asciidoc-html5</processor><!-- For the extensions bar, used asciidoc-html5 instead -->
	</processors>
</configuration>

Help

Since Python's argparse module is used for parsing arguments, --help will provide some advice.

usage: server.py [ options ] -c config.xml 

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG.XML, -c CONFIG.XML
                        XML configuration file
  --scrub               Instead of running the server, just do a cache scrub
  --bind-address ADDRESS
                        Bind to ADDRESS instead of the address specified in
                        configuration
  --bind-port ADDRESS   Bind to ADDRESS instead of the port specified in
                        configuration

Design

File locking everywhere to keep things consistent within the server processes and threads.
Raw source files will be used as the input, which can be modified whenever.
A caching system with a directory tree that corresponds to the source (asciidoc, etc.) structure.
1. Web server processes will have a shared lock on a toplevel .lock file.
2. Other threads and processes will have an exclusive lock on a toplevel .lock file.
The caching system will have two methods of cleanup.
1. A time to live system that with a configurable maximum age. The cleanup process will have to be scheduled in cron or something similar.
2. An optional LRU-based system that will delete the oldest entry when there are too many sitting around. A thread will run in the background will have this task dispatched to it.
The caching system will use file size, file modification time, and a configurable checksum to check for changes in source files.
The actual filter will be configurable and replaceable, with asciidoc as both the initial and reference implementation.
Any source file revision control is left to the person managing the source directory tree.

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wikiserv

Justification

Requirements

Configuration

Help

Design

About

Releases

Packages

Languages

License

takeshitakenji/wikiserv

Folders and files

Latest commit

History

Repository files navigation

wikiserv

Justification

Requirements

Configuration

Help

Design

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages