Note: This is a lightly modified version of the file at litpd.md . The minor modifications are solely for presentation.
Revisions
Version | Date | Comments |
---|---|---|
0.1a | 21/03/2024 | Initial version |
Introduction
This document describes a simple literate programming tool. It is
developed for use in my own programming projects. I use markdown for most
documentation, and I use lua
for most of my programming needs these days.
This tool uses pandoc to process a markdown file to one of the supported
printable/publishable outputs supported by pandoc. The tool also includes
a lua filter to process the code blocks in the literate program document to
generate output programs in their own files.
This tool is itself written as a literate program and can be found at litpd
What is Literate Programming?
For those unfamiliar to the term "literate programming", I would refer you to the excellent writings on the topic by Donald Knuth. In short literate programming is about program definition in a document which is written as a work of literature. Therefore one of the primary objectives of a literate program is to be read and enjoyed by another programmer. The literate program document can also be used to create an executable program or a library in the target programming language.
The litpd Program
The litpd program is written in the Lua programming language. The goal of the program is two-fold: 1. Readable Program: Generate a publishable/printable Program Description in HTML or PDF formats. 2. Runnable Program: Separate out and/or merge code blocks into individual program files so that they can be used as a normal program in the target language.
The program uses pandoc to perform the generation of the final readable document with minor adjustments. Therefore this part of the program simply delegates to pandoc.
To generate the source code in proper files and structure, we inject a lua filter program into the pandoc processing flow. This program extracts the code from the document and writes it to the target program file(s).
The approach is also described in the High-level design diagram below.
Therefore, the litpd application is composed of two programs.
- litpd.lua: This program is the main cli tool used to generate the publishable document and the runnable program from the input literate program written in the pandoc markdown format.
- mdtangle.lua: This program is a pandoc lua filter. The goal of this program is to run during the filter phase of document generation and extract the source code of the literate program into proper output program files.
CLI Program - litpd.lua
The litpd.lua
program provides a command line interface to the literate
programming tool. It allows us to run the pandoc conversion of the literate
program document into the publishable document and the runnable program using
the mdtangle.lua
filter as well as compile the output into proper files at
the proper locations.
The program has the following parts:
- Program documentation
- Extract program arguments & check required arguments
- Construct the pandoc command
- Run the pandoc command and check for errors
We will discuss each part one by one.
Program Header
This part is self-explanatory and provides the header of the program in standard lua documentation format.
--- litpd.lua - Main CLI program for litpd tool.
--
-- license: MIT see LICENSE file
-- date: 21/03/2024
-- author: Abhishek Mishra
Program Arguments
In this section of the program, we first construct the args
table from the
program arguments.
- The
script_path
stores the first argument which is usually the name of the lua program being run, if this is run from a standard lua executable. - In case the
script_path
contains any windows\
backslash path separators, we replace them with/
slashes. - The we match anything before the last
\
of thescript_path
as the path of the directory containing the program, and store the result inlitmd_home
. - Next we define a function
show_usage
which prints the usage of the command, this is used when the user does not pass the proper expected arguments to the command. - Finally we look at the arguments:
- If the length of the args is 0, then we print usage and exit.
- If the first argument (i.e. the input literate programming document) is
not provided, then we print the usage and exit. Else we store the input file
name in
input_file
. - The rest of the arguments are assumed to be pandoc arguments and are stored
in a separate table called
options
.
-- get the arguments from the command line
local args = {...}
-- get the path of script, and its directory
local script_path = arg[0]
-- replace all backslashes with forward slashes
script_path = script_path:gsub("\\", "/")
local litmd_home = script_path:match(".*/")
-- print("litmd_home: " .. litmd_home)
--- Show usage
local function show_usage()
print("Usage: litmd <inputfile.md> [options]")
end
-- if no arguments are provided, print the usage
if #args == 0 then
show_usage()
return
end
-- get the input file name
local input_file = args[1]
if input_file == nil then
print("No input file provided")
show_usage()
return
end
-- get the rest of the arguments
local options = {}
for i = 2, #args do
table.insert(options, args[i])
end
Construct and Display Pandoc Command
In the next section of the program we now construct the pandoc command to run such that both the output document, and output code are generated correctly.
- The
TANGLE_FILTER
variable is created to store the path to the lua pandoc filter which will extract the code from the input document and write it to individual source code files. - The
PANDOC_CMD
variable stores a string which passes the lua-filter arg and the markdown source type setting to the pandoc command. - Then we construct the command to be run in a variable called
cmd
from its constituent parts. First thePANDOC_CMD
then theinput_file
and finally the rest of the args in the tableoptions
are added to the stringcmd
. - Once constructed the
cmd
string is displayed on the terminal.
local TANGLE_FILTER = litmd_home .. "mdtangle.lua"
local PANDOC_CMD = "pandoc --lua-filter=" .. TANGLE_FILTER .. " --from=markdown "
-- create the final command, start with the pandoc command
local cmd = PANDOC_CMD
-- add the input file
cmd = cmd .. input_file
-- add the rest of the options
for i = 1, #options do
cmd = cmd .. " " .. options[i]
end
-- display the command to be executed
print("Executing: " .. cmd)
Run the Pandoc Command
The last few lines of the program run the constructed cmd
string using
the io.popen
library call. The call returns a handle to the output, which is
stored in handle
. We read the output from this output stream into a variable
called result
and then close the handle
.
The result
is printed to the terminal. And then the program is done.
-- execute the command
local handle = io.popen(cmd)
if handle == nil then
print("Error executing command")
return
end
local result = handle:read("*a")
handle:close()
-- handle the result
print(result)
Filter Program - mdtangle.lua
The mdtangle.lua
program is a pandoc lua filter. A pandoc filter is a
program which is executed by the pandoc program during its filtration phase.
The filter has access to the abstract syntax tree (AST) of the input document.
The access to the AST of the input document provides the filter program the
ability to implement transformations of the input document, or add functionality
to the document generation process that is not part of the standard pandoc
processing.
The mdtangle.lua
filter is only interested in the CodeBlock
section of the
AST which represents the code sections of the input markdown document. The
program registers itself to read all the CodeBlock
sections. When a new code
block occurs, the filter program notes down its attribute named code_file
.
If such an attribute exists then the code inside the CodeBlock
is written
to the file at code_file
in append mode.
Thus the effect of the filter is to take the code blocks from the literate program and write them in their own target files.
Lets now look at the various parts of the program.
Program Header
--- md-tangle.lua - Lua filter for pandoc to tangle code blocks into one or more
-- files.
--
-- license: MIT see LICENSE file
-- date: 21/03/2024
-- author: Abhishek Mishra
Module Declaration
The pandoc filter API expects a lua table to be returned from the program. The table should contain entries for each AST node type that the filter intends to process.
We define a table named tangle
which will have just one entry CodeBlock
by
the end of the program. tangle
will be returned to pandoc as the definition
of the filter module.
local tangle = {}
Read code_file
Attribute
As discussed earlier we have made one addition to the pandoc markdown format,
to support literate programming. Each code block which will be generated into
its own file must specify the output program file name in the fenced code block.
This output program file is specified as the value of a special attribute
code_file
of the fenced code block.
The function get_file_name
accepts a code_block
value as argument. This
code_block
is received by the CodeBlock
handler in our program. Therefore it
is a pandoc object which has an attributes
table.
The function retrieves the code_file
value and stores it in file_name
. If
there is no code_file
defined for the fenced block, then it is provided a
default file name.
The file_name
is returned to the caller.
local function get_file_name (code_block)
local file_name = code_block.attributes["code_file"]
if file_name == nil then
file_name = "Untitled.lua"
end
return file_name
end
File I/O
The program defines three functions to perform I/O to the output program file(s).
get_file
: Takes thecode_block
as argument, and gets thefull_path
of the file mentioned in the attributes of the fenced code blcok. The it opens a file in append node for the givenfull_path
. Bothfull_path
andfile
are returned to the caller.write_code_block
: This function takes acode_block
and afile
already opened to write it. It writes the content of thecode_block
followed by a newline in thefile
.close_file
: closes the givenfile
.
local function get_file (code_block)
local full_path = get_file_name(code_block)
local file = io.open(full_path, "a")
return full_path, file
end
local function write_code_block (code_block, file)
local code = code_block.text
file:write(code)
file:write("\n")
end
local function close_file (file)
file:close()
end
CodeBlock
AST Hook
The CodeBlock
function in the filter module will be called by pandoc when it
encounters a code block in the input markdown document. The only argument of
the function is code_block
which gets the text of the code written in the
fenced code block.
- We retrieve the
full_path
to thecode_block
, and the corresponding writablefile
object using theget_file
function defined above. - Then the program writes the
code_block
to thefile
using the functionwrite_code_block
. - Finally we close the
file
using theclose_file
function.
function tangle.CodeBlock (code_block)
local full_path, file = get_file(code_block)
print("Tangling code block at " .. full_path)
write_code_block(code_block, file)
close_file(file)
end
Module Export
Lastly, we export the module for use in pandoc.
return {
tangle
}
Future Plans
This is a fairly new program. As I use it in my daily programming workflow, I will make changes.
- Version History: All changes will be noted in the version history section at the top of the document.
- Bug Fixes: I've only uesd this to write a few programs, and therefore I'm sure there are several bugs lurking in the corners. They will be fixed, and the document updated accordingly.
- New Features: I see a few things which might be useful in the future.
- Out-of-order Code Blocks: Currently the program files are generated with content in the order it appears in fenced code blocks in the input document. This is restrictive, and sometimes one might want to introduce program sections in a different sequence. This is a possible future enhancement.
- Ignore Code Blocks: Some code blocks might just be examples or asides, and need not end up in the final program files. There should be a mechanism to ignore such code blocks.