shocco (16974B)
1 #!/bin/sh 2 # **shocco** is a quick-and-dirty, literate-programming-style documentation 3 # generator written for and in __POSIX shell__. It borrows liberally from 4 # [Docco][do], the original Q&D literate-programming-style doc generator. 5 # 6 # `shocco(1)` reads shell scripts and produces annotated source documentation 7 # in HTML format. Comments are formatted with Markdown and presented 8 # alongside syntax highlighted code so as to give an annotation effect. This 9 # page is the result of running `shocco` against [its own source file][sh]. 10 # 11 # shocco is built with `make(1)` and installs under `/usr/local` by default: 12 # 13 # git clone git://github.com/rtomayko/shocco.git 14 # cd shocco 15 # make 16 # sudo make install 17 # # or just copy 'shocco' wherever you need it 18 # 19 # Once installed, the `shocco` program can be used to generate documentation 20 # for a shell script: 21 # 22 # shocco shocco.sh 23 # 24 # The generated HTML is written to `stdout`. 25 # 26 # [do]: http://jashkenas.github.com/docco/ 27 # [sh]: https://github.com/rtomayko/shocco/blob/master/shocco.sh#commit 28 29 # Usage and Prerequisites 30 # ----------------------- 31 32 # The most important line in any shell program. 33 set -e 34 35 # There's a lot of different ways to do usage messages in shell scripts. 36 # This is my favorite: you write the usage message in a comment -- 37 # typically right after the shebang line -- *BUT*, use a special comment prefix 38 # like `#/` so that its easy to pull these lines out. 39 # 40 # This also illustrates one of shocco's corner features. Only comment lines 41 # padded with a space are considered documentation. A `#` followed by any 42 # other character is considered code. 43 # 44 #/ Usage: shocco [-t <title>] [<source>] 45 #/ Create literate-programming-style documentation for shell scripts. 46 #/ 47 #/ The shocco program reads a shell script from <source> and writes 48 #/ generated documentation in HTML format to stdout. When <source> is 49 #/ '-' or not specified, shocco reads from stdin. 50 51 # This is the second part of the usage message technique: `grep` yourself 52 # for the usage message comment prefix and then cut off the first few 53 # characters so that everything lines up. 54 expr -- "$*" : ".*--help" >/dev/null && { 55 grep '^#/' <"$0" | cut -c4- 56 exit 0 57 } 58 59 # A custom title may be specified with the `-t` option. We use the filename 60 # as the title if none is given. 61 test "$1" = '-t' && { 62 title="$2" 63 shift;shift 64 } 65 66 # Next argument should be the `<source>` file. Grab it, and use its basename 67 # as the title if none was given with the `-t` option. 68 file="$1" 69 : ${title:=$(basename "$file")} 70 71 # These are replaced with the full paths to real utilities by the 72 # configure/make system. 73 MARKDOWN='/usr/bin/markdown_py' 74 PYGMENTIZE='/usr/bin/pygmentize' 75 76 # On GNU systems, csplit doesn't elide empty files by default: 77 CSPLITARGS=$( (csplit --version 2>/dev/null | grep -i gnu >/dev/null) && echo "--elide-empty-files" || true ) 78 79 # We're going to need a `markdown` command to run comments through. This can 80 # be [Gruber's `Markdown.pl`][md] (included in the shocco distribution) or 81 # Discount's super fast `markdown(1)` in C. Try to figure out if either are 82 # available and then bail if we can't find anything. 83 # 84 # [md]: http://daringfireball.net/projects/markdown/ 85 # [ds]: http://www.pell.portland.or.us/~orc/Code/discount/ 86 command -v "$MARKDOWN" >/dev/null || { 87 if command -v Markdown.pl >/dev/null 88 then alias markdown='Markdown.pl' 89 elif test -f "$(dirname $0)/Markdown.pl" 90 then alias markdown="perl $(dirname $0)/Markdown.pl" 91 else echo "$(basename $0): markdown command not found." 1>&2 92 exit 1 93 fi 94 } 95 96 # Check that [Pygments][py] is installed for syntax highlighting. 97 # 98 # This is a fairly hefty prerequisite. Eventually, I'd like to fallback 99 # on a simple non-highlighting preformatter when Pygments isn't available. For 100 # now, just bail out if we can't find the `pygmentize` program. 101 # 102 # [py]: http://pygments.org/ 103 command -v "$PYGMENTIZE" >/dev/null || { 104 echo "$(basename $0): pygmentize command not found." 1>&2 105 exit 1 106 } 107 108 # Work and Cleanup 109 # ---------------- 110 111 # Make sure we have a `TMPDIR` set. The `:=` parameter expansion assigns 112 # the value if `TMPDIR` is unset or null. 113 : ${TMPDIR:=/tmp} 114 115 # Create a temporary directory for doing work. Use `mktemp(1)` if 116 # available; but, since `mktemp(1)` is not POSIX specified, fallback on naive 117 # (and insecure) temp dir generation using the program's basename and pid. 118 : ${WORK:=$( 119 if command -v mktemp 1>/dev/null 2>&1 120 then 121 mktemp -d "$TMPDIR/$(basename $0).XXXXXXXXXX" 122 else 123 dir="$TMPDIR/$(basename $0).$$" 124 mkdir "$dir" 125 echo "$dir" 126 fi 127 )} 128 129 # We want to be absolutely sure we're not going to do something stupid like 130 # use `.` or `/` as a work dir. Better safe than sorry. 131 test -z "$WORK" -o "$WORK" = '/' && { 132 echo "$(basename $0): could not create a temp work dir." 133 exit 1 134 } 135 136 # We're about to create a ton of shit under our `$WORK` directory. Register 137 # an `EXIT` trap that cleans everything up. This guarantees we don't leave 138 # anything hanging around unless we're killed with a `SIGKILL`. 139 trap "rm -rf $WORK" 0 140 141 # Preformatting 142 # ------------- 143 # 144 # Start out by applying some light preformatting to the `<source>` file to 145 # make the code and doc formatting phases a bit easier. The result of this 146 # pipeline is written to a temp file under the `$WORK` directory so we can 147 # take a few passes over it. 148 149 # Get a pipeline going with the `<source>` data. We write a single blank 150 # line at the end of the file to make sure we have an equal number of code/comment 151 # pairs. 152 153 # Folding.el support: turn {{{ folds }}} into titles -jrml 154 (cat "$file" \ 155 | sed -e 's/^# {{{/# #/' -e 's/^# }}}.*/# --------------/' \ 156 | awk ' 157 /function.*\(\) {$/ { print "# ### " $2; print $0; next } 158 /\(\) {$/ { print "# ### " $1; print $0; next } 159 {print $0}' \ 160 && printf "\n\n# \n\n") | 161 162 # We want the shebang line and any code preceding the first comment to 163 # appear as the first code block. This inverts the normal flow of things. 164 # Usually, we have comment text followed by code; in this case, we have 165 # code followed by comment text. 166 # 167 # Read the first code and docs headers and flip them so the first docs block 168 # comes before the first code block. 169 ( 170 lineno=0 171 codebuf=;codehead= 172 docsbuf=;docshead= 173 while read -r line 174 do 175 # Issue a warning if the first line of the script is not a shebang 176 # line. This can screw things up and wreck our attempt at 177 # flip-flopping the two headings. 178 lineno=$(( $lineno + 1 )) 179 test $lineno = 1 && ! expr "$line" : "#!.*" >/dev/null && 180 echo "$(basename $0): $(file):1 [warn] shebang! line missing." 1>&2 181 182 # Accumulate comment lines into `$docsbuf` and code lines into 183 # `$codebuf`. Only lines matching `/#(?: |$)/` are considered doc 184 # lines. 185 if expr "$line" : '# ' >/dev/null || test "$line" = "#" 186 then docsbuf="$docsbuf$line 187 " 188 else codebuf="$codebuf$line 189 " 190 fi 191 192 # If we have stuff in both `$docsbuf` and `$codebuf`, it means 193 # we're at some kind of boundary. If `$codehead` isn't set, we're at 194 # the first comment/doc line, so store the buffer to `$codehead` and 195 # keep going. If `$codehead` *is* set, we've crossed into another code 196 # block and are ready to output both blocks and then straight pipe 197 # everything by `exec`'ing `cat`. 198 if test -n "$docsbuf" -a -n "$codebuf" 199 then 200 if test -n "$codehead" 201 then docshead="$docsbuf" 202 docsbuf="" 203 printf "%s" "$docshead" 204 printf "%s" "$codehead" 205 echo "$line" 206 exec cat 207 else codehead="$codebuf" 208 codebuf= 209 fi 210 fi 211 done 212 213 # We made it to the end of the file without a single comment line, or 214 # there was only a single comment block ending the file. Output our 215 # docsbuf or a fake comment and then the codebuf or codehead. 216 echo "${docsbuf:-#}" 217 echo "${codebuf:-"$codehead"}" 218 ) | 219 220 # Remove comment leader text from all comment lines. Then prefix all 221 # comment lines with "DOCS" and interpreted / code lines with "CODE". 222 # The stream text might look like this after moving through the `sed` 223 # filters: 224 # 225 # CODE #!/bin/sh 226 # CODE #/ Usage: shocco <file> 227 # DOCS Docco for and in POSIX shell. 228 # CODE 229 # CODE PATH="/bin:/usr/bin" 230 # CODE 231 # DOCS Start by numbering all lines in the input file... 232 # ... 233 # 234 # Once we pass through `sed`, save this off in our work directory so 235 # we can take a few passes over it. 236 sed -n ' 237 s/^/:/ 238 s/^:[ ]\{0,\}# /DOCS /p 239 s/^:[ ]\{0,\}#$/DOCS /p 240 s/^:/CODE /p 241 ' > "$WORK/raw" 242 243 # Now that we've read and formatted our input file for further parsing, 244 # change into the work directory. The program will finish up in there. 245 cd "$WORK" 246 247 # First Pass: Comment Formatting 248 # ------------------------------ 249 250 # Start a pipeline going on our preformatted input. 251 # Replace all CODE lines with entirely blank lines. We're not interested 252 # in code right now, other than knowing where comments end and code begins 253 # and code begins and comments end. 254 sed 's/^CODE.*//' < raw | 255 256 # Now squeeze multiple blank lines into a single blank line. 257 # 258 # __TODO:__ `cat -s` is not POSIX and doesn't squeeze lines on BSD. Use 259 # the sed line squeezing code mentioned in the POSIX `cat(1)` manual page 260 # instead. 261 cat -s | 262 263 # At this point in the pipeline, our stream text looks something like this: 264 # 265 # DOCS Now that we've read and formatted ... 266 # DOCS change into the work directory. The rest ... 267 # DOCS in there. 268 # 269 # DOCS First Pass: Comment Formatting 270 # DOCS ------------------------------ 271 # 272 # Blank lines represent code segments. We want to replace all blank lines 273 # with a dividing marker and remove the "DOCS" prefix from docs lines. 274 sed ' 275 s/^$/##### DIVIDER/ 276 s/^DOCS //' | 277 278 # The current stream text is suitable for input to `markdown(1)`. It takes 279 # our doc text with embedded `DIVIDER`s and outputs HTML. 280 $MARKDOWN | 281 282 # Now this where shit starts to get a little crazy. We use `csplit(1)` to 283 # split the HTML into a bunch of individual files. The files are named 284 # as `docs0000`, `docs0001`, `docs0002`, ... Each file includes a single 285 # doc *section*. These files will sit here while we take a similar pass over 286 # the source code. 287 ( 288 csplit -sk \ 289 $CSPLITARGS \ 290 -f docs \ 291 -n 4 \ 292 - '/<h5>DIVIDER<\/h5>/' '{9999}' \ 293 2>/dev/null || 294 true 295 ) 296 297 298 # Second Pass: Code Formatting 299 # ---------------------------- 300 # 301 # This is exactly like the first pass but we're focusing on code instead of 302 # comments. We use the same basic technique to separate the two and isolate 303 # the code blocks. 304 305 # Get another pipeline going on our performatted input file. 306 # Replace DOCS lines with blank lines. 307 sed 's/^DOCS.*//' < raw | 308 309 # Squeeze multiple blank lines into a single blank line. 310 cat -s | 311 312 # Replace blank lines with a `DIVIDER` marker and remove prefix 313 # from `CODE` lines. 314 sed ' 315 s/^$/# DIVIDER/ 316 s/^CODE //' | 317 318 # Now pass the code through `pygmentize` for syntax highlighting. We tell it 319 # the the input is `sh` and that we want HTML output. 320 $PYGMENTIZE -l sh -f html -O encoding=utf8 | 321 322 # Post filter the pygments output to remove partial `<pre>` blocks. We add 323 # these back in at each section when we build the output document. 324 sed ' 325 s/<div class="highlight"><pre>// 326 s/^<\/pre><\/div>//' | 327 328 # Again with the `csplit(1)`. Each code section is written to a separate 329 # file, this time with a `codeXXX` prefix. There should be the same number 330 # of `codeXXX` files as there are `docsXXX` files. 331 ( 332 DIVIDER='/<span class="c"># DIVIDER</span>/' 333 csplit -sk \ 334 $CSPLITARGS \ 335 -f code \ 336 -n 4 - \ 337 "$DIVIDER" '{9999}' \ 338 2>/dev/null || 339 true 340 ) 341 342 # At this point, we have separate files for each docs section and separate 343 # files for each code section. 344 345 # HTML Template 346 # ------------- 347 348 # Create a function for apply the standard [Docco][do] HTML layout, using 349 # [jashkenas][ja]'s gorgeous CSS for styles. Wrapping the layout in a function 350 # lets us apply it elsewhere simply by piping in a body. 351 # 352 # [ja]: http://github.com/jashkenas/ 353 # [do]: http://jashkenas.github.com/docco/ 354 layout () { 355 cat <<HTML 356 <!DOCTYPE html> 357 <html> 358 <head> 359 <meta http-equiv='content-type' content='text/html;charset=utf-8'> 360 <title>$1</title> 361 <link rel=stylesheet href="docco.css"> 362 <link rel=stylesheet href="style.css"> 363 <link rel=stylesheet href="public/stylesheets/normalize.css"> 364 </head> 365 <body> 366 <div id=container> 367 <div id=background></div> 368 <table cellspacing=10 cellpadding=10> 369 <thead> 370 <tr> 371 <th class=docs><h1>$1</h1></th> 372 <th class=code></th> 373 </tr> 374 </thead> 375 <tbody> 376 <tr><td class='docs'>$(cat)</td><td class='code'></td></tr> 377 </tbody> 378 </table> 379 </div> 380 </body> 381 </html> 382 HTML 383 } 384 385 # Recombining 386 # ----------- 387 388 # Alright, we have separate files for each docs section and separate 389 # files for each code section. We've defined a function to wrap the 390 # results in the standard layout. All that's left to do now is put 391 # everything back together. 392 393 # Before starting the pipeline, decide the order in which to present the 394 # files. If `code0000` is empty, it should appear first so the remaining 395 # files are presented `docs0000`, `code0001`, `docs0001`, and so on. If 396 # `code0000` is not empty, `docs0000` should appear first so the files 397 # are presented `docs0000`, `code0000`, `docs0001`, `code0001` and so on. 398 # 399 # Ultimately, this means that if `code0000` is empty, the `-r` option 400 # should not be provided with the final `-k` option group to `sort`(1) in 401 # the pipeline below. 402 if stat -c"%s" /dev/null >/dev/null 2>/dev/null ; then 403 # GNU stat 404 [ "$(stat -c"%s" "code0000")" = 0 ] && sortopt="" || sortopt="r" 405 else 406 # BSD stat 407 [ "$(stat -f"%z" "code0000")" = 0 ] && sortopt="" || sortopt="r" 408 fi 409 410 # Start the pipeline with a simple list of split out temp filename. One file 411 # per line. 412 ls -1 docs[0-9]* code[0-9]* 2>/dev/null | 413 414 # Now sort the list of files by the *number* first and then by the type. The 415 # list will look something like this when `sort(1)` is done with it: 416 # 417 # docs0000 418 # code0000 419 # docs0001 420 # code0001 421 # docs0002 422 # code0002 423 # ... 424 # 425 sort -n -k"1.5" -k"1.1$sortopt" | 426 427 # And if we pass those files to `cat(1)` in that order, it concatenates them 428 # in exactly the way we need. `xargs(1)` reads from `stdin` and passes each 429 # line of input as a separate argument to the program given. 430 # 431 # We could also have written this as: 432 # 433 # cat $(ls -1 docs* code* | sort -n -k1.5 -k1.1r) 434 # 435 # I like to keep things to a simple flat pipeline when possible, hence the 436 # `xargs` approach. 437 xargs cat | 438 439 440 # Run a quick substitution on the embedded dividers to turn them into table 441 # rows and cells. This also wraps each code block in a `<div class=highlight>` 442 # so that the CSS kicks in properly. 443 { 444 DOCSDIVIDER='<h5>DIVIDER</h5>' 445 DOCSREPLACE='</pre></div></td></tr><tr><td class=docs>' 446 CODEDIVIDER='<span class="c"># DIVIDER</span>' 447 CODEREPLACE='</td><td class=code><div class=highlight><pre>' 448 sed " 449 s@${DOCSDIVIDER}@${DOCSREPLACE}@ 450 s@${CODEDIVIDER}@${CODEREPLACE}@ 451 " 452 } | 453 454 # Pipe our recombined HTML into the layout and let it write the result to 455 # `stdout`. 456 layout "$title" 457 458 # More 459 # ---- 460 # 461 # **shocco** is the third tool in a growing family of quick-and-dirty, 462 # literate-programming-style documentation generators: 463 # 464 # * [Docco][do] - The original. Written in CoffeeScript and generates 465 # documentation for CoffeeScript, JavaScript, and Ruby. 466 # * [Rocco][ro] - A port of Docco to Ruby. 467 # 468 # If you like this sort of thing, you may also find interesting Knuth's 469 # massive body of work on literate programming: 470 # 471 # * [Knuth: Literate Programming][kn] 472 # * [Literate Programming on Wikipedia][wi] 473 # 474 # [ro]: http://rtomayko.github.com/rocco/ 475 # [do]: http://jashkenas.github.com/docco/ 476 # [kn]: http://www-cs-faculty.stanford.edu/~knuth/lp.html 477 # [wi]: http://en.wikipedia.org/wiki/Literate_programming 478 479 # Copyright (C) [Ryan Tomayko <tomayko.com/about>](http://tomayko.com/about)<br> 480 # This is Free Software distributed under the MIT license. 481 :