UNIX shells provide easy access to UNIX functionality such as pipes, signals, file descriptor manipulation, and the file system. Caml-Shcaml hopes to excel at these same tasks.
Shcaml has a bunch of modules; these are the ones we think it's likely
you'll need. All modules in the system are submodules of the
module, except for the module
High-level user utilities.
Record readers and splitters for a variety of file formats.
Fittings represent processes, internal or external, that produce, consume, or transform data.
Quick and dirty argument processing.
Structured records for line-oriented data
Readers are responsible for breaking input data into records.
Generalized channels and file descriptor manipulation.
An Ocaml abstraction for UNIX processes.
Caml-Shcaml requires findlib and the pcre package (as well as the camlp4 and unix packages, which are provided by Ocaml and findlib).
To build and install:
% gunzip shcaml-VERSION.tar.gz % tar xf shcaml-VERSION.tar % cd shcaml-VERSION % ./configure % make % make install
If your findlib is installed as root, you may need to "sudo make install".
Shcaml should now be installed. Try the following:
% ocaml # #use "topfind";; ... # #camlp4o;; ... # #require "shcaml";; /home/alec/.godi/lib/ocaml/std-lib/camlp4: added to search path /home/alec/.godi/lib/ocaml/std-lib/unix.cma: loaded /home/alec/.godi/lib/ocaml/pkg-lib/pcre: added to search path /home/alec/.godi/lib/ocaml/pkg-lib/pcre/pcre.cma: loaded /home/alec/.godi/lib/ocaml/site-lib/shcaml: added to search path /home/alec/.godi/lib/ocaml/site-lib/shcaml/shcaml.cmo: loaded /home/alec/.godi/lib/ocaml/site-lib/shcaml/shtop.cmo: loaded /home/alec/.godi/lib/ocaml/site-lib/shcaml/shtopInit.cmo: loaded Caml-Shcaml version 0.1.1 (Shmooz)
# let processes = LineShtream.string_list_of ^$
run_source (ps () -| cut Line.Ps.command);;
val processes : string list ...If all has gone well, you should have a list of all the process invocations (whatever's in the "COMMAND" field when you call ps auxww) currently running on your system.
This manual is more tutorial style than straight ahead instruction
manual. The API is (hopefully!) completely documented, so for
specific information on any particular bit of the library, check
there. This document is here to demonstrate some of the concepts and
features of Shcaml.
Shcaml is composed of several major components that are the building blocks of the library. Let's start out by examining a few of them.
Follow the instructions above in the "Getting Started" section to get
Shcaml installed and running. We'll work in the toploop, with Shcaml
loaded. So, run
# #use "topfind";;
# #require "shcaml";;
Line.t represents structured data that might be found in a
file or in the output of a command. A line might represent a record
from the passwd file, or the output of ps. Let's make one:
# let hello = Line.line "hello world, I'm a line!";;
val hello : Shcaml.Line.empty Shcaml.Line.t = <line:"hello world, I'm a line!">I know it looks like
hellohas our greeting in it, but at the moment we have an
emptyline. What gives? Well, all lines are constructed from a raw string, in this case
"hello world, I'm a line!". But that doesn't actually tell us any useful information about what kind of data is in that string. Let's suppose that
hellowere a line that came from a comma-delimited file. Then we would want to think of it as delimited input, rather than simply a string. Lines represent delimited input simply as a list of strings. Let's turn our
emptyline into a more structured piece of data. We'll use
Pcre.asplitto create to turn the string into an array.
# let hello_delim =
(Pcre.asplit ~pat:", " (Line.show hello))
val hello_delim : <| delim : <| > > Shcaml.Line.t = <line:"hello world, I'm a line!">Okay, that's not the type it really prints, what it really prints is something like this:
< delim : < names : Shcaml.Line.absent; options : Shcaml.Line.absent >; fstab : Shcaml.Line.absent; group : Shcaml.Line.absent; key_value : Shcaml.Line.absent; mailcap : Shcaml.Line.absent; passwd : Shcaml.Line.absent; ps : Shcaml.Line.absent; seq : Shcaml.Line.absent; source : Shcaml.Line.absent; stat : Shcaml.Line.absent > Shcaml.Line.t
That's pretty messy, so in this manual, we use an abbreviated syntax that we'll explain below. But before explaining it, let's just check and make sure you got what I promised you. Try this:
# Line.Delim.fields hello_delim;;
- : string array = [|"hello world"; "I'm a line!"|]Now that you know my word is good, let's figure out what that big ol' type we got back for
hello_delimmeans. If you're a Real Functional Programmer, you might be disappointed to see that it appears that we suddenly have an object type. Don't worry, the only object you might actually use in Shcaml is in
Flags, and you might even like that one. (As it turns out, there's no actual object constructed in the implementation of
Line, but that's a technical detail). If you look more closely, you'll notice that the type of
hello_delimtells us that
delimfield present, and all other fields absent. This is an extremely powerful thing. Consider,
hellodoes not have
delim : Shcaml.Line.presentin its type. What would happen if we try to get the
# Line.Delim.fields hello;;
Characters 18-23: Line.Delim.fields hello;; ^^^^^ This expression has type Shcaml.Line.empty Shcaml.Line.t but is here used with type (< delim : < .. > as 'b; .. > as 'a) Shcaml.Line.t Type Shcaml.Line.empty = < delim : Shcaml.Line.absent; fstab : Shcaml.Line.absent; group : Shcaml.Line.absent; key_value : Shcaml.Line.absent; mailcap : Shcaml.Line.absent; passwd : Shcaml.Line.absent; ps : Shcaml.Line.absent; seq : Shcaml.Line.absent; source : Shcaml.Line.absent; stat : Shcaml.Line.absent > is not compatible with type 'a Type Shcaml.Line.absent = [> `Phantom ] is not compatible with type 'b Types for method delim are incompatibleSo we get a type error, because
hellodoes not contain a
delim(Never mind those
`Phantoms, they're just there to scare you). The type of a line tells you what data it has. This is one of the ways in which Shcaml helps alleviate many problems in shell scripting. A Shcaml pipeline that expects to be receiving delimited lines cannot be run on lines that don't have them. Code that passes bad data along simply won't compile.
The type parameter to
Line.t specifies which fields are present in a
given line. The type as printed by Ocaml is rather ghastly, because
it explicitly mentions all the fields that are absent. We'd rather
only think about what's present in the line, so we use the abbreviated
syntax from above (and throughout the rest of the manual) that does
this. Shcaml includes a camlp4 extension that parses this syntax.
Findlib will load this extension when you compile a file, or in the
toploop when you
#require "shcaml", if camlp4 is already loaded.
Now, suppose we wanted to uppercase the strings in the
# let hello_DELIM =
(Array.map String.uppercase (Line.Delim.fields hello_delim))
val hello_DELIM : <| delim : <| > > Shcaml.Line.t = <line:"hello world, I'm a line!">
# Line.Delim.fields hello_DELIM;;
- : string array = [|"Hello world"; "I'm a line!"|]Hm, that was fun! I think I want to do it again and again. So let's define a function that will do it for us:
# let uppercase_delims ln =
(Array.map String.uppercase (Line.Delim.fields ln))
val uppercase_delims : (< delim : < .. >; .. > as 'a) Shcaml.Line.t -> 'a Shcaml.Line.t = <fun>Whoa! Another funny type. But a moment's reflection shows that it's exactly the type we might have wanted. It says that
uppercase_delimstakes a line with a
delimfield (and maybe other stuff) and produces a line with the same type. But since
uppercase_delimsonly cares about delimited data, it passes any other information stored in the line through unchanged. We don't know what other fields might be in the line, but we do know that when
uppercase_delimsdoes its thing, the line that came in has the same group data when it comes out (note the
'ain the result type).
We've seen how lines can have generic delimited data attached. Lines
can also have passwd data, data from ps, data representing
key-value pairs, a record of its provenance (
source), and several
others. Functions for manipulating this data will often appear in
Line, for instance,
Line.Passwd. Let's try
another example, creating a line with data from the password file in
it. (Don't worry, this is all built in, but we want to walk you through it.
It builds character.) We'll start by making a delimited list of the fields:
# let root = Line.line "root:x:0:0:Enoch Root:/root:/bin/shcaml";;
val root : Shcaml.Line.empty Shcaml.Line.t = <line:"root:x:0:0:Enoch Root:/root:/bin/shcaml">
# let root_delim = Line.Delim.create
(Pcre.asplit ~pat:":" (Line.show root)) root;;
val root_delim : <| delim : <| > > Shcaml.Line.t = <line:"root:x:0:0:Enoch Root:/root:/bin/shcaml">Then, we'll make a function that takes lines with delimited data to lines with passwd data as well.
# let passwd_of_delim ln =
match Line.Delim.fields ln with
| [|name;passwd;uid;gid;gecos;home;shell|] ->
~name ~passwd ~gecos ~home ~shell
~uid:(int_of_string uid) ~gid:(int_of_string gid)
| _ -> Shtream.warn "Line didn't have 7 fields";;
val passwd_of_delim : <| delim : < .. > as 'a; .. as 'b > Shcaml.Line.t -> <| delim : 'a; passwd : Shcaml.Line.present; .. as 'b > Shcaml.Line.t = <fun>Inspecting the types yet again, we're pretty happy. Our function takes a line with a
delimfield, and returns one with not just a
delimfield, but also a
Shtream.warnwill be discussed below). Let's try it out:
# let root_pw = passwd_of_delim root_delim;;
val root_pw : < delim : <| >; passwd : Shcaml.Line.present > Shcaml.Line.t = <line:"root:x:0:0:Enoch Root:/root:/bin/shcaml">
# Line.Passwd.uid root_pw;;
- : int = 0You may have noticed that when we get the string a line was made out of, we use
Line.show. You can call
showon any line, and it will return a string representation of that line. That does not necessarily mean it will print out the exact value with which the line was created. In fact, you can change what
Line.select. Suppose that we wanted people to only see a username when they tried to
# let root_un = Line.select Line.Passwd.name root_pw;;
val root_un : < delim : <| >; passwd : Shcaml.Line.present > Shcaml.Line.t = <line:"root">
# Line.show root_un;;
- : string = "root"
# Line.show root_pw;;
- : string = "root:x:0:0:Enoch Root:/root:/bin/shcaml"Using
Line.selectbecomes extremely important when we start working with external processes (that is, running UNIX programs from Ocaml). When a line is to be piped into some external process, Shcaml calls
showon it and sends the string that results along. Thus, when it's important, you can change how your data is rendered when it goes to UNIX.
Shtreams are similar in intent and operation to Ocaml
but unlike a
Shtreams have an
'h'. Additionally, shtreams
know about Ocaml channels; any shtream may be turned into an Ocaml
in_channel, and vice-versa. Shtreams have a richer interface than
streams, which may be explored in the API. Let's try to make a
# let stdin_shtream = Shtream.of_channel input_line stdin;;
val stdin_shtream : string Shcaml.Shtream.t = <abstr>
# Shtream.next stdin_shtream;;
hello, there. (you type this)
- : string = " hello, there. (you type this)"Here, we create a shtream from the
Shtream.of_channel. The first argument is a reader function, that is, a function that tells the shtream how to produce a value from the channel. In this example,
stdin_shtreamreads data a line at a time. When we call
stdin_shtream, it tries to produce another value, causing
input_lineto be called on the
in_channelwith which the shtream was created.
We can turn our shtream into an
in_channel again with
# let newstdin = Shtream.channel_of print_endline stdin_shtream;;
val newstdin : in_channel = <in_channel:4>
# input_line newstdin;;
- : string = " Hi again!"To turn the shtream back into an
in_channel, we needed to give it a writer function, here
print_endline. The writer function should take values in the shtream and print them on stdout. (Bear in mind, shtreams need not contain strings, so a writer function for an
'a Shtream.thas type
'a -> unit.
Shtreams can be generated programmatically using
instance, we could write a shtream that acted like the UNIX program
yes(1), which prints a string to stdout until it's killed. Our
version will be a function that takes a string and creates a shtream
that generates that string over and over again.
As with standard library streams,
takes a function of type
int -> 'a option.
That function is called
with successive integers starting from 0, and is expected to return
Some value, meaning the next value in the shtream, or
indicating that there is no more data to read from the shtream.
To demonstrate that the generating function is called for each element,
we'll include the argument to the function in each element.
# let yes s =
let builder n = Some (Printf.sprintf "%d: %s" n s) in
val yes : string -> string Shcaml.Shtream.t = <fun>
# let yes_shtr = yes "yes";;
val yes_shtr : string Shcaml.Shtream.t = <abstr>
# Shtream.next yes_shtr;;
- : string = "0: yes"
# Shtream.next yes_shtr;;
- : string = "1: yes"
# Shtream.next yes_shtr;;
- : string = "2: yes"
# Shtream.next yes_shtr;;
- : string = "3: yes"
# Shtream.next yes_shtr;;
- : string = "4: yes"We can, of course, create a channel from this shtream, as well.
# let yes_chan = Shtream.channel_of print_endline yes_shtr;;
val yes_chan : in_channel = <in_channel:3>
# input_line yes_chan;;
- : string = "5: yes"
# input_line yes_chan;;
- : string = "6: yes"
# Channel.close_in yes_chan;;
- : unit = ()What we've demonstrated here is a small portion of the functionality of shtreams, but it's enough to give you an idea of how they work. Many more facilities for creating, observing, and manipulating shtreams are described in the
ShtreamAPI documentation. However, from the perspective of Shcaml, shtreams are relatively low-level constructs. In addition to extending
Streams, Shcaml provides extensions to standard Ocaml channels in a module called
Channel, and an abstraction of processes (UNIX programs you run from Shcaml) in
Proc. Lines and shtreams combine their powers in
Fittings, which we discuss next.
Fittings provide an embedded process control notation. That's fancy way of saying that we did our best to create some functions that make it look (kinda, sorta) like you're writing snippets of shell scripts in your Ocaml. Let's try a simple one:
# run (command "echo a fitting!");;
a fitting! ~ : Shcaml.Proc.status = Unix.WEXITED 0We've run the command
"echo a fitting!". We can see "a fitting!" printed, and that it finished successfully (
Unix.WEXITED 0). When a command doesn't exit successfully, we see that too:
# run (command "false");;
- : Shcaml.Proc.status = Unix.WEXITED 1Let's look a little more closely at that. There are two things happening. We construct a fitting with
command "false". There are several different ways to create fittings:
Fitting.commandtakes a string that will be run in the shell (e.g.,
command "foo bar baz"is like sh -c "foo bar baz"). However, the fitting is not actually executed until we call
Fitting.runon it. For example,
# let goodbye = command "echo goodbye from unix" in
print_endline "hello from caml";
hello from caml goodbye from unix ~ : Shcaml.Proc.status = Unix.WEXITED 0Notice that the "hello from caml" appeared before the "goodbye from unix". There are several kinds of "runners". The one we've seen,
run, executes a fitting with stdin as its input and stdout as its output. The type of
(Shcaml.Fitting.text -> 'a Shcaml.Fitting.elem) Shcaml.Fitting.t -> Shcaml.Proc.status. In general,
('a -> 'b) Shcaml.Fitting.tis a thing that consumes a sequence of
'as and produces a sequence of
'bs. The type
Fitting.textindicates data coming in over a channel; the type
'a Shcaml.Fitting.elemindicates generic data that can be sent over a channel. There are several kinds of fitting constructors provided in the
Fittingmodule. Let's look at a few of them. All of the following print the /etc/passwd file to the standard out (we'll elide the output here to save space):
# run (command "cat /etc/passwd");;
# run (from_file "/etc/passwd");;
# run (from_gen (`Filename "/etc/passwd"));;
...Rather than send the output from a fitting to stdout, we can get it as a shtream:
# let passwd = run_source (from_file "/etc/passwd");;
val passwd : Shcaml.Fitting.text Shcaml.Fitting.shtream = <abstr>
# Shtream.next passwd;;
- : Shcaml.Fitting.text = <line:"root:x:0:0:root:/root:/bin/bash">
# Shtream.next passwd;;
- : Shcaml.Fitting.text = <line:"daemon:x:1:1:daemon:/usr/sbin:/bin/sh">What good is that, you may ask? Well, now that we have a shtream of lines, we can start applying some of our line functions to them. Here's one that we provide for parsing passwd files (these sorts of functions are provided by the
# let pw_shtream = run_source
(from_file "/etc/passwd" -| Adaptor.Passwd.fitting ());;
val pw_shtream : <| passwd : Shcaml.Line.present; seq : Shcaml.Line.present; source : Shcaml.Line.present > Shcaml.Line.t Shcaml.Fitting.shtream = <abstr>
# Shtream.next pw_shtream;;
- : <| passwd : Shcaml.Line.present; seq : Shcaml.Line.present; source : Shcaml.Line.present > Shcaml.Line.t = <line:"root:x:0:0:root:/root:/bin/bash">Now we have a shtream that has (take a careful look at those types) lines with passwd data in them. (They also have
source, which tells you where data came from, and
seq, which tells you its line number in the source.)
Can you guess what the
(-|) operator does? That's
right, it's a pipe! (The
| character is pretty meaningful in Ocaml
programs, as are most other shell operators, so we have decorated them
a little bit to give them the right precedence and to keep them from
clashing with other Ocaml syntax.)
The type of
(-|) will help us understand fittings a whole lot
- : ('a -> 'b) Shcaml.Fitting.t -> ('b -> 'c) Shcaml.Fitting.t -> ('a -> 'c) Shcaml.Fitting.t = <fun>Typically, in the shell, when we want to pipe two processes together (foo | bar), we think of
baras a program that takes whatever kind of output
fooproduces and then generates its own output. In Shcaml, we think the same way. The type of a fitting tells us what kind of data it accepts as input and generates as output. An
('a -> 'b) Shcaml.Fitting.ttakes values of type
'aas input and outputs values of type
'b. So of course, you can only pipe together two fittings if the first one produces data the second one consumes. So if the first fitting given to
'as and outputs
'bs, then the second must consume
'bs, and output
'cs. When you put them together, then, you'll get a new fitting that reads
'as, runs them through the first fitting and back into the second, and then produces the output of the second,
'cs. That is, we get an
('a -> 'c) Shcaml.Fitting.t.
Fittings provide a general mechanism to pipe together data like this.
But they also know a whole lot about UNIX, and make it very easy to
intermix calls to the shell with Ocaml code. Let's use the system's
sort command and our built-in
uniq functions (we provide a Fitting
UsrBin) to get a list of the different shells
that are in use on the system.
# let shells = LineShtream.string_list_of
-| Adaptor.Passwd.fitting ()
-| cut Line.Passwd.shell
-| command "sort"
-| uniq ()));;
val shells : string list = ["/bin/bash"; "/bin/false"; "/bin/sh"; "/bin/sync"; "/bin/zsh"; "/usr/lib/nx/nxserver"; "/usr/sbin/nologin"]Your results may differ, of course; on the box this manual is currently being written on, it appears that nobody uses C Shell. That pipeline is longer than the one we've seen, but the only new material is
UsrBin.cut, which takes a function from
('a Shcaml.Line.t -> string)and produces an
('a Shcaml.Line.t -> 'a Shcaml.Line.t) Shcaml.Fitting.t. It's like
Line.selectfor fittings. We start the pipeline off with
from_file "/etc/passwd", which will generate a shtream of the lines out of the passwd file. Then we adapt the shtream into a shtream with passwd data attached (
Adaptor.Passwd.fitting ()). Next, we want to make our lines appear to the outside world not as the full string read out of the passwd file, but rather just the shell field. So we call
UsrBin.cutto select the
Line.Passwd.shellfield as the
showtext for each line. That way, when the lines get passed to the external
sortcommand, it just sees the shell field, and not the whole passwd record. Then we use our internal
UsrBin.uniqto remove duplicates. Because we pass our fitting to
run_source, it generates a shtream, upon which we may finally call
LineShtream.string_list_of. But the code is much easier to understand than the prose, isn't it?
In addition to pipes, Shcaml provides analogues to the shell's
; sequencing operators. Take a bit of structured playtime
and poke around with them. They're in the fine
A difference between fittings and UNIX pipelines is that fittings only
have one input and one output, while UNIX processes may read or write
on many different file descriptors (for instance,
stderr). Shcaml provides facilities for sophisticated I/O
redirection. Let's start by taking a look at how redirection is
dup_spec is a list of instructions for how I/O redirection should
be done for a given fitting. There are a great many operators
Channel.Dup for specifying different sorts of
interconnections. Here's a bunch of different examples, each of which
redirects the standard output to /dev/null:
# run (command "echo hello" />/ [ stdout />* `Null ]);;
- : Shcaml.Proc.status = Unix.WEXITED 0
# run (command "echo hello" />/ [ 1 %>* `Filename "/dev/null" ]);;
- : Shcaml.Proc.status = Unix.WEXITED 0
# run (command "echo hello" />/ [ `OutFd 1 *>& `Null ]);;
- : Shcaml.Proc.status = Unix.WEXITED 0
# run (command "echo hello" />/ [ `OutChannel stdout *>& `Null ]);;
- : Shcaml.Proc.status = Unix.WEXITED 0Why so many ways to say nothing at all? Well, there are a few different kinds of places you can send data (not all of them /dev/null), and several different names for the same places. For instance, writing to
stdout, file descriptor 1, or
`OutChannel stdout. Shcaml provides operators for dealing with each of these cases. (
Channel.gen_channels are Shcaml's lower-level generalized channels.) In order to make it easier to remember which operator is which, they're named systematically. See
Channel.Dupfor an explanation of the myriad redirection operators.
(/</) take a fitting on the left and a
list of redirections on the right, and apply the redirections in the
latter to the former. For example,
# run (command "echo hello; echo world 1>&2"
/>/ [ 1 %> "file1"; 2 %> "file2" ]);;
- : Shcaml.Proc.status = Unix.WEXITED 0Let's check that it worked:
# run (from_file "file1");;
hello ~ : Shcaml.Proc.status = Unix.WEXITED 0
# run (from_file "file2");;
world ~ : Shcaml.Proc.status = Unix.WEXITED 0
Adaptor module provides record readers and splitters for a
variety of file formats. The readers and splitters for each format
are contained in a submodule named for the format (for instance, the
functions for /etc/mailcap are in
Adaptor.Mailcap. Record readers
read "raw data off the wire". That is, a reader is a function from an
in_channel to a
Reader.raw_line, which is a record of string data,
possibly including some delimiter junk.
Splitters do field-splitting. Given a line, they will use the
data in the line to produce a line the relevant fields. In
addition to readers and splitters, each module exports an
function that is used to transform shtreams of lines by using the
splitter functions (they all have these names by
convention) in the module; a function
provided as well, which (as one might expect) provides a version
of the adaptor as a fitting, so it might be used directly in a
There are adaptor submodules for delimited text, simple flat files,
comma-separated text, key-value and sectioned key-value (ie, ssh
config files or .ini-style files), /etc/
files, and more.
UsrBin contains a collection of miscellaneous useful functions.
Among these are fittings like
uniq. In addition, it provides some lower-level but still quite
useful functions, such as
mkpath (mkdir -p, as well
as a submodule
UsrBin.Test that contains functions
analogous to test(1).
It is an unfortunate necessity of the scope and intent of Shcaml that many of the names of things in the library sound generic (for instance: runner, reader, stash, line etc.). In fact, in the API documentation and the manual, we have striven to use such terms in a more formalized sense. This glossary documents Shcaml (and related) "terms of art", hopefully eliminating ambiguity and confusion.
Channel.clobbercontrols whether to clobber files by default.
"ls > /dev/null". This differs from a program, which is one executable to launch.
2 %>& 1).
Line.t, containing both raw text and, usually, metadata.
string listof already-parsed arguments.
Line.twas created. Accessed with
Line.showto get the current (possibly processed) text of a line.
Readerfor how readers are defined. Most
Adaptors also come with a reader for their format.
Fittingfor executing a fitting, which may be previously constructed. See
Line.show. This is used, for example, to send the contents of a line to external processes. If you want the
stringof a line, this is usually it.
Adaptors come with a splitter for their format.
Proc.tin an optional parameter, so that the caller of a function that may fork can find out the identity of the resulting child process. Functions such as
Channel.open_command_intake an optional
Proc.t option refparameter in which they stash
Proc.tof the process that they start.
unit, used to specify an action to be performed later, or in another context. Often in Shcaml, that other context is a child process. For example,
Fitting.thunktakes a thunk, which it runs in a subprocess and splices into the pipeline.