So you're learning OCaml!


EDITED: This has since become quite popular so I’ve made it even meatier for you!

Today was the first day of the Introduction to Functional Programming in OCaml course, located here. Apparently over 2,000 people signed up and while doubtlessly many will drop out, there will still be 2,000 more programmers that are now aware of this amazing language called OCaml.

Crash course on the OCaml ecosystem.

These are some key notes that you should know.

  1. opam is the package manager for OCaml. It is very advanced and supports many features. The most basic of which is
$ opam install <some_package>

For people on OS X, you can get it on brew and all the Linux distros should have opam. Attention Ubuntu people: you should do this instead because apt-get’s version of opam on Ubuntu is madly outdated.

$ add-apt-repository ppa:avsm/ppa
$ apt-get update
$ apt-get install ocaml opam

For Windows people, this seems to be a decent option or you could get a VM. Do note that opam doesn’t work on this platform and for a beginner you might ending up wasting a lot of time with environment issues or libraries that assume Unix.

  1. Once you have opam installed, you probably want to do:
$ opam switch 4.02.3

This will install the latest version of the compiler. When you did opam switch you can see all the other available compilers as well. You can even switch to a beta compiler like so:

opam switch 4.03.0+beta1
  1. ocamlfind is a program that predates opam and wraps the standard OCaml compilers: ocamlc and ocamlopt. The former is a byte code compiler and the latter creates native code.

  2. ocamlbuild is a tool that helps build OCaml programs, many people have strong opinions on it. You can find the manual for it here. It seems that people are starting to invest effort into it again.

  3. oasis is a tool that helps abstract usage of 3, 4. I resisted it for a while and wrote Makefiles instead, don’t do that, just use oasis. The oasis flow basically goes like this: (Be aware that oasis is really finicky and its error messages are useless), see the oasis minitutorial at the end of this post.

  4. merlin is a OCaml program that is simply amazing it drives code completion for plugins available in emacs and vim. Once you have merlin installed with

$ opam install merlin

then you can add a .merlin file to your project so that merlin knows what packages to code complete for, a sample .merlin file looks like this:

B _build/src
S src
PKG cmdliner lwt
FLG -w +a-4-40..42-44-45-48

Notice how I put the B _build/src That sort of assumes you’re using _oasis and you made the a src directory. I also provided you with some nice compiler flags for extra warnings.

You’ll need to add some code to vim or emacs to truly get the most out of merlin, you can even get jedi style docstring popups like so:

The elisp that I use for my init.el is listed after the oasis tutorial at the end of this post.

  1. There are no full blown IDEs for OCaml, learn emacs, vim. Also Sublime Text has a merlin plugin, if you’re already familiar with Sublime Text then just stick with it, merlin is really what matters here.

  2. utop is an enhanced repl, its better than the plain ocaml repl. Install it with opam install utop

Library situation

OCaml does have a standard library but it sucks. It was only created to serve the needs of the compiler programmers, ie its not like Python’s standard library which has everything under the sun + the moon. There are a few standard library replacements, one is called Core and its provided by Jane Street. Its the library used in the Real World OCaml book/website. Another standard library replacement is called Batteries, this is more “community” supported. There is a more recent contender called Containers. For a categorized list of contemporary and well liked/must have libraries, checkout the awesome-ocaml repository.

Speaking of Libraries…

This is “functional programming,” so many of the real world libraries you’ll encounter will have Monadic interfaces, like lwt or Core’s async, both are asynchronous threading libraries, use Monads and that wacky >>= function. But you really shouldn’t fret about what a Monad is or represents, just follow the type signature and you’ll be fine. For a more detailed treatment of Monads in OCaml and a code example to talk to the Stripe API, see this.

Doing simple tasks (shameless plug)

I try using OCaml for literally everything and that includes going to hackathons, to make this less painful I wrote a library called Podge which helps with simple stuff. I don’t claim its a standard library replacement, just a library for getting stuff done. These two code samples assume the file is named and can be run with utop

First install with opam:

$ opam install podge
  1. Reading output of a process
#require "podge"
let () = 
  Podge.Unix.read_process_output "ls -halt" |> List.iter print_endline

The |> just means piping, its piping the output of read_process_output into the input of the partially applied function iter

  1. Reading a file
#require "podge"
let () = 
  Podge.Unix.read_lines "" |> List.iter print_endline

Similar to 1, this reads all lines of file and gives it to the input of the partially applied function iter.

These are two simple code samples from Podge, check out the repo for other useful modules like: (The README has code examples)

What can you do with it?

Loads, warning shameless plugs ahead.

  1. I wrote a opam package that makes it easy to get an iOS OCaml cross-compiler, see here.
  2. Compilers!, lots of compilers/compiler tools are written in OCaml: Facebook uses OCaml for pfff and flow and the first cut of Rust was written in OCaml.
  3. Financial world, Jane Street uses OCaml for basically everything (AFAIK)
  4. Systems Programming: ahrefs, my former employer Ahrefs uses OCaml for heavy systems programming.
  5. Kernels: Unikernels are hot right now, the most prominent one is the Mirage-OS project and its all OCaml.
  6. Shameless plug: I use OCaml as well for js_of_ocaml, in fact I’m using it to write an Electron app with a node backend (All code is OCaml compiled into JS, then run on node/Electron), see here.
  7. Genomics/Bioinformatics: Hammer Lab in NYC uses OCaml for their genomics/sequencing work.
  8. My employer MixRank let me write OCaml for a ssh tunnel multiplexer for jailbroken iDevices called gandalf
  9. You can even replace your shell scripts or Python with it, aka run it using the interpreter, see the self contained example listed after the elisp code at the end.

Stick with it!

This style of coding might be new to you or maybe its your first programming language, stick with it and continue. OCaml offers many awesome features and has many strengths including a very professional and pragmatic community. Also, if you’re in the Bay Area then please come to weekly office hours hours hosted at MixRank in San Francisco. Its open to all levels of experience and I still have some Enter the Monad tshirts to give away courtesy of Jane Street.

Oasis mini-tutorial

  1. Create a directory.

  2. Go to the directory and create a file named _oasis and directory named src

  3. Here is a template of the contents of the _oasis file

OASISFormat:  0.4
OCamlVersion: >= 4.02.3
Name:         opam_package_name
Version:      0.1
Maintainers:  New OCaml programmer
Synopsis:     Some short description
License:      BSD-3-clause
Plugins:      META (0.4), DevFiles (0.4)
AlphaFeatures: ocamlbuild_more_args

Some cool description

# This is a comment and this below creates an binary program
Executable <some_program_name>
  Path: src
  install: true
  CompiledObject: native
  BuildDepends: package_one, package_two

# Another comment, this builds a library called pg
Library pg
  Path:         src
# oasis will figure out the dependencies, 
# Just list the modules you want public, 
# Note that there's no .ml, just give the name
Modules:      Pg
  CompiledObject: byte
  BuildDepends: some_package
  1. Generate the Makefile,, configure and other build crap.
$ oasis setup -setup-update dynamic
  1. Actually build your code, yes its just a call to make.
$ make

Assuming that you were building an executable, then you should see either a foo.native or a foo.byte in the root directory of the project.

  1. You can stop here, but you can go even further with oasis2opam. Install it with:
opam install oasis2opam

then in your project’s root directory, aka the directory with the _oasis file, do:

oasis2opam --local

This creates the opam directory and some meta data for the opam packaging system. Your local package can now be a first class citizen with opam just by doing this in the same project root directory:

$ opam pin add <your_package_name> . -y

Elisp for OCaml coding

;; OCaml code
 (lambda ()
   ;; Add opam emacs directory to the load-path
   (setq opam-share
	  (shell-command-to-string "opam config var share 2> /dev/null")
	  0 -1))
   (add-to-list 'load-path (concat opam-share "/emacs/site-lisp"))
   ;; Load merlin-mode
   (require 'merlin)
   ;; Start merlin on ocaml files
   (add-hook 'tuareg-mode-hook 'merlin-mode t)
   (add-hook 'caml-mode-hook 'merlin-mode t)
   ;; Enable auto-complete
   (setq merlin-use-auto-complete-mode 'easy)
   ;; Use opam switch to lookup ocamlmerlin binary
   (setq merlin-command 'opam)
   (require 'ocp-indent)
   (autoload 'utop-minor-mode "utop" "Minor mode for utop" t)
   (autoload 'utop-setup-ocaml-buffer "utop" "Toplevel for OCaml" t)
   (autoload 'merlin-mode "merlin" "Merlin mode" t)
   ;; Important to note that setq-local is a macro and it needs to be
   ;; separate calls, not like setq
   (setq-local merlin-completion-with-doc t)
   (setq-local indent-tabs-mode nil)
   (setq-local show-trailing-whitespace t)
   (setq-local indent-line-function 'ocp-indent-line)
   (setq-local indent-region-function 'ocp-indent-region)
   (if (equal system-type 'darwin)
       (load-file "/Users/Edgar/.opam/working/share/emacs/site-lisp/ocp-indent.el")
     (load-file "/home/gar/.opam/working/share/emacs/site-lisp/ocp-indent.el"))

(add-hook 'utop-mode-hook (lambda ()
			     (get-process "utop") nil)))

OCaml as shell scripting, example assume gandalf

This is the deploy script I use for a project that I’m working on, a way to get a node like thing on iOS using JavaScriptCore, Objective-C++ and Grand Central Dispatch.

You can do chmod +x on the script, then can just invoke it as a regular program, no compiling necessary.

#!/usr/bin/env ocaml
(* Need topfind to make require work, need require to use podge package *)
#use "topfind"
#require "podge"

module U = Yojson.Basic.Util
module A = Podge.ANSITerminal

type cmd = Scp of int * string
         | Ssh of int * string

type error_condition = Copy_error | Exec_error

let connected_devices () =
  Podge.Unix.read_process_output "gandalf -s"
  |> Podge.List.drop ~n:5
  |> String.concat ""
  |> Yojson.Basic.from_string
  |> U.to_list
  |> List.fold_left
    (fun accum item -> U.(member "Local Port" item |> to_int) :: accum) []

let command = function
  | Scp (port, target) ->
    Printf.sprintf "scp -P %d %s root@localhost:~/" port target
  | Ssh (port, cmd) ->
    Printf.sprintf "ssh root@localhost -p %d \"%s\"" port cmd

let usage () =
  "This deploy script assumes that it was started\n\
   from the makefile with `make deploy` and that gandalf is running"
  |> print_endline;
  exit 1

let with_gandalf udid ~f =
  let f_name = Filename.temp_file "gandalf" "deployment" in
  let f_chan = open_out f_name in
  Printf.sprintf "%s:2000:22" udid |> output_string f_chan;
  close_out f_chan;
  let gandalf_pid =
    Unix.(create_process "gandalf" [|"gandalf"; "-m"; f_name|] stdin stdout stderr)
  Unix.sleep 1;
  f ();
  Unix.kill gandalf_pid 5

let cmd_result ~error_msg = function
  | outcome when outcome <> 0 ->
    A.colored_message ~m_color:Podge.T.Red error_msg |> prerr_endline;
    exit 1
  | _ -> ()

let () =
  if Array.length Sys.argv <> 3 then usage ()
    with_gandalf Sys.argv.(2) begin fun () ->
      let devices = connected_devices () in
      devices |> List.iter begin fun i_device ->
        let scp_cmd = Sys.(Scp (i_device, argv.(1))) |> command in
        Printf.sprintf "Deploying binary to remote device: %s" scp_cmd
        |> A.colored_message |> print_endline;
        Sys.command scp_cmd
        |> cmd_result ~error_msg:"Was unable to copy over the binary";
        let ssh_cmd_sign =
          Sys.(Ssh (i_device, "ldid -S /var/root/" ^ argv.(1))) |> command
        Printf.sprintf "Signing binary on remote device: %s" ssh_cmd_sign
        |> A.colored_message |> print_endline;
        Sys.command ssh_cmd_sign
        |> cmd_result
          ~error_msg:(Printf.sprintf "Signing binary on remote device: %s" ssh_cmd_sign);
        let ssh_cmd = Sys.(Ssh (i_device, "/var/root/" ^ argv.(1))) |> command in
        Printf.sprintf "Executing binary on remote device: %s" ssh_cmd
        |> print_endline;
        Sys.command ssh_cmd
        |> cmd_result ~error_msg:"Was unable to execute the binary";
        A.colored_message "Deployed and tested successfully"
        |> print_endline