getting-started-sleipnir

The Huttenhower Lab > getting-started-sleipnir
Getting started with Sleipnir

First, don't forget that you should always refer to the: Main Sleipnir Documentation

This is just a quick tutorial for getting up and running; the documentation proper will usually contain much more detail.

Obtaining and building Sleipnir and its dependencies

The Sleipnir library consists of the library proper, a suite of tools to perform specific tasks, and a collection of external dependencies. The library and tools are built as a single unit, using autotools (configure/make) on Linux and MacOS or the Sleipnir solution in Visual Studio on Windows. The external dependencies are built either using their respective configure/makefiles on Linux/MacOS or a second solution (extlib) on Windows. Note that if you're using the Sleipnir Mercurial tree, all external dependencies can be installed to their current directory rather than systemwide.

Obtaining a copy of Sleipnir from revision control

Sleipnir is version controlled using Mercurial, which can be installed using a variety of standard methods (deb/rpm/etc.) on Linux/MacOS, either at a Cygwin command line or graphically using TortoiseHG on Windows. Once installed, use the following command line (or the equivalent GUI command) to check out a copy of Sleipnir:

hg clone https://<yourusername>@bitbucket.org/biobakery/arepa

Note that if you're using Visual Studio, a convenient directory in which to execute this command is the Projects folder under Visual Studio's directory in your Documents folder. On Linux/MacOS, you can execute the command directly in your home directory (or anywhere else you'd like). This should prompt you for your ssh password, think for a while, print out a handful of status messages, and create a folder named sleipnir. Congratulations! Now you can start building things.

Building Sleipnir's external dependencies

This is where the paths of Linux and Windows diverge. Choose your own adventure:

Building Sleipnir's external dependencies on Windows

  1. First, since Microsoft doesn't feel the need to include the full C99 standard in its Visual C++ distribution, download the stdint.h file and place it in your C:\Program Files\Microsoft Visual Studio 9.0\VC\include directory.
  2. Open the extlib.sln solution in the extlib/proj/vs2008/ directory.
  3. Make sure the build type dropdown menu at the top of Visual Studio reads "Debug".
  4. From the Build menu, select "Build Solution." Go get some coffee or read the paper while several things build themselves.
  5. Change the built type to "Release", rinse, and repeat the build (and coffee).

Building Sleipnir's external dependencies on Linux/MacOS

The good news is that Linux/MacOS generall come with more of Sleipnir's external dependencies pre-installed. The bad news is that the build procedure for the rest of them is slightly more involved (but only slightly!)

  1. Change to the extlib/svm_perf directory, type make, and hit enter.
  2. Change to the extlib/log4cpp-1.0 directory, type ./configure --prefix=`pwd`, and hit enter. Wait for the text to stop scrolling by, make sure everything looks ok, type make install, and hit enter. If all goes well, you'll get some "same file" errors at the end, but you can ignore them.
  3. Change to the extlib/gengetopt-2.22 directory, type ./configure --prefix=`pwd`, wait, type make install>, and wait again. Everything should succeed without issues.
  4. If you're planning on using BNServer, change to the extlib/boost_1_42_0 directory. Type ./bootstrap.sh --prefix=`pwd` --with-libraries=graph, wait, and then type ./bjam install.

Building Sleipnir and its tools

Windows and Linux/MacOS differ again here...

Building Sleipnir on Windows

  1. Double check that you've added stdint.h to your Visual C++ include directory as described above.
  2. Open the Sleipnir.sln solution in the trunk/proj/vs2008/ directory.
  3. Choose the "Debug" or "Release" build type(s) as appropriate in Visual Studio.
  4. From the Build menu, select "Build Solution." Wait a while; C++ makes Visual Studio slow.
  5. Congratulations - you now have a full build of Sleipnir and its tools in the trunk/proj/vs2008/Release or Debug directory, depending on your choice above. You can run the tools from the command line (I highly recommend Cygwin over the vanilla Windows command prompt).
  6. Extra bonus tip: for convenience, you can add the tools directly to your path. In Cygwin, type export PATH=$PATH:/cygdrive/c/Docume~1//MyDocu~1/Visual~1/Projects/Sleipnir/trunk/proj/vs2008/Release, replacing the appropriate parts of the path with the actual path to your Visual Studio projects directory (e.g. on Vista/Windows 7, it'll start with /cygdrive/c/Users). In the Windows Command Prompt, type set PATH=%PATH%;C:\Docume~1\\MyDocu~1\Visual~1\Projects\Sleipnir\trunk\proj\vs2008\Release.

Building Sleipnir on Linux/MacOS

  1. First, change to the trunk directory where the Sleipnir code proper resides.
  2. Modify the following command appropriately if you've either A) not built all of the optional dependencies (e.g. Boost) or B) installed dependencies system-wide rather than locally. However, assuming you've built everything and installed it within the extlib tree as described above, configure Sleipnir using the following command:
    ./configure --with-gengetopt=../extlib/gengetopt-2.22/bin/gengetopt --with-log4cpp=../extlib/log4cpp-1.0/ --with-smile=../extlib/smile_1_1_linux64_gcc_4_1_2/ --with-svm-perf=../extlib/svm_perf/ --with-boost-includes=../extlib/boost_1_34_1/include/boost-1_34_1/ --with-boost-graph-lib=../extlib/boost_1_34_1/lib/libboost_graph-gcc41-mt.a LDFLAGS=-static
    
  3. Whew! Now just type make and wait a while; that part's easy. This will build both the Sleipnir library itself and all of the tools.

Performing some common tasks with Sleipnir

Building a naive Bayes classifier and predicting biological networks from data

  1. This process is decribed in the main Sleipnir documentation and in the documentation for the most relevant tool, BNCreator. I'll summarize here...
  2. First, let's assume that you have two important pieces of data as inputs. They can be named anything, I'll just provide some examples here:
  3. A gold standard answer file, answers.dab, accompanied by an answers.quant file in the same directory.
  4. A collection of biological datasets, each described by a dataset.dab and accompanying dataset.quant file. I'll assume those all live in a directory named data.
  5. To learn a Bayes net for this data, run:
    BNCreator -m -w answers.dab -o classifier.xdsl data/*.dab
  6. To predict a biological network based on the learned Bayes classifier, run:
    BNCreator -m -i classifier.xdsl -o network.dab -d data
  7. To evaluate the resulting predictions, run:
    DChecker -p -w answers.dab -i network.dab > performance.txt

Creating a new Sleipnir tool

  1. In the tools directory, pick the pre-existing tool most like the one you want to create. If possible, choose a tool with identical dependencies, e.g. if you're making a tool that depends on SVMs, choose SVMer. Let's use that as an example.
  2. Copy the SVMer directory, naming the copy after your new tool; let's call it Exemplar. This should typically result in a new directory containing eight files: Makefile.am and .inSVMer.cpp and .ggocmdline.c and .h, and stdafx.cpp and .h.
  3. Change the two misnamed files (SVMer.cpp and .ggo) to be named appropriately (Exemplar.cpp and .ggo).
  4. Change every instance of SVMer in the relevant files (Makefile.am and .in) to Exemplar.
  5. To make the Linux/MacOS builds work for your new tool, edit four files: tools/Makefile.am and .inconfigure.ac, and configure. In each location where SVMer appears, copy the line and replace the new copy with Exemplar. If you've chosen an appropriate pre-existing tool to copy, the dependencies should be the same, e.g. the new copy of the build instructions for Exemplar will appear alongside SVMer in the portion of the configure and makefiles dependent on SVMLight.
  6. To make the Windows build work, create a new directory Exemplar in the proj/vs2008 directory. Copy SVMer.vcproj and rename the new copy Exemplar.vcproj in the new directory. Replace every instance of SVMer in this file with Exemplar, and randomly change the three GUIDs to new values (I usually just change the last four numerals). Finally, in Visual Studio, add the new project to the Sleipnit.sln solution and, in "Project Dependencies", make sure it's marked to be dependent on Sleipnir itself. If necessary, modify the project properties to point to any new external dependencies (paths under C/C++ -> General and libraries under Linker -> Input.
  7. That's the easy part! Now you just have to do the hard part, that is, change the contents of Exemplar.cpp.ggo, and stdafx.cpp and .h to reflect the functionality of your new tool. As in most parts of computer science, imitation is the sincerest form of flattery: the best way to write new code is to copy and paste old code, so if another Sleipnir tool does something like what you want, use it as a template or copy outright as necessary.

Adding a new Sleipnir dependency

This is probably the trickiest item on this instruction list, but fortunately it's also the last and least likely to be necessary. Very few modifications to Sleipnir will require changes to its dependency structure. However, in case you're adding new machine learning calls or want to tie into different pre-existing libraries, this should give you an idea how to get started.

  1. Extract a plain, vanilla tarball (or other package) of the new external dependency inside the extlib directory.
  2. Immediately run hg add on that directory (and commit if desired), since you only want to add the source for the external tool to the repository, not the fully built version.
  3. On Windows, create a new .vcproj file for the dependency in an appropriately-named subdirectory of proj/vs2008. Do this in the same way described above for new tool project files, add it to the extlib.sln solution, change the contained source files as appropriate (this'll depend on the exact contents of the new dependency; svm_perf is probably a good template for a simple library, log4cpp for a more complex one.
  4. On Linux, configure and build the dependency as described above (probably configure --prefix=`pwd`, then make install).
  5. Now, back in the Sleipnir trunk, edit configure.ac. Find the current external dependency most similar to your (again, I'd suggest svm_perf for simple ones or log4cpp for more complicated ones), copy it, and modify the names appropriately. There should only be a single block of information that needs to be copied and edited in configure.ac.
  6. If you're also adding new tools that depend on your new addition, add them (or edit their inclusion appropriately) in tools/Makefile.am. For example, WITH_SVM_TOOLS controls tools dependent on SVMLightWITH_SMILE_TOOLS controls those dependent on Bayes nets, and so forth.
  7. Run autoconf from the main directory to regenerate configure. Running configure itself (with appropriate local flags as described above) followed by a make clean and make should regenerate the appropriate makefiles automatically (by running automake).
  8. In theory, that's it! In practice, the process may need a bit of tweaking; remember, it can't hurt to experiment, since Mercurial (or your favorite RCS of choice) will always prevent you from breaking anything permanently.