Mendes Research Group

DOME

A database system for functional genomics and systems biology

Summary

DOME is a system for managing, and analysing genome-wide profiling data, now known as systems biology . It is dedicated to experiments where samples are processed with transcriptomics, proteomics, and metabolomics technologies. DOME is for integrated analyses spanning those three data types.

DOME is a laboratory database intended to manage the data generated in one or several projects. The system has not been designed to become a public repository, though the code could be used for that purpose if wanted. DOME should be installed in each laboratory that desires to use this system. The DOME databases hosted in our website are only for those projects in which we are active participants (Medicago truncatula, grape functional genomics, and yeast systems biology)

DOME is a client-server application and is fully accessible from a web browser (i.e. DOME users do not have to install new software). The web front-end guides the user to make complex queries to select data by diverse criteria. Query results are then available to be displayed in table format, downloaded in data files, visualized within pathway diagrams, or processed with multivariate analysis algorithms, such as k-means clustering, non-negative matrix factorization and PCA.

DOME is open source, and is freely available to download. Unfortunately we have no funded resources to maintain this software and there is no support beyond these web pages.

The DOME team: Bharat Mehrotra, Xing Jing Li, Aejaaz Kamal, Saroj Mohapatra, and Pedro Mendes. Stefan Hoops and Olga Brazhnik also participated in earlier development of the project.

Schema

DOME is a complex database which contains several data types, multiple versions of intermediate data processing stages, a reference database, and metadata.

Dome is supposed to be installed in a laboratory that may be dealing with several projects. One portion of DOME, the reference database named B-Net, is common to all projects. B-Net represents the background knowledge of biochemistry and molecular biology of the biological systems analysed. The other portion of the DOME schema can be instantiated several times, one per project, as illustrated in the figure below.

A high-level overview of the DOME schema

The DOME schema is presented in two parts:

Details of the B-Net Schema. (containing the reference biochemical information)
Details of the Project Schema (containing data and metadata)

Download

The system is composed of two files:

database structure and setup scripts. (This is a bzip compressed file. Decompress it and then untar it. It contains two dumps: bnet.dump and project.dump. Use these files to create the DOME database on PostgreSQL.) Download domeDatabase.tar.bz2 (14.80 Mb)
web front-end. (This is the bzip compressed file. Decompress it and untar it. It contains the front-end scripts for the DOME system which should run on an Apache server running PHP and Perl.) Download domeScripts.tar.bz2 (273.96 Kb)

Installation

The installation of DOME is composed of two parts: the first consists of installing and configuring all of the required software, and should be done by the Systems Administrator of the machine that will host the DOME (since root access is needed); the second part is the configuration of DOME itself and should be carried out by the DOME manager (i.e. the person in charge of the DOME system).

The machine where DOME will be installed should have as much RAM as can fit. 1 Gb is the absolute minimum, but more is recommended. Ideally this machine should be reserved only for DOME. In high data installations, it may be better to use a multiple machine configuration, where the database server resides in one computer and the web server on a different one (detailed instructions for this will appear later).

Requirements for installation:

Linux, Solaris, FreeBSD, or NetBSD operating system
Apache web server
PostgreSQL
PHP
PERL version 5
R statistical package
gnuplot

Section A - Configuration of supporting services

DOME has been built on top of several packages that provide basic functionality, such as a web server, a database management system, etc. These packages need to be installed and properly configured before DOME can work. This is the job of a systems administrator (i.e. a person with access to the root account). Note that many of these packages may already have been installed with the operating system, but their configuration may need to be tuned. Please check all of the sections below and make sure the configurations match (or exceed) what is described.

1) Web server

A web server must be installed and this must support php and perl. We strongly recommend using Apache, which comes with most Linux distributions. It should be configured to load the following modules: mod_php and mod_perl. Please make note of the username used by the apache server, as this will be important later on; it is usually nobody or apache, but it could be something else. Make sure that the apache server is always running by including it in your startup scripts.

To make sure that mod_php and mod_perl are loaded in apache, you should inspect/edit the httpd.conf configuration file. In most Linux distributions this is located in /etc/httpd.conf or perhaps in /etc/httpd/conf/httpd.conf.

2) Firewall

Make sure that your firewall does not prevent the httpd user from connecting to the network. (Remember which was the user that runs the web server? You noted it down in 1) above).

In Fedora Core 4, the SELinux application defaults to not allowing user apache to access the network. Unfortunately this also prevents it from accessing the postgresql server, and DOME will not work. This can be changed in Desktop -> System Settings -> Security Level -> SELinux, then under httpd service check Allow httpd scripts to connect to the network.

3) PHP

You already ensured PHP is available to the web served in step 1. Now you need to tune its configuration. This is done in the file php.ini that is usually located in /etc, /etc/apache, /usr/lib/, /usr/local/lib or somewhere else. To be sure where this file is, run the following command: echo "<?php phpinfo(); ?>" | php | grep php.ini . This should indicate where the active php.ini is located. Change that file in the following way:

increase the memory limit to a large value (at least 256Mb); if it is not large enough some queries may not display properly on the user's front-end. Assuming you have 2Gb of RAM in your machine, let's set this to 512Mb, the line should read memory_limit = 512M.
increase the maximum execution time of a script to 300 seconds. The line should read: max_execution_time = 300.
Increase the maximal size allowed for uploads, since many microarray data files will be larger than the default 2Mb. Set it to 20 Mb or even larger if you know you will have larger data files. The line should read: upload_max_filesize = 20M.
Make sure that cookies are kept for a whole browser session. The line should read: session.cookie_lifetime = 0.
Increase the time for garbage collector to automatically remove data to 24 hours. The line should read: session.gc_maxlifetime = 86400 (the number is in seconds).

4) Postgresql

DOME uses the PostgreSQL database management system. It may be possible to use another DBMS, such as Oracle or MySQL, but currently we only support PostgreSQL. The systems administrator has to make sure that the postgres server is up and running. If the machine is not behind a firewall, then it is important also to secure its access. Make sure that postgresql is always running by including it in your startup scripts. The server configuration must include the following points:

The working memory must be set high. In our own server that has 2 Gb RAM, we reserve 1Gb for postgresql. If you have only 1Gb RAM, then set this to be at least 512Mb. Don't even try running DOME on a machine with less than 1 Gb RAM! This setting is defined in the file postgresql.conf: work_mem = 512000.
Create a user for DOME to access the database, using psql on the command line: psql -U postgres -c "CREATE USER curator PASSWORD \"php-user-password\"" -d template1. Remember the password you enter here as you will need it later! (It will be needed in Section B to configure DOME).
Create a database called DOME: psql -U postgres -c "CREATE DATABASE \"DOME\"" -d template1.
Enable access rights to the user curator to the DOME database: psql -U postgres -c "GRANT ALL ON DATABASE \"DOME\" TO curator" -d template1.
Please note: For PostgreSQL version 8.1 and higher, you will need to replace "template1" with "postgres".

A quick tutorial on PostgreSQL installation is at http://sqlrelay.sourceforge.net/sqlrelay/gettingstarted/postgresql.html. It may be helpful to get a knowlegeable sysadmin to overcome some basic issues with PostgreSQL, but is definitely not enough for a complete novice to get by (if you are in that class, please go read the PostgreSQL documentation or even a book!).

5) R Statistical package

The R package is needed for preprocessing microarray data and other analyses. First check if your Linux distribution has this package (it may be an option). If it does then install it, otherwise you will have to download it from its web site. You may need to build it from source.

6) PERL

Perl is almost surely already installed in your system. To be sure check it with the following command: perl --version. If your system does not have it then you will need to install it, but you should seriously consider switching to another Linux distribution... Once you have PERL you need to ensure that the packages listed below are installed. You may want to consider using the CPAN package to download and install the needed packages. One advantage of using CPAN is that it will automatically install any dependencies.

If you have CPAN, you can invoke it with the command perl -MCPAN -e shell. This puts you in the CPAN shell and from there you install packages with the command install packagename.

The PERL packages needed by DOME are:

DBD:Pg (a postgres connection module)

7) GD Library

Install GD library on your system. Also, make sure that the modules for GD Library are installed for Perl and PHP.

8) OMETER

Ometer is software for multivariate analysis of functional genomics and systems biology data. Installation instructions on this link.

This material is based upon work supported by the National Science Foundation under Grant No. 0109732. Any opinions findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.