To compile the cluster explorer (on Linux), type:
        gcc ClusterExplorer.c -lgsl -lblas -o ClusterExplorer
You need to have the GNU scientific library installed. It should be straightforward to compile this program on other operating systems, but we haven't tried.

To use the cluster explorer, type:
        ./ClusterExplorer [INPUT FILE] [OUTPUT FILE]
Here, [INPUT FILE] is the name of the input file and [OUTPUT FILE] is the name of the output file.

Input file format:
The input file should contain only two lines (no "enter", "tab" and "space" is allowed in the middle of asequence). The first line contains the structure information while the second line contains the mutation information. In the first line, the buried or exposed status for each residue is represented by "b" or "e", respectively. In the second line, the conserved or mutated status is represented by "c" or "m", respectively. Each character in both lines denotes a residue in the protein. The residues are ordered from N terminal to C terminal.

Example of input file:
eebbbbbbeebbbbebbbebbeeeebebebebbbbbeeebebbebbeeeebbebebeeeebeeeebbbeebbebbeebebebbbbbebebebbeebbeebebebbbbbbbbbeebeeeebbeebeeeebeebbbbbbbbbeeee
cccccccccccccccccccccmcccccccmcccccccccccccccccccccmcmmcmcmmmccccccccccccmccccccccccccccccccccmccccccccccccccccccccccccccccccccccccccccccccccccc

Output file format:
The output file contains nine columns separated by "tab". Column 1 contains the length of the input sequence. Column 2 contains the number of mutations in the sequence. Column 3 contains the start mutation index of the cluster (counting starts from zero). Column 4 contains the end mutation index of the cluster (counting starts from zero). Column 5 contains the start residue position of the cluster (counting starts from zero). Column 6 contains the end residue position of the cluster (counting starts from zero). Column 7 contains the fraction of buried sites of the cluster. Column 8 contains the Qs value for the cluster. Column 9 contains the Pu value for the cluster. Each line in the output file denotes one potential mutation cluster.

Example of output file (the output for the above input example):
144	11	2	7	51	60	0.4	1.93662e-06	1.69689e-05
|	|	|	|	|	|	|	|		|
|	|	|	|	|	|	|	|		Pu value
|	|	|	|	|	|	|	Qs value
|	|	|	|	|	|	Fraction of buried sites
|	|	|	|	|	End residue position of the cluster
|	|	|	|	Start residue position of the cluster
|	|	|	End mutation index of the cluster
|	|	Start mutation index of the cluster
|	Mutaion number in the input sequence
Length of the input sequence
