IBA LABORATORY, THE UNIVERSITY OF TOKYO |
EGPC (v1.0) : a powerful tool for data classification and important features identification |
Terms and conditions for use of the software: The
owners (authors) of the software give you non-exclusive and
non-transferable license to use the software and to modify it for your
needs. However, whenever you use the software, you are requested to make
a citation to the following journal paper. This software is distributed
in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.
Limitation of Liability: |
Citation of the Paper:
@ARTICLE{Paul:2007:EGPC, AUTHOR = {Topon Kumar Paul and Hitoshi Iba}, TITLE = {Prediction of cancer class with majority voting genetic programming classifier using gene expression data}, JOURNAL = {IEEE/ACM Transactions on Computational Biology and Bioinformatics}, YEAR = {2007}, VOLUME = {}, NUMBER = {}, PAGES = {} }OR Topon Kumar Paul and Hitoshi Iba (2007). Prediction of cancer class with majority voting genetic programming classifier using gene expression data. To appear in IEEE/ACM Transactions on Computational Biology and Bioinformatics. |
Download: All Files
(in WinZip format) | EGPCpre.jar |
EGPCcom.jar |
EGPCgui.jar | Example Data Files
(WinZip file) Download Manual: Readme.pdf | Readme.html | Readme.txt |
Note:
|
EGPC is a multi-class classifier based on genetic
programming and majority voting. The main features of EGPC are that:
|
Execution of EGPCpre.jar (for preprocessing of data) in CLI mode: To run the program from command prompt, type: java [-Xmx<heap size>] -jar EGPCpre.jar [arguments...] Command line arguments and formats: -Xmx<heap size>: maximum heap size; some data sets may require higher heap size. Example: -Xmx512m (m or M for mega byte). -f <input file>: input data file name (with path if not on the current working directory); <input file> must be provided. -o <output file>: output file name (with path if not on the current working directory); default: DataOut.txt. -p <l:h:d:f>: preprocessing parameters; l=lower threshold, h-higher threshold, d=difference, f=fold change. -n <normalization info>: normalization info; for log normalization type G with the base like G10 or Ge while for linear normalization type La:b where a:b is the range. -h <header info>: header info; G: first column contains genes IDs; S: first row contains samples IDs; GS or SG for both. Example: java -jar EGPCpre.jar -f "DataFile/BrainPre.txt" -o BrainPro.txt -p 20:16000:100:3 -n Ge -h GS |
Execution of EGPCcom.jar in CLI mode: To run the programfrom command prompt, type: java [-Xmx<heapsize>] -jar EGPCcom.jar [arguments...] Command line arguments and formats: -u: UCIML format; default (if it is omitted) is Microarray format. -d <data file>: data file name (with path if not on the current working directory); <datafile> must be provided. -v <validation file>: validation file name (with path if not on the current working directory); if it is not provided, the training information must be provide under the “–t” below. -s <sample size>: number of samples; must be provided. -a <attribute size>: number of attributes; must be provided. -A <attribute info>: attribute information; default is that all attributes are numeric. Refer to Readme.pdf file for details. -t <training info>: training subset information; the training information can be either the filename (with path if not on the current working directory) containing the indexes of the training samples or the training size of each type of sample delimited by colon like 179:106. -c <classes>: number of classes; default is 2. -m <ensemble size>: ensemble size; default is 3. -F <functions>: functions to be used; functions are delimited by colon (:) and the default functions are "+:-:/:*:sqr:sqrt". Note here that the functions’ string must be within double quotation (“ ”). -p <population size>: population size; default is 1000. -g <max gen>: maximum number of generations; default is 50. -r <max run>: number of trials or repetition; default is 20. |
Execution of EGPCgui.jar in GUI mode: Go to the command prompt and type: java [-Xmx<heap size>] -jar EGPCgui.jar Some screen shots of GUI of EGPC are given below. |
![]() |
![]() |
![]() |
![]() |
Download: All Files (in WinZip format) | EGPCpre.jar | EGPCcom.jar | EGPCgui.jar | Example Data Files (WinZip file)| Description |
Copyright@2006, IBA Laboratory. All rights reserved. Last update: July 30, 2007 03:09:54 PM (Tokyo time). |