README file for running metagenomic data analysis


Software Requirements
-	Linux operating system (Preferably Ubuntu)
-	Python 2.7
-	MetaPhlAn2 
		sudo apt install metaphlan2 
-	Samtools
		sudo apt-get install samtools
-	MATLAB
-	R


(**Python Code**)
Before starting
-	Copy Fastq samples in the /Read_files folder.
-	Copy the sample names in the /sample_file_names.txt. for example:
		- sample_file_names:
			p_file1_1.fq
			p_file1_2.fq
			p_file2_1.fq
			p_file2_2.fq
			p_file3_1.fq
			p_file3_2.fq
			

To execute FastQC quality control, run the following command:
	- python quality_control.py

FastQC results are saved in the /fastqc_output folder.


To execute MetaPhlAn2, run the following command:
	- python metaphlan_profiling.py

MetaPhlAn2 results are saved in the /metaphlan_output folder.


To execute MOSAIK, run the following command:
	- python mapping_mosaik.py

MOSAIK results are saved in the /mosaik_outputs folder.


You can also download the related EC numbers and reactions from the KEGG database using the following commands:
	- python kegg_ko_2_ec.py
	- python kegg_ec_2_re.py
	
	
(**MATLAB Code**)
To execute N repeats K fold cross-validation and SVM classification algorithm , run the following command in cross_validation_svm folder:
	- cross_validation_svm(data,label,n,k)

(**R Code**)
Compositional bias correction and related R code located in / R_code folder

