Quick Start
-
Specific the path of references (.fasta) and samples (.fastq) in a configure file (.YAML).
For example, write down and save the following block into a text file and named it as
data.yaml.reference: contamination: fa: ./ref/contamination.fa genes: fa: ./ref/genes.fa genome: fa: /data/reference/genome/Mus_musculus/GRCm39.fa star: /data/reference/genome/Mus_musculus/star/GRCm39.release108 samples: mESCWT-rep1-input: data: - R1: ./test/IP16.fastq.gz group: mESCWT treated: false mESCWT-rep1-treated: data: - R1: ./test/IP4.fastq.gz group: mESCWT treated: true mESCWT-rep2-treated: data: - R1: ./test/IP5.fastq.gz group: mESCWT treated: trueYou can also copy and edit from this template.
Read the more details on how to customize.
-
Run all the analysis by one command:
apptainer run docker://y9ch/bidseqdefault
The pipeline will load configure file named
data.yamlunder the current directory.How to run apptainer on computation nodes without internet acess?
-
(On the login node with internet connection) Run
module load apptainerto mount the apptainer utils, if it is not installed by default. -
(On the login node with internet connection) Build the
bidseq_latest.siffile using the commandapptainer pull docker://y9ch/bidseq. -
(On the computation node) Run
apptainer run bidseq_lastest.sif -c data.yamlto start the pipeline. Note that most HPC systems mount directories in a complex manner. Therefore, you need to find out the actual path by executingrealpath ./and specify this output intoapptainerusingapptainer run -B /the/real/path ...
If your configure file is not named as
data.yaml, add-c your_file_name.yamlarg after the command to customize. -
-
View the analytics reports and filtered sites.
default
3 folder will be created in the working directory (default:
workspace),- trimming, mapping, deduping reports are in
report_readsfolder, with key numbers in all the steps reported in one webpage(example). - filtered sites for Ψ sites detection are in
filter_sitesfolder. These sites are only passed the simplest filtering, you can apply customized threshold into them based your data type and quality. - processed mapping results (.bam) are in
align_bamfolder. You can zoom into location that you interested in IGV.
- trimming, mapping, deduping reports are in