This pipeline is for BL-Hi-C.It is based on Juicer and HiC-pro which combines the advatages of these two processing pipelines. HiCpipe is much faster than Juicer and HiC-pro and can output multile features of Hi-C maps. The main.sh will trim the Linker of BL-Hi-C and map the data to certein genome. Then it will use the subjob.sh script to do the other steps in parallel in shell background. You could use top or htop to check your running program.
The outputs is listed as following:
name | software | output content |
---|---|---|
mapping | bwa | merged mapped reads(.bam) |
filter | HiC-pro | contact pairs (.txt) |
pair2hic | juicer (pre) | compressed Hi-C maps(.hic) |
hic2map | juicer(dump) | sparse and dense matrix (.mat) |
compartmet | R eigen | PC1 values(.txt, .bw) |
TAD | Insulation score | TAD boundaries(.bed); insulation score(.bw) |
CDB | HiCDB | CDBs(.bed); relative insulation score(.bw) |
loop | HiCloop | loops(.bedpe) |
qc | shell | Hi-C quality report |
Here is the general features of HiCpipe software.(in developing)
Other utility:
Easy clustering based on compartment and insulation.
Statistics of Hi-C features.
All software metioned before should be installed first. To install this pipeline, simply download this pipeline and use the shell script.
git clone https://github.com/ChenFengling/HiCpipe.git
Organize your data as PROJECT_PATH/sample/sample.fq.gz, for example
BLHiC-project1
├── sample1
│ ├── sample1_R1.fq.gz
│ └── sample1_R2.fq.gz
└── sample2
├── sample2_R1.fq.gz
└── sample2_R2.fq.gz
You will get the summarized data in PROJECT_PATH/all_results/
use the following code to analyse your BL-HiC data
sh main.sh $PROJECT_PATH $Resolution $genome $core $HiCpipe_PATH
Configurations should be changed in config-hicpro_*.txt: BOWTIE2_IDX_PATH GENOME_SIZE GENOME_FRAGMENT.
1.change tss annotation in compartment.r
tss=read.table("YOUR_TSS_FOLDER/tss.bed")
2.change BOWTIE2_IDX_PATH GENOME_SIZE GENOME_FRAGMENT in config-hicpro.txt follow the instrcution in https://github.com/nservant/HiC-Pro/tree/master/annotation to generate the sites of restriction enzyme.
/home/software/HiC-Pro/bin/utils/digest_genome.py -r GG^CC -o mm9_ggcc.bed /home/reference/mouse/mm9/Sequence/BWAIndex/genome.fa
Use HiCqc.sh to generate Hi-C qc report
sh HiCqc.sh $PROJECT_PATH $REPORT_NAME $HiCpipe_PATH
You will find the qc report REPORT_NAME_report.txt under PROJECT_PATH.
Valid_interaction_pairs/Total_PETsTotal_PETs (>50%)
valid_interaction_rmdup/Valid_interaction_pairs (>85%)
cis_interaction/trans_interaction (>1.5)
HiCDB paper
Chen, F., Li, G., Zhang, M. Q., & Chen, Y. (2018). HiCDB: a sensitive and robust method for detecting contact domain boundaries. Nucleic acids research, 46(21), 11239-11250.
T-ALL paper using HiCpipe
Yang, L., Chen, F., Zhu, H., Chen, Y., Dong, B., Shi, M., ... & Zhang, M. Q. (2020). 3D Genome Analysis Identifies Enhancer Hijacking Mechanism for High-Risk Factors in Human T-Lineage Acute Lymphoblastic Leukemia. bioRxiv.
ChIA-PET2 https://github.com/GuipengLi/ChIA-PET2
Hi-Cpro sample
Hi-Cpro
Juicer tools pre https://github.com/theaidenlab/juicer/wiki/Pre#4dn-dcic-format
juicerbox https://github.com/theaidenlab/Juicebox
video for Juicebox usage
cnv and transloctaion tools: HiCtrans HiCnv HiCapp