Processing for Data_Tam_DNAseq_2025

gene_x 0 like s 400 view s

Tags: pipeline

  1. Targets
    1. Could you please help me to process these data (Project: X101SC25015922-Z01-J001)?
    2. For you information,
    3. 1. Please compare the data with the AYE strain (CU459141) across the following conditions:
    4. a) AYE-S
    5. b) AYE-Q
    6. c) AYE-WT on Tig4
    7. d) AYE-craA on Tig4
    8. e) AYE-craA-1 on Cm200
    9. f) AYE-craA-2 on Cm200
    10. 2. The "clinical" sample refers to a clinical isolate of Acinetobacter baumannii. Im unsure which reference genome would be most appropriate for comparison in this case. Can we use lab strains (CP059040, CU459141, and CP079931) as reference genome for comparison?
    11. Processed the genome sequence for project X101SC24115801-Z01-J001?
    12. 1. Kindly compare the data with the ATCC 19606 strain (CP059040) under the following conditions:
    13. a) adeABadeIJ (knockout of adeA, adeB, adeI, and adeJ, please confirm whether these genes are successfully knocked out.)
    14. b) adeIJK (knockout of adeI, adeJ, and adeK, please confirm whether these genes are successfully knocked out.)
    15. c) CM1
    16. d) CM2
    17. The "HF" sample may also refer to a clinical isolate of Enterobacter hormaechei.
    18. 2. The "HF" sample refers to a clinical isolate of Acinetobacter baumannii. Im unsure which reference genome would be most appropriate for comparison in this case. Can we use lab strains (CP059040, CU459141, and CP079931) as reference genome for comparison?

Project Data_Tam_DNAseq_2025_AYE

  1. Download the raw data.
    1. 86e4016c902a1cd23a2190415425e641 01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz
    2. 554eb44ae261312039929f0991582111 01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz
    3. ce004b0d7135bce80f34bd6bac3e89e7 01.RawData/AYE-Q/AYE-Q_1.fq.gz
    4. bddc7ced051a2167a5a8341332d7423a 01.RawData/AYE-Q/AYE-Q_2.fq.gz
    5. 227d93b8a762185d5dcd1e4975041491 01.RawData/AYE-S/AYE-S_1.fq.gz
    6. f098c9a8579bf5729427dc871225a290 01.RawData/AYE-S/AYE-S_2.fq.gz
    7. 78e08dd090d89330b1021ce42fb09baa 01.RawData/clinical/clinical_1.fq.gz
    8. 2346fef1d896ef0924d2ec88db51cade 01.RawData/clinical/clinical_2.fq.gz
    9. 4c07494505caf22f70edb54692bcaca2 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz
    10. 52944e395004dc11758d422690bda168 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz
    11. 92b498ed7465645ca00bbc945c514fe2 01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz
    12. fd9d670942973e6760d6dd78f4ee852a 01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz
    13. 375f1e3efb60571ffd457b3cb1e64a84 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz
    14. 041c08f4c45f1fabd129fc10500c6582 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz
    15. c129aa9a208ca47db10bb04e54c096d7 02.Report_X101SC25015922-Z01-J001.zip
    16. md5sum 01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz > MD5.txt_
    17. md5sum 01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz >> MD5.txt_
    18. md5sum 01.RawData/AYE-Q/AYE-Q_1.fq.gz >> MD5.txt_
    19. md5sum 01.RawData/AYE-Q/AYE-Q_2.fq.gz >> MD5.txt_
    20. md5sum 01.RawData/AYE-S/AYE-S_1.fq.gz >> MD5.txt_
    21. md5sum 01.RawData/AYE-S/AYE-S_2.fq.gz >> MD5.txt_
    22. md5sum 01.RawData/clinical/clinical_1.fq.gz >> MD5.txt_
    23. md5sum 01.RawData/clinical/clinical_2.fq.gz >> MD5.txt_
    24. md5sum 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz >> MD5.txt_
    25. md5sum 01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz >> MD5.txt_
    26. md5sum 01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz >> MD5.txt_
    27. md5sum 01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz >> MD5.txt_
    28. md5sum 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz >> MD5.txt_
    29. md5sum 01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz >> MD5.txt_
    30. md5sum 02.Report_X101SC25015922-Z01-J001.zip >> MD5.txt_
    31. ce004b0d7135bce80f34bd6bac3e89e7 AYE-Q_1.fq.gz
    32. bddc7ced051a2167a5a8341332d7423a AYE-Q_2.fq.gz

Data process according to http://xgenes.com/article/article-content/325/analysis-of-snps-indels-transposons-and-is-elements-in-5-a-baumannii-strains/

  1. Call variant calling using snippy

    1. ln -s ~/Tools/bacto/db/ .;
    2. ln -s ~/Tools/bacto/envs/ .;
    3. ln -s ~/Tools/bacto/local/ .;
    4. cp ~/Tools/bacto/Snakefile .;
    5. cp ~/Tools/bacto/bacto-0.1.json .;
    6. cp ~/Tools/bacto/cluster.json .;
    7. mkdir raw_data; cd raw_data;
    8. # Note that the names must be ending with fastq.gz
    9. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_1.fq.gz AYE-S_R1.fastq.gz
    10. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-S/AYE-S_2.fq.gz AYE-S_R2.fastq.gz
    11. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_1.fq.gz AYE-Q_R1.fastq.gz
    12. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-Q/AYE-Q_2.fq.gz AYE-Q_R2.fastq.gz
    13. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_1.fq.gz AYE-WT_on_Tig4_R1.fastq.gz
    14. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-WTonTig4/AYE-WTonTig4_2.fq.gz AYE-WT_on_Tig4_R2.fastq.gz
    15. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_1.fq.gz AYE-craA_on_Tig4_R1.fastq.gz
    16. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craAonTig4/AYE-craAonTig4_2.fq.gz AYE-craA_on_Tig4_R2.fastq.gz
    17. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_1.fq.gz AYE-craA-1_on_Cm200_R1.fastq.gz
    18. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-1onCm200/AYE-craA-1onCm200_2.fq.gz AYE-craA-1_on_Cm200_R2.fastq.gz
    19. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_1.fq.gz AYE-craA-2_on_Cm200_R1.fastq.gz
    20. ln -s ../X101SC25015922-Z01-J001/01.RawData/AYE-craA-2onCm200/AYE-craA-2onCm200_2.fq.gz AYE-craA-2_on_Cm200_R2.fastq.gz
    21. #ln -s ../X101SC25015922-Z01-J001/01.RawData/clinical/clinical_1.fq.gz clinical_R1.fastq.gz
    22. #ln -s ../X101SC25015922-Z01-J001/01.RawData/clinical/clinical_2.fq.gz clinical_R2.fastq.gz
    23. #download CU459141.gb from GenBank
    24. mv ~/Downloads/sequence\(1\).gb db/CU459141.gb
    25. #setting the following in bacto-0.1.json
    26. "fastqc": false,
    27. "taxonomic_classifier": false,
    28. "assembly": true,
    29. "typing_ariba": false,
    30. "typing_mlst": true,
    31. "pangenome": true,
    32. "variants_calling": true,
    33. "phylogeny_fasttree": true,
    34. "phylogeny_raxml": true,
    35. "recombination": false, (due to gubbins-error set false)
    36. "genus": "Acinetobacter",
    37. "kingdom": "Bacteria",
    38. "species": "baumannii", (in both prokka and mykrobe)
    39. "reference": "db/CU459141.gb"
    40. conda activate bengal3_ac3
    41. (bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
    42. #check if we need big calculation for including the clinical sample by checking mlst. TODO: send the mlst results to Tam. Next step by check vrap which complete isolate?
  2. Run second run without the clinical sample

    1. mkdir results_with_clinical
    2. mv variants results_with_clinical
    3. mv roary results_with_clinical
    4. mv fasttree results_with_clinical
    5. mv raxml-ng results_with_clinical
    6. mv snippy/clinical/ snippy_clinical
    7. mv trimmed/clinical_trimmed_*.fastq .
    8. rm raw_data/clinical_*.fastq.gz
    9. rm fastq/clinical_*.fastq
    10. (bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
  3. Using spandx calling variants (almost the same results to the one from viral-ngs!)

    1. mkdir ~/miniconda3/envs/spandx/share/snpeff-5.1-2/data/CP059040
    2. cp CP059040.gb ~/miniconda3/envs/spandx/share/snpeff-5.1-2/data/CP059040/genes.gbk
    3. vim ~/miniconda3/envs/spandx/share/snpeff-5.1-2/snpEff.config
    4. /home/jhuang/miniconda3/envs/spandx/bin/snpEff build CP059040 #-d
    5. ~/Scripts/genbank2fasta.py CP059040.gb
    6. mv CP059040.gb_converted.fna CP059040.fasta #rename "CP059040.1 xxxxx" to "CP059040" in the fasta-file
    7. ln -s /home/jhuang/Tools/spandx/ spandx
    8. (spandx) nextflow run spandx/main.nf --fastq "snippy_CP059040/trimmed/*_P_{1,2}.fastq" --ref CP059040.fasta --annotation --database CP059040 -resume

Run vrap for calling the next closely species from the database for the clinical sample!

  1. ln -s ../X101SC24115801-Z01-J001/01.RawData/HF/HF_1.fq.gz HF_R1.fastq.gz
  2. ln -s ../X101SC24115801-Z01-J001/01.RawData/HF/HF_2.fq.gz HF_R2.fastq.gz
  1. Download all S epidermidis genomes and identified all ST2 isolates from them!

    1. #Acinetobacter baumannii Taxonomy ID: 470
    2. #esearch -db nucleotide -query "txid470[Organism:exp]" | efetch -format fasta -email j.huang@uke.de > genome_470_ncbi.fasta
    3. #python ~/Scripts/filter_fasta.py genome_470_ncbi.fasta complete_genome_470_ncbi.fasta #
    4. # ---- Download related genomes from ENA ----
    5. https://www.ebi.ac.uk/ena/browser/view/470
    6. #Click "Sequence" and download "Counts" (13059) and "Taxon descendants count" (16091) if there is enough time! Downloading time points is 28.02.2025.
    7. python ~/Scripts/filter_fasta.py ena_470_sequence.fasta complete_genome_470_ena_taxon_descendants_count.fasta #16091-->920
    8. #python ~/Scripts/filter_fasta.py ena_470_sequence_Counts.fasta complete_genome_470_ena_Counts.fasta #xxx, 5.8G
  2. Run vrap

    1. #replace --virus to the specific taxonomy (e.g. Acinetobacter baumannii) --> change virus_user_db --> specific_bacteria_user_db
    2. ln -s ~/Tools/vrap/ .
    3. mamba activate /home/jhuang/miniconda3/envs/vrap
    4. vrap/vrap.py -1 trimmed/clinical/clinical_1.fq.gz -2 trimmed/clinical/clinical_2.fq.gz -o vrap_clinical --bt2idx=/home/jhuang/REFs/genome --host=/home/jhuang/REFs/genome.fa --virus=/home/jhuang/DATA/Data_Tam_DNAseq_2025_AYE/complete_genome_470_ena_taxon_descendants_count.fasta --nt=/mnt/nvme0n1p1/blast/nt --nr=/mnt/nvme0n1p1/blast/nr -t 100 -l 200 -g

Project Data_Tam_DNAseq_2025_adeABadeIJ_adeIJK_CM1_CM2

  1. (bengal3_ac3) /home/jhuang/miniconda3/envs/snakemake_4_3_1/bin/snakemake --printshellcmds
  2. #HF is Enterobacter cloacae (550) or Enterobacter hormaechei (158836)
  3. # ---- Download related genomes from ENA ----
  4. https://www.ebi.ac.uk/ena/browser/view/550
  5. #Click "Sequence" and download "Counts" (7263) and "Taxon descendants count" (8004) if there is enough time! Downloading time points is 28.02.2025.
  6. python ~/Scripts/filter_fasta.py ena_550_sequence.fasta complete_genome_550_ena_taxon_descendants_count.fasta #8004-->100
  7. https://www.ebi.ac.uk/ena/browser/view/158836
  8. #Click "Sequence" and download "Counts" (3763) and "Taxon descendants count" (4846) if there is enough time! Downloading time points is 28.02.2025.
  9. python ~/Scripts/filter_fasta.py ena_158836_sequence.fasta complete_genome_158836_ena_taxon_descendants_count.fasta #4846-->540
  10. cat complete_genome_158836_ena_taxon_descendants_count.fasta complete_genome_550_ena_taxon_descendants_count.fasta > complete_genome_158836_550.fasta
  11. grep "ENA|AP022130|AP022130.1" complete_genome_158836_550.fasta
  12. #>ENA|AP022130|AP022130.1 Enterobacter cloacae plasmid pWP5-S18-CRE-02_4 DNA, complete genome, strain: WP5-S18-CRE-02.
  13. ln -s ~/Tools/vrap/ .
  14. mamba activate /home/jhuang/miniconda3/envs/vrap
  15. vrap/vrap.py -1 trimmed/clinical/clinical_1.fq.gz -2 trimmed/clinical/clinical_2.fq.gz -o vrap_clinical --bt2idx=/home/jhuang/REFs/genome --host=/home/jhuang/REFs/genome.fa --virus=/home/jhuang/DATA/Data_Tam_DNAseq_2025_AYE/complete_genome_470_ena_taxon_descendants_count.fasta --nt=/mnt/nvme0n1p1/blast/nt --nr=/mnt/nvme0n1p1/blast/nr -t 100 -l 200 -g

Supplementary: Enterobacter cloacae (taxid550) vs. Enterobacter hormaechei (taxid158836)

  1. 🔬 介绍
  2. 阴沟肠杆菌(Enterobacter cloacae 霍尔马氏肠杆菌(Enterobacter hormaechei 都属于 肠杆菌科(Enterobacteriaceae),是革兰氏阴性、兼性厌氧的杆状细菌。它们广泛存在于 环境中(如水、土壤、植物) 以及 人类和动物的肠道 中。
  3. 🦠 Enterobacter cloacae(阴沟肠杆菌)
  4. 特征:
  5. 革兰氏阴性、兼性厌氧、运动性杆菌
  6. 能在多种环境中生存,适应性强
  7. 具有 β-内酰胺酶,能抗多种抗生素
  8. 致病性:
  9. 是一种 机会性感染菌,可导致 医院相关感染(HAI),如:
  10. 尿路感染(UTI
  11. 肺炎
  12. 败血症
  13. 伤口感染
  14. 耐药性:
  15. 产生 超广谱β-内酰胺酶(ESBLs 碳青霉烯酶(CRE),对 青霉素、头孢菌素、碳青霉烯类 抗生素具有高耐药性
  16. 医院环境中的 E. cloacae 菌株耐药率较高,治疗较为棘手
  17. 🦠 Enterobacter hormaechei(霍尔马氏肠杆菌)
  18. 特征:
  19. E. cloacae 非常相似,也属于 Enterobacter cloacae complex(阴沟肠杆菌复合群)
  20. 在分子水平上与 E. cloacae 略有不同,通常需要 基因测序(如 16S rRNA MLST 进行区分
  21. 致病性:
  22. 也是一种 机会性病原菌,可引起:
  23. 医院感染(如 ICU 患者的感染)
  24. 免疫力低下患者的败血症
  25. 新生儿败血症(可见于 NICU
  26. 耐药性:
  27. E. cloacae 更容易产生耐药性,特别是 碳青霉烯耐药菌株(CRE
  28. 近年来,E. hormaechei 被认为是 医院爆发性感染的高危菌株
  29. 🔬 主要区别(E. cloacae vs. E. hormaechei
  30. 特征 Enterobacter cloacae Enterobacter hormaechei
  31. 分类 阴沟肠杆菌 霍尔马氏肠杆菌
  32. 复合群 Enterobacter cloacae complex Enterobacter cloacae complex
  33. 致病性 机会性感染 机会性感染,常见于 ICU
  34. 耐药性 可能产生 ESBLs CRE 更容易产生 CRE,耐药率更高
  35. 分子鉴定 16S rRNA MALDI-TOF 需基因测序区分
  36. 医院爆发 少见 常见
  37. 🩺 预防 & 治疗
  38. 加强医院感染控制(如手卫生、环境消毒)
  39. 抗生素敏感性检测(AST):针对耐药菌使用合适的抗生素,如 替加环素、粘菌素
  40. 限制广谱抗生素的使用,避免耐药菌株传播
  41. 总结:
  42. 🔹 E. cloacae E. hormaechei 都是 Enterobacter cloacae complex 的成员,容易引起医院感染
  43. 🔹 E. hormaechei 通常比 E. cloacae 更耐药,尤其是 CRE 菌株
  44. 🔹 临床上需要分子鉴定 以区分它们,并选择合适的治疗方案
  45. 如果是医院感染菌株,建议做 药敏检测(AST),然后选择合适的抗生素进行治疗 🚑💊

like unlike

点赞本文的读者

还没有人对此文章表态


本文有评论

没有评论

看文章,发评论,不要沉默


© 2023 XGenes.com Impressum