velocyto,bam文件合成loom文件

问题遇到的现象和发生背景

velocyto run10x bam合成loom文件

问题相关代码,请勿粘贴截图
/mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam
(pyvelo) seven 16:06:35 /mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam
$ rmsk_gtf=/mnt/e/Zhou_XM/RNA_loom/mm10_rmsk.gtf
(pyvelo) seven 16:07:27 /mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam
$ cellranger_gtf=/mnt/e/Zhou_XM/RNA_loom/refdata-gex-mm10-2020-A/genes/genes.gtf
(pyvelo) seven 16:09:20 /mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam
$ cellranger_outDir=/mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam/KI
(pyvelo) seven 16:11:33 /mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam
$ ls -lh $rmsk_gtf  $cellranger_outDir $cellranger_gtf
-rwxrwxrwx 1 seven seven  87M Aug  8 10:54 /mnt/e/Zhou_XM/RNA_loom/mm10_rmsk.gtf
-rwxrwxrwx 1 seven seven 902M Jun 18  2020 /mnt/e/Zhou_XM/RNA_loom/refdata-gex-mm10-2020-A/genes/genes.gtf

/mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam/KI:
total 0
(pyvelo) seven 16:12:04 /mnt/e/Zhou_XM/RNA_velocyto/bam/KIbam
$ velocyto run10x -m $rmsk_gtf  $cellranger_outDir $cellranger_gtf

运行结果及报错内容

2022-08-09 16:14:06,197 - ERROR - This is an older version of cellranger, cannot check if the output are ready, make sure of this yourself
2022-08-09 16:14:06,199 - ERROR - Can not locate the barcodes.tsv file!
Traceback (most recent call last):
  File "/home/seven/miniconda3/envs/pyvelo/bin/velocyto", line 8, in 
    sys.exit(cli())
  File "/home/seven/miniconda3/envs/pyvelo/lib/python3.9/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/home/seven/miniconda3/envs/pyvelo/lib/python3.9/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/home/seven/miniconda3/envs/pyvelo/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/seven/miniconda3/envs/pyvelo/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/seven/miniconda3/envs/pyvelo/lib/python3.9/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/seven/miniconda3/envs/pyvelo/lib/python3.9/site-packages/velocyto/commands/run10x.py", line 91, in run10x
    bcfile = bcmatches[0]
IndexError: list index out of range
我的解答思路和尝试过的方法

查了下cellranger版本:linux是6,测序文件用的是5,但又用不到这个软件啊?

img

masker文件下载:

基因组注释文件(gtf):如果是10X的数据,那么这个gtf可以直接使用对应reference的gtf,位于reference目录下的gene文件夹

mask注释文件:从UCSC genome browser获取。

• 选定基因组和对应版本,

• Track选择为RepeatMasker, Table为rmsk, output format为GTF

小鼠mm10:Table Browser

人GRCh38:Table Browser

velocyto run -m /data/users/minmingw/Alignment/hg38/GRCh38_repeat_masker.gtf SRR11050949Aligned.sortedByCoord.out.bam /data/users/minmingw/Alignment/hg38/Homo_sapiens.GRCh38.103.gtf

//如果你有事先筛选好的barcodes:

velocyto run -b /data/users/minmingw/sra/filtered_91497/barcodes.tsv -m /data/users/minmingw/Alignment/hg38/GRCh38_repeat_masker.gtf Chip91497Aligned.sortedByCoord.out.bam /data/users/minmingw/Alignment/hg38/Homo_sapiens.GRCh38.103.gtf

如果没有cellsorted_bam会自动排序bam,不必要预先samtools sort。

如果文件太大,报错里告诉你最好用samtools sort,记得必须按CB排序:

samtools sort -t CB Chip94576Aligned.sortedByCoord.out.bam -o cellsorted_Chip94576Aligned.sortedByCoord.out.bam