因SPAdes无法同时处理多个样本,因此我利用bash写了一个循环函数
我的代码
!/bash/sh
#Generate output direction
mkdir /data/caozhr/YN/spades/output
#write funtions to run spades
runSpades () {
FILE1=${1}_R1.fastq
FILE2=${1}_R2.fastq
echo $FILE1
echo $FILE2
spades.py --pe1-1 /data/caozhr/YN/spades/R1_for_spades/${FILE1} --pe1-2 /data/caozhr/YN/spades/R2_for_spades/${FILE2} -o /data/caozhr/YN/spades/output/${1} --metaviral --isolate -t 16 --phred-offset 33 -k 21,33,55,77
}
export -f runSpades
#多线程运行
ls /data/caozhr/YN/spades/R1_for_spades/ | sed 's/_R1.fastq//g' | xargs -I {} --max-procs 10 sh -c 'runSpades {}'
出现报错:
sh: 1: runSpades: not found
sh: 1: runSpades: not found
sh: 1: runSpades: not found
sh: 1: runSpades: not found
sh: 1: runSpades: not found
sh: 1: runSpades: not found
sh: 1: runSpades: not found
sh: 1: runSpades: not found
我的解答思路和尝试过的方法:将原有runSpades代码直接写入
ls /data/caozhr/YN/spades/R1_for_spades/ | sed 's/_R1.fastq//g' | xargs -I {} --max-procs=10 sh -c '
FILE1=${1}_R1.fastq
FILE2=${1}_R2.fastq
echo $FILE1
echo $FILE2
spades.py --pe1-1 /data/caozhr/YN/spades/R1_for_spades/${FILE1} --pe1-2 /data/caozhr/YN/spades/R2_for_spades/${FILE2} -o /data/caozhr/YN/spades/output/${1} --metaviral --isolate -t 16 --phred-offset 33 -k 21,33,55,77'
再次出现报错,原因是ls /data/caozhr/YN/spades/R1_for_spades/ | sed 's/_R1.fastq//g' 这一步处理之后的信息,无法通过| xargs -I {} pipeline传递给下游继续分析。
我想要达到的结果
1.修改原有代码使之可以执行
2.或获取可以在linux中循环执行spades.py的代码