shell脚本提取txt文件中关键词并统计出现次数

有一个test.txt文件,里面想提取特定关键词,并对各项关键词统计出现过几次,使用shell脚本。

关键词的特征是:**SRA.关键词.**,开头“SRA.”+关键词+结尾“.”

SRA:SRR10168379.12205864.1关键词是:SRR10168379
SRA:SRR10168392.8392060.2关键词是:SRR10168392

希望运行完成的结果res.txt,关键词——号后面加上次数,按照次数多少排序:
SRR10168379——7次
SRR10168392——2次

test.txt文件内容是

# tblastn
# Iteration: 0
# Query: 
# RID: ZRD35BAF013
# Database: SRR10168375 SRR10168376 SRR10168377 SRR10168378 SRR10168379 SRR10168381 SRR10168392 SRR10168393 SRR13285085 SRR13285570
# Fields: query acc.ver, subject acc.ver, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score, % positives, query/sbjct frames
# 100 hits found
Query_45991    SRA:SRR10168379.12205864.1    34.694    49    32    0    527    575    148    2    4.6    36.6    53.06    0/-3
Query_45991    SRA:SRR10168379.10841544.1    41.667    48    20    1    187    226    144    1    5.7    36.2    58.33    0/-1
Query_45991    SRA:SRR10168392.8392060.2    47.059    34    13    1    194    222    27    128    11    35.4    61.76    0/3
Query_45991    SRA:SRR10168393.9810230.1    41.304    46    19    1    187    224    1    138    15    35.0    58.70    0/1
Query_45991    SRA:SRR10168379.2460949.2    41.304    46    19    1    187    224    1    138    18    34.7    58.70    0/1
Query_45991    SRA:SRR10168393.20965295.2    42.222    45    18    1    188    224    1    135    20    34.7    57.78    0/1
Query_45991    SRA:SRR10168376.8708660.2    43.902    41    15    1    192    224    1    123    28    34.3    58.54    0/1
Query_45991    SRA:SRR10168379.12533534.1    40.000    50    22    1    187    228    150    1    31    34.3    56.00    0/-1
Query_45991    SRA:SRR10168379.6639135.2    41.304    46    19    1    187    224    141    4    34    33.9    58.70    0/-1
Query_45991    SRA:SRR10168379.13010027.2    39.583    48    21    1    187    226    1    144    41    33.9    58.33    0/1
Query_45991    SRA:SRR10168381.1806861.1    40.816    49    21    1    188    228    150    4    41    33.9    55.10    0/-1
Query_45991    SRA:SRR10168379.3721520.2    40.816    49    21    1    188    228    150    4    41    33.9    55.10    0/-1
Query_45991    SRA:SRR10168378.17299083.1    41.304    46    19    1    187    224    139    2    42    33.9    56.52    0/-3
Query_45991    SRA:SRR10168393.17810994.2    39.583    48    21    1    187    226    3    146    46    33.5    58.33    0/3
Query_45991    SRA:SRR10168379.2656880.1    41.304    46    19    1    187    224    144    7    53    33.5    58.70    0/-1
Query_45991    SRA:SRR10168379.1997738.2    41.304    46    19    1    187    224    146    9    53    33.5    58.70    0/-2
Query_45991    SRA:SRR10168379.11604415.1    41.304    46    19    1    187    224    149    12    55    33.5    56.52    0/-2
Query_45991    SRA:SRR10168379.11899618.1    39.583    48    21    1    187    226    147    4    57    33.5    56.25    0/-1
Query_45991    SRA:SRR10168379.4610022.2    39.583    48    21    1    187    226    147    4    57    33.5    56.25    0/-1


$ cat test.awk
!/^[ \t]*#/ {
    keyword = substr($2, 5, 11);
    count[keyword]++;
} END {
    for (keyword in count)
        print keyword "---" count[keyword];
}
$ cat test.txt
# tblastn
# Iteration: 0
# Query: 
# RID: ZRD35BAF013
# Database: SRR10168375 SRR10168376 SRR10168377 SRR10168378 SRR10168379 SRR10168381 SRR10168392 SRR10168393 SRR13285085 SRR13285570
# Fields: query acc.ver, subject acc.ver, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score, % positives, query/sbjct frames
# 100 hits found
Query_45991    SRA:SRR10168379.12205864.1    34.694    49    32    0    527    575    148    2    4.6    36.6    53.06    0/-3
Query_45991    SRA:SRR10168379.10841544.1    41.667    48    20    1    187    226    144    1    5.7    36.2    58.33    0/-1
Query_45991    SRA:SRR10168392.8392060.2    47.059    34    13    1    194    222    27    128    11    35.4    61.76    0/3
Query_45991    SRA:SRR10168393.9810230.1    41.304    46    19    1    187    224    1    138    15    35.0    58.70    0/1
Query_45991    SRA:SRR10168379.2460949.2    41.304    46    19    1    187    224    1    138    18    34.7    58.70    0/1
Query_45991    SRA:SRR10168393.20965295.2    42.222    45    18    1    188    224    1    135    20    34.7    57.78    0/1
Query_45991    SRA:SRR10168376.8708660.2    43.902    41    15    1    192    224    1    123    28    34.3    58.54    0/1
Query_45991    SRA:SRR10168379.12533534.1    40.000    50    22    1    187    228    150    1    31    34.3    56.00    0/-1
Query_45991    SRA:SRR10168379.6639135.2    41.304    46    19    1    187    224    141    4    34    33.9    58.70    0/-1
Query_45991    SRA:SRR10168379.13010027.2    39.583    48    21    1    187    226    1    144    41    33.9    58.33    0/1
Query_45991    SRA:SRR10168381.1806861.1    40.816    49    21    1    188    228    150    4    41    33.9    55.10    0/-1
Query_45991    SRA:SRR10168379.3721520.2    40.816    49    21    1    188    228    150    4    41    33.9    55.10    0/-1
Query_45991    SRA:SRR10168378.17299083.1    41.304    46    19    1    187    224    139    2    42    33.9    56.52    0/-3
Query_45991    SRA:SRR10168393.17810994.2    39.583    48    21    1    187    226    3    146    46    33.5    58.33    0/3
Query_45991    SRA:SRR10168379.2656880.1    41.304    46    19    1    187    224    144    7    53    33.5    58.70    0/-1
Query_45991    SRA:SRR10168379.1997738.2    41.304    46    19    1    187    224    146    9    53    33.5    58.70    0/-2
Query_45991    SRA:SRR10168379.11604415.1    41.304    46    19    1    187    224    149    12    55    33.5    56.52    0/-2
Query_45991    SRA:SRR10168379.11899618.1    39.583    48    21    1    187    226    147    4    57    33.5    56.25    0/-1
Query_45991    SRA:SRR10168379.4610022.2    39.583    48    21    1    187    226    147    4    57    33.5    56.25    0/-1
$ cat test.txt | awk -f test.awk > res.txt
$ cat res.txt
SRR10168392---1
SRR10168376---1
SRR10168381---1
SRR10168393---3
SRR10168378---1
SRR10168379---12
您好,我是有问必答小助手,您的问题已经有小伙伴帮您解答,感谢您对有问必答的支持与关注!
PS:问答VIP年卡 【限时加赠:IT技术图书免费领】,了解详情>>> https://vip.csdn.net/askvip?utm_source=1146287632您好,我是有问必答小助手,您的问题已经有小伙伴帮您解答,感谢您对有问必答的支持与关注!
PS:问答VIP年卡 【限时加赠:IT技术图书免费领】,了解详情>>> https://vip.csdn.net/askvip?utm_source=1146287632

对于你这个问题可以参考如下链接:

如有帮助,请点击我的回答下方的【采纳该答案】按钮帮忙采纳下,谢谢!

img