java es 如何实现搜索时候输入拼音自动转汉字列表

问题是这样,官网首页搜索的时候想学习百度当输入
“chelizi” -》车厘子
“cherry”-》 车厘子
然后匹配es存储关键字“车厘子 或樱桃”
请问是否有相关的开源的jar包或者api, 当输入拼音的时候能转成汉字列表,如果输入的拼音是英文,则转换成英文

img

百度开发者上面查找相关的api,应该是有的

可能同一个拼音存在多种不同词语 不同声调,不同语境也是不同词语,用开放的云词库试试,参考输入法之类的词库

ES就可以满足你的需求,你所需要的应该是分词器,中文分词器;该作者又开发出了https://github.com/medcl/elasticsearch-analysis-pinyin,可以通过拼音查出中文结果

需要加配置:

{
  "settings": {
    "refresh_interval": "2s",
    "number_of_shards": 5,
    "number_of_replicas": 1,
    "analysis": {
      "filter": {
        "edge_ngram_filter": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 20
        },
        "pinyin_jianpin": {
          "type": "pinyin",
          "first_letter": "none",
          "padding_char": ""
        },
        "pinyin_simple_filter": {
          "type": "pinyin",
          "keep_first_letter": true,
          "keep_separate_first_letter": true,
          "keep_full_pinyin": false,
          "keep_original": false,
          "limit_first_letter_length": 20,
          "lowercase": true
        },
        "pinyin_full_filter": {
          "type": "pinyin",
          "keep_first_letter": false,
          "keep_separate_first_letter": false,
          "keep_full_pinyin": true,
          "none_chinese_pinyin_tokenize": true,
          "keep_original": false,
          "limit_first_letter_length": 20,
          "lowercase": true
        }
      },
      "tokenizer": {
        "ik_smart": {
          "type": "ik",
          "use_smart": true
        }
      },
      "analyzer": {
        "ngramIndexAnalyzer": {
          "type": "custom",
          "tokenizer": "ik_max_word",
          "filter": [
            "edge_ngram_filter",
            "lowercase"
          ]
        },
        "ngramSearchAnalyzer": {
          "type": "custom",
          "tokenizer": "whitespace",
          "filter": [
            "lowercase"
          ]
        },
        "ikIndexAnalyzer": {
          "type": "custom",
          "tokenizer": "ik"
        },
        "ikSearchAnalyzer": {
          "type": "custom",
          "tokenizer": "ik"
        },
        "pinyinSimpleIndexAnalyzer": {
          "tokenizer": "ik_max_word",
          "filter": [
            "pinyin_simple_filter",
            "edge_ngram_filter",
            "lowercase"
          ]
        },
        "pinyinSimpleSearchAnalyzer": {
          "tokenizer": "whitespace",
          "filter": [
            "pinyin_simple_filter",
            "lowercase"
          ]
        },
        "jianpinIndexAnalyzer": {
          "type": "custom",
          "tokenizer": "keyword",
          "filter": [
            "pinyin_first_letter",
            "edge_ngram_filter",
            "lowercase"
          ]
        },
        "jianpinSearchAnalyzer": {
          "type": "custom",
          "tokenizer": "whitespace",
          "filter": [
            "pinyin_first_letter",
            "lowercase"
          ]
        },
        "pinyinFullIndexAnalyzer": {
          "tokenizer": "keyword",
          "filter": [
            "pinyin_full_filter",
            "lowercase"
          ]
        },
        "pinyinFullSearchAnalyzer": {
          "tokenizer": "whitespace",
          "filter": [
            "pinyin_full_filter",
            "lowercase"
          ]
        }
      }
    }
  }
}   


我感觉像是一个字典存储了这些,要么就是深度学习了。

你这种功能其实不就是输入法的功能嘛.根据拼音显示汉字

ES就可以满足你的需求,你所需要的应该是分词器,中文分词器;该作者又开发出了https://github.com/medcl/elasticsearch-analysis-pinyin,可以通过拼音查出中文结果

可以考虑去gitee或者github上搜索一下有没有相关的。