如何操作JSON树的叶子

Wat*_*att 1 java tree json data-structures

我想_RARE_JAVA在JSON树中替换罕见的单词.

我的罕见词汇列表包含

late  
populate
convicts
Run Code Online (Sandbox Code Playgroud)

所以对于JSON如下

["S", ["PP", ["ADP", "In"], ["NP", ["DET", "the"], ["NP", ["ADJ", "late"], ["NOUN", "1700<s"]]]], ["S", ["NP", ["ADJ", "British"], ["NOUN", "convicts"]], ["S", ["VP", ["VERB", "were"], ["VP", ["VERB", "used"], ["S+VP", ["PRT", "to"], ["VP", ["VERB", "populate"], ["WHNP", ["DET", "which"], ["NOUN", "colony"]]]]]], [".", "?"]]]]
Run Code Online (Sandbox Code Playgroud)

我应该得到

["S", ["PP", ["ADP", "In"], ["NP", ["DET", "the"], ["NP", ["ADJ", "_RARE_"], ["NOUN", "1700<s"]]]], ["S", ["NP", ["ADJ", "British"], ["NOUN", "_RARE_"]], ["S", ["VP", ["VERB", "were"], ["VP", ["VERB", "used"], ["S+VP", ["PRT", "to"], ["VP", ["VERB", "populate"], ["WHNP", ["DET", "which"], ["NOUN", "colony"]]]]]], [".", "?"]]]]
Run Code Online (Sandbox Code Playgroud)

注意如何

["ADJ","late"]
Run Code Online (Sandbox Code Playgroud)

被...取代

["ADJ","_RARE_"]
Run Code Online (Sandbox Code Playgroud)

到目前为止我的代码如下:

我递归迭代树,一旦找到罕见的单词,我创建一个新的JSON数组并尝试用它替换现有的树节点.见// this Doesn't work下文,这就是我被卡住的地方.树在此功能之外保持不变.

public static void traverseTreeAndReplaceWithRare(JsonArray tree){   

        //System.out.println(tree.getAsJsonArray()); 

        for (int x = 0; x < tree.getAsJsonArray().size(); x++)
        {
            if(!tree.get(x).isJsonArray())
            {
                if(tree.size()==2)
                {   
                //beware it will get here twice for same word
                 String word= tree.get(1).toString();  
                 word=word.replaceAll("\"", ""); // removing double quotes

                 if(rareWords.contains(word))
                 {
                 JsonParser parser = new JsonParser();                   

                             //This works perfectly 
                             System.out.println("Orig:"+tree);
                 JsonElement jsonElement = parser.parse("["+tree.get(0)+","+"_RARE_"+"]");

                 JsonArray newRareArray = jsonElement.getAsJsonArray();

                             //This works perfectly 
                             System.out.println("New:"+newRareArray);

                 tree=newRareArray; // this Doesn't work
                 }                 

                }               
                continue;   
            }
            traverseTreeAndReplaceWithRare(tree.get(x).getAsJsonArray());
        }
    }
Run Code Online (Sandbox Code Playgroud)

上面调用的代码,我使用google的gson

JsonParser parser = new JsonParser();
JsonElement jsonElement = parser.parse(strJSON);
JsonArray tree = jsonElement.getAsJsonArray();  
Run Code Online (Sandbox Code Playgroud)

seh*_*ehe 6

这是C++中的直接方法:

#include <fstream>
#include "JSON.hpp"
#include <boost/algorithm/string/regex.hpp>
#include <boost/range/adaptors.hpp>
#include <boost/phoenix.hpp>

static std::vector<std::wstring> readRareWordList()
{
    std::vector<std::wstring> result;

    std::wifstream ifs("testcases/rarewords.txt");
    std::wstring line;
    while (std::getline(ifs, line))
        result.push_back(std::move(line));

    return result;
}

struct RareWords : boost::static_visitor<> {

    /////////////////////////////////////
    // do nothing by default
    template <typename T> void operator()(T&&) const { /* leave all other things unchanged */ }

    /////////////////////////////////////
    // recurse arrays and objects
    void operator()(JSON::Object& obj) const { 
        for(auto& v : obj.values) {
            //RareWords::operator()(v.first); /* to replace in field names (?!) */
            boost::apply_visitor(*this, v.second);
        }
    }

    void operator()(JSON::Array& arr) const {
        int i = 0;
        for(auto& v : arr.values) {
            if (i++) // skip the first element in all arrays
                boost::apply_visitor(*this, v);
        }
    }

    /////////////////////////////////////
    // do replacements on strings
    void operator()(JSON::String& s) const {
        using namespace boost;

        const static std::vector<std::wstring> rareWords = readRareWordList();
        const static std::wstring replacement = L"__RARE__";

        for (auto&& word : rareWords)
            if (word == s.value)
                s.value = replacement;
    }
};

int main()
{
    auto document = JSON::readFrom(std::ifstream("testcases/test3.json"));

    boost::apply_visitor(RareWords(), document);

    std::cout << document;
}
Run Code Online (Sandbox Code Playgroud)

这假设您想要对所有字符串值进行替换,并且只匹配整个字符串.通过更改正则表达式或正则表达式标志,您可以轻松地使此不区分大小写,匹配字符串内的单词等.稍微调整以回应评论.

包含JSON.hpp/cpp的完整代码如下:https://github.com/sehe/spirit-v2-json/tree/16093940