Re: Geode - store and query JSON documents

Anilkumar Gingade Mon, 23 Nov 2020 13:12:21 -0800

Ankit,

Here is how you can query your JSON object.


String queryStr = "SELECT d.col1 FROM /JsonRegion v, v.data d where d.col1.k11 
= 'aaa'";

As replied earlier; the data is stored as PdxInstance type in the cache. In the 
PdxInstance, the data is stored as top level or nested collection of 
objects/values based on input JSON object structure. 
The query engine queries on the PdxInstance type and returns the value.

To see, how the PdxInstance data looks like in the cache, you can print the 
returned value from querying the region values:
E.g.:
     String queryStr = "SELECT v FROM /JsonRegion v";
     SelectResults results = (SelectResults) 
QueryService().newQuery(queryStr).execute();
      Object[] value = results.asList().toArray();
      System.out.println("#### Projected value: " + value[0]);

You can find sample queries on different type of objects (collections, etc) at:
https://geode.apache.org/docs/guide/18/getting_started/querying_quick_reference.html

Also in order to determine where the time is getting spent, can you separate 
out object creation through JSONFormatter from put operation.
E.g.:
PdxInstance pdxInstance = JSONFormatter.fromJSON(jsonDoc_2);
// Time taken to format:
region.put("1", pdxInstance);
// Time taken to add to cache:

And measure the time separately. It will help to see if the time is spent in 
getting the PdxInstance or in doing puts. Also, can you measure the time in 
avg. 
E.g. Say time measured for puts from 1000 to 2000 and avg time for those puts. 

-Anil.


On 11/23/20, 11:27 AM, "ankit Soni" <[email protected]> wrote:

     Hello geode-dev,

    I am *evaluating usage of Geode (1.12) with storing JSON documents and
    querying the same*. I am able to store the json records successfully in
    geode but seeking guidance on how to query them.
    More details on code and sample json is,


    *Sample client-code*

    import org.apache.geode.cache.client.ClientCache;
    import org.apache.geode.cache.client.ClientCacheFactory;
    import org.apache.geode.cache.client.ClientRegionShortcut;
    import org.apache.geode.pdx.JSONFormatter;
    import org.apache.geode.pdx.PdxInstance;

    public class MyTest {

        *//NOTE: Below is truncated json, single json document can max
    contain an array of col1...col30 (30 diff attributes) within data. *
        public final static  String jsonDoc_2 = "{" +
                "\"data\":[{" +
                            "\"col1\": {" +
                                    "\"k11\": \"aaa\"," +
                                    "\"k12\":true," +
                                    "\"k13\": 1111," +
                                    "\"k14\": \"2020-12-31:00:00:00\"" +
                                    "}," +
                            "\"col2\":[{" +
                                    "\"k21\": \"222222\"," +
                                    "\"k22\": true" +
                                    "}]" +
                        "}]" +
                "}";

    *     //NOTE: Col1....col30 are mix of JSONObject ({}) and JSONArray
    ([]) as shown above in jsonDoc_2;*

        public static void main(String[] args){

            //create client-cache
            ClientCache cache = new
    ClientCacheFactory().addPoolLocator(LOCATOR_HOST, PORT).create();
            Region<String, PdxInstance> region = cache.<String,
    PdxInstance>createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
                    .create(REGION_NAME);

            //store json document
            region.put("key", JSONFormatter.fromJSON(jsonDoc_2));

            //How to query json document like,

            // 1. select col2.k21, col1, col20 from /REGION_NAME where
    data.col2.k21 = '222222' OR data.col2.k21 = '333333'

            // 2. select col2.k21, col1.k11, col1 from /REGION_NAME where
    data.col1.k11 in ('aaa', 'xxx', 'yyy')
        }
    }

    *Server: Region-creation*

    gfsh> create region --name=REGION_NAME --type=PARTITION
    --redundant-copies=1 --total-num-buckets=61


    *Setup: Distributed cluster of 3 nodes
    *

    *My Observations/Problems*
    -  Put operation takes excessive time: region.put("key",
    JSONFormatter.fromJSON(jsonDoc_2));  - Fetching a single record from () a
    file and Storing in geode approx. takes . 3 secs
       Is there any suggestions/configuration related to JSONFormatter API or
    other to optimize this...?

    *Looking forward to guidance on querying this JOSN for above sample
    queries.*

    *Thanks*
    *Ankit.*

Re: Geode - store and query JSON documents

Reply via email to