elasticsearch - Trouble w/ facet counts -
i'm attempting use elasticsearch analytics -- track "top content" hand-rolled rails cms. requirement quite bit more complicated keeping counter each piece of content. won't depth of problem right now, can't seem basics working.
my problem this: i'm using facets , counts aren't expect them be. example:
query:
{"facets":{"el_ids":{"terms":{"field":"el_id","size":1,"all_terms":false,"order":"count"}}}}
result:
{"el_ids":{"_type":"terms","missing":0,"total":16672,"other":16657,"terms":[{"term":"quis","count":15}]}}
ok, great, piece of content id "quis" had 15 hits , since order
count
, should top piece of content. lets top 5 pieces of content.
query:
{"facets":{"el_ids":{"terms":{"field":"el_id","size":5,"all_terms":false,"order":"count"}}}}
result (just facet):
[ {"term":"qgz9","count":26}, {"term":"quis","count":15}, {"term":"hnqn","count":15}, {"term":"higp","count":15}, {"term":"csns","count":15} ]
huh? piece of content w/ id "qgz9" had more hits 26? why wasn't top result in first query?
ok, lets top 100 now.
query:
{"facets":{"el_ids":{"terms":{"field":"el_id","size":100,"all_terms":false,"order":"count"}}}}
results (just facet):
[ {"term":"qgz9","count":43}, {"term":"difc","count":37}, {"term":"zryp","count":31}, {"term":"u65r","count":31}, {"term":"sxsi","count":31}, ... ]
so "qgz9" has 43 hits instead of 26? how can be? can assure there's nothing happening in background modifying index. if repeat these queries, same results.
as repeat process of increasing result size, counts continue change , new content ids emerge @ top. can explain me i'm doing wrong or understanding of how works flawed?
it turns out known issue:
...the way top n facets work getting top n each shard, , merging results. can give inaccurate results.
by default, index being created 5 shards. changing index has single shard, counts behave inline expectations. workaround set size
value greater number of expected facets , peel off top n results.
Comments
Post a Comment