Has Child Query

edit

The has_child query works the same as the has_child filter, by automatically wrapping the filter with a constant_score (when using the default score type). It has the same syntax as the has_child filter:

{
    "has_child" : {
        "type" : "blog_tag",
        "query" : {
            "term" : {
                "tag" : "something"
            }
        }
    }
}

An important difference with the top_children query is that this query is always executed in two iterations whereas the top_children query can be executed in one or more iteration. When using the has_child query the total_hits is always correct.

Scoring capabilities

edit

The has_child also has scoring support. The supported score types are min, max, sum, avg or none. The default is none and yields the same behaviour as in previous versions. If the score type is set to another value than none, the scores of all the matching child documents are aggregated into the associated parent documents. The score type can be specified with the score_mode field inside the has_child query:

{
    "has_child" : {
        "type" : "blog_tag",
        "score_mode" : "sum",
        "query" : {
            "term" : {
                "tag" : "something"
            }
        }
    }
}

Min/Max Children

edit

The has_child query allows you to specify that a minimum and/or maximum number of children are required to match for the parent doc to be considered a match:

{
    "has_child" : {
        "type" : "blog_tag",
        "score_mode" : "sum",
        "min_children": 2, 
        "max_children": 10, 
        "query" : {
            "term" : {
                "tag" : "something"
            }
        }
    }
}

Both min_children and max_children are optional.

The min_children and max_children parameters can be combined with the score_mode parameter.

Memory Considerations

edit

In order to support parent-child joins, all of the (string) parent IDs must be resident in memory (in the field data cache. Additionally, every child document is mapped to its parent using a long value (approximately). It is advisable to keep the string parent ID short in order to reduce memory usage.

You can check how much memory is being used by the ID cache using the indices stats or nodes stats APIS, eg:

curl -XGET "http://localhost:9200/_stats/id_cache?pretty&human"