New

The executive guide to generative AI

Read more
IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Bulk Indexing

edit

Elasticsearch also supports bulk indexing of documents. The bulk API expects JSON action/metadata pairs, separated by newlines. When constructing your documents in PHP, the process is similar. You first create an action array object (e.g. index object), then you create a document body object. This process repeats for all your documents.

A simple example might look like this:

Bulk indexing with PHP arrays.

for($i = 0; $i < 100; $i++) {
    $params['body'][] = [
        'index' => [
            '_index' => 'my_index',
            '_type' => 'my_type',
	]
    ];

    $params['body'][] = [
        'my_field' => 'my_value',
        'second_field' => 'some more values'
    ];
}

$responses = $client->bulk($params);

In practice, you’ll likely have more documents than you want to send in a single bulk request. In that case, you need to batch up the requests and periodically send them:

Bulk indexing with batches.

$params = ['body' => []];

for ($i = 1; $i <= 1234567; $i++) {
    $params['body'][] = [
        'index' => [
            '_index' => 'my_index',
            '_type' => 'my_type',
            '_id' => $i
        ]
    ];

    $params['body'][] = [
        'my_field' => 'my_value',
        'second_field' => 'some more values'
    ];

    // Every 1000 documents stop and send the bulk request
    if ($i % 1000 == 0) {
        $responses = $client->bulk($params);

        // erase the old bulk request
        $params = ['body' => []];

        // unset the bulk response when you are done to save memory
        unset($responses);
    }
}

// Send the last batch if it exists
if (!empty($params['body'])) {
    $responses = $client->bulk($params);
}
Was this helpful?
Feedback