(For more resources related to this topic, see here.)
This article will give a brief overview of the Solarium library and showcase some of the concepts and configuration options on Solr end for implementing certain features.
Calling Solr using PHP code
A ping query is used in Solr to check the status of the Solr server. The Solr URL for executing the ping query is http://localhost:8080/solr/collection1/admin/ping/?wt=json.
Response of Solr ping query in browser
We can use Curl to get the ping response from Solr via PHP code; a sample code for executing the previous ping query is as below
$curl = curl_init("http://localhost:8080/solr/collection1/admin/ping/
?wt=json"); curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($curl); $data = json_decode($output, true); echo "Ping Status : ".$data["status"].PHP_EOL;
Though Curl can be used to execute almost any query on Solr, but it is preferable to use a library which does the work for us. In our case we will be using Solarium. To execute the same query on Solr using the Solarium library the code is as follows.
include_once("vendor/autoload.php"); $config = array("endpoint" => array("localhost" => array
("host"=>"127.0.0.1", "port"=>"8080", "path"=>"/solr", "core"=>"collection1",) ) );
We have included the Solarium library in our code. And defined the connection parameters for our Solr server.
Next we will need to create a Solarium client with the previous Solr configuration. And call the createPing() function to create the ping query.
$client = new SolariumClient($config); $ping = $client->createPing(); Finally execute the ping query and get the result. $result = $client->ping($ping); $result->getStatus();
The output should be similar to the one shown below.
Output of ping query using PHP
Adding documents to Solr index
To create a Solr index, we need to add documents to the Solr index using the command line, Solr web interface or our PHP program. But before we create a Solr index, we need to define the structure or the schema of the Solr index. Schema consists of fields and field types. It defines how each field will be treated and handled during indexing or during search. Let us see a small piece of code for adding documents to the Solr index using PHP and Solarium library.
Create a solarium client. Create an instance of the update query. Create the document in PHP and finally add fields to the document.
$client = new SolariumClient($config); $updateQuery = $client->createUpdate(); $doc1 = $updateQuery->createDocument(); $doc1->id = 112233445; $doc1->cat = 'book'; $doc1->name = 'A Feast For Crows'; $doc1->price = 8.99; $doc1->inStock = 'true'; $doc1->author = 'George R.R. Martin'; $doc1->series_t = '"A Song of Ice and Fire"';
Id field has been marked as unique in our schema. So we will have to keep different values for Id field for different documents that we add to Solr.
Add documents to the update query followed by commit command. Finally execute the query.
$updateQuery->addDocuments(array($doc1)); $updateQuery->addCommit(); $result = $client->update($updateQuery);
Let us execute the code.
After executing the code, a search for martin will give these documents in the result.
Document added to Solr index
Executing search on Solr Index
Documents added to the Solr index can be searched using the following piece of PHP code.
$selectConfig = array( 'query' => 'cat:book AND author:Martin', 'start' => 3, 'rows' => 3,
'fields' => array('id','name','price','author'),
'sort' => array('price' => 'asc') ); $query = $client->createSelect($selectConfig); $resultSet = $client->select($query);
The above code creates a simple Solr query and searches for book in cat field and Martin in author field. The results are sorted in ascending order or price and fields returned are id, name of book, price and author of book. Pagination has been implemented as 3 results per page, so this query returns results for 2nd page starting from 3rd result.
In addition to this simple select query, Solr also supports some advanced query modes known as dismax and edismax. With the help of these query modes, we can boost certain fields to give more importance to certain fields in our query. We can also use function queries to do some type of dynamic boosting based on values in fields.
If no sorting is provided, the Solr results are sorted by the score of documents which are calculated based on the terms in the query and the matching terms in the documents in the index. Score is calculated for each document in the result set using two main factors – term frequency known as tf and inverse document frequency known as idf.
In addition to these, Solr provides a way of narrowing down the results using filter queries. Also facets can be created based on fields in the index and it can be used by the end users to narrow down the results.
Highlighting search results using PHP and Solr
Solr can be used to highlight the fields returned in a search result based on the query. Here is a sample code for highlighting the results for search keyword harry.
Get the highlighting component from the query, set the fields to be highlighted and also set the html tags to be used for highlighting.
$hl = $query->getHighlighting(); $hl->setFields('name,series_t'); $hl->setSimplePrefix('')->setSimplePostfix('');
Once the query is run and result set is received, we will need to retrieve the highlighted results from the result set. Here is the output for the highlighting code.
Highlighted search results
In addition to highlighting, Solr can be used to create a spelling suggester and a spell checker. Spelling suggester can be used to prompt input query to the end user as the user keeps on typing. Spell check can be used to prompt spelling corrections similar to ‘did you mean’ to the user. Solr can also be used for finding documents which are similar to a certain document based on words in certain fields. This functionality of Solr is known as more like this and is exposed via Solarium by the MoreLikeThis component. Solr also provides grouping of the result based on a particular query or a certain field.
Solr can be scaled to handle large number of search requests by using master slave architecture. Also if the index is huge, it can be sharded across multiple Solr instances and we can run a distributed search to get results for our query from all the sharded instances. Solarium provides a load balancing plug-in which can be used to load balance queries across master-slave architecture.
Solr provides an extensive list of features for implementing search. These features can be easily accessed in PHP using the Solarium library to build a full features search application which can be used to power search on any website.
Resources for Article:
- Apache Solr Configuration [Article]
- Getting Started with Apache Solr [Article]
- Making Big Data Work for Hadoop and Solr [Article]