Search: Installing SphinxSearch Debian/Ubuntu

We'll cover how to install SphinxSearch on Debian and Ubuntu servers. Sphinx is an optional component for HelpSpot 4 and not needed for most installations. By default database full text indexes are used.

HelpSpot 4 was tested and developed against SphinxSearch 2.1.9. Currently, there is a 2.2 series available. We recommend using 2.1.9. however some newer server distributions may require the installation of the 2.2.* series based on library/dependency availability. Either series will work with HelpSpot, however the 2.2 series may display some WARNING type errors during indexing, due to some depreciated configuration in use. These are normal and should not affect the searching engine's capabilities.

Note that deb, rpm, windows and source files are all available here for the latest version (2.2.*) and here for previous versions (2.1.* and older). If you are installing from a deb file, you can download the .deb file and run "sudo gdebi deb-file-you-downloaded.deb". If the gdebi command is not available, install it first: "sudo apt-get install gdebi".

Ubuntu:

Installation on Ubuntu can be done most easily using SphinxSearch's available repositories (PPA). 

Note that SphinxSearch will require at least MySQL client libraries to be installed. If you are using MySQL Server (or a MySQL variant), that will also install the client libraries. Users of PostgreSQL may need to install the package "mysql-client" via "sudo apt-get install mysql-client".

The following was tested on Ubuntu Trusty (14.04 LTS), but should work on most versions of Ubuntu. These commands will install Sphinx Search:

# Ensure command "add-apt-repository" is available:
# If this command fails, install package "python-software-properties" instead
sudo apt-get install software-properties-common

# Add Sphinx repository for 2.1.* series:
sudo add-apt-repository ppa:builds/sphinxsearch-rel21

# Alternatively, add Sphinx repository for 2.2.* series:
sudo add-apt-repository ppa:builds/sphinxsearch-rel22

# Update local repositories and install Sphinx
sudo apt-get update
sudo apt-get install sphinxsearch

# Stop Sphinx from running until we get our configuration in place
sudo service sphinxsearch stop

Assuming HelpSpot version 4+ is already installed and running, and the new "data" directory can be written to by HelpSpot, we can use the new HelpSpot commands to create our Sphinx configuration file.

# Change directory into the HelpSpot site location
cd /path/to/helpspot

# Create Sphinx configuration file using available "hs" command-line tool:
php hs search:config

This will create a file in the HelpSpot "data" directory named "sphinx.conf". This will have the needed configuration to index the HelpSpot database, including host, user and password connection information for your HelpSpot database. You can symlink this file to the SphinxSearch location at /etc/sphinxsearch/sphinx.conf or you can move this configuration file to that location. We recommend the latter, as systems usually cannot follow symlinks at system boot time, and so SphinxSearch may not successfully start back up when the system restarts.

# Move Sphinx example conf file
sudo mv /etc/sphinxsearch/sphinx.conf /etc/sphinxsearch/sphinx.conf.bak

# Move HelpSpot sphinx.conf file in place
sudo mv data/sphinx.conf /etc/sphinxsearch/sphinx.conf

Once that's done, we can index our database and then start the search engine.

# Index the database into the search engine
sudo indexer --all --rotate

# Start Sphinx
sudo service sphinxsearch start

Debian

The following was testing on Debian 7 (Wheezy).

Note that, just like on Ubuntu, SphinxSearch will require at least MySQL client libraries to be installed. If you are using MySQL Server (or a MySQL variant), that will also install the client libraries. Users of PostgreSQL will still need to install the package "mysql-client" via "sudo apt-get install mysql-client".

On Debian, you can install SphinxSearch from the appropriate .deb files as provided by SphinxSearch: Here for the latest version (2.2.*) or Here for previous versions (2.1.* and older). On Debian 7 (Wheezy), we used SphinxSearch version 2.2.7, which is the current latest. While testing this, I downloaded the .deb file directly to the server using the following command (ensure the "wget" package is installed if you do not have it):

# Download SphinxSearch installer for Debian Wheezy, 64bit architecture
wget http://sphinxsearch.com/files/sphinxsearch_2.2.7-release-1~wheezy_amd64.deb

Once you have the .deb file appropriate for your server Debian release and architecture (32bit vs 64bit), you can continue installing SphinxSearch.

# Install gdebi, which will install packages and its dependencies
# from a .deb file
sudo apt-get install gdebi-core

# Install the SphinxSearch package and any unmet dependencies
sudo gdebi sphinxsearch_2.2.7-release-1~wheezy_amd64.deb

# Stop SphinxSearch until we configure it:
sudo service sphinxsearch stop

In the current .deb files from SphinxSearch, there is a bug which causes SphinxSearch not to start when a server reboots.Debian 7 (Wheezy) makes use of the tmpfs file system for files in the /var/run directory. This filesystem is saved only in memory, and therefore files must be re-created when a system reboots. These are not created currently in SphinxSearch.

To rectify this, we need to make one adjustment to the init script responsible for starting SphinxSearch on system boot.

Edit the file /etc/init.d/sphinxsearch and add the following to the do_start() function:

test -e /var/run/sphinxsearch || install -m 755 -o root -g root -d /var/run/sphinxsearch

The entire do_start() function should look something like this:

do_start() {
        # Check if we have the configuration file
        if [ ! -f /etc/sphinxsearch/sphinx.conf ]; then
            echo "\n"
            echo "Please create an /etc/sphinxsearch/sphinx.conf configuration file."
            echo "A template is provided as /etc/sphinxsearch/sphinx.conf.sample."
            exit 1
        fi

        # This is the new line
        test -e /var/run/sphinxsearch || install -m 755 -o root -d root /var/run/sphinxsearch

        start-stop-daemon --start --pidfile $PIDFILE  --exec ${DAEMON}
}

After saving that file and exiting your editor, it should be all set.

Next, assuming HelpSpot version 4+ is already installed and running, and the new "data" directory can be written to by HelpSpot, we can use the new HelpSpot commands to create our Sphinx configuration file.

# Change directory into the HelpSpot site location
cd /path/to/helpspot

# Create Sphinx configuration file using available "hs" command-line tool:
php hs search:config

This will create a file in the HelpSpot "data" directory named "sphinx.conf". This will have the needed configuration to index the HelpSpot database, including host, user and password connection information for your HelpSpot database. You can symlink this file to the SphinxSearch location at /etc/sphinxsearch/sphinx.conf or you can move this configuration file to that location. We recommend the latter, as systems usually cannot follow symlinks at system boot time, and so SphinxSearch may not successfully start back up when the system restarts.

# Move Sphinx example conf file
sudo mv /etc/sphinxsearch/sphinx.conf /etc/sphinxsearch/sphinx.conf.bak

# Move HelpSpot sphinx.conf file in place
sudo mv data/sphinx.conf /etc/sphinxsearch/sphinx.conf

Once that's done, we can index our database and then start the search engine.

# Index the database into the search engine
sudo indexer --all --rotate

# Start Sphinx
sudo service sphinxsearch start

CRON Tasks

SphinxSearch must be told to periodically re-index the HelpSpot database. This is done in two ways:

  • Regularly, but less often (perhaps once per day) re-index the entire HelpSpot database. This fixes any indexing issues, and resolves any potential edge cases such as merged requests causing inaccurate search results.
  • Regularly, fairly often, indexing the defined "Delta Indexes", which index only data that has accumulated since the last indexing.

There are four indexes used with HelpSpot:

  • Requests (Customer information, custom fields, other related data)
  • Request History (Any public, private or external note within requests)
  • Knowledge Books
  • Forums

The following is our recommended setup for scheduling indexing of the HelpSpot database, however definitely change how often indexing occurs based on your needs for HelpSpot (e.g., if you have a higher or lower volume of requests, if you make little use of forums or Knowledge Books):

Note that since Sphinx is started/run by user "root" in Debian/Ubuntu, the CRON tasks should also be run by user root.

Entire Database:

Index the entire database once per day, preferably at a less busy time of day.

A CRON task for that would look like this:

0 0 * * * indexer --all --rotate

Forums and Knowledge Books

Index the forums and knowledge books 2 to 4 times a day. The following will index them every 6 hours, which is 4 times per day:

0 */6 * * * indexer forums_ndx knowledgebooks_ndx --rotate

Requests: Delta Indexes

Request delta indexes must be indexed and then combined into the main index. This involves multiple commands and can best be run in a shell script, such as the following:

#! /usr/bin/env bash

# Assumed to be run as root in Debian/Ubuntu
indexer requests_history_ndx_delta --rotate
indexer --merge requests_history_ndx requests_history_ndx_delta --rotate
indexer requests_ndx_delta --rotate
indexer --merge requests_ndx requests_ndx_delta --rotate

The following CRON task will run the above shell script every 10 minutes:

0/10 * * * * /path/to/delta_index_shell_script.sh

 


This page was: Helpful | Not Helpful