Building a Website Search Seems Simple. It Isn’t.
Google’s search rankings are notoriously important for websites, directly affecting factors like discoverability and the amount of user traffic. It’s so vital to businesses that an entire search engine optimization industry revolves around figuring out how to improve websites’ rankings on search results pages.
But search can also be important to websites as an internal tool, helping users already on the website find what they are looking for.
Matt Riley, vice president of product at Elastic, the company that manages the Elasticsearch tool, said internal search within websites can serve different purposes. Private search engines allow employees within a company to find documentation on the company intranet, while user-facing internal search engines direct users to products or other areas of the website.
THINGS TO CONSIDER BEFORE IMPLEMENTING INTERNAL SEARCH
- Don't build search from scratch. Search is a complicated service. Chances are you’re better off building off an existing tool.
- Free tools may need some configuring. Free and open-source tools are available but need some work to configure on the part of developers.
- E-commerce sites provide built-in tools. If you’re a small company without a development team, these search tools built for general use may be a good choice.
- Don’t get boxed in. If search is important to your website and changes need to happen quickly, proprietary tools may not be the best choice.
“One way to think about the distinction between the two is exploration searches versus navigational searches,” Riley said.
Navigational searches can help users find their way on websites that are too large and sprawling to only make use of menus and navigation bars. Exploration searches, on the other hand, are more like the search bars on e-commerce websites.
“You’re typing in a search query and hoping to get an answer back that you’re not necessarily sure exists out there, but you’re relying on the relevance model of the search engine to give you a good answer,” Riley said.
Having good internal search can make websites more user-friendly and poor navigational tools and search relevance can drive users away. That’s why choosing how to implement internal search can be a fraught decision. Developers can pay for specialized search services, configure open-source tools or even build internal search from scratch themselves.
Consider these factors before deciding how to tackle this important feature on your website.
There Are Many Different Ways to Implement Search
When Charlie Hull, managing consultant at software search consultancy company OpenSource Connections, first started working in the search space 20 years ago, most options for internal search — aside from building your own from scratch — were commercial, proprietary products. Specialized search consultancy companies built their own tools and helped customers integrate them into their websites for a fee.
Around 2010, open-source solutions like Apache Solr and Elasticsearch started appearing that were built off Lucene, an existing open-source search library. These tools come with lots of built-in features to choose from and allow developers to write less code.
“There are still people creating new commercial search engines and new open-source search engines.”
“This kind of shook up the market quite a lot,” Hull said. “But the interesting thing is there are still people creating new commercial search engines and new open-source search engines. They keep appearing.”
These days, there are around a hundred different search offerings, Hull said, ranging from companies that create completely new ways of doing search to companies that just offer support to existing open-source solutions.
“They could be someone who’s just taken one of the open-source products and just put it up on a website somewhere, which is the approach Amazon uses,” he said.
Creating Internal Search From Scratch Is Rarely a Good Idea
From the user’s perspective, search boxes look deceptively clean and simple. But there are many aspects of developing an effective search tool that are hidden to users. When users type in a search phrase, they should be given more results than only those with an exact match on the phrase.
Words in the search phrase need to be identified, cleaned and run through natural language processing to ensure search results reflect the intent of the user. And behind the scenes, the website’s content and data should have already been indexed and the user’s search phrase translated into a query language and run against the data. Websites sometimes also need sophisticated user interfaces with filters and sorting to help users narrow search parameters.
“It is a product of multiple years of very smart people spending a lot of time on it.”
That’s why creating internal search from scratch is a massive endeavor, especially for smaller companies that don’t have plenty of resources at their disposal. Developers need to consider all these pieces before choosing what kind of implementation suits them.
Apache Software Foundation member Atri Sharma, who works on the team managing the Apache Solr search tool, said the process of creating a good search tool was beyond the amount of work an average development team could handle.
“If I wanted to build a search engine to match Solr, I would require a team of 50 people and a few million dollars and a few years,” Sharma said. “It is a product of multiple years of very smart people spending a lot of time on it. … So it makes no sense to be building a comparative Solr — unless [you’re pursuing] a very different technological idea.”
Taking the Open-Source Route
Open-source solutions may be well-suited for developers who want to use search tools but don’t have a lot of money to spend. Tools like Solr and Elasticsearch are great at searching through structured and unstructured data, like CSV files or large amounts of text, and are just two of the many free solutions available.
Although open-source tools cut down on a lot of work, they still require developers to do some configuration. For example, Sharma said Solr provides a ready-to-go search interface, but developers have to write custom code to index the data so Solr knows how to search on it.
“You will have to spend some effort in configuring it, deploying it and integrating with the application, but that’s mostly a one-time effort,” he said.
Searching across structured and unstructured data works well for exploration searches, allowing websites to match on search words found in product descriptions, product categories or names. Because HTML is also highly structured, structured search works well for navigational searches too.
“You will have to spend some effort in configuring it, deploying it and integrating with the application, but that’s mostly a one-time effort.”
“When you take the text out of the HTML, in the body and in the title of the HTML page, it’s pretty similar to searching over a product catalog on an e-commerce store, where you have a product name and a product description,” Riley said.
But all search tools may not support all types of searches. For instance, Solr and Elasticsearch aren’t optimized for searching across graph-based data. Storing data in graphs makes sense for information that is relationship-centric, such as social media data, but searching on it requires a different method than structured and unstructured data.
Some e-commerce sites may also want to incorporate some graph searching capability into their internal search tools. Sharma gave an example of a large e-commerce site that wants to recommend related products along with the product users are searching for.
“If I’m searching for cheese, I should also be shown bread,” he said. “That’s discovery — recommending products based on what you’re searching for, so that’s a knowledge graph.”
Companies could make more money by recommending related products, so it may be beneficial to use tools that prioritize graph search.
Companies Offer Ready-To-Use Search Tools
As more companies move their software applications and websites to the cloud, cloud-native support for internal search tools becomes an important consideration.
Sharma said Solr didn’t initially have support for cloud-based applications. As cloud hosting became more popular, the Solr team worked to build the infrastructure necessary to do cloud support. Solr added cross-datacenter replication to support application availability and circuit breakers to handle situations when search is overloaded.
“That is an overall project transformation that is actively happening,” Sharma said. “It’s an entire overhaul, so that has been a cumulative community decision that we have taken to go in that direction.”
Due to the popularity of cloud computing and the need for internal search on cloud-hosted websites, big cloud computing companies like Amazon and Microsoft have also created internal search tools for applications hosted on their cloud computing platforms. These tools provide indexing and querying functionality, which developers access by writing code to call those APIs.
Hull said that e-commerce companies like Shopify also provide good ready-to-use search tools for shop owners built to handle a wide variety of products and industries, which can be a useful option for small companies that don’t have development teams.
How Much Control Do You Need Over Search?
An important consideration when choosing how to implement search is whether having more control over the way search works is worth the extra effort, Hull said. Some companies with large budgets and niche search needs may want to use tools that offer more configurability.
“If search is absolutely core to value, then you need to own it,” Hull said. “You need to have some control, you need to make sure that you can roll out a new feature without the vendor saying, ‘I’m sorry, that’s not on the roadmap for two years.’”
If internal search affects the viability of your business, it’s a good idea to have more control so you can have a flexible and nimble solution. In that case, it’s best to avoid vendors with proprietary tools that can box you in with a dependency on their search technology, Hull said. You’d be at risk of the vendor suddenly raising prices, changing their subscription model and hindering progress on your product because of limitations on their end.
Having control over internal search also gives developers a chance to improve the service over time — although that can be tricky. Improving search requires developers to first measure how it’s performing.
“If search is absolutely core to value, then you need to own it.”
“It’s all relevance, it’s purely, ‘When I type in a query are the results coming back relevant to what I was looking for?’” Hull said. “And, of course, if you’re just typing in a couple of words, we don’t really know what you’re looking for … It’s a hard thing to measure that relevance.”
Companies can evaluate their search by evaluating how users are actually using the search tool. Rather than working through a list of customer complaints one by one, Hull recommended companies take time to understand how customers are using search to make more effective improvements by having focus groups and asking people directly.
For instance, users may be using industry lingo to search for products rather than the official name of the product, affecting search relevance.
“But the main thing I think, is to think about, ‘What are you trying to do for your user?’” Hull said. “It’s all got to flow from what your user needs — what they’re trying to search for, what you’re trying to give them back, what their information needs are and how you can best serve that. And that’s where these things should start.”