Empower ASP.NET Core With Elasticsearch
Prologue
Well, in previous posts, we’ve Setup Elasticsearch with Kibana, and enabled our project to Ingest Data from MySQL to Elasticsearch. In this article, of course, we are going to answer the question on how to use Elasticsearch in our application, which is an ASP.NET Core Web API project. To be specific, we’re using .NET 8. 🤩
If you want to run your application on Linux, and have no idea how to do it, see Cross-platform Development with .NET Core.
1. Setup the Project
We’ll use the project structure in Thoughts on Basic Structure of ASP.NET Core Web API. Although Minified API is introduced in ASP.NET Core, I still prefer using Startup.cs
. Maybe I’ll make some changes later. 🫠
1.1 Install NuGet Package
To use Elasticsearch in ASP.NET Core, we need the NEST NuGet package.
1.2 Add Dependency Injection
First, we need to add an option entry for elastic search. Of course you can hard code it into your project, but this provides you more flexibility.
1 | "ElasticOptions": { |
Then, create a class for it.
1 | public class ElasticOptions |
And finally, in Startup.cs
, we can write this.
1 | public class Startup |
Now, you are ready to use Elasticsearch.
1.3 Connection Settings
1.3.1 Basic Authentication
Now that we learnt the basic configuration of Elasticsearch connection, let’s look at it a step further.
If you configured basic authentication for Elasticsearch, you then need username and password to communicate with Elasticsearch API. So let’s add some more fields in configuration file.
1 | "ElasticOptions": { |
Correspondingly, we have to change our option class.
1 | public class ElasticOptions |
At last, in Startup.cs
, if we enable basic authentication, add username and password to connection settings.
1 | // ... |
1.3.2 Some More Options
Later, if you run into mysterious errors with Elasticsearch connection, calm down. It may not be your fault, but improperly configured options.
First of all, if you’re using an older version of Elasticsearch client (e.g. 7.x), but deployed the latest version (e.g. 8.x) on your server, you should better enable API versioning header.
1 | settings.EnableApiVersioningHeader(); |
The other one may be tricky, as you may not encounter in most cases. However, if it ever occurs, there’s few answer on the Internet.
This happens, if you concurrently send multiple requests with one Elasticsearch client instance, this error will be thrown. The solution is to enable pipelining, and disable direct streaming.
1 | settings.EnableHttpPipelining() |
Also, use IsValid
to check if the response is successful or not. Don’t use other flags.
2. Index Document
Although we’ve learnt how to ingest data from MySQL to Elasticsearch, in which there’s no need to manually index data into Elasticsearch, sometimes we may not need MySQL and have to put data directly into Elasticsearch.
2.1 Creating Models for Elasticsearch
Well, emm… let’s put it simple. (Not because I don’t know how.) For our document, it is actually an C# Object view of a JSON object. NEST will serialize and deserialize between this model and JSON it uses in RESTful API. So just a simple object. And I think it is not a good idea for nested objects. (Or I’m missing something.)
Just keep this model simple. If you use expression-bodied members, NEST will also include it, even if you don’t want it appear in your index.
So, a proper model should look like this. Well, fields like int
and DateTime
are acceptable.
1 | class ElasticModel |
If you really want add some other fields into your model, use Elasticsearch attributes, e.g. [Ignore]
to tell NEST which fields are not for indexing.
2.1 Index A Single Model
Assume that we inject IElasticClient
to our services as _client
. Then we can simply index a model with Index
or its async brother. It is preferred to set the index name manually if you have multiple indices. The ID is not required, as Elasticsearch can create it for you automatically.
1 | await _client.IndexAsync<ElasticModel>(model, op => op |
You don’t need to create the index first, as Elasticsearch will automatically create it if the index does not exist before.
2.2 Bulk Index Models
For data that comes in large scales, we may need to bulk index them. Luckily, NEST provides us with BulkDescriptor
.
1 | var bulkDescriptor = new BulkDescriptor(); |
Note that the return value of bulk operation is always success, but individual operation may fail.
To improve the performance of your application in such scenario with heavy indexing or updating tasks, see Bulk Task Optimization in C#.
3. Let the Search Begin!
NEST provides strongly typed requests and responses for Elasticsearch APIs, so it is quite easy for us to do search stuffs. However, it is more like translating RESTful API to NEST API. So you can refer to the official documentation for searching with RESTful API Search API.
Basically, it is easy to change between these two types of API, but there are some cases when the translation is not that obvious, which are what I’m going to talk about.
3.1 Basic Search
3.1.1 Search with ID
First of all, you should inject IElasticClient
to your service. Suppose it is called _client
.
Let’s start with searching by ID. In any search, we should specify which index we’re going to search. As ID is fixed, there’s no need for partial or even fuzzy match, we just need to use Term
query, which will not analyze words.
1 | public async Task<List<ElasticModel>> GetModelsById(IEnumerable<string> ids) |
Here, we use Documents
in response
. Astute readers will note that in RESTful API, the results are in hits, and NEST does provide response.Hits
field for all hit results. So what’s the difference?
To put it simple, elements in Hits
represents the hits list in RESTful response, which include extra fields, like Score
. However, Documents
only contain _source
filed.
1 | hits: [ |
3.1.2 Search with many conditions
Here, we get all models whose ID is in a given list. In this case, a better solution is to build a query container, and add all id matches as single queries. So here is how it works. It is actually a logic query, see Logic Operator.
1 | public async Task<List<ElasticModel>> GetModelsById(IEnumerable<string> words) |
It’s a little tricky that QueryContainer
uses operator overload to achieve “OR” or “AND” operation, instead of a method.
In order to expose high level API, NEST seems to use many “descriptors” when constructing requests.
3.2 Fuzzy Match
One of Elasticsearch’s advantages is that it supports fuzzy search. To use fuzzy search, you should use Match
query instead of Term
, and set fuzziness manually. Here the fuzziness is defined as Levenshtein Edit distance (or number of edits). Supported distances are 0, 1, 2.
1 | await _client.SearchAsync<ElasticModel>(s => s |
Get use to ))))...)
with NEST. 🤭
3.3 Pagination
Of course, Elasticsearch supports pagination, but with from
and size
. It is easy to convert page and page size to from and size. Here, we assume that page starts from 0.
1 | await _client.SearchAsync<ElasticModel>(s => s |
You shouldn’t use pagination to iterate through all data! By default, Elasticsearch allow maximum 10000 records to be returned, which means from
+ size
should less than 10000. To do this, use Scroll API instead.
3.4 Sorting
Well, sorting is relatively simpler. Just specify the field and order in search.
1 | await _client.SearchAsync<ElasticModel>(s => s |
3.5 Logic Operator
Elasticsearch support logic AND, OR and NOT in query. To do this, we only need a bool query.
1 | await _client.SearchAsync<ElasticModel>(s => s |
Must
is AND, Should
is OR, and MustNot
is NOT.
Appendix: Some Tricks
A. Dynamic Field
For common queries, flexibility may be the biggest concern. For example, we may need to dynamically change the field we want to search based on the request parameter. So here is how we change field based on query.
1 | private static Expression<Func<ElasticModel, string>>? GetField(string field) |
We have to return Expression
instead of Func
, as Func
cannot be converted into Expresesion
directly. We can only initiate Linq expression with a lambda.
With this, we can achieve dynamic field in query. You may need to check null before pass it to Field
.
1 | q.Match(m => m.Field(GetField("name")).Query(cond.Value)) |
B. Complex Condition
When we do logic operations, the condition may not be fixed. So there’s a problem of constructing the bool query. Here is an example of build flexible bool query.
1 | private static BoolQueryDescriptor<ElasticModel> ConstructQueryDescriptor( |
With this, you can simply call this function in bool query.
1 | await _client.SearchAsync<ElasticModel>(s => s |
The same goes with other types of queries.
Epilogue
It is really easy to communicate with Elasticsearch with NEST. The APIs are overall, natural to use and understand. Just so much better than it in Python, which does not have any type at all, thus lacks productivity.