Prologue

Well, in previous posts, we’ve Setup Elasticsearch with Kibana, and enabled our project to Ingest Data from MySQL to Elasticsearch. In this article, of course, we are going to answer the question on how to use Elasticsearch in our application, which is an ASP.NET Core Web API project. To be specific, we’re using .NET 8. 🤩

If you want to run your application on Linux, and have no idea how to do it, see Cross-platform Development with .NET Core.


1. Setup the Project

We’ll use the project structure in Thoughts on Basic Structure of ASP.NET Core Web API. Although Minified API is introduced in ASP.NET Core, I still prefer using Startup.cs. Maybe I’ll make some changes later. 🫠

1.1 Install NuGet Package

To use Elasticsearch in ASP.NET Core, we need the NEST NuGet package.

image-20231214231900101

1.2 Add Dependency Injection

First, we need to add an option entry for elastic search. Of course you can hard code it into your project, but this provides you more flexibility.

1
2
3
"ElasticOptions": {
"DefaultConnection": "http://localhost:9200"
}

Then, create a class for it.

1
2
3
4
5
6
public class ElasticOptions
{
public const string ElasticSection = "ElasticOptions";

public string DefaultConnection { get; set; }
}

And finally, in Startup.cs, we can write this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public class Startup
{
private readonly IConfiguration _configuration;

public Startup(IConfiguration configuration)
{
_configuration = configuration;
}

public void ConfigureServices(IServiceCollection services)
{
// ...
var elasticOptions = new ElasticOptions();
_configuration.GetRequiredSection(ElasticOptions.ElasticSection).Bind(elasticOptions);
var pool = new SingleNodeConnectionPool(new Uri(elasticOptions.DefaultConnection));
var client = new ElasticClient(new ConnectionSettings(pool));
services.AddSingleton<IElasticClient>(client);
// ...
}
}

Now, you are ready to use Elasticsearch.

1.3 Connection Settings

1.3.1 Basic Authentication

Now that we learnt the basic configuration of Elasticsearch connection, let’s look at it a step further.

If you configured basic authentication for Elasticsearch, you then need username and password to communicate with Elasticsearch API. So let’s add some more fields in configuration file.

1
2
3
4
5
6
"ElasticOptions": {
"DefaultConnection": "http://localhost:9200",
"EnableBasicAuth": true,
"Username": "your username",
"Password": "your password"
}

Correspondingly, we have to change our option class.

1
2
3
4
5
6
7
8
9
public class ElasticOptions
{
public const string ElasticSection = "ElasticOptions";

public string DefaultConnection { get; set; }
public bool EnableBasicAuth { get; set; }
public string Username { get; set; }
public string Password { get; set; }
}

At last, in Startup.cs, if we enable basic authentication, add username and password to connection settings.

1
2
3
4
5
6
7
8
// ...
ConnectionSettings settings = new ConnectionSettings(pool)
if (elasticOptions.EnableBasicAuth)
{
settings.BasicAuthentication(elasticOptions.Username, elasticOptions.Password);
}
var client = new ElasticClient(settings);
// ...

1.3.2 Some More Options

Later, if you run into mysterious errors with Elasticsearch connection, calm down. It may not be your fault, but improperly configured options.

First of all, if you’re using an older version of Elasticsearch client (e.g. 7.x), but deployed the latest version (e.g. 8.x) on your server, you should better enable API versioning header.

1
settings.EnableApiVersioningHeader();

The other one may be tricky, as you may not encounter in most cases. However, if it ever occurs, there’s few answer on the Internet.

This happens, if you concurrently send multiple requests with one Elasticsearch client instance, this error will be thrown. The solution is to enable pipelining, and disable direct streaming.

1
2
settings.EnableHttpPipelining()
.DisableDirectStreaming();

Also, use IsValid to check if the response is successful or not. Don’t use other flags.


2. Index Document

Although we’ve learnt how to ingest data from MySQL to Elasticsearch, in which there’s no need to manually index data into Elasticsearch, sometimes we may not need MySQL and have to put data directly into Elasticsearch.

2.1 Creating Models for Elasticsearch

Well, emm… let’s put it simple. (Not because I don’t know how.) For our document, it is actually an C# Object view of a JSON object. NEST will serialize and deserialize between this model and JSON it uses in RESTful API. So just a simple object. And I think it is not a good idea for nested objects. (Or I’m missing something.)

Just keep this model simple. If you use expression-bodied members, NEST will also include it, even if you don’t want it appear in your index.

So, a proper model should look like this. Well, fields like int and DateTime are acceptable.

1
2
3
4
5
6
7
class ElasticModel
{
public string Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public DateTime Updated { get; set; }
}

If you really want add some other fields into your model, use Elasticsearch attributes, e.g. [Ignore] to tell NEST which fields are not for indexing.

2.1 Index A Single Model

Assume that we inject IElasticClient to our services as _client. Then we can simply index a model with Index or its async brother. It is preferred to set the index name manually if you have multiple indices. The ID is not required, as Elasticsearch can create it for you automatically.

1
2
3
4
await _client.IndexAsync<ElasticModel>(model, op => op
.Index(type)
.Id(model.Id)
);

You don’t need to create the index first, as Elasticsearch will automatically create it if the index does not exist before.

2.2 Bulk Index Models

For data that comes in large scales, we may need to bulk index them. Luckily, NEST provides us with BulkDescriptor.

1
2
3
4
5
6
7
8
9
10
var bulkDescriptor = new BulkDescriptor();
foreach (var model in models)
{
bulkDescriptor.Index<ElasticModel>(op => op
.Document(model)
.Id(model.Id)
.Index("demo")
);
}
await _client.BulkAsync(bulkDescriptor);

Note that the return value of bulk operation is always success, but individual operation may fail.

To improve the performance of your application in such scenario with heavy indexing or updating tasks, see Bulk Task Optimization in C#.


3. Let the Search Begin!

NEST provides strongly typed requests and responses for Elasticsearch APIs, so it is quite easy for us to do search stuffs. However, it is more like translating RESTful API to NEST API. So you can refer to the official documentation for searching with RESTful API Search API.

Basically, it is easy to change between these two types of API, but there are some cases when the translation is not that obvious, which are what I’m going to talk about.

3.1.1 Search with ID

First of all, you should inject IElasticClient to your service. Suppose it is called _client.

Let’s start with searching by ID. In any search, we should specify which index we’re going to search. As ID is fixed, there’s no need for partial or even fuzzy match, we just need to use Term query, which will not analyze words.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public async Task<List<ElasticModel>> GetModelsById(IEnumerable<string> ids)
{
var container = new QueryContainer();
foreach (string id in ids)
{
container |= new QueryContainerDescriptor<ElasticModel>()
.Term(m => m.Field(f => f.Id).Value(id));
}

ISearchResponse<ElasticModel> response = await _client.SearchAsync<ElasticModel>(s => s
.Index("demo").Query(q => q.Bool(b => b.Should(container))));

if (!response.IsValid)
{
throw new SearchException(response.DebugInformation);
}

return response.Documents.ToList();
}

Here, we use Documents in response. Astute readers will note that in RESTful API, the results are in hits, and NEST does provide response.Hits field for all hit results. So what’s the difference?

To put it simple, elements in Hits represents the hits list in RESTful response, which include extra fields, like Score. However, Documents only contain _source filed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
hits: [
// ...
{
"_index": "works",
"_id": "12346",
"_score": 1,
"_source": {
"id": 12346,
"name": "Design Pattern",
"description": "GoF patterns",
"updated": "2023-11-18T14:14:06"
}
},
// ...
]

3.1.2 Search with many conditions

Here, we get all models whose ID is in a given list. In this case, a better solution is to build a query container, and add all id matches as single queries. So here is how it works. It is actually a logic query, see Logic Operator.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public async Task<List<ElasticModel>> GetModelsById(IEnumerable<string> words)
{
var container = new QueryContainer();
foreach (string id in ids)
{
container |= new QueryContainerDescriptor<ElasticModel>()
.Term(m => m.Field(f => f.Name).Value(id));
}

ISearchResponse<ElasticModel> response = await _client.SearchAsync<ElasticModel>(s => s
.Index("demo").Query(q => q.Bool(b => b.Should(container))));

if (!response.IsValid)
{
throw new SearchException(response.DebugInformation);
}

return response.Documents.ToList();
}

It’s a little tricky that QueryContainer uses operator overload to achieve “OR” or “AND” operation, instead of a method.

In order to expose high level API, NEST seems to use many “descriptors” when constructing requests.

3.2 Fuzzy Match

One of Elasticsearch’s advantages is that it supports fuzzy search. To use fuzzy search, you should use Match query instead of Term, and set fuzziness manually. Here the fuzziness is defined as Levenshtein Edit distance (or number of edits). Supported distances are 0, 1, 2.

1
2
3
4
await _client.SearchAsync<ElasticModel>(s => s
.Index("demo")
.Query(q => q.Match(m => m.Field(f => f.Description)
.Query("algorihm").Fuzziness(Fuzziness.EditDistance(1)))));

Get use to ))))...) with NEST. 🤭

3.3 Pagination

Of course, Elasticsearch supports pagination, but with from and size. It is easy to convert page and page size to from and size. Here, we assume that page starts from 0.

1
2
3
4
5
await _client.SearchAsync<ElasticModel>(s => s
.Index("index name")
.From(page * pageSize)
.Size(pageSize)
// ...

You shouldn’t use pagination to iterate through all data! By default, Elasticsearch allow maximum 10000 records to be returned, which means from + size should less than 10000. To do this, use Scroll API instead.

3.4 Sorting

Well, sorting is relatively simpler. Just specify the field and order in search.

1
2
3
4
5
6
await _client.SearchAsync<ElasticModel>(s => s
.Index("index name")
.From(page * pageSize)
.Size(pageSize)
.Sort(m => m.Field(f => f.Updated, SortOrder.Descending))
.Query(/* ... */);

3.5 Logic Operator

Elasticsearch support logic AND, OR and NOT in query. To do this, we only need a bool query.

1
2
3
4
5
6
7
8
await _client.SearchAsync<ElasticModel>(s => s
.Index("demo")
.From(dto.Page * dto.PageSize)
.Size(dto.PageSize)
.Query(q => q.Bool(b => b
.Must(q => q.Match(m => m.Field(f => f.Name).Query("hello")))
.Should(q => q.Match(m => m.Field(f => f.Name).Query("there")))
.MustNot(q => q.Match(m => m.Field(f => f.Name).Query("General")))));

Must is AND, Should is OR, and MustNot is NOT.


Appendix: Some Tricks

A. Dynamic Field

For common queries, flexibility may be the biggest concern. For example, we may need to dynamically change the field we want to search based on the request parameter. So here is how we change field based on query.

1
2
3
4
5
6
7
8
private static Expression<Func<ElasticModel, string>>? GetField(string field)
{
return field switch {
"name" => w => w.Name,
"description" => w => w.Description,
_ => null
};
}

We have to return Expression instead of Func, as Func cannot be converted into Expresesion directly. We can only initiate Linq expression with a lambda.

With this, we can achieve dynamic field in query. You may need to check null before pass it to Field.

1
q.Match(m => m.Field(GetField("name")).Query(cond.Value))

B. Complex Condition

When we do logic operations, the condition may not be fixed. So there’s a problem of constructing the bool query. Here is an example of build flexible bool query.

1
2
3
4
5
6
7
8
9
private static BoolQueryDescriptor<ElasticModel> ConstructQueryDescriptor(
BoolQueryDescriptor<ElasticModel> descriptor)
{
descriptor = descriptor.Must(q => q.Match(m => m.Field(f => f.Name).Query("hello")));
descriptor = descriptor.Should(q => q.Match(m => m.Field(f => f.Name).Query("there")));
descriptor = descriptor.MustNot(q => q.Match(m => m.Field(f => f.Name).Query("General")));

return descriptor;
}

With this, you can simply call this function in bool query.

1
2
3
4
5
await _client.SearchAsync<ElasticModel>(s => s
.Index("demo")
.From(dto.Page * dto.PageSize)
.Size(dto.PageSize)
.Query(q => q.Bool(b => ConstructQueryDescriptor(b)));

The same goes with other types of queries.


Epilogue

It is really easy to communicate with Elasticsearch with NEST. The APIs are overall, natural to use and understand. Just so much better than it in Python, which does not have any type at all, thus lacks productivity.