Elasticsearch Basics

At the core of elasticsearch’s intelligent search engine is Lucene. It’s highly scalable, real-time search, distributed and analytics and is natively JSON + REST API. In this tutorial we’ll look at some components and features of Elasticsearch.

 

Cluster

A cluster is a collection of one or more nodes that together holds
your entire data and provides
federated indexing and search
capabilities across all nodes.

Node

A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities.

Index

An index is a collection of documents that have somewhat similar characteristics.

Document

A document is a basic unit of information that can be indexed.

Type

A type is a logical category/partition of your index whose semantics is completely up to you.

Elasticsearch Features
Real Time Data
Real Time Analytics
High Availability
Distributed
Full Text Search
RESTful API

 

Elasticsearch can subdivide index into multiple pieces called shards.

Sharding is important for two primary reasons:

 

Real Time Data

Data flows into your system but getting in back real fast is what we are looking for.Need examples, OK Searching 50 millions venues in real time? Foursquare does it using Elasticsearch. Fog Creek Software’s search is control by Elasticsearch for 30 million requests per month across 40 billion lines of code and many more.

Real Time Analytics

Exploring data is as important as free text search today.Understanding data and gaining insights allows business to take better decision and improve product.

High Availability

Elasticsearch clusters are resilient — they will detect and remove failed nodes, and reorganize themselves to ensure that your data is safe and accessible.

Elasticsearch allows you to make one or more copies of your index’s shards into what are called replica shards, or replicas for short.

Replication is important for two primary reasons:

Distributed

Elasticsearch allows you to start small, but will grow with your business. It is built to scale horizontally out of the box. As you need more capacity, just add more nodes, and let the cluster reorganize itself to take advantage of the extra hardware.

Full Text Search

Using Lucene under the hood to provide powerful full text search capabilities available in any open source product. Search comes with multi-language support, a powerful query language, support for geolocation, context aware did-you-mean suggestions, autocomplete, search snippets and many more.

Comparison of DB and Elasticsearch

DB Elasticsearch
Database Index or Indices
Table Type
Column Field

 

Configurable and Extensible

ES configurations can be changed while ES is running, but some will require a restart (and in some cases reindexing). Most configurations can be changed using the REST API as well.

ES has several extension points – namely site plugins (let you serve static content from ES), rivers (for feeding data into ES), and plugins that let you add modules.

Elasticsearch Configuration Example [elasticsearch.yml]

Basic this which can be configured are

You can exploit these settings to design advanced cluster topologies.

  1. 1. You want this node to never become a master node, only to hold data. This will be the “workhorse” of your cluster. master: false, node.data: true
  2. 2. You want this node to only serve as a master: to not store any data and to have free resources. This will be the “coordinator” of your cluster. master: true, node.data: false
  3. 3. You want this node to be neither master nor data node, but to act as a “search load balancer” (fetching data from nodes, aggregating   results, etc.) node.master: false, node.data: false

There are more advance settings for having control over you data & config.

1 2 3 20
Let's Share
Exit mobile version