Skip to Content
17:51 Video

Simplify IT & Security Operation Monitoring for Any Environment with Pure and Elastic

From searchable snapshots to simplified cluster provisioning, learn how Elasticsearch and Pure FlashBlade make distributed search simple, speedy, and secure.
Click to View Transcript
00:08
Stacie Brown: Hello, and welcome to this session at accelerate 2021 scaling and protecting Elastic Search. My name is Stacy Brown. I'm a field Solutions Architect here at Pure Storage, and today I'll be joined by Richard devore, Chief Architect at NTT managed services. The FlashBlade is a great option and
00:32
making the infrastructure underneath elastic more agile, and therefore, Elasticsearch, more agile. And it's interesting to know that the FlashBlade at its core shares a lot of the same design principles as elastic. We're going to discuss in more detail how Pure FlashBlade helps Elasticsearch
00:51
scale while maintaining speed. All without sacrificing simplicity. We'll also be talking about how you can protect your elastic data, and how Pure and Elastic Search are enabling customers to search against historic data without impacting operational complexity, or breaking the
01:09
bank. And Richard is going to share with all of you some of the key improvements and decisions made at NTT managed services to provide their customers the best experience when accessing their data, all backed by Elasticsearch and FlashBlade. This aggregation of compute and storage means
01:31
storage performance and capacity can now be scaled independently. scaling is no longer tied to the compute layer, while still delivering consistently low latency with predictable linear increases in both performance and capacity. Emphasis infrastructure for Elastic Search needs to be agile. This
01:53
means it needs to be able to scale on demand and in different ways so we can meet flexible user requirements, and it needs to enable iteration changes in the way users are using Elastic Search. Supporting the ability to quickly and easily spin up new clusters to support testing of new versions or new
02:13
configurations. This also needs to be easy so that the engineers and end users alike can take full advantage of Elastic Search. This means running more clusters, running larger clusters and dynamically changing and shaping these clusters. In order to do this successfully requires agile
02:32
infrastructure. Be Elastic Search environment must be architected to work with 1000s of events supporting the ingest of more data that is retained for longer periods of time, as companies will need to support internal team demands to iterate. This will mean supporting storage requirements
02:50
for data storage from weeks to years, all while responding to real time and ad hoc queries. These teams come with the expectation already set by the cloud providers spinning up new environments at a moment's notice. Elastic Search is often part of a larger data analytics pipeline, not just a standalone
03:11
application. It's the data collection shipping and raw data storage as well. So as you look at the larger analytic pipeline, you often see other components that have a need for storage, all of which can be stored on FlashBlade. FlashBlade delivers consistently low latency with predictable the near increases
03:30
in performance and capacity from seven blades in one chassis, all the way up to 150 blades in 10 chassis with multi dimensional performance. And by multi dimensional performance we mean not tuned for a specific IO pattern, fully capable of supporting small or large, random or sequential data or
03:51
metadata operations, making FlashBlade a perfect fit for serving storage throughout the entire analytic pipeline. Traditionally, Elastic Search is deployed on physical servers with direct attached storage. But this creates several challenges. uneven disk utilization or a single disk
04:10
failure can cause Elastic Search to shut itself down due to unplanned compute demands. In the traditional approach to implementing Elastic Search. There are so many areas for failures, not to mention operational complexity. Let's talk a little bit about scaling an Elastic Search cluster. With
04:30
legacy direct attached storage. Scaling requires adding both compute and storage. For many customers. This also means acquiring customized compute to meet the demands of elastic. Regardless of what may be needed to scale, be it compute or storage. Both compute and storage will need to be added to
04:51
the cluster. This also implies changes at other layers such as networking, Rackspace, or power Once the server is provisioned and integrated into the Elastic Search cluster, then a rebalance of the index shards is required before the application starts to use this node. As the cluster
05:12
grows, it's likely to become more and more imbalanced, since there are now more resources than necessary to keep up with the scaling requirements. And then there is the data movement overhead involved during a rebalance which will frequently cause unplanned downtime as the system is subjected to a higher
05:30
volume of transfer requests. With a disaggregated approach, Elastic Search can scale compute or storage independently expanding capacity by adding a single blade at a time, which also increases the availability of performance provided by the FlashBlade. And increasing compute can now take advantage
05:52
of commodity servers that can be added as a VM, a physical server or even a container. And of course, containers allow for more of an elastic search as a service experience. This approach significantly simplifies the operational aspects of running an Elasticsearch cluster. By being
06:10
able to independently scale either storage or compute as needed, we gain more efficient resource utilization, which means that we can now support bursting Elastic Search compute nodes to cope with specific queries or patches required by the business, perhaps end of quarter or end of your
06:27
operations or to meet a sudden requirement the change in the retention period from one year to two years by simply increasing the available storage footprint. Scaling Elasticsearch is now agile. Let's also take a look at how node failures are handled. When Elastic Search is deployed on direct attached
06:46
storage upon a node failure, Elasticsearch will automatically start rebuilding the missing shards across the remaining nodes in order to maintain the required replica counts. Of course, in this legacy direct attached storage architecture, Elastic Search data is stored in one primary shard and at a
07:04
minimum to replica shards. This rebuild operation is resource intensive. So when there is a node failure, we suddenly have a spike in workload that frequently disrupts our elastic clusters normal operation, and has a high potential for impacting both ingestion and query performance. This failure
07:24
adds additional stress to all hardware layers of the stack, network, compute and the local storage. Nobody wants a call from an unhappy customer. So ideally, the deployment of Elasticsearch is built to minimize any impact of a failure. Even in a perfect world things break. Now, let's take a
07:43
look at this same failure, but taking advantage of FlashBlade in a disaggregated architecture. For starters, because FlashBlade is providing additional data durability, we need to only configure a single replica shard will have one primary shard and one replica shard. When a node fails, there will be less data
08:04
to rebuild. But more importantly, the data on the failed node is still present on the FlashBlade, which means the recovery node can be remapped to the original NFS volume. Now the elastic cluster only needs to replay a subset of events to be back in consistency. And we're using Kubernetes and port works
08:23
the replacement of the failed node happens automatically and transparently to any end user or the elastic administrator by leveraging the FlashBlade. In a disaggregated architecture, the impact of node failures becomes invisible and end users experience predictable performance even during failure
08:43
events. We know that all infrastructure needs to be maintained. operating systems need to be upgraded and patched. Networks need to be patched and improvements made, and software upgrades are all unavoidable. When Elastic Search is architected based on das to protect the data and
09:02
availability of the cluster nodes are taken out of the cluster before any maintenance work is performed. During this process, the data will be evacuated off of a node and redistributed across the remaining nodes in the cluster. This step is repeated for each node in the cluster, a time
09:21
consuming and cumbersome operational tasks for large scale Elasticsearch deployments. Similar to a node failure, the additional IO workload can add stress to the cluster, which could lead to unplanned downtime. When leveraging FlashBlade in a disaggregated architecture, we have confidence
09:40
in the reliability of the data. Since the FlashBlade has six nines of availability, I do not need to evacuate each node to perform the upgrade. Since the data is persisted on the FlashBlade, I can simply initiate a rolling upgrade across all the nodes in the cluster. No wasting additional
09:58
time monitoring as you Each node gets upgraded and no performance impact to the end user queries. As we talked about earlier, Elastic Search is often part of a data analytics pipeline. The demand to allow users to quickly experiment and iterate means providing an agile, highly performant Elastic Search as a
10:20
service architecture. Users may need tests and dead copies of the production cluster to validate a new software version, or to try a new configuration. Deploying distributed applications like Elastic Search in a legacy Daz architecture makes experimentation and quick iteration extremely difficult.
10:40
It means sizing compute, storage, networking, Rackspace, and power for every single iteration. leveraging the FlashBlade in a disaggregated architecture makes it easy to be much more agile in delivering as a service distributed applications, I can create new virtual or containerized compute
11:01
layers to leverage the existing storage with the ability to replay data from production to test them all on the same shared storage platform. And because the FlashBlade is designed to handle multi dimensional application workloads, there is no risk to performance when spinning up these test dev
11:19
copies. The FlashBlade becomes the backbone for the end to end analytics pipeline, simplifying operations and enabling agile service delivery to end users, allowing us to deliver elastic on prem with a cloud like experience. Let's talk a little bit more about elastic as a service Kubernetes and
11:41
containers are the platform of choice when it comes to agile as a service delivery models. Let's go into a bit more depth on how we deliver Elastic Search as a service in such an environment. We leverage the elastic cloud Kubernetes operator. This provides a solid framework to deploy Elastic Search clusters
12:00
within our Kubernetes environment. Port works can also be leveraged to deliver persistent volume claims to Kubernetes applications back in it by FlashBlade. For easy scaling, configuration management and resiliency. We can now easily transition away from customized physical servers
12:18
to containerize Elasticsearch deployment with zero data migration. When deploying Elasticsearch on Kubernetes. With port works, we're able to deliver simple scaling, automated failure recovery and bursting for high ingest query periods, all as part of offering Elastic Search as a service.
12:39
Elastic is a stateful service and this does make the Kubernetes configuration more complex than it is for stateless micro services. The biggest challenge will happen when configuring storage and network. And you'll want to make sure both subsystems deliver consistently low latency.
12:55
FlashBlade is able to provide this consistently low latency across varied application workloads without adding additional operational complexity. And together elastic EC K and port works provides simple provisioning, scaling and recovery of Elastic Search clusters while eliminating the
13:15
administrative burden, not only allowing customers to quickly and easily spin up new elastic clusters, but also allowing seamless scaling, Elastic Search as a service delivered with speed and simplicity at scale. Now let's go into how we can protect that Elastic Search data.
13:37
Another great way that elastic and FlashBlade work together is leveraging the FlashBlade as a snapshot repository. Elastic Search is configured to point to the FlashBlade as an s3 target. When an elastic snapshot is taken, it is stored on the FlashBlade. This is an important data protection strategy. As
13:56
Elastic Search becomes more valuable in log analytics environments. SLA s and recovery times become critical. If you are backing up the container, the VM or the physical host, you are not performing a consistent backup of your indexes. If there were to be a problem, either an infrastructure outage accidental
14:17
deletion or some sort of corruption detected. customers need to be able to restore that Elasticsearch index. And you can do that quickly when the snapshots are hosted on a high performance object store. This high performance object store is also the foundation of supporting experimentation and
14:36
quick iteration by leveraging the same snapshots to quickly and easily spin up test dev environments. snapshots are also useful for hot archiving, you can restore data from the past when it needs to be used. FlashBlade also supports replicating these snapshots to a DR location that is either
14:56
another FlashBlade or even to AWS s3 this gives customers another option for spinning up transient Elasticsearch clusters in the cloud. In March of this year, 2021, elastic announced that's searchable snapshots are now ga in elastic seven dot 11. searchable snapshots will allow customers to retain and search
15:20
data on object storage. Let's go into how FlashBlade and searchable snapshots work together and the value this will bring to elastic environments. searchable snapshots allows customers to store data that is older customers can now extend the value of snapshots taken as part of a data protection
15:40
strategy, allowing users to search against data that is infrequently accessed. This also means there is less data for the elastic cluster to manage, reducing not only the storage requirements, but also the operational overhead. By default, searchable snapshot indices have no replicas. The
16:00
underlying snapshot provides resilience and the query volume is expected to be low enough that a single shard copy is sufficient. But this is where the FlashBlade performance really shines, providing low latency query times against those snapshots. searchable snapshots are managed through
16:18
ILM, the regular index is converted into a searchable snapshot when it reaches the colder frozen phase. By leveraging the FlashBlade as the searchable snapshot repository. Customers are able not only to retain data longer, but search against it quickly and easily. So in summary, from our
16:40
discussion today, FlashBlade consolidates modern data applications onto a single scalable platform, eliminating complex and inefficient infrastructure silos and delivering new levels of investment protection. FlashBlade provides a truly differentiated solution for all
16:58
of your workloads. Whether that includes protecting elastic data, keeping data that is searchable for longer periods of time, or offering elastic as a service. And like we mentioned earlier, elastic is often part of a much bigger data analytics pipeline. Leveraging FlashBlade as a persistent storage for the
17:18
end to end data pipeline also means a single source of truth for the data across many different analytics use cases being accessed across many different teams, all working together to provide better business insights. I hope that you enjoyed our session today and learn more about scaling and
17:37
protecting your Elasticsearch environment. Thank you for your time. Enjoy the rest of accelerate.
  • Data Analytics
  • FlashBlade
  • Elastic

For IT and security operations teams, time to insights is critical. The only way to identify and proactively resolve issues is with high speed access to data – and lots of it. During this session, Stacie Brown, Sr. Field Solutions Architect at Pure Storage and Richard Devore, Chief Architect at NTT Ltd. Managed Services, walk through how Pure helps IT teams store more data and access it in less time with a focus on Elastic platforms. They will cover common challenges with scaling, data protection and how you can future-proof investments to take advantage of cloud agility with Elastic Searchable Snapshots.

11/2024
Enhance Data Lakehouse Infrastructure
Pure Storage® has partnered with Dremio, the unified data lakehouse platform, to help enterprises build a future-proof, scalable, and efficient data infrastructure.
Solution Brief
3 pages
Continue Watching
We hope you found this preview valuable. To continue watching this video please provide your information below.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Your Browser Is No Longer Supported!

Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.