Checking what is going on at a glance

Hey All,

Logwriter here! How are you?

I should have published the 4th part of the series “MongoDB Deployment Tips on NetApp All Flash FAS Systems” today, but I wasn’t able to finish my test plan(which contains 3 scenarios to be tested). Comparing MongoDB WiredTiger compression feature with ONTAP Storage Efficiency is a time consuming task.

The 4th part of this series will be published September 13th, 2016.

I have committed to myself that I’m going to publish an article per week. So, I’ve decided to write about something that has been helping me to understand my MongoDB database during my testing: mongostat.

If for any reason, you’re not using MongoDB Cloud Manager or MongoDB Ops Manager to manage and monitor your MongoDB database, I bet you would like to understand mongostat’s ouput.

Of course mongostat doesn’t bring the same value as Cloud Manager or Ops Manager, but it can be useful to help you in understanding your workload characteristics in case of a real-time troubleshooting task.

Understanding mongostat output

Let’s pretend it is the first time I’m having access to this database and I want to understand what mongod is doing here.

The command: mongostat –host localhost –port 27100 1

  • –host: is your mongoDB database host
  • –port: is the port where mongod is listening
  • 1: is the refresh interval in seconds
mongostat

Figure 1. mongostat output.

At a glance I can tell you a few things about the workload, let me use the first line of values to comment about it:

  • insert column: there wasn’t any insert operations happening at the moment – 0 operations
  • query column: it’s a read intensive workload – 107,812 query operations
  • update column: it’s a few updates compared to the queries – 5,823 update operations
  • delete column: there wasn’t any delete operations happening at the moment – 0 operations
  • getmore column: database cursors weren’t busy – 0 operations
  • command column: There is one command per second – 1 command
  • % dirty column: There are 0.3% of dirty bytes in WiredTiger cache. It looks fine since the updates aren’t the majority of the operations – 0.3% dirty bytes
  • % used column: There are 1.9% of WiredTiger cache utilization. – 1.9% used
  • flushes column: Mongodb is flushing data to disk every second. In this case, it is working as expected since the parameter syncPeriodSecs is equal to 1 on /etc/mongod.conf – 1 flush per second
  • qr | qw column: qr (queue read) and qw (queue write); There 2 clients in the queue waiting for reading data and 0 clients in the queue waiting for writing data – 2|0
  • ar | aw column: ar (active read) and qw (active write); There 4 clients reading data and 1 client writing data – 4|1
  • netIn column: There are 10.4 megabytes of traffic received by mongod – 10.4m
  • netOut column: There are 222 megabytes of traffic sent by mongod – 222m

If you’re a storage admin, let me remind you that mongostat is reporting MongoDB counters and its point of view in the environment.

By that I mean, if you see 100,000 at query column it doesn’t mean you will get 100,000 read operations at the storage. The same thing for update, insert, or delete. It isn’t a 1 to 1 relationship.

I hope it helps you a little bit if you need to go over a mongoDB instance for the first time to understand what is going on, or if you have to troubleshooting any issue.

Please let me know if you have any questions and see you next post!

Thanks for reading it,

Logwriter

 

Advertisements

MongoDB Deployment Tips on NetApp All-Flash FAS Systems – Part 3

featured_image_mongodb_series

Hey All,

Logwriter here! How are you?

Here we go to Part 3 of MongoDB Deployment Tips on NetApp AFF Systems. At this part I’m going to show you how to restore your entire mongoDB replica set using NetApp Snap Creator Framework.

It’s a beautiful summer day, you’re at your desk playing with MongoDB Compass querying your database when all of sudden your connection to the database hangs and you’re trying to understand what is happening.

part3_server_exploding

Your entire ReplicaSet is gone… and your manager is probably going like this:

part3_make_it_stop

Luckily your database is running on a NetApp AFF System and it takes a snapshot each 15 min with a retention policy of 48 snapshots, which gives you 12 hours of snapshots to look back in the past if you need it. Of course I just put some numbers here, because the number of snapshots available to restore will completely depend on your RPO needs. In this example, your RPO would be 15 min.

After take a deep breath and ask your manager to calm down, you log in to your Snap Creator Framework and starts a restore operation following steps 1 through 3 as indicated by Figure 1.

part3_sc_restore_step00

Figure 1. Starting a Snap Creator Restore Operation.

After you click on “Restore” you will see a screen asking you 5 questions. All questions are listed on Figure 2. Basically, Snap Creator wants to know what volumes do you want to restore, which kind of restore you want to execute and which snapshot you want to use for restore.

Restore type, can be “Volume Restore” where all volume’s content will be reverted to the state as it was when the snapshot was taken. Or it can be “Single File Restore” where you can choose which files/LUNs you want get reverted to the state as it was when the snapshot was taken.

I’m going with “Volume Restore” because my entire ReplicaSet is gone, so I want to revert all my servers to the same point in time.

part3_sc_restore_step01

Figure 2. Providing Snap Creator the information it needs to proceed with a restore.

Then you click “next” and you will see a summary screen as shown by Figure 3.

part3_sc_restore_step02

Figure 3. Snap Creator restore summary screen.

After click “Finish” I have only volume mdb_ntap1_01 as part of the restore job. Snap Creator will ask you if you want to add more volumes to this restore job as shown by Figure 4.

part3_sc_restore_step03

Figure 4. Adding more volumes to the restore job.

Clicking “Yes” and add all volumes that are part of your mongoDB replicaset. In my case here, I have volumes from mdb_ntap1_01 through mdb_ntap1_08, plus the arbiter’s volume mdb_ntap1_arbiter. After you’ve added all your volumes, click “no” and Snap Creator will show you a list of volumes that will be restored by this restore job as shown by Figure 5.

part3_sc_restore_step04

Figure 5. List of volumes that will be restored by the restore job.

Then you click “Ok” and after a few seconds your entire ReplicaSet is restored.

part3_sc_restore_step05

Figure 6. Restore Finished.

You are ready to rescan your LUNs if you will be using the same set of servers to bring your ReplicaSet up or if they were blown up like the gif in the beginning of this post, you can map your LUNs to another set of servers and then bring your Replica Set up and running.

I would like to highlight that the restore process took 25 seconds as you can see at Figure 7 and 8.

part3_sc_restore_log00

Figure 7. Restore has been started at 3:59:26 PM.

part3_sc_restore_log01

Figure 8. Restore was finished at 3:59:51 PM.

 

Figure 9 shows the database size.

part3_sc_restore_show_dbs

Figure 9. MongoDB database size.

So, my ReplicaSet is composed of 1 primary, 1 secondary and 1 arbiter. It means that if my database size is 1TB, the total data restored by Snap Creator was 2TB at 25 seconds. Pretty fast, BOOYAH!!

Please let me know if you have any questions and see you next post.

Thanks for reading it.

Logwriter

MongoDB Deployment Tips on NetApp All-Flash FAS Systems – Part 2

featured_image_mongodb_series

Hey All,

Logwriter here! How are you?

This post became longer than I was expecting, so I’ve decided to change the agenda of this series a little bit. At part two, you will see:

  • WiredTiger and Its Write Path
  • NetApp Snap Creator Framework
  • Creating Your MongoDB Profile and Configuration on Your Snap Creator Server

That said, let’s start it… I hope you enjoy it !

NetApp Snap Creator is the tool you want to use for backup, restore and clone operations with your MongoDB database.

part2_mdb_snapcreator

Figure 1. Snap Creator Login screen.

But before to jump into how to use Snap Creator, let’s see how WiredTiger handles write operations.

WiredTiger and Its Write Path

WiredTiger uses a MultiVersion Concurrency Control (MVCC) to handle write transactions. At the beginning of a write operation, WiredTiger provides a point in time version (aka snapshot) of the data that resides on disk. The snapshot is a representation of that data in memory.

When is time to write to disk, WiredTiger writes all the snapshot’s dirty pages into the datafiles in a consistent way. This process is called checkpoint.

By default checkpoints are made every 60 seconds. It can be modified changing the parameter syncPeriodSecs at /etc/mongod.conf.

But checkpoints aren’t enough to guarantee the durability of the data. Between checkpoints MongoDB protects your data using journaling.

A journal record is a representation of the operations that are happening with your data. For example, a document update might result in changes to an index, so WiredTiger creates a single journal record that contains the update operation and the necessary changes to the index.

By default, MongoDB syncs WiredTiger buffers with disks in the following events:

  • Every 50 milliseconds;
  • If a write operation occurs with write concern of j: true, WiredTiger syncs the journal files immediately;

The journal sync interval can be modified using the option commitIntervalMs, the minimum value is 1 and maximum value is 500.

You’re probably asking yourself “Why must I know about WiredTiger snapshot, checkpoint and journaling?”

You should know about these things to understand what will be in your backup as soon as you create one.

Through the Write Concern(j: true or j:false), MongoDB let’s the application determines the data’s level of importance, or better to say the durability of the data.

So, if your application is sending write operations with j: true, when you take a snapshot for sure all the operations acknowledge by mongod will be in your backup. But if your application is sending write operations with j:false, it isn’t guarantee that all operations acknowledge by mongod will be in your snapshot, because you don’t know if you’re taking a snapshot before or after the WiredTiger buffers being flushed to the journal files.

Note: You might be thinking about db.fsyncLock(), but according to MongoDB it is disruptive to your database. It will make mongod lock everything for write operations and it might affect reads. The connection that has started the lock must keep open to send the unlock, otherwise a hard shutdown (process kill) has to be done on mongod to unlock your instance.

NetApp Snap Creator Framework

Snap Creator is an unified data protection platform for multiple applications, databases and operating systems. It helps NetApp’s customers to address the following challenges:

  • Achieve application-consistent data protection
  • Standardize and simplify backup, restore and disaster recovery tasks in any environment
  • Become cloud ready by reducing the complexity of unique business processes and the development and management of scripts, helping you to make your backup, restore, and disaster recovery tasks more agile.

Let’s take a look on how Snap Creator works. It’s made of two main components: Snap Creator Server (scServer) and Snap Creator Agents (scAgent). Figure 2 gives you a better view how it looks like:

part2_mdb_snapcreator_dp_arch

Figure 2. NetApp Snap Creator Framework Data Protection Architecture

You need a VM or bare metal server to run your scServer. This VM or server needs to be able to reach out to your ONTAP cluster because it will be responsible to send the API calls to backup/restore or clone your application.

The scAgent goes in your application’s server. So, in our case here, the scAgent is installed on my MongoDB primary and secondary servers.

Creating your MongoDB Profile and Configuration on Your Snap Creator Server

To connect to your Snap Creator Server you just need a browser and then you point to your scServer IP on port 8443. The first time you login you will see the screen shown by figure 3.

part2_mdb_sc_welcome_screen

Figure 3. Snap Creator Welcome screen.

After click on “OK”, you will be redirected to the profile creation screen shown here by figure 4.

part2_mdb_sc_new_profile

Figure 4. Creating a new profile.

At this point I’m going to create a profile called “production”. Here I’m using the profile as the representation of my MongoDB production environment. After click on “OK” you will redirected to the configuration wizard as shown by figure 5.

part2_mdb_cfg_wiz_01

Figure 5. Snap Creator Configuration Wizard.

You need to give a name to your config. The name of my config is PRD_Real_Time_Analytics_DB.

part2_mdb_cfg_wiz_02

Figure 6. Configuration Name.

Snap Creator doesn’t have a specific plug-in for MongoDB, so at this screen you will select “None” as your plug-in type.

part2_mdb_cfg_wiz_03

Figure 7. Plug-in type selection.

After that, you need to inform the name or IP address where your scAgent is installed. In our case here, I’ve installed my scAgent on my MongoDB replica members and I’m pointing to the name of my primary server in the configuration.

part2_mdb_cfg_wiz_04

Figure 8. Agent Configuration.

Now is time to let your scServer knows how to talk with your ONTAP cluster. At this point you need to inform which protocol do you want to use to connect to your ONTAP cluster/SVM. Here I’m using HTTP on port 80.

part2_mdb_cfg_wiz_05

Figure 9. Storage Connection Settings.

Now your scServer knows which protocol will be used to talk to ONTAP, but it still needs the connectivity and authentication information to send commands to it. Here you will inform your SVM/Cluster name or IP address and also username and password to connect to it.

part2_mdb_cfg_wiz_06

Figure 10. Controller and SVM Credentials.

scServer now connects to your SVM/Cluster and it will show a list of volumes (left panel) for that particular object. You need to select the volumes that contains your database and click on the “right arrow” button to move them to the right side as shown by figure 11. Then click on “save”.

part2_mdb_cfg_wiz_07

Figure 11. Selecting Your Database Volumes.

After that, Snap Creator shows you the list of volumes that you’ve selected and your credentials for your configuration. Then, click “next”.

part2_mdb_cfg_wiz_08

Figure 12. List of SVM/Cluster and Volumes will be part of your configuration.

Now is time to setup your backup and retention policies. For this example, we are setting up a daily backup policy with retention of 2 days.

part2_mdb_cfg_wiz_09

Figure 13. Backup and retention policy.

Here is the most critical step. As your database is spread across volumes (in my case I have 8 volumes), to backup it properly you need to take a consistency group snapshot. So, at this screen, don’t forget to click on “Consistency Group” check box before to click on “next”.

part2_mdb_cfg_wiz_10

Figure 14. Snapshot Details.

The next screen is about replication and remote backup, SnapMirror and SnapVault respectively. I’m not using any of these two options on my environment, so just click on “next”.

part2_mdb_cfg_wiz_11

Figure 15. SnapMirror and SnapVault.

If you have OnCommand Unified Manager in your environment and wants to let your scServer to send notifications for it, here is where you provide all the connection information to let that happen.

part2_mdb_cfg_wiz_12

Figure 16. NetApp OnCommand Unified Manager settings.

Then a Summary screen and the opportunity to review all your settings before to click on “Finish”.

part2_mdb_cfg_wiz_13

Figure 17. Configuration Summary.

After click on “Finish” your configuration is created and you are ready to create your first MongoDB backup.

part2_mdb_cfg_wiz_14

Figure 18. Configuration created successfully.

To create your first backup, select your configuration on the left panel. Click on “Actions” button and then “Backup”.

part2_mdb_action_backup

Figure 19. Backup your database.

On the next part of this series I will cover restore and cloning operations.

Thanks for reading it and let me know your opinion about this post leaving a comment.

A special thanks for Kevin Adistambha (Technical Service Engineer @ MongoDB). He had answered my questions about fsyncLock() and fsyncUnlock in the mongodb users forum.

See you soon,

Logwriter

References:

WiredTiger Internals

WiredTiger Snapshots and Checkpoints

NetApp Snap Creator Data Sheet

ONTAP 9 and Oracle Database Storage Efficiency and Performance

Hey All,

Logwriter here! How are you?

ONTAP 9.0 has introduced a bunch of new features. So, as always the Workload Engineering Team put it on its Labs to run some testing and provide the results to the NetApp community and IT industry.

In TR-4514 we show that even adding more features in ONTAP the performance of your Oracle Database isn’t affected at all. You will experience a better storage efficiency without impacting the performance of your application.

NetApp ONTAP 9.0RC1 has been available for download at NetApp support website  and if you want to have an idea how an Oracle database performs on it, check it out here.

See you soon!

 

 

MongoDB Deployment Tips on NetApp All-Flash FAS Systems – Part 1

featured_image_mongodb_series

Hey All,

Logwriter here! How are you?

More than ever, the parts of a data center are seamlessly integrated within each other. This integration helps IT professionals get their jobs done faster, which saves time and money.

Some parts have a stronger dependency between each other. For example, a database cannot store data without a place to write it down.

Since we are talking about integration, it would be helpful for a database administrator(DBA) to understand what kind of features a storage system can offer for his databases. I’m not saying that a DBA should be an expert about storage systems, but if he knew about the features that a storage system can offer for him, his work would be done faster and easier.

NetApp All-Flash FAS (AFF) systems can offer features like triple parity RAID protection, Snapshot®, FlexClones®, in-line deduplication, in-line compression, compaction, encryption and replication. It can offer all these features without impacting the performance of your applications.

MongoDB is an enterprise NOSQL database. It has been used by companies like Metlife, CitiGroup, eBay, McAfee and Adobe to support their third generation applications.

Here is what I’ll cover during this series of posts:

  • Part 1 will cover “Choosing a protocol” and “Volume/LUNs layout”.
  • Part 2 will cover “WiredTiger and Its Write Path” and “Backup using Snap Creator”
  • Part 3 will cover “Restoring your MongoDB Replica Set using Snap Creator”
  • Part 4 will cover “Storage Efficiency and MongoDB”
  • Finally, Part 5 will cover “MongoDB Performance on ONTAP 9.0”

If you are a MongoDB DBA or a storage systems administrator that needs to deploy a MongoDB database environment, let me share with you some tips about how to deploy  a MongoDB database on NetApp AFF system and what it can do for you.

Choosing a Protocol 

NetApp AFF system is a multiprotocol storage solution. It can talk FCP, iSCSI, NFS and CIFS simultaneously if you need it.

According to the MongoDB 3.2 manual – Production Notes, NFS is not recommended for MMAPv1 and it can be used for WiredTiger but you should expect a lower performance.

Before the SSD drive’s age, the bottleneck of a storage system was disk I/O. HDD drives used to deliver a higher and unpredictable latencies. With the advent of SSD drives, the I/O layer was vanished from the bottleneck area.

NetApp ONTAP operating system has been improved to work with SSD drives and as a multiprotocol storage system it allows you to design your environment without worry about which protocol will support your application.

ONTAP presents a small difference in performance between FCP and NFS, where FCP shows a better throughput (IOPS) at the same latency as NFS, but it would matter only for applications like trading from the stock market world where every microsecond counts.

Volume & LUN Layout

Your volume/LUN layout decisions can affect two important factors of your MongoDB environment: Storage Efficiency and Performance.

MongoDB uses replica sets to provide protection and high availability of its data. It means, if we assume you have the most traditional mongoDB cluster topology, you’ll end up with 1 primary server and 2 secondary servers.

Considering the topology mentioned above, if your database size is 1TB you’ll need 3 TB of space to deploy your database. The primary data (1TB), plus 2 copies (1TB + 1TB).

NOTE: WiredTiger provides compression and you could save some space using it, but I’m not taking it into account to make the example easier to follow. 

Knowing about NetApp storage space efficiency and how it works, you would be inclined to create one volume with 3 LUNs on it and you’d map one LUN for each server. Check out figure 1.

mdb_vollun_singlevol

Figure 1. Volume/LUN layout – single volume

Well, it’s the most efficient way to achieve the best storage space efficiency, since 3 copies of the same data reside on the same volume, but it isn’t the best flexvol design for performance.

Figure 1 shows that your ONTAP cluster is an HA-Pair, so it’s made of 2 nodes. Each node has a data aggregate (node1 has n1_aggr1 and node2 has n2_aggr1), so if you create a single volume on n1_aggr1 you’re using only half of the compute resources(CPU, cache, network) available in your ONTAP cluster.

Let me show you a similar volume/LUN layout that will fit a like a charm on both buckets: storage efficiency and performance.

mdb_vollun_multivol

Figure 2. Volume/LUN Layout – Multiple volumes/LUNs

Figure 2 shows a volume/LUN layout where you’re allowing your MongoDB database to have access to all the compute resources (CPU, memory, network) available in your ONTAP cluster.

  • Explaining how to spread your database across multiple volumes

Instead of create a single 3TB flexvol as shown in figure 1, let’s try the multiple volumes/LUNs approach (figure 2)

a) divide 3TB by 4 flexvols to have the volume capacity. It gives you 768GB per flexvol.

b) on each flexvol you will create 3 LUNs of 250GB each.

c) map one LUN of each flexvol to one server. It gives a total of 4x 250GB for each server (1 TB)

d) using LVM on your servers, you have to create a Volume Group composed of 4x 250GB LUNs and then create a striped Logical Volume where the number of stripes is equal to the number of LUNs (–stripes 4) and the stripe size is 4KB (–stripesize 4K)

Applying a volume/LUN layout of multiple volumes/LUNs you’re maximizing your storage space efficiency savings and performance of your MongoDB database.

On the next part of this series you will learn how to backup/restore and clone a MongoDB database using powerful and amazing features available through your ONTAP cluster.

Stay tuned !

Does the protocol matters?

communication

Hey All,

Logwriter here! How are you?

Let’s say you have a storage subsystem with hard drives (yes, that one that spins a set of plates remember?! LOL) and you want to evaluate a new storage solution, but now you would like to buy a faster solution because the latency caused by the hard drives isn’t acceptable for your application anymore.

If you’re evaluating a NetApp All-Flash FAS (AFF) Solution, it’s a multiple protocol solution. Your database can talk to the storage over FCP, or NFS, or iSCSI. Then a question might pop up in your head:

“Moving from a traditional storage solution (hard drives) to an All-Flash Array, does the protocol matters?”

To answer your question, NetApp have done a protocol performance comparison for Oracle Databases.

We’ve ran a set of tests with an Oracle RAC database accessing an AFF8080 and the only change between each test was the protocol. We’ve done tests with FCP, iSCSI, Oracle DNFS and Linux kernel NFS.

Moving from a traditional storage solution (hard drives) to an All-Flash Array, does the protocol matters? Check it out here.

 

NVA – FlexPod Select for High-Performance Oracle RAC Environments

flexpod_minime

I’m very excited to announce that our new NVA design guide, FlexPod Select for High-Performance Oracle RAC Environments, is available for download through NetApp Media Library, check it out Click here.

NVA stands for NetApp Verified Architecture and it means that NetApp has done sufficient testing to verify that the architecture works smoothly and efficiently to deliver the best performance to your business.

This document is an update of the previous NVA design guide, where an EF550 was used in the architecture and had reached 1,000,000 IOPS at 0.9 ms.

We’ve added some computing power to the environment and also upgraded our EF550 to an EF560 with proven results to show how scalable our technologies are.

Are you curious to see what this is all about? Download the document and check out page number 13 (I love this number, it’s my lucky number).

I know that maybe a lot of people are going to comment here “Ok! More than 1,000,000 IOPS at less than 1ms. What is the point?”

If I was a DBA, it would be a relief to know that if my database requests a block and receives a cache miss from the db_cache , the storage subsystem is fast enough to deliver it at sub-millisecond latency without interference on my application’s performance.

This is the second time that I was requested to work on a project that demands the creation of an official and public NetApp document.

I don’t know if you have an idea how much work has to be done to transform this:

flexpod_rack

into this:

NVA0012-front-cover

Bhavin Shah and I were pointed out in the document as the authors, but I don’t think it is fair to not mention the names of the people that had worked with Bhavin and me on this document, so I really want to say THANK YOU to:

  • Scott Lane, Lee Dorrier, Chris Lemmons and Robert Yoder for reviewing the document and making sure that everything on it was presented in a consistent manner;
  • John George for helping us with all the diagrams regarding Cisco UCS;
  • Kevin Blake for working on the brand review and technical editing.

I hope this document can be useful to you. Please let me know if you have any questions.