My Adventures in Coding

April 28, 2010

MongoDB – Setup a Replica Pair running in auth mode

Filed under: MongoDB,NOSQL — Brian @ 10:11 pm
Tags: , , , ,

If you are using MongoDB and need a Master/Slave configuration that will give you a Slave that will automatically be promoted to Master in the event of a Master DB failure, then MongoDB Replica Pairs will do the job for you. In a Replica Pair the Slave checks the Master for updates every few seconds. If the Master fails to respond, the Slave automatically takes over as the Master. So as far as your application is concerned, everything is still functioning correctly. In the event that the failed pair comes back online, it will see the other pair is currently the Master, will start as a Slave and sync up with the Master.

Testing the Replica Pair setup, it works very well. However, we wanted to be able to run Replica Pairs with MongoDB auth mode turned on so we could password protect our databases. We figured out how to do this setup, after a couple of attempts, so here are our instructions. Hopefully this will help!

Replica Pair using auth mode

Server1 = Your server that has all of the data you want to use
Server2 = Your server that current has no data (This is our Failover server)

Assuming both servers have a data file location of /data/db

  • Server1: delete all files with local.*
    • rm -f /data/db/local.*
  • Server2: ensure your /data/db folder is empty
  • Start Server1
    • mongod --pairwith Server2 --dbpath /data/db
    • Server1 will become the current Master
  • Start Server2
    • mongod --pairwith Server1 --dbpath /data/db
    • Server2 will become the current Slave
  • With the mongo shell connect to Server1
    • mongo --host Server1
    • Add credentials to the admin database
      • use admin
      • db.addUser(“admin”,”adminpassword”)
      • db.auth(“admin”,”adminpassword”)
    • Add replication credentials to the admin database
      • use local
      • db.addUser(“repl”,”replpassword”)
      • exit
  • Stop Server1 (ctrl+c)
    • Server2 should now switch to being the new Master
  • With the mongo shell connect to Server2
    • mongo --host Server2
    • Authenticate with the admin credentials, they were copied from Server1(Master) to Server2 (Slave)
      • use admin
      • db.auth(“admin”,”adminpassword”)
    • Add the replication credentials to the “local” database, these were not copied from Server1 to Server2 automatically while running as replica pairs
      • use local
      • db.addUser(“repl”,”replpassword”)
      • exit
  • Stop Server2 (ctrl+c)
  • Start Server1 in auth mode (It will be the Master)
    • mongod --pairwith Server2 --auth --dbpath /data/db
  • Start Server2 in auth mode (It will be the Slave)
    • mongod --pairwith Server1 --auth --dbpath /data/db

Python Replica Pair Connection

Example connection string for Python connecting to a replica pair (Server1, Server2)

import pymongo
connection=pymongo.Connection.from_uri("mongodb://username:password@Server1,Server2/database")

April 26, 2010

MongoDB – Connecting a Slave to a Master running in auth mode

Filed under: MongoDB,NOSQL — Brian @ 11:06 pm
Tags: , , , , ,

We have been running MongoDB in a test environment with a Master and a Slave working fine. However, for production we want to run our MongoDB Master and Slave using auth mode. To do this, according to the documentation on the MongoDB site, you need to create an account on both the Master and the Slave in the “Local” database that has the username “repl” (i.e., replication user). This common user on both instances of MongoDB is what is used to authenticate a Slave .The documentation on the MongoDB site on how to set up a Master-Slave configuration in auth mode is kind of vague. So to help out anyone who may be attempting this setup for the first time, here are our step by step instructions.

To setup a Master and Slave running in auth mode:

Setup Master

  • Create a directory to store your mongo DB database files
    • mkdir /data/db
  • Go to the bin folder of where your MongoDB code was extracted
    • e.g., /Users/me/mongodb/mongodb-osx-x86_64-1.4.0/bin
  • Start the Master DB
    • mongod --dbpath /data/db
  • Open another command prompt in the same folder and run the MongoDB shell
    • mongo
    • NOTE: You can also connect to a remote mongo server using the shell: mongo [remotehostname]
  • Create an admin user on the admin database
    • use admin
    • db.addUser(“admin”,”adminpassword”)
    • exit
  • Stop the mongodb server, use ctrl+c in the command prompt where it was started
  • Start the mongodb in “auth” mode
    • mongod --master --auth --dbpath /data/db
  • Now let’s login to the admin database using the admin credentials and add the “repl” (replication) user
    • mongo admin -u admin -p adminpassword
    • use local
    • db.addUser(“repl”,”replpassword”)
    • exit
  • Just to make sure it works, let’s login to our mongo db server with the “repl” credentials
    • mongo local -u repl -p replpassword
    • If you get a command prompt, then the setup was all successful!
    • exit

Setup Slave

Follow the same setup as Master on your Slave server. A MongoDB instance is always configured as a Master. Which db is a Master and which is a Slave is determined when the database is started.

NOTE: The above instructions assume you are setting up the Master and the Server on two physical machines

  • If you want to setup both on the same machine, you will need to use different “dbpath” folders for each
  • For Example:
    • mongodb --master --dbpath /data/masterdb
    • mongodb --slave --dbpath /data/slavedb

Start the Master and the Slave

  • On Master server: mongod --master --auth --dbpath /data/masterdb
  • On Slave Server: mongod --slave --auth --source [masterhostname] --dbpath /data/slavedb/
  • The Slave should start now and successfully connect to the Master running in auth mode

Test your configuration

  • Open a mongo shell to the Master database
    • mongo --host [masterhostname] admin -u admin -p adminpassword
  • Now lets add a new database and add a user account
    • use foo
    • db.addUser(“foouser”,”foopassword”)
  • Now let’s check to make sure the foo database and user have been replicated to the slave
    • Open a mongo shell to the Slave database
    • mongo --host [slavehostname] foo -u foouser -p foopassword
    • show collections
    • If you get a command prompt and can type “show collections”, everything is working fine!

NOTE: If your configuration was setup incorrectly you will probably see the following error:

        replauthenticate: no user in local.system.users to use for authentication

March 7, 2010

MongoDB queries in Java using Conditional Operators

Filed under: Java,NOSQL — Brian @ 4:29 pm
Tags: , , , ,

When I first started using MongoDB, I used the interactive shell to learn the query syntax. The syntax is simple and straightforward. However, when I started using the Java driver I was not sure how to translate some of my command line queries into Java code, for example some of the conditional operators like “$in”. The Java documentation on the MongoDB website only showed how to use some basic conditionals like greater than and less than, but not for using other conditionals that use lists like the $in option.

A simple query for all cars with the make “Ford” that match any of several models listed:

SQL:

SELECT * FROM dbo.Cars 
WHERE make="Ford" 
AND model IN ("Galaxy","Mustang","Meteor")

MongoDB interactive shell:

db.cars.find( { "make":"Ford", "model":{ $in: ["Galaxy","Mustang","Meteor"] } } )

MongoDB Java driver:

BasicDBObject query = new BasicDBObject();
query.put("make", "Ford");
String models[] = new String[]{"Galaxy", "Mustang", "Meteor"};
query.put("model", new BasicDBObject("$in", models));
DBCursor resultsCursor = carsCollection.find(query);

February 15, 2010

Getting Started with Mongo DB

Filed under: MongoDB,NOSQL — Brian @ 4:28 pm
Tags: , , , ,

Why are we considering Mongo DB?

In our current system we receive data from our customers, store it in a relational database, and then when we make that data available via a REST service we change very little. The data we are storing is all related, each chunk is really a document. So lately we have been questioning why we go through so much work to break the data down into a schema we have created, to only put the data back into the same format when we use it. This extra work has made us wonder if a Document Store is a better fit for our needs. There is nothing wrong with a relational database, it is just that as the needs of data storage change, a one size fits all solution is not necessarily the best solution in every case.

To just give an example of some of the current problems we have been running into: constant changes in the data we have to store. Each time a customer requires that we store a new field, we must also modify our database schema to include this new field and update our application to map data received to this new field. For us, we really don’t want to define a schema, we just want to store the data given to us by our customers, and surface it as we received it. Trying to force the data into some sort of database schema we define based on what we “think” our customers “might” send us is time consuming and in this case just does not feel right.

Our team has been considering document style database solutions such as Couch DB for the last few months, but we still have not been ready to make the jump. Switching from a relational to a document style database is a big shift, not just in technology, but also in how we approach the design of our application. Moving to Couch DB is a more drastic shift than switching to Mongo DB (You can read about the differences here). Recently we have become more interested in Mongo DB and decided to spend some time exporing it.

A Blend of Document and Relational Databases

We do not want to have a predefined schema, but rather we want the database to just take what data we give it and store it. However, we are not quite ready to give up all of the flexibility in querying that is provided by a relational database. This is what peaked our interest in Mongo DB. What makes Mongo DB unique amongst document style databases is that it allows you to store full documents in collections (Think of a collection as a schema free Table) in JSON format, BUT it still gives you the ability to query over any fields in those documents and create indexes for fields often queried. This is what made us so interested in Mongo DB.

Alright, to the fun stuff! Let’s try out Mongo DB

  • Download the software: http://www.mongodb.org/display/DOCS/Downloads
  • Unzip the Mongo DB software to a folder such as: C:\mongodb-win32-x86_64-1.2.2
  • Create a folder to store database files:
    • The default folder is C:\data\db on windows and “/data/db” on unix systems.
    • Make sure to create this folder before starting Mongo DB.
  • Start Mongo DB
    • cd C:\mongodb-win32-x86_64-1.2.2\bin
    • mongod.exe
    • NOTE: If you get an error saying “Assertion: dpath (/data/db/) does not exist” then you have either not created the directory, or permissions have not been set appropriately.

Interactive Shell

Mongo DB provides a shell interface for querying the database directly. This is very useful when you want to try the database for the first time.

To start the Interactive shell:
cd C:\mongodb-win32-x86_64-1.2.2\bin
mongo.exe

Just to get started let’s try out a few basic statements

Insert a document into a new collection
Note that when we insert the document “car” we are also creating a brand new collection called “cars”. You can think of a collection as being like a table in a relational database, but without any defined schema. Also note that creation of collections is on demand, if you try to insert a document with a collection name that does not exist, a new collection will be create and the document will be inserted.

> car = { make: "Ford" , model: "Galaxy"};
     { "make" : "Ford", "model" : "Galaxy" }
> db.cars.save(car);
> db.cars.find();
     { "_id" : ObjectId("4b7789f2fb5c000000006faa"), "make" : "Ford", "model" : "Galaxy" }

Query for the document
In the above statement we inserted a new document and just did a “find” which returns ALL documents in the collection. But what if we want to query for a specific document? Mongo DB providers a function called “findOne” which returns a single document based on search criteria provided.

> db.cars.findOne({ make: "Ford" });
     {
             "_id" : ObjectId("4b7789f2fb5c000000006faa"),
             "make" : "Ford",
             "model" : "Galaxy"
     }

Update the existing document
In this update statement we are saying for every document in collection “cars” with a “model” type of “Galaxy”, set the “status” field to “InStock”. In an update if the “status” field already exists it will be updated, but also if the field does not exist on that document it will also be added.

> db.cars.update({ model: "Galaxy"}, {make: "Ford", model: "Galaxy", status: "InStock"});
> db.cars.findOne({ make: "Ford" });
     {
             "_id" : ObjectId("4b7789f2fb5c000000006faa"),
             "make" : "Ford",
             "model" : "Galaxy",
             "status" : "InStock"
     }

Query for a list of documents
When you use the “find” command you can specify criteria to search by (which feels very much like a select statement in a relational database). The find command returns a cursor which allows you to call functions like “next()” and “hasNext()” to retrieve documents. The interesting thing is that the cursor is not a snapshot (a list of all documents that meet the search criteria at that moment in time), but instead it does a live query for the next item each time “next()” is called. So if you create a cursor for all “cars” matching the criteria of “make” being “Ford” and then as your loop is running, new documents matching this criteria are added to the “cars” collection, they will also be included. This feature is especially interesting to us because in our current system we provide data feeds. To be able to start the data transfer and know that any new documents added while the job is running, will not be missed, is a big plus.

> var cursor = db.cars.find({ make: "Ford" });
> cursor.length()
     1
> car = { make: "Ford" , model: "Fairlane"};
     { "make" : "Ford", "model" : "Fairlane" }
> db.cars.save(car);
> cursor.length()
     2
> db.cars.find({ make: "Ford" });
     { "_id" : ObjectId("4b7789f2fb5c000000006faa"), "make" : "Ford", "model" : "Galaxy", "status" : "InStock" }
     { "_id" : ObjectId("4b79c916fb5c000000006fab"), "make" : "Ford", "model" : "Fairlane" }

Import JSON data from a file

Also, if you have data in valid JSON format in a file and would like to import this data, you can use the import from file utility:

mongoimport.exe –file [PathToMyFile] –collection [NameOfNewCollection]

For our initial test of Mongo DB we created a test file with 200,000 JSON records of real production data and loaded it into Mongo DB with this file import utitlity. The performance was very good, it did the import in about 1000 records per second.

Conclusion

As with any technology change, it is important to think about the problem you are trying to solve and pick the right tool for the job. Avoid looking at cool technologies then trying to apply them to every problem as some sort of silver bullet solution, which is all too common in our industry. MongoDB is not a one size fits all solution just as a relation database is not a one size fits all solution. For most of our applications a relational database is still the best choice, but we found in this case a document store was more appropriate, which lead us to start exploring a number of document store solutions, and out of that evaluation MongoDB was the best fit for our needs. So evaluate MongoDB and see if it is the right tool to solve your problem, or just use it because MongoDB is Web Scale (haha).

If you are interested in learning more about Mongo DB and especially in how the query syntax works, check out the tutorial provided on the Mongo DB website.

Create a free website or blog at WordPress.com.