My Adventures in Coding

February 22, 2019

Cassandra – FSReadError on Startup

Filed under: Cassandra — Brian @ 5:43 pm
Tags: , , , ,

Recently we encountered an error with one node in a Cassandra cluster where the Cassandra service said it was running but we would get a failure when we tried to connect:

# cqlsh
Connection error: ('Unable to connect to any servers', 
{'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. 
Last error: Connection refused")})

So we decided to tail the journal to see if we could find a useful error message:

journalctl -f -u cassandra

While monitoring the journal output we saw the following exception recurring roughly every minute:

Nov 19 04:17:35 pcu077al188 cassandra[17259]: Exception (org.apache.cassandra.io.FSReadError) encountered during startup: java.io.EOFException
Nov 19 04:17:35 pcu077al188 cassandra[17259]: FSReadError in /var/lib/cassandra/hints/b22dfb1b-6a6e-44ce-9c7c-fda1e75293af-1542627895660-1.hints
Nov 19 04:17:35 pcu077al188 cassandra[17259]: at org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:235)

The file that Cassandra was having problems with was a 0 byte hint file.

The following Stack Overflow post suggested that to resolve the problem you just needed to remove this file. We tried this solution and it worked.

Steps to Fix FSReadError Startup Problem

Stop Cassandra

systemctl stop cassandra

Move the suspect hint file into a temporary folder (just to be safe)

mv /var/lib/cassandra/hints/b22dfb1b-6a6e-44ce-9c7c-fda1e75293af-1542627895660-1.hints /tmp

Start Cassandra

systemctl start cassandra

Verify the error has stopped

journalctl -f -u cassandra

Now verify you can connect using CQLSH

# cqlsh
Connected to PointServiceClusterV3 at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>

Note: In our case this happened on a Cassandra instance in a test environment that had not been shutdown cleanly so there was no worry about data integrity. However, if this had happened on a node in a production environment I would recommend running nodetool repair on the node.

nodetool repair

I hope that helps!

January 29, 2019

Cassandra – Switching from SimpleStrategy to NetworkTopologyStrategy

When we started using Cassandra I setup clusters for our test, staging, and production environments. Then we created the initial keyspace for our application, added tables, started using them, and everything worked fine.

Later we decided that for our up-time requirements we wanted to have a second cluster in another data center to act as a hot fail-over on production. No problem, Cassandra has us covered. However, when we had originally created our application’s keyspace, it was created with the default replication strategy SimpleStrategy. For a fail-over cluster we need Cassandra to be configured with the NetworkTopologyStrategy. No big deal right, should be an easy fix?

After reading through the documentation on Changing keyspace replication strategy, I was left with one question:

“What do I use for the data center name?”.

With SimpleStrategy you specify the number of nodes to which you want each item replicated by specifying the parameter “replication_factor”, for example (‘replication_factor’ : 3) . However, when using NetworkTopologyStrategy you use the data center name to specify how many nodes you want to have copies of the data, for example (‘mydatacentername’, 3). I was worried that if I altered the strategy on one node then the cluster thought the node was not part of the same data center, this would cause some serious problems.

Fortunately, it turns out that Cassandra has a default data center name which you can use when making this switch, kudos to the person who replied to this StackOverflow post.

Of course I was not going to try this switch out on any of our clusters until I was confident it would work. I setup a three node cluster with SimpleStrategy with replication factor set to 2, added data to the cluster, ran a nodetool repair, then I altered the strategy for the keyspace, verifyied nothing had changed as expected, then I ran nodetool repair again, and once again verified all my data was intact. So it worked as promised.

Switch Replication Strategy

Note: In this example, the keyspace we are switching the replication strategy on is called “examplekeyspace”.

Open a cqlsh prompt on any node in the cluster

Check the current replication strategy

SELECT * FROM system_schema.keyspaces;

show_keyspaces_before

Verify the default data center name

SELECT data_center FROM system.local;

show_data_center_name

Alter the existing Keyspace

Alter the keyspace using the data center name (make sure you copy it exactly!) as the replication factor and set the number of nodes to replicate to to be the same as before.

ALTER KEYSPACE ExampleKeyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3'};

alter_keyspace_network_topology

Now if you check the keyspace on each node in the cluster you will see that the replication strategy is now NetworkTopologyStrategy.

Nodetool Repair

Switching the replication strategy does not cause any data to be moved between nodes, you would need to run nodetool repair to do that. However, if all you are doing is switching an existing cluster with a single rack and datacenter from SimpleStrategy to NetworkTopologyStrategy it should not require any data be moved. But if you would like to be thorough it does not hurt to run a nodetool repair.

Run nodetool repair

nodetool repair -pr examplekeyspace

Using the option “pr – primary range only” means repair will only repair the keys that are known to the current node where repair is being run, and on other nodes where those keys are replicated. Make sure to run repair on each node, but only do ONE node at a time.

Conclusion

When I started using Cassandra I did not realize that for data replication how much of a limitation SimpleStrategy imposes. So if all you want is a single rack in a single datacenter, SimpleStrategy works, however if there is even the slightest possibility you might one day add a failover cluster in another data center or nodes in one data center but on multiple racks, use NetworkTopologyStrategy. Personally, for anything other than a local test cluster, I would always go with NetworkTopologyStrategy.

That is all!

November 9, 2018

Jetty – Profiling with JMX and VisualVM

Filed under: Jetty,JMX — Brian @ 3:36 pm
Tags: , ,

JMX (Java Management Extensions) combined with VisualVM are very useful tools to have when profiling a Java web app running on a Jetty web server, especially when you want to pinpoint what part of your application is using the most CPU or Memory.

This is a quick guide to help you enable JMX on your Jetty web server and start profiling your application using VisualVM.

Setup JMX on Jetty

Configure JMX

First, we will need to add the following JMX configuration section to your Jetty web server’s start.ini file.

IP Address: You will need to update the text “YOUR_IP_ADDRESS_HERE” in the configuration to be the IP Address of the server this config is being added to, to be able to connect to JMX remotely. If you are only using VisualVM from the same machine then localhost will work fine. So replace YOUR_IP_ADDRESS_HERE with either “localhost” if you will be running VisualVM on the same machine, or the IP Address of your Jetty server if you will use VisualVM remotely.

Port: The default port is 1099, which is why those lines are commented out, but you can change the port if you would rather use a different one.

# ---------------------------------------
# Module: jmx-remote
# Enables remote RMI access to JMX
# ---------------------------------------
--module=jmx-remote

## The host/address to bind the RMI server to.
jetty.jmxremote.rmiserverhost=YOUR_IP_ADDRESS_HERE

## The port the RMI server listens to (0 means a random port is chosen).
# jetty.jmxremote.rmiserverport=1099

## The host/address to bind the RMI registry to.
jetty.jmxremote.rmiregistryhost=YOUR_IP_ADDRESS_HERE

## The port the RMI registry listens to.
# jetty.jmxremote.rmiregistryport=1099

Restart Jetty

Now that the config has been changed, you will need to cycle Jetty for the JMX changes to take effect.

You can verify JMX is running on port 1099 using netstat:

netstat -tunlep | grep LISTEN | awk '{print $4}'

That is it! You now have JMX enabled on your Jetty server and are ready to start profiling!

Profile with VisualVM

Open/Install VisualVM

If you have JDK 8 installed, you should have VisualVM in your bin folder with the name “jvisualvm.exe”, for example:
C:\Program Files\Java\jdk1.8.0_191\bin\jvisualvm.exe

If you have a newer version of the JDK, VisualVM no longer comes bundled so you must download the standalone version.
https://visualvm.github.io/download.html

Connect to a Remote Jetty Server

Open VisualVM

Right click on the “Remote” option and select “Add Remote Host”:
1-Add_Remote_Host

Enter the IP Address of the Jetty server and click “OK”:
2-Enter-IP-Dialog

Now find the server address in the “Remote” section, right click, and select “Add JMX Connection”:
3-Add_JMX_Connection

In the dialog box that opens, enter the JMX port, which in this case is 1099 and click OK:
4-Add_JMX_Connection_Dialog

The JMX connection will show up in the “Remote” section, right click on it and select “Open”:
5-Open_JMX_Connection

You will now see the overview page:
6-Overview

Monitoring in VisualVM

Monitoring

Go to the “Monitoring” tab to see an overview of the application’s CPU, Memory, and Threads:
VisualVM Overview

Threads

Go to the “Threads” tab to see stats on all threads:
7-Threads-Tab

Sampling CPU & Memory in VisualVM

One of the many helpful features of VisualVM is the ability to sample CPU or Memory for a short period of time. This can be very useful when a performance problem is happening on a server and you want to find out what is using the most CPU or Memory at that moment in time.

Sample CPU

To sample CPU, go to the “Sampler” tab and click the “CPU” button:
8-Sampler-CPU

The sampler will run until you click the “Stop” button:
8-Sampler-CPU-Stop

After stopping, you can expand an item to get more detail:
9-Sampler-CPU-Details

Hot Spots

The Sampler also has a useful feature called “Hot spots” which shows you areas of the code that are the most active. I have used this feature before to help pinpoint methods in our code that were using the majority of our server’s CPU:
10-Sampler-CPU-Hot-Spots

Sampling by Memory

Exactly the same as sampling by CPU, just use the “Memory” option instead.

Sampling by Package

The Sampler also allows you to filter by package name. So for example if you suspect a problem is within a certain package, you can restrict the Sampler to just return results for that package.

In the upper right hand corner of the “Sampler” tab you will see a checkbox called “Settings”. When you check that box the profiled packages option will become visible:
11-Sampler-Package-Filtering

Note, that by default the option “Do not profile packages” is selected, so if you want to profile a specific package be sure to switch to “Profile only packages”:

Here I am setting it to “Profile only packages” and adding the name of a package I want to filter on:
12-Sampler-Package-Filtering-Edit

Note: In the profile packages window:
com.google.gson.* (All classes in package)
com.google.gson.** (All classes in package and subpackages)

So now I am only sampling classes in the package I have specified:
13-Sampler-Package-Filtering-Run

And of course you can still open the “Hot spots” window as well.

Fequently Asked Questions

My stats stopped updating

This usually means that the app was restarted so the PID changed. When VisualVM gets disconnected, it does not give you any feedback. Instead your charts will just stop updating.

First, you must close the existing tab:
14-Close-The-Tab

Second, you must also close the existing JMX connection as well since it is tied to the PID:
14-Remove-JMX-Connection

Third, you can now open a new connection following the steps explained in the previous section.

I hope that helps!

April 27, 2018

Redis – Setup a Test Cluster on Ubuntu

Filed under: Linux,Redis,Ubuntu — Brian @ 4:49 pm
Tags: , , , ,

I have been working with Redis lately and in my local development environment I was just testing against a single node setup of Redis. However, this did not give me the functionality I wanted to test against since our production environment uses a Redis cluster of six nodes.

Connecting to a Redis cluster requires a different configuration from a single Redis instance and is also functionally different. For example, a standard six node cluster of Redis is actually three pairs of Master-Replica where each pair stores a third of the data. So for those reasons I really wanted to test my application setup against a Redis cluster in my local development environment, which was Ubuntu.

At the the time of this post I was using Ubuntu 16.04 and Redis 4.0.9.

So here is how to spin up a six node test cluster of Redis on a single Ubuntu instance.

Download and Build Redis

wget http://download.redis.io/releases/redis-4.0.9.tar.gz
tar -zxvf redis-4.0.9.tar.gz
cd redis-4.0.9
make

Create Six Redis Nodes

First, edit the create-cluster startup file:

cd utils/create-cluster
vi create-cluster

In my setup, I changed the following:

  • Port – set to 7000 (This will be the starting port)
  • Timeout – set to 10000 (I found this handy in my dev enivornment)
  • Nodes – set to 6 (We use a 6 node cluster in production, so I wanted the same)
  • Replicas – set to 1 (This means each master with have one replica, 6 nodes = 3 Masters and 3 Replicas)

Edit_Create_Cluster_Script

Now we can start all six nodes:

./create-cluster start

You will see the nodes have been started:
Redis-Start-Cluster

If you do an “ls” in the folder you will notice that some files have been created:

  • Each node has a log file (For example: 7001.log)
  • Each node has a cluster configuration file (For example: nodes-7001.conf)

Cluster_config_files

You can also confirm the nodes are running by connecting to a node using the “redis-cli” command line tool by specifying the port of a node:

cd redis-4.0.9/src
./redis-cli -p 7001

Redis-Cli-Cluster-Key-Added

Join the Nodes as a Cluster

Even though at this stage we have six Redis nodes, they know nothing about each other. So the next step is to join these nodes as a cluster.

Prerequisites: Ruby and Redis Gem

The script that connects the nodes as a cluster is written in Ruby and requires the Redis Gem, which requires Ruby 2.2.2 or higher.

Update Ruby

sudo apt-get install ruby-full

Install Redis Gem

sudo gem install redis

To create a cluster where you only need to connect to it from the local machine you can use:

cd redis-4.0.9/src
./redis-trib.rb create --replicas 1 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006

NOTE: If you plan to connect to your Redis cluster remotely, you will need to use the public IP of the machine instead of 127.0.0.1.

When you run the join cluster script it will show you the generated configuration and then ask you to confirm the setup:
Redis-Join-Nodes-To-Cluster_001

After you answer “yes” and press enter, the nodes will be joined in a cluster. If you see the following the cluster creation was successful:
Redis-Join-Nodes-Success

That is all! You now have a functional Redis Cluster that you can connect to from your application!

For more information on Redis Clusters I recommend reading the Redis Cluster Tutorial.

April 1, 2018

Java – Using SQLServerBulkCopy in Java with an InputStream

Filed under: Java,SQL Server — Brian @ 9:59 pm
Tags: , , ,

Up until recently, the only constructor available in SQLServerBulkCopy took in a file path. This was kind of awkward since if you wanted to use bulk copy with data you had in memory, you either had to write it to a file only to read it back, or extend the class and write a constructor that could take in an InputStream. Luckily in version (I think 6.3) Microsoft finally added a new constructor with support for InputStream.

The following is a simple example in Java showing how to use SQLServerBulkCopy to write to a table in SQL Server.

Create a table in your SQL Server database:

CREATE TABLE [dbo].[SqlBulkInsertExample](
	[Id] [int] NOT NULL,
	[Name] [varchar](100) NOT NULL 
) ON [PRIMARY]
GO

Create a Java project. If using Maven, you can add the following dependency to your pom.xml file:

<dependency>
    <groupId>com.microsoft.sqlserver</groupId>
    <artifactId>mssql-jdbc</artifactId>
    <version>6.4.0.jre8</version>
</dependency>

Here is a simple Java example application showing how to take some data, convert it to a CSV, and save it to the database using bulk insert:

import com.microsoft.sqlserver.jdbc.SQLServerBulkCSVFileRecord;
import com.microsoft.sqlserver.jdbc.SQLServerBulkCopy;
import com.microsoft.sqlserver.jdbc.SQLServerBulkCopyOptions;

import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.Types;
import java.util.HashMap;
import java.util.Map;

public class SqlBulkCopyExample {

  private static String JDBC_URL = "jdbc:sqlserver://localhost;DatabaseName=Playground;InstanceName=SQL2016";
  private static String USERNAME = "client";
  private static String PASSWORD = "client";
  private static String TABLE_NAME = "dbo.SqlBulkInsertExample";

  public static void main(String[] args) {
    try {
      // Create some data to insert into our database table
      Map data = new HashMap();
      data.put(1, "John Smith");
      data.put(2, "Steve Smith");
      data.put(3, "Molly Smith");

      // We are going to build a CSV document to use for the bulk insert
      StringBuilder stringBuilder = new StringBuilder();

      // Add table column names to CSV
      stringBuilder.append("id, name\n");

      // Copy data from map and append to CSV
      for (Map.Entry entry : data.entrySet()) {
        stringBuilder.append(
                String.format("%s,%s\n", entry.getKey(), entry.getValue()));
      }

      byte[] bytes = stringBuilder.toString().getBytes(StandardCharsets.UTF_8);
      try (InputStream inputStream = new ByteArrayInputStream(bytes)) {

        // Pass in input stream and set column information
        SQLServerBulkCSVFileRecord fileRecord = new SQLServerBulkCSVFileRecord(
                inputStream, StandardCharsets.UTF_8.name(), ",", true);

        fileRecord.addColumnMetadata(1, "id", Types.INTEGER, 0, 0);
        fileRecord.addColumnMetadata(2, "name", Types.VARCHAR, 0, 0);

        try (Connection connection = DriverManager.getConnection(
                JDBC_URL, USERNAME, PASSWORD)) {

          // Set bulk insert options, for example here I am setting a batch size
          SQLServerBulkCopyOptions copyOptions = new SQLServerBulkCopyOptions();
          copyOptions.setBatchSize(10000);

          // Write the CSV document to the database table
          try (SQLServerBulkCopy bulkCopy = new SQLServerBulkCopy(connection)) {
            bulkCopy.setBulkCopyOptions(copyOptions);
            bulkCopy.setDestinationTableName(TABLE_NAME);
            bulkCopy.writeToServer(fileRecord);
          }
        }
      }
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Now if you login to your database and run the following select:

SELECT * FROM dbo.SqlBulkInsertExample

You should see that you have data!
Query_Results

I hope that helps!

January 27, 2018

Cassandra – Getting Started with Java

Filed under: Cassandra — Brian @ 9:15 pm
Tags: , ,

Cassandra is a great tool for storing time series data and I happen to be using it on my current project for that exact purpose.

There are several ways to use Cassandra from Java and many ways to improve performance, but here I just want to provide a simple “Getting Started” example. So here it is!

First, download the current version of Cassandra V3 from here.

Extract the tar.gz file:

 tar -zxvf apache-cassandra-3.11.1-bin.tar.gz
 

Change directory into the bin folder:

 cd apache-cassandra-3.11.1/bin
 

Start Cassandra:

 ./cassandra -f
 

Create a Java project, if using Maven, you can add the following dependencies to your pom.xml file:

<dependency>
    <groupId>com.datastax.cassandra</groupId>
    <artifactId>cassandra-driver-core</artifactId>
    <version>3.3.0</version>
</dependency>

Here is a simple Java example showing how to connect to Cassandra, create a keyspace, create a table, insert a row, and select a row:

import com.datastax.driver.core.*;

import java.time.Instant;
import java.time.ZoneId;
import java.util.Date;
import java.util.UUID;

public class CassandraV3Tutorial {

    private final static String KEYSPACE_NAME = "example_keyspace";
    private final static String REPLICATION_STRATEGY = "SimpleStrategy";
    private final static int REPLICATION_FACTOR = 1;
    private final static String TABLE_NAME = "example_table";

    public static void main(String[] args) {

        // Setup a cluster to your local instance of Cassandra
        Cluster cluster = Cluster.builder()
                .addContactPoint("localhost")
                .withPort(9042)
                .build();

        // Create a session to communicate with Cassandra
        Session session = cluster.connect();

        // Create a new Keyspace (database) in Cassandra
        String createKeyspace = String.format(
                "CREATE KEYSPACE IF NOT EXISTS %s WITH replication = " +
                        "{'class':'%s','replication_factor':%s};",
                KEYSPACE_NAME,
                REPLICATION_STRATEGY,
                REPLICATION_FACTOR
        );
        session.execute(createKeyspace);

        // Create a new table in our Keyspace
        String createTable = String.format(
                "CREATE TABLE IF NOT EXISTS %s.%s " + "" +
                        "(id uuid, timestamp timestamp, value double, " +
                        "PRIMARY KEY (id, timestamp)) " +
                        "WITH CLUSTERING ORDER BY (timestamp ASC);",
                KEYSPACE_NAME,
                TABLE_NAME
        );
        session.execute(createTable);

        // Create an insert statement to add a new item to our table
        PreparedStatement insertPrepared = session.prepare(String.format(
                "INSERT INTO %s.%s (id, timestamp, value) values (?, ?, ?)",
                KEYSPACE_NAME,
                TABLE_NAME
        ));

        // Some example data to insert
        UUID id = UUID.fromString("1e4d26ed-922a-4bd2-85cb-6357b202eda8");
        Date timestamp = Date.from(Instant.parse("2018-01-01T01:01:01.000Z"));
        double value = 123.45;

        // Bind the data to the insert statement and execute it
        BoundStatement insertBound = insertPrepared.bind(id, timestamp, value);
        session.execute(insertBound);

        // Create a select statement to retrieve the item we just inserted
        PreparedStatement selectPrepared = session.prepare(String.format(
                "SELECT id, timestamp, value FROM %s.%s WHERE id = ?",
                KEYSPACE_NAME,
                TABLE_NAME));

        // Bind the id to the select statement and execute it
        BoundStatement selectBound = selectPrepared.bind(id);
        ResultSet resultSet = session.execute(selectBound);

        // Print the retrieved data
        resultSet.forEach(row -> System.out.println(
                String.format("Id: %s, Timestamp: %s, Value: %s",
                row.getUUID("id"),
                row.getTimestamp("timestamp").toInstant().atZone(ZoneId.of("UTC")),
                row.getDouble("value"))));

        // Close session and disconnect from cluster
        session.close();
        cluster.close();
    }
}

If you would like to look at the data in your local Cassandra database, you can use the CQLSH command line tool.

So from the bin folder type:

 ./cqlsh

This will take you to a “cqlsh>” prompt.

To view all available Keyspaces:

 DESCRIBE KEYSPACES;

You will now see our “example_keyspace” database:

 system_schema system system_traces
 system_auth system_distributed example_keyspace

To switch to that Keyspace:

 USE example_keyspace;

To show all tables in the keyspace:

 DESCRIBE TABLES;

You will be shown all tables which includes “example_table”.

Now from the command line you can view the data in the table by using a select statement:

 select * from example_table;

Which will show the following information:

 id | timestamp | value
 --------------------------------------+---------------------------------+-------
 1e4d26ed-922a-4bd2-85cb-6357b202eda8 | 2018-01-01 01:01:01.000000+0000 | 123.45

I hope that helps!

Note: The documentation on the DataStax website is very good.

November 21, 2016

VirtualBox – Resize a CentOS Virtual Drive

Filed under: CentOS,Linux,VirtualBox — Brian @ 9:51 pm
Tags: , , , ,

Recently I created a CentOS VM using VirtualBox for some development work I was doing, but foolishly I set the virtual disk to the default size of 8 gigs, figuring it would be easy to resize it later.

Well, after piecing together bits of information from several places I was finally able to get my virtual disk resized as well as my CentOS partition that was on it. Here are the steps that worked for me!

Expand VirtualBox Drive

The first thing you need to do is expand your virtual drive which I did using the VBoxmanage command line tool provided by VirtualBox (reference).

Note: Ensure virtual box root folder is on your path (e.g., C:\Program Files\Oracle\VirtualBox).

Resize the partition with the command VBoxmanage to the desired size in megabytes (e.g., here I am resizing the drive to 20 gigs):

VBoxmanage modifyhd MyCentosDisk.vdi --resize 20000

Resize Partition with GParted

Download the gparted live cd.

Attach the gparted live cd as a drive to your virtual machine and make it the “Primary Master”.

Note: Ensure the checkbox “Live CD” is selected when you add the CD drive.

Now start your VM.

The gparted UI will load. Now resize the parition by expanding it to use the available free space on the virtual disk.

Shutdown your VM, remove the gparted live cd drive and set your virtual disk to once again be the “Primary Master” drive.

However, after you restart your VM you will notice that CentOS is still not aware of the additional space:

df -h

Expand CentOS Partition

Now you will need to expand your Linux partition to use the free space on the drive.

First check the sizes of the current partitions and get the name:

vgdisplay -v

On my system the name of the partition to expand was “/dev/centos/root”.

Extend the volume with the available free space:

lvextend -l +100%FREE /dev/centos/root

Now you are ready to perform the online resize:

xfs_growfs /dev/centos/root

NOTE: resize2fs works on other distributions of Linux but not on CentoOS, so you must use xfs_growfs (reference).

I hope that helps!

April 13, 2016

SQL Server – Get Sizes of All Tables and Indexes in a Database

Filed under: SQL,SQL Server — Brian @ 9:51 pm
Tags: , , , ,

Even though there are great tools in SQL Server Management Studio that give you a wealth of information about all of your tables and indexes, I still find it handy to have a script I can use to quickly check the current state of all tables and indexes in a database.

Size of each Table (Including Indexes)

This query gives you a total size for each table in KB including all of the indexes on that table. The query shows the table name, row count, total space, and total space used by the table and its indexes.

SELECT
    t.[Name] AS TableName,
    p.[rows] AS [RowCount],
    SUM(a.total_pages) * 8 AS TotalSpaceKB,
    SUM(a.used_pages) * 8 AS UsedSpaceKB
FROM sys.tables t
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
WHERE t.is_ms_shipped = 0 AND i.OBJECT_ID > 255
GROUP BY t.[Name], p.[Rows]
ORDER BY t.[Name]

Size of each Index

This query shows the total size in KB of each index in the database. The query shows the name of each index, which table the index is on, and the total size of the index.

SELECT 
    i.[name] AS IndexName,
    t.[name] AS TableName,
    SUM(s.[used_page_count]) * 8 AS IndexSizeKB
FROM sys.dm_db_partition_stats AS s
INNER JOIN sys.indexes AS i ON s.[object_id] = i.[object_id] 
	AND s.[index_id] = i.[index_id]
INNER JOIN sys.tables t ON t.OBJECT_ID = i.object_id
GROUP BY i.[name], t.[name]
ORDER BY i.[name], t.[name]

I hope you find these queries useful!

April 3, 2016

Jetty – Setup Jetty on CentOS 7

Filed under: Jetty,Linux — Brian @ 9:50 pm
Tags: , , , ,

I recently setup Jetty on a CentOS 7 Linux server. Previously we had been running Jetty in a windows environment but now we are moving our servers over to Linux. Here are my setup notes, I hope they help!

Install JDK

Note: We are using the Oracle JDK (Not openJDK). Check for the latest version of JDK 1.8.

Download and install the JDK:

wget --no-cookies --no-check-certificate --header "Cookie: oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/8u111-b14/jdk-8u111-linux-x64.rpm" -O /opt/jdk-8-linux-x64.rpm

yum install /opt/jdk-8-linux-x64.rpm

Set JAVA_HOME and add it to your PATH:

Create a file in /etc/profile.d to store java settings:

sudo touch /etc/profile.d/java.sh

vi /etc/profile.d/java.sh
Add the lines:

export JAVA_HOME=/usr/java/latest
export PATH=$JAVA_HOME/bin:$PATH

Install Jetty Web Server

Note: Check for the latest version of Jetty.

Download and install Jetty:

wget http://central.maven.org/maven2/org/eclipse/jetty/jetty-distribution/9.3.14.v20161028/jetty-distribution-9.3.14.v20161028.tar.gz

tar -zxvf jetty-distribution-9.3.14.v20161028.tar.gz

mv jetty-distribution-9.3.14.v20161028 /opt/jetty-9.3.14

ln -s /opt/jetty-9.3.14 /opt/jetty

Note: creating a soft link for your /opt/jetty install will be very useful later when you want to upgrade your version of Jetty. All you will need to do to upgrade is just download the tar.gz of the newest version, unpack it, and point the soft link to the new folder.

Create a temp folder for Jetty to unpack war files:

mkdir /opt/temp

Create webappbase folder where your apps will run:

mkdir /opt/webappbase

Move start.ini and webapps into the webappbase folder:

mv /opt/jetty/webapps /opt/webappbase/webapps

mv /opt/jetty/start.ini /opt/webappbase

Create a “jetty” user that Jetty will run under:

useradd -m jetty 

Update files and folders to be owned by “jetty” user:

chown -RH jetty:jetty /opt/jetty
  
chown -R jetty:jetty /opt/webappbase
  
chown -R jetty:jetty /opt/temp

Configure Jetty Web Server

Setup Jetty as a service:

ln -s /opt/jetty/bin/jetty.sh /etc/init.d/jetty

Add the new Jetty service to be managed by chkconfig:

chkconfig --add jetty

Set Jetty service to run on startup for the following run levels:

chkconfig --level 345 jetty on

Create a jetty settings file:

sudo touch /etc/default/jetty

You will need to set TMPDIR, JETTY_BASE, and JETTY_HOME. Also, any JAVA_OPTIONS you need to set can be set in the settings file for your Jetty server.

Add jetty settings by editing the jetty default settings file:
vi /etc/default/jetty

TMPDIR=/opt/temp
 
JETTY_BASE=/opt/webappbase
 
JETTY_HOME=/opt/jetty

JETTY_USER=jetty
 
export JAVA_OPTIONS="-Xms128m -Xmx2048m -server -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled"

If you want to change the default port number, you can do so by editing the start.ini file:
vi /opt/webappbase/start.ini

# Replace 
# jetty.http.port=8080
# With 
jetty.http.port=8081

Start Jetty Server

Switch to the “root” user, then start the Jetty service:

sudo su
service jetty start

Verify the service has started successfully:

service jetty status

Test that the server is running:

curl localhost:8081

Deploy a War file

To deploy an application to Jetty, all you need to do is copy a WAR file to the folder:

/opt/webappbase/webapps

That’s all!

January 5, 2016

SQL Server – How to write an Upsert using MERGE

Filed under: SQL,SQL Server — Brian @ 8:35 pm
Tags: , , , ,

Normally, when you want an application to INSERT a row to a table if it does not exist or UPDATE it if it does exist, your application must first do a SELECT to check if the row exists, which is the standard SELECT-UPDATE-INSERT pattern. The down side to this pattern is it requires two database round trips instead of one.

Now, over the years I have worked with document stores such as MongoDB and really enjoyed the freedom to be able to make a single call to the database and be able to ADD/REPLACE a document, without having to check if it exists.

Fortunately in SQL Server 2008, the MERGE function was introduced.

MERGE allows you to make only a single database round trip when you want to INSERT a row if it does not exist, or UPDATE it if it does. The following is a simple example showing how to use the MERGE statement.

Quick Start

So if you just want a quick example to get you started then here you go. In this example the table “ClientData” is the one being updated.

  • MERGE – specifies the table we will be inserting a row into or updating
  • USING – defines the condition we will be using to check if the row exists or not
  • WHEN MATCHED THEN – SQL statement to run when the row exists
  • WHEN NOT MATCHED – SQL statement to run when the row does not exist
MERGE dbo.ClientData AS [Target] 
USING (SELECT 12345 AS clientId, 'Some' AS data) AS [Source] ON [Target].clientId = [Source].clientId 
WHEN MATCHED THEN UPDATE SET [Target].data = [Source].data, [Target].updatedDateUtc = GetUtcDate() 
WHEN NOT MATCHED THEN INSERT (clientId, data) VALUES ([Source].clientId, [Source].data);

How it Works

First lets create a table to use for our test of the Merge statement:

CREATE TABLE dbo.ClientData(
	ClientId [bigint] NOT NULL,
	Data [varchar](20) NOT NULL,
	UpdatedDateUtc [datetime] NOT NULL DEFAULT (getutcdate()),
 CONSTRAINT [PK_ClientData_ClientId] PRIMARY KEY CLUSTERED (
	ClientId ASC
)) ON [PRIMARY]
GO

You can verify the table has been created and see that it is empty:

SELECT * FROM dbo.ClientData

Now, run the following Merge statement for the first time, where no matching row in the table:

MERGE dbo.ClientData AS [Target]
USING (SELECT 12345 AS clientId) AS [Source] 
ON [Target].clientId = [Source].clientId
WHEN MATCHED THEN  UPDATE SET [Target].data = 'Update', [Target].updatedDateUtc = GetUtcDate()
WHEN NOT MATCHED THEN  INSERT (clientId, data) VALUES ([Source].clientId, 'Insert');

As you can see, the INSERT statement was executed:

SELECT * FROM dbo.ClientData

Now let’s run the exact same merge statement a second time and see what happens:

MERGE dbo.ClientData AS [Target]
USING (SELECT 12345 AS clientId) AS [Source] 
ON [Target].clientId = [Source].clientId
WHEN MATCHED THEN  UPDATE SET [Target].data = 'Update', [Target].updatedDateUtc = GetUtcDate()
WHEN NOT MATCHED THEN  INSERT (clientId, data) VALUES ([Source].clientId, 'Insert');

Now, you can see the the UPDATE statement was executed since the “data” field has been updated to the text “Update”:

select * from dbo.ClientData

If you are curious about the performance difference between MERGE and SELECT-INSERT-UPDATE here is a performance comparison.

Next Page »

Blog at WordPress.com.