All posts by Eyal Gutkind

Using Graph Database with High Performance Networks

Companies today are finding that the size and growth of stored data is becoming overwhelming. As the databases grow, the challenge is to create value by discovering insights and connections in the big databases in as close to real time as possible. In the recently published whitepaper, Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks we describe a combination of high performance networking and graph base and analytics technologies which offers a solution to this need.

 

185880629

 

Each of the examples in the paper is based on an element of a typical analysis solution. In the first example, involving Vertex Ingest Rate shows the value of using high performance equipment to enhance real-time data availability. Vertex objects represent nodes in a graph, such as Customers, so this test is representative of the most basic operation: loading new customer data into the graph. In the second example, Vertex Query Rate highlights the improvement in the time needed to receive results, such as finding a particular customer record or a group of customers.

 

The third example, Distributed graph navigation processing starts at a Vertex and explores its connections to other Vertices. This is representative of traversing social networks, finding optimal transportation or communications routes and similar prob­lems. The final example, Task Ingest Rate shows the performance improvement when loading the data connecting each of the vertices. This is similar to entering orders for products, transit times over a communications path and so on.

 

Each of these elements is an important part of a Big Data analysis solution. Taken together, they show that InfiniteGraph can be made significantly more effective when combined with Mellanox interconnect technology.

 

Resources: Mellanox Web 2.0 Solutions

Deploying Hadoop on Top of Ceph, Using FDR InfiniBand Network

We recently posted a whitepaper on “Deploying Ceph with High Performance Networks” using Ceph as a block storage device.  In this post, we review the advantages of using CephFS as an alternative for HDFS.

Hadoop has become a leading programming framework in the big data space. Organizations are replacing several traditional architectures with Hadoop and use it as a storage, data base, business intelligence and data warehouse solution. Enabling a single file system for Hadoop and other programming frameworks benefits users who need dynamic scalability of compute and or storage capabilities.

Continue reading

Deploying Ceph with High Performance Networks

As data continues to grow exponentially storing today’s data volumes in an efficient way is a challenge.  Many traditional storage solutions neither scale-out nor make it feasible from Capex and Opex perspective, to deploy Peta-Byte or Exa-Byte data stores.

Ceph_Logo_Standard_RGB_120411_fa

In this newly published whitepaper, we summarize the installation and performance benchmarks of a Ceph storage solution. Ceph is a massively scalable, open source, software-defined storage solution, which uniquely provides object, block and file system services with a single, unified Ceph storage cluster. The testing emphasizes the careful network architecture design necessary to handle users’ data throughput and transaction requirements.

 

Ceph Architecture

Continue reading

Building an Enterprise Class Big Data Solution with IBM BigInsights, IBM GPFS, FPO and Mellanox RDMA

Big Data solutions such as Hadoop and NoSQL applications are no longer a sole game for Internet moguls. Today’s retail, transportation and entertainment corporations use Big Data practices such as Hadoop for data storage and data analytics.

IBM BigInsights makes Big Data deployments an easier task for the system architect. BigInsights with IBM’s GPFS-FPO file system support provides enterprise level Big Data solution, eliminating Single Point of Failure structures and increasing ingress and analytics performance.

The inherent RDMA support in IBM’s GPFS takes the performance aspect a notch higher. The testing conducted at Mellanox Big Data Lab with IBM BigInsights 2.1, GPFS-FPO and FDR 56Gbps InfiniBand showed an increased performance for write and read of 35% and 50 %, respectively, comparing to a vanilla HDFS deployment. On the analytics benchmarks, the system provided 35% throughput gain by enabling the RDMA feature.

Continue reading

See the Elephant’s Room in Vegas!

Las Vegas, Nevada is not only the home of games, art, shows and fun, also serves as home to one of the largest Hadoop clusters in the world!

 

Racks in the Switch SuperNAP - Photo Courtesy of Switch
Racks in the Switch SuperNAP – Photo Courtesy of Switch

During the upcoming 2014 EMC World show, we invite you to join us for an informative tour of SuperNAP, The World’s leader in Data Center EcoSystem Development and home of the 1000-node Hadoop cluster.  In this tour, we will show how a Hadoop cluster is deployed in a co-location data center, maintained and provide analytics tools for a large community of businesses and academic institutes. It will be a great opportunity to learn about actual working cluster workloads, design considerations and available tools for next generation businesses opportunities in Big Data.

Continue reading