Skip to content

Latest commit

 

History

History
440 lines (329 loc) · 39.6 KB

DataStax_Non_Technical_Implementation_Guide.asc

File metadata and controls

440 lines (329 loc) · 39.6 KB

Non-Technical Implementation Guide

Purpose of the Document

The purpose of this document is to provide DataStax Enterprise software implementation approaches, as well as, lead and support solutions for non-technical team members. The intended audience of this document is: Organizational Leaders, Program/Project Managers, Architects, Team Leaders (Management), and other team members who lead and manage people and their projects. That is, this document is geared to the people who must answer the question: who does what?  Even though this is not a technical document, engineering and technology-focused individuals may find it beneficial for project execution.

The document is broken into the following sections:

  1. Introduction

  2. High-level Project Approach

  3. Risk Management

  4. Suggested Skills

  5. Execution Preparation Matrix

  6. Conclusion

Introduction

Today’s large, established, and traditional enterprises are experiencing consumer driven pressure to transform the way in which the enterprise interacts with their customers. These businesses are turning to DataStax Enterprise as the technology that enables the business process transformation. Most of these companies have moved from more traditional business pratices to Internet Enterprise business practices. The industry standard to date, includes business transactions that are dictated by well-defined, strictly-controlled customer interactions. The trend is shifting away from business practices which control how a customer buys, communicates, and receives services to business prcatices that promote immediate responses to customer demands.

Like any business transformation initiative, the method of execution used to implement the business transformation is almost equally as vital to the success of the transformation as the technology used to enable the transformation. The DataStax Enterprise platform is proven technology which can offer the right responses to the modern customer’s demands. This guide seeks to provide a proven approach to implement this technology, resulting in the desired business transformation for large enterprises.

This document provides a non technical, comprehensive approach for the implementation of DataStax Enterprise to efficiently and effectively achieve the enterprise business transformation. The sections of this document have been deliberately selected to help enterprise leaders understand key project success items and importantly, the key risk items that must be carefully managed to avoid undesired setbacks during implementation.

High-Level Approach

The high-level approach for implementing a system on the DataStax Enterprise platform requires similar approaches to developing a distributed, customer facing, revenue generating (i.e. mission critical) application. For example, planning, communication, and execution are all required items for the success of mission critical applications and are also required for implementing systems on DataStax Enterprise.

The following graphic highlights the unique/required steps in a high-level approach for implementing a solution on DataStax Enterprise.

Approach Overview

Note:  There are a couple of key project lifecycle phases explicitly omitted from this diagram, as they contain no DataStax-specific, Discovery, Planning, and Production Deployment items. Further items not depicted in this graph, such as application and functional/security requirements are assumed to be included in the development approach.

This diagram depicts a methodology-agnostic approach to projet implementation. These phases could be included as major milestones within a Waterfall, Agile, Kanban, or other project management methodology. 

The following list provides detail and context for the high-level approach diagram.

  • Requirements Phase

    • DataStax Milestones:

      • Data Model Requirements

      • Security and Encryption Requirements

      • Service Level Agreements

      • Operational Requirements (Monitor and Manage)

      • Search Requirements (Optional – Only for DataStax Search)

      • Analytics Requirements (Optional – Only for DataStax Analytics)

    • The pervasive sentiment in the Apache Cassandra community as well as in the DataStax Enterprise community is that one of the keys to success is "getting the data model right". To enable a scalable data model, specific data model requirements are required.

      • For next generation, transformation, upgrade, etc. projects, a great starting point for data model requirements is to enable query level logging from within the existing database. Then, sort the query logs in order of occurrence, starting with the most accessed queries first. These queries will provide most, if not all, of the requirements needed to produce the data model for DataStax.

      • For new application/functionality requirements, treat the requirements phase of the project as you would in acheiving any API requirements effort. That is, define specific Create, Read, Update, and Delete (CRUD) requirements with a special focus on the Read requirements. Specific requirements for the WHERE or BY clause of read operations are required for successful data model design.

    • DataStax Security and Encryption requirements encapsulate the following areas:

      • Authentication Requirements (i.e. Kerberos, Password, SSL, LDAP, etc.)

      • Authorization Requirements (i.e. access to Schema, Table, or other database components)

      • As DataStax Enterprise is a distributed system, encryption requirements should be defined at 2 distinct levels (note, compression design choices will occur at this level as well)

        • Client Application to DataStax (the Cluster)

        • Node-to-Node (Inter-Cluster)

    • Defining Service Level Agreements (SLAs) for each CRUD operation (in terms of latency measured in milliseconds), as well as for system uptime is highly recommended to guide the design and delivery of the solution. An absence of SLAs is a project management failure, which has a high probability of increased project duration and decreased product quality.

    • Chances are that you are working to build a mission critical application that will function at a very large scale serving millions or more of customer requests per day. Defining the requirements for the operational monitoring and management of the system is highly recommended during this phase of the project. There is a large risk that post-production system issues go either undetected or require an increased amount of time/duration/effort to resolve if clear operational requirements are absent from the onset of system implementation.

    • If the project’s scope includes DataStax Search components, then, similarly to data model requirements, search requirements are required at this stage to provide enough clarity to develop the DataStax Search views (SOLR cores) that will enable search functionality. The requirements should be clear enough to determine the fields that will be searched on and returned in the results. The requirements should be clear enough to delineate how search will be conducted, i.e. multiple search fields or single search field, the use of faceted results vs. ranked list results, etc.

    • If the project’s scope includes DataStax Analytics components, then Analytics requirements should be captured at this time. Analytics requirements should incorporate the statistical algorithms, required data sources, data movement/modifications, security/access, and other analytical requirements at a clear enough level to enable a thorough design.

  • Design Phase

    • DataStax Milestones:

      • Data Model Design

      • Data Access Object Design

      • Data Movement Design

      • Operational Design (Management and Monitoring)

      • Search Design (Optional)

      • Analytics Design (Optional)

    • The Data Model design should include the following components in a clear format that all team members can understand.  The following link will provide in depth reference material for data modeling in DataStax: http://www.datastax.com/resources/data-modeling.

      • Keyspace Design (Replication Strategy, Name)

      • Table Design (Table Names, Partition Keys, Clustering Columns (if applicable), and physical table properties as necessary (i.e. encryption, bloom filter settings, etc.)

      • Any relationships between tables.  Note that database joining within DataStax Enterprise is not technically feasible. However, relationships between tables are still important, especially for the application developers.

    • Applications built on DataStax Enterprise are more successful when applications leverage simple Data Access Objects to encapsulate and abstract data manipulation logic. This is opposed to the current trend in application development, where projects leverage frameworks to encapsulate, abstract, and represent database components as application objects, i.e. Hibernate, LinQ, JPA, ORM, etc.  Designing the Data Access Object, as much as possible, up front will help the application development team as they build out higher-level functionality.

    • Data Movement design includes items such as batch and real-time data integration between systems, ETL, Change Data Capture, data pipelines, etc. Capturing data transformation logic clearly is essential to the success of data integration initiatives. Items such as data types, transformation logic, error handling, look-ups, and data normalization should be clearly documented as part of Data Movement design.

    • Operational Design includes topics such as tooling and the techniques used to deploy new nodes, configure and upgrade nodes in the cluster, backup and restore operations, cluster monitoring, Opscenter use, repairs, alerting, disaster management processes, etc. Several organizations leverage a "playbook" approach to Operational Design.

    • It is recommended to incorporate items such as searchable terms, returned terms, tokenizers, filters, multi-document search terms, etc. in the Search Design for each searchable view, SOLR Core, that will be included in the application. Please see here for more information on the items available for design with DataStax Search - http://www.datastax.com/documentation/datastax_enterprise/4.5/datastax_enterprise/srch/srchTOC.html.

    • When working with DataStax Analytics, it is important to first determine which Analytics components will be leveraged in the solution. Once that decision has been made, then specific, functionally aligned design items should be produced, such as Hive table structures, Map Reduce workflows, etc.

  • Implementation Phase

    • DataStax Milestones:

      • Infrastructure

      • Deployment and Configuration Management Mechanism

      • Software Components (Data Model and Application)

      • Unit Testing of Components

    • This phase of the approach is typical for any type of software project. This is where "things" are actually built and implemented. Building out infrastructure and software components do not require any special DataStax centric highlights.

    • Deployment and Configuration Management Mechanisms are going to be key to managing a distributed system. It is recommended that all operational items are automated, as much as feasible, to optimize the process of deploying and/or configuring nodes in the cluster. Tools like Opscenter, Docker, Vagrant, Chef, Puppet, etc. can be leveraged to help quickly deliver the operational components necessary to manage the full software solution.

    • Unit Testing of functionality becomes a bit more complex with distributed systems compared to single node systems. Specific defects, such as race conditions, are only observed "at scale". Because of this, it is recommended that unit testing be executed over a small cluster, that contain more than a single node. Tools such as ccm can be used by developers to automate the process of quickly launching test clusters as part of a unit test.

  • Pre-Production Testing Phase

    • DataStax Milestones:

      • Defect tracking items (JIRA, Log of Issues, etc.)

      • Operational readiness checklist completed

    • This is perhaps the most critical phase of this approach.  This phase enables the project team to identify actual issues prior to going to production. As stated in the unit testing section, specific defects will not be observed until the software solution is functioning "at scale" under normal and extraordinary conditions for a period of time. These steps are deliberately provided in the approach to enable the identification of "at scale" problems preemptively.

    • This phase should encompass a two week period where, at minimum, one of the weeks is dedicated to running the application at production scale. Only observations should be made during this period of the project. Note that it may take several iterations of configuration, code change, and refactoring to enable the application to execute for a full week. The one week recommendation to ensure there are enough data points to conclude that the application and infrastructure are adequate to handle a production workload. Apache Cassandra needs to be stressed for this amount of time to determine if read performance degrades, due to compaction design items, or if it remains acceptable.

    • Here is a list of items that should be included in an Operational Readiness Checklist for DataStax Enterprise:

      • Replace a downed node and a dead seed node

      • Configure and execute repair (ensure repair completes within GC_Grace_Period)

      • Add a node to a cluster

      • Replace a downed Data Center

      • Add a Data Center to the cluster

      • Decommission a node

      • Restore a backup

      • At a Cluster Level and Per Node Level, report on errors, throughput, latency, resource saturation, bottlenecks, compactions, flushes, and health

  • Scale and Enhancements Phase

    • This phase is provided to highlight the normal, operational mode of an application built on DataStax Enterprise. This is a predictable eventuality which can be addressed by adding nodes to expand capacity to the system. Scaling with DataStax Enterprise is as simple as that.

As mentioned above, this approach is methodology-agnostic. The stages in the approach can be executed as single, individual phases in a Waterfall approach or by iterating over each phase in small, horizontal slices of functionality that include a facet of each phase. Please note that Pre-Production testing should be executed as a single phase including all planned functionality for Production deployment.

There is an additional approach that shows how small, agile teams can go from Prototype (PoC) to Production without much refactoring. Here is a link to referenced approach - http://www.slideshare.net/planetcassandra/jake-luiciani-poc-to-production-c

The attached presentation is intended for technical audiences. It provides some good details on data modeling as well as Pre-Production testing. The main takeaway is that, if the PoC is well constructed, then you can move directly into the Pre-Production testing phase of this approach, skipping the requirements through implementation phases. This highlights the scaling advantage of Apache Cassandra and DataStax Enterprise.

Risk Management

What would a technology project be without risk?  That’s a trick question, because without risk, there are no rewards.  This is especially true for the types of transformational applications that are being built on top of the DataStax Enterprise platform.  The huge scale that DataStax Enterprise can enable at millions of transactions per second with tens of Petabytes of live data, transforms small risk into large issues if the initial risk is not identified and managed.

Some key areas involving project risk management for DataStax Enterprise and Apache Cassandra are addressed below. This section does not provide an exhaustive list of risk management items for large, distributed applications; only DataStax-centric items are covered.

Risk Item

Description

Impact Severity

Mitigation Effort

Potential Impact

Mitigation Technique

Shared Storage

Using a shared storage disk system to store data within Apache Cassandra/DataStax Enterprise.  Shared Storage could be NAS, SAN, Amazon EBS, etc.  See here for information on the risk of Shared Storage.

Critical

Large

Revenue impacting due to

  1. Severe system downtime if/when Shared Storage fails.

  2. Severe performance impact on system, rendering it non-responsive due to Shared Storage not being able to handle the disk i/o load from Cassandra.

  1. Test storage system capabilities as part of the Pre-Production testing phase of the project.

  2. Escalate to the project sponsor/executive that the project will fail if Shared Storage is used as the disk system for DataStax Enterprise. The project will fail with Shared Storage.

  3. Put project "on hold" until Shared Storage is removed from infrastructure design.

  4. Start process of purchasing local storage for DataStax nodes and extend testing phase to enable enough time to switch out the disk system once performance testing proves that Shared Storage is not capable of supporting workloads from DataStax Enterprise.

Relational Model Port

Team wants to "port" or move an existing relational data model into DataStax with redesigning the model for Apache Cassandra.  This item may appear to save time on a project by skipping "steps", but will cost more time/resources in the full duration of the project.

Critical

Large

Revenue impacting due to

  1. Severe performance impact for reading data from a bad data model.

  2. Risk of large re-write effort of application to fix the data model before the application will be production ready.

  1. Test data model capabilities as part of the Pre-Production testing phase of the project.

  2. Similar to item 1, escalate high enough in the organization until someone of influence can help correct the technical direction of the solution, while planning for a large redevelopment effort post Pre-Production testing and Pre-Production deployment.

Lack of "At Scale" Testing

Placing items into Production without Production-like load testing over many days.  Testing for too short of a period, less than 5 days, is the equivalent of not testing due to the manner of which DataStax manages data files.

Critical

Large

Revenue impacting due to

  1. Risk of large "re-write" effort to resolve issues found in Production.

  1. Prior to production launch, spin up a parallel testing team to create a production-like testing environment and prepare for large "re-work" efforts to resolve issues found in Production.

  2. Create a small team of capable team members that can quickly diagnose and make recommendations for upcoming Production issues.

Slow Network Connections

The network used to connect nodes to other nodes or client applications is not fast enough or large enough to handle the amount of network traffic that will be placed onto it from the full application. This involves DataStax Enterprise and the client application stack.

Critical

Large

Revenue impacting due to

  1. Non-functioning cluster (a cluster that can’t communicate internally is not reliable enough to support a production system).

  2. Severe performance issues.

  1. Estimate anticipated network requirements early in the project and compare the network requirements with network availability.

  2. Test the performance of the network during the Pre-Production testing phase of the project.

  3. If a slow preforming network is anticipated/suspected/probable, then start escalating this item until someone of influence can help the team acquire more network bandwidth.

  4. Leverage compression to minimize the amount of network bandwidth used in the system.  Outside of compression, there are no other options for minimizing the network impact from a DataStax system.  The only other option to resolve network issues is to expand the network capabilities.

Lack of Operational Readiness

DataStax Enterprise is built on the premise of operational simplicity, but it is always a good idea to ensure the operations team is prepared to manage the system in production prior to deploying the system.

Critical

Medium

Revenue impacting due to

  1. Inability to recognize and resolve system operational issues.

  1. During the Pre-Production Pre-Production, ensure the team’s testing phase completes the recommended items included in the Operational Readiness Checklist.

Lack of Security

No security in included in the system, i.e. no authentication, authorization, nor encryption is included.

Critical

Medium

Data breach, revenue, profit, etc. impacting due to

  1. Risk of unauthorized access to sensitive data.

  1. Ensure deliberate design decisions are provided for system authentication, authorization, and encryption components of DataStax Enterprise.

  2. Test for security weaknesses during the Pre-Production testing phase of the project.

  3. Escalate if pushback on security decisions or resolution is encountered during the project.

Lack of Training

DataStax Enterprise leverages new technologies to perform operations that engineers have been executing for years but at a scale that engineers have previously not been able to accomplish. Like any new, transformational technology, proper knowledge is required from the full project team to be successful.

Critical

Medium

Revenue impacting due to

  1. Poor execution from lack of knowledge.

  2. System not operational from bad design decisions and lack of adequate testing.

  3. Inability to resolve issues from lack of knowledge and experience.

  1. Send team members to DataStax training.

Incorrectly Sized Machines

DataStax node machines are sized either too small (specifically CPU and Memory) or too large (specifically disk) compared to specifications found here or here.

Critical

Medium

End User experience impacting

  1. Risk of performance degradation for end users.

  1. Test machine size as part of the Pre-Production testing effort.

  2. If nodes are undersized, then add more nodes to the cluster to provide enough processing power for the cluster.

  3. If nodes are oversized, then leverage RAID 10 on disks to reduce the amount of disk capacity for nodes.

  4. If nodes are oversized, then minimize disk partition sizes so the right level of disk space is exposed/usable to DataStax.

Incorrectly Sized Cluster

The total amount of processing (CPU, RAM) or disk space is not adequate enough to handle both anticipated "normal" load as well as exceptional "spike" load.

High

Medium

Revenue impacting due to

  1. Risk of missed transactions if the Cluster is overwhelmed and unresponsive.

  1. Test cluster capabilities as well as saturation points as part of the Pre-Production testing phase of the project.

  2. Work with the operations team to ensure they can quickly add DataStax nodes to the cluster to increase Cluster capacity.

Too Many Tables

A data model is created that will contain more than 500 tables within a single DataStax cluster.

High

Medium

Revenue impacting due to

  1. Potential Cluster stability issues caused by the pressure of having so many tables in Cassandra.

  2. If DataStax Search capabilities are built on top of this many tables, then the system will become non-responsive. 

Operational capabilities impacted due to

  1. Limitations of monitoring tools, such as Opscenter, to collect and publish statistics with this many tables.

  1. Test the number of tables during Pre-Production testing to understand the impact on system stability, performance, and operational capabilities.

  2. Start a redesign of the data model/system architecture.  Two suggested alternative approaches are: 1 - Leverage partition keys to replace the need for separate tables.  2- Leverage multiple Clusters with less than 500 tables in each cluster.

Large Data Values

Storing data values that are larger than 10 MBs per column or 100 MBs per row (called a Cassandra partition) is not a good design for DataStax.

High

Medium

End User experience impacting due to:

  1. Severe performance degradation due to sending and processing large data values.

  1. During the design phase, recommend the team "chunk" large values into multiple partitions.

  2. Test the performance of reading and writing large data values during the Pre-Production testing phase.

  3. Start planning for code rewrite to adjust the design of the data model and processing of large data values.

  4. Ask the architecture team to review possible augmenting solutions that are targeted at storing the large data values, such as Gluster.

Cross Cluster Operations

Any client operation that preforms "Cross Cluster" operations, such as reading or writing leveraging QUORUM Consistency Levels, means that the operation is including all Data Centers in the operation read/write path.

High

Small

End User experience impacting due to:

  1. Severe performance degradation by incurring network performance and remote Data Center response times in every operation.

  1. Ensure only LOCAL or ONE based Consistency Levels are used in the client application.

  2. Performance issues will be observed during the Pre-Production testing effort that highlights this non-optimal application configuration.

  3. Change any QUORUM operations to LOCAL_QUORUM operations in the client code.

Heavy Use of Secondary Indexes

Data models that rely on more than two arbitrarily chosen indexes based on heuristics, secondary indexes to satisfy query requirements are over using secondary indexes.  This risk item indicates an issue with the data model design.

Medium

Medium

Revenue impacting due to

  1. Sever performance issues from secondary index based queries.

  2. Potential correctness issues from secondary index ingestion challenges.

  1. Test secondary index capabilities as part of the Pre-Production testing phase of the project.  If the secondary indexes enable an accurate and fast result set for queries, while not overly burdening the cluster, then there are no issues.

  2. If secondary indexes prove to be 1 - inaccurate 2 - slow 3 - too resource intensive for the cluster, then a data model (and application) rewrite will be required.  Plan for this rewrite effort.

  3. Ensure the testing effort focuses on load testing of the secondary indexes explicitly.

Lack of Requirements

Lack of clear requirements, particularly to help guide the data model design.  Or, constantly changing data model requirements.

Medium

Small

Extended implementation duration due to:

  1. Constant re-work of the data model and application that reads/writes to the data model.

  1. Leverage requirements to guide Pre-Production test phase.  If the requirements aren?t clear enough to guide the Pre-Production test phase, i.e. not enough detail on read/write patterns/use cases, then the test phase cannot begin and the project should be immediately placed back into the requirements phase.

Active-Passive Architecture

A stand-by Data Center is included in the Cluster topology design.  Stand-by Data Center means that a Data Center is included in the system infrastructure but won?t be actively used by any client applications.

Low

Small

Increased Project Cost due to

  1. Unused hardware/resources

  1. Have application team leverage all Data Centers as actively participating database systems.  Cassandra is designed to provide high availability without the need to have any stand-by components.

Suggested Skills

The required skillset for the development effort depends on the type of application being built.  This section discusses specific skills that will enable successful deployment and application builds with DataStax Enterprise.   These skills could be supplied by one or several team members/roles.  Individual team members that possess the listed skills are very valuable assets for implementation; the individual is able to work across all technologies included in DataStax Enterprise.

Apache Cassandra

For DataStax Enterprise implementations that will leverage Apache Cassandra only, i.e. no Analytics or Search components, the following skills and roles are recommended.

Skill Description Impact to Project

Linux Experience

Team members who posses a deep understanding, and have several years of administration experience, of the Linux operating system are required for DataStax Enterprise implementations. Specifically, the Linux skillset requires "know-how" for system diagnosis/monitoring, network troubleshooting, software installation,disk/partition configuration, os administration.  Deep expertise is very beneficial for troubleshooting purposes.  Apache Cassandra is tightly integrated with the Linux system and relies on the Linux OS for items such as disk management, cache management, etc.

The project will benefit from this skill during the following tasks:

  1. Implementation tasks

  2. Troubleshooting tasks

  3. Performance enhancement tasks

  4. Operational tasks

Java Experience

Even for teams that choose a different technology for the application, having someone on the team who is knowledgeable about Java in general, and the JVM specifically, will benefit the team. Apache Cassandra, which powers DataStax Enterprise, is a JAVA application.  Though standard and recommended JAVA configurations are included with Cassandra, having a team member who can tune JAVA and the JVM, will benefit the performance of Apache Cassandra and DataStax.

The project will benefit from this skill during the following tasks:

  1. Implementation tasks

  2. Troubleshooting tasks

  3. Performance enhancement tasks

Distributed Systems Development Experience

DataStax Enterprise is a distributed system.  Therefore, it has some unique nuances to it, which pose design/development challenges for application developers.  Team members who have worked with distributed systems will benefit the team.

The project will benefit from this skill during the following tasks:

  1. Design tasks

  2. Development tasks

  3. Troubleshooting tasks

  4. Performance enhancement tasks

  5. Testing tasks

Automated Configuration and Deployment Experience

It is common for deployments of DataStax Enterprise to leverage tens to hundreds, if not thousands, of nodes.  Having a team member who can automate the deployment and configuration of these nodes is very beneficial to the project.

The project will benefit from this skill during the following tasks:

  1. Implementation tasks

  2. Deployment tasks

  3. Troubleshooting tasks

  4. Performance enhancement tasks

  5. Testing tasks

 

Physical Data Modeling Experience

DataStax Enterprise relies on the data modeling guidelines of Apache Cassandra.  These guidelines align more the dimensional modeling compared to 3rd normal form (3NF) data modeling found in most online applications.  Having a team member who has experience in both dimensional and 3NF physical data modeling will be beneficial for the team.

The project will benefit from this skill during the following tasks:

  1. Design tasks

  2. Development tasks

  3. Troubleshooting tasks

  4. Performance enhancement tasks

  5. Testing tasks

DataStax Analytics

For DataStax Enterprise implementations that will leverage DataStax Analytics components, the following skillsets are recommended.

Skill Description Impact to Project

Data Analysis (Analytics) Experience

Regardless of the tooling deciscions for DataStax Enterprise, having a team member who is competent in Analytics will be an asset to the team.  This skill will enable a consultative/guidance role for the project team. This can help the team chose the appropriate algorithms, pipeline techniques, and visualization techniques for Analytics.

The project will benefit from this skill during the following tasks:

  1. Design tasks

  2. Development tasks

  3. Testing tasks

 

Hadoop Experience

If Hadoop, or one of its components will be leveraged for the DataStax Analytics tool, then someone with the appropriate Hadoop toolset will help. They will need to augment the team and provide experience with these tools.

The project will benefit from this skill during phases of the project.

Spark Experience

If Spark will be leveraged for the DataStax Analytics tool, then someone with Spark experience will be beneficial to the team.  This skillset implies some experience with Scala as well.

The project will benefit from this skill during phases of the project.

For DataStax implementations that will leverage DataStax Search components, the following skillsets are recommended, in addition to what has been presented for Apache Cassandra.

Skill Description Impact to Project

SOLR Experience

Apache SOLR is the underlying technology used by DataStax to provide search functionality. This tool is very powerful, but has a lot of options.  Having a skilled SOLR team member will be beneficial to the project.

The project will benefit from this skill during phases of the project.

Execution Readiness Matrix

This Matrix provides project leaders with a way to quantify their teams execution capabilities and preparedness.

The matrix summarizes and quantifies the items highlighted here. This quantified method will help Project Managers determine if the application and team are ready for Production.  A total score less than 60 means that the application and team are not ready for Production.  Note the weighting scales are different per topic. Specifically, the Pre-Production Testing phase is weighted very heavily to emphasize its importance. 

To use this matrix, simply place a check, or other, mark in the box that applies per topic item.  Then, once all items have been checked, summarize the score and compare it to the Production threshold of 60. Please contact DataStax for assistance if any of the topic items are deficient. 

This matrix can also be used to pinpoint issues within the team or application.

Topic

Item

Approach Total

Requirements

Incomplete (0)

Mostly Incomplete (2)

Some Parts Complete (3)

Mostly Complete (4)

Complete (5)

Data Model Requirements

Security and Encryption Requirements

Service Level Agreements

Operational Monitoring and Management

Design

Incomplete(0)

Mostly Incomplete (2)

Some Parts Complete (3)

Mostly Complete (4)

Complete (5)

Data Model Design

Data Access Object Design

Data Movement Design

Operational Design (Management and Monitoring)

Search Design (Optional)

Analytics Design (Optional)

Implementation

Incomplete(0)

Mostly Incomplete (2)

Some Parts Complete (3)

Mostly Complete (4)

Complete (5)

Infrastructure

Database Components

Application Components

Deploy and Configuration Mechanisms

Unit Testing Components

Pre-Production Testing

Incomplete(-10)

Mostly Incomplete (-5)

Some Parts Complete (1)

Mostly Complete (5)

Complete (10)

Executed for 2 Weeks

Issue Tracking and Resolution

Operational Checklist

Deploy and Configuration Mechanisms

Risk Total

Critical Risk Severity

Non Existent (1)

Exists (-5)

Shared Storage

Relational Model Port

Lack of "At Scale" Testing

Slow Network Connections

Lack of Operational Readiness

Lack of Security

Lack of Training

High Risk Severity

Non Existent (1)

Exists (-4)

Incorrectly Sized Machines

Incorrectly Sized Cluster

Too Many Tables

Large Data Values

Cross Cluster Operations

Medium Risk Severity

Non Existent (1)

Exists (-3)

Heavy Use of Secondary Indexes

Lack of Requirements

Low Risk Severity

Non Existent (1)

Exists (-2)

Active-Passive Architecture

Skillset Total

Level of Expertise

Beginner (1)

Novice (2)

Competent (3)

Advanced (4)

Expert (5)

Linux Experience

Java Experience

Distributed Systems Development Experience

Automated Configuration and Deployment Experience

Physical Data Modeling Experience

Data Analysis (Analytics) Experience (Optional)

Hadoop Experience (Optional)

Spark Experience (Optional)

SOLR Experience (Optional)

Conclusion

Hopefully, this reference guide has provided project, team, and technology leaders with strategies that will lead to the successful implementation of applications built on DataStax Enterprise.