Cassandra

Cassandra

Our comprehensive Apache Cassandra training program provides data modeling experience like no other
apache cassandra course

The large volume and extensive variety of data that is required by today’s business processes need for a highly available, low latency database. Apache Cassandra delivers this solution by enabling high-speed reads and writes across a replicated, distributed system. This Apache Cassandra training program provides data modeling experience in order to take advantage of Cassandra’s linearly scalable peer-to-peer design.

The evolution of Big Data is now evolving the landscape of big businesses. While this raw data is difficult to harness, Apache Cassandra, the open source NoSQL distributed database management system is able to handle large amounts of data across many commodity servers. MobiGnosis’ Cassandra training program will teach you all about the fundamentals of Cassandra, starting from the basics to the more advanced methodologies. Here you will learn Cassandra Data models, Cassandra Architecture, about configuration, reading and writing data and integrating it with Hadoop from our Cassandra training which also includes practice sessions for your better understanding. Knowledge of this new-age technology is just what you require to have a successful career and our trainers will help you excel in it!

Key Learnings :

  • Architect Cassandra databases and implementation of commonly used design patterns
  • Model data in Cassandra based on query patterns
  • Access Cassandra databases using CQL and Java
  • Create a balance between read/write speed and data consistency
  • Integrate Cassandra with Hadoop, Pig, and Hive
  • How and where to use Cassandra and the core concepts that drive this database.
  • Learn how to use the fault-tolerant and high availability feature of Cassandra
  • Understand the Apache Cassandra architecture and the more complex inner workings such as gossip protocol, read repairs and Merkle trees
  • How to properly identify requirements and create a Cassandra data model by applying data modelling techniques
  • How to integrate Cassandra with Hadoop and use tools like Pig and Hive

Topics Covered During Classroom :

1. Basics

  • Revise CAP Theorem
  • Good fit use cases

2. Concepts

  • Who uses it?
  • Database or Datastore?
  • Masterless architecture
  • Seed node(s)
  • Gossip
  • Detecting a failed node
  • Replication
  • Partitioner
  • Snitch – summary
  • Snitch – property file, when to use? An example
  • Virtual node, ring architecture
  • Commodity vs Specialized hardware
  • Bootstrap process
  • Elastic linear scalability
  • Debate – heterogeneous machines, adding capacity
  • Deployment – 4 dimensions
  • Distributed workloads, Multi-DC setup
  • Regions and Zones (Cloud setup)
  • SEDA

3. Setup and installation

  • Acquiring and Installing C*
  • Understand the key components of Cassandra.yaml
  • Configuring and Installation structure
  • Directories – Data, Commit Log, Cache
  • System log configuration
  • Nodetool & CqlSh

4. Concepts II

  • Keyspace
  • Admin/system keyspace
  • Column Family / Table
  • Primary key components
  • Visualizing PK based storage, on disk cells & row sizing
  • Fault tolerance via replication
  • Coordinator
  • Consistency Levels – read/write, immediate
  • Quorum
  • Applied consistency level – scenario game
  • Inconsistencies across nodes
  • Anti-entropy op & Read repair
  • Hinted handoff
  • Debate – RF change impact

5. Write and Read path

  • Why C* writes fast?
  • Components of the write
  • Storage – a primer
  • A bit more about LSM
  • Write path flow
  • Data state
  • Memtable, SSTables, Commit log
  • When does the flush trigger?
  • Data file name structure
  • Overview of CDC
  • Row cache, Key cache, Chunk cache
  • Bloom filters
  • Key index sample / Partition index sample
  • Read path flow
  • Eager retry
  • Last write wins with tombstone example
  • Compaction
  • NTP and why it is important

6. Modeling

  • QDD
  • De-normalization
  • Row key & data partitioning
  • Visualizing the components of the PK
  • Choice of row key
  • How to fix a wide row?
  • Partition key IN clause (Slicing)
  • Counters, Collections
  • TTL, UDT, UDF, UDA
  • Materialized views
  • Where clause & order by restrictions
  • Antipatterns
  • Relationship tables and dup tables
  • Bitmap type of index & example
  • CQL limits – a primer
  • Batch – pro con analysis
  • Lightweight transactions
  • Triggers
  • Debate – Locking in/out of C*
  • Application design DAO
  • Nodetool – table/cfhistograms, table/cfstats, tpstats, netstats, proxyhistograms
COURSE OVERVIEW

The large volume and extensive variety of data that is required by today’s business processes need for a highly available, low latency database. Apache Cassandra delivers this solution by enabling high-speed reads and writes across a replicated, distributed system. This Apache Cassandra training program provides data modeling experience in order to take advantage of Cassandra’s linearly scalable peer-to-peer design.

The evolution of Big Data is now evolving the landscape of big businesses. While this raw data is difficult to harness, Apache Cassandra, the open source NoSQL distributed database management system is able to handle large amounts of data across many commodity servers. MobiGnosis’ Cassandra training program will teach you all about the fundamentals of Cassandra, starting from the basics to the more advanced methodologies. Here you will learn Cassandra Data models, Cassandra Architecture, about configuration, reading and writing data and integrating it with Hadoop from our Cassandra training which also includes practice sessions for your better understanding. Knowledge of this new-age technology is just what you require to have a successful career and our trainers will help you excel in it!

WHAT YOU WILL LEARN

Key Learnings :

  • Architect Cassandra databases and implementation of commonly used design patterns
  • Model data in Cassandra based on query patterns
  • Access Cassandra databases using CQL and Java
  • Create a balance between read/write speed and data consistency
  • Integrate Cassandra with Hadoop, Pig, and Hive
  • How and where to use Cassandra and the core concepts that drive this database.
  • Learn how to use the fault-tolerant and high availability feature of Cassandra
  • Understand the Apache Cassandra architecture and the more complex inner workings such as gossip protocol, read repairs and Merkle trees
  • How to properly identify requirements and create a Cassandra data model by applying data modelling techniques
  • How to integrate Cassandra with Hadoop and use tools like Pig and Hive
COURSE CURRICULUM

Topics Covered During Classroom :

1. Basics

  • Revise CAP Theorem
  • Good fit use cases

2. Concepts

  • Who uses it?
  • Database or Datastore?
  • Masterless architecture
  • Seed node(s)
  • Gossip
  • Detecting a failed node
  • Replication
  • Partitioner
  • Snitch – summary
  • Snitch – property file, when to use? An example
  • Virtual node, ring architecture
  • Commodity vs Specialized hardware
  • Bootstrap process
  • Elastic linear scalability
  • Debate – heterogeneous machines, adding capacity
  • Deployment – 4 dimensions
  • Distributed workloads, Multi-DC setup
  • Regions and Zones (Cloud setup)
  • SEDA

3. Setup and installation

  • Acquiring and Installing C*
  • Understand the key components of Cassandra.yaml
  • Configuring and Installation structure
  • Directories – Data, Commit Log, Cache
  • System log configuration
  • Nodetool & CqlSh

4. Concepts II

  • Keyspace
  • Admin/system keyspace
  • Column Family / Table
  • Primary key components
  • Visualizing PK based storage, on disk cells & row sizing
  • Fault tolerance via replication
  • Coordinator
  • Consistency Levels – read/write, immediate
  • Quorum
  • Applied consistency level – scenario game
  • Inconsistencies across nodes
  • Anti-entropy op & Read repair
  • Hinted handoff
  • Debate – RF change impact

5. Write and Read path

  • Why C* writes fast?
  • Components of the write
  • Storage – a primer
  • A bit more about LSM
  • Write path flow
  • Data state
  • Memtable, SSTables, Commit log
  • When does the flush trigger?
  • Data file name structure
  • Overview of CDC
  • Row cache, Key cache, Chunk cache
  • Bloom filters
  • Key index sample / Partition index sample
  • Read path flow
  • Eager retry
  • Last write wins with tombstone example
  • Compaction
  • NTP and why it is important

6. Modeling

  • QDD
  • De-normalization
  • Row key & data partitioning
  • Visualizing the components of the PK
  • Choice of row key
  • How to fix a wide row?
  • Partition key IN clause (Slicing)
  • Counters, Collections
  • TTL, UDT, UDF, UDA
  • Materialized views
  • Where clause & order by restrictions
  • Antipatterns
  • Relationship tables and dup tables
  • Bitmap type of index & example
  • CQL limits – a primer
  • Batch – pro con analysis
  • Lightweight transactions
  • Triggers
  • Debate – Locking in/out of C*
  • Application design DAO
  • Nodetool – table/cfhistograms, table/cfstats, tpstats, netstats, proxyhistograms

Training Duration & Pricing:

For Individuals

Duration: 1.5 Months and we also offer 2 Months Offline Support

Mode: Classroom & Online

Course Fees: Call us at +91-9900001329

For Corporate Training

The Mobignosis Corporate Training Program is designed for organisations who require practical upskilling for their employees to gain knowledge on the current trending technologies

cassandra training

Cassandra

08:00 AM – 10:00 AM

CERTIFICATION

cerification

Candidates receive Mobignosis course completion certificate upon successful completion of course

FAQs

The course is an instructor led classroom/online coaching session

The instructors are industry experts (Big Data Professionals) who consult with leaders in technological services like SAP, Capgemini, Cisco and many others

As a team of practicing Big Data professionals, we use the leading edge methodologies in our consulting work and have used the same methodologies to develop the Cassandra course content for classroom coaching. So, you are exposed to the most up to date quality course contents

The Cassandra Training program includes 2 months free technical support post training, the participants can repeat the session free of cost, For any additional assistance we are just a phone call away

Leave a Reply

Your email address will not be published. Required fields are marked *