Interview Questions
Row vs Column Store
-
It was earlier assumption that storing data in Columnar based structure takes more memory size and not performance Optimized. As the SAP HANA Modeler Views can only be created on the top of Column based tables, storing data in Column tables is not a hurdle.
Column based Storage
Since the SAP HANA is evolved with the advent of performance, using column-based data storage in Information views and presented the real benefits of columnar tables over Row based tables. Typically, the column store table stores the data vertically. Hence, similar data types come together as shown in the example above. It provides faster memory read and writes operations with help of In-Memory Computing Engine.
In a conventional database, data is stored in Row based structure i.e. horizontally. SAP HANA stores data in both row and Column based structure. This provides Performance optimization, flexibility and data compression in HANA database.
What are the benefits of storing data in Column based storage?
- We can achieve excellent data Compression with the help of Column-based storage
- Read and write access to tables would be extremely faster when compared to conventional Row based storage
- Column-based storage unlocks features such as flexibility & parallel processing
- The calculation and aggregations on the Column-based storage can be carried out at higher speed
How to store data in column-based structure?
Various methods and algorithms are available for storing data in a Column based structure such as Dictionary Compressed, Run-Length Compressed and much more. Let us examine both these methods and algorithms,
Dictionary Compressed Algorithm
In this type of method, the data cells are stored in the form of the numerical format within the tables and numeral cells are always performance optimized as compared to characters.
Run length compressed Algorithm
In this type of method, multiplier with cell value is conserved in numerical format and multiplier shows repetitive value in the table.
What is the functional difference between the Row-based storage and Column-based Storage?
It is always advisable to use Column based storage if SQL statement has to perform aggregate functions and calculations. Column based tables always perform better when running aggregate functions like Sum, Count, Max, Min.
Row based storage is preferred when output has to return complete row. The example given below makes it easy to understand.
Row vs Column Store Functional
In the above example, while running an Aggregate function (Sum) in sales column with Where clause, it will only use Date and Sales column while running SQL query so if it is column based storage table then it will be performance optimized, faster as data is required only from two columns.
While running a simple Select query, the full row has to be printed in output so it is advisable to store the table as Row based in this scenario.
Get in touch with training experts Get Free Quotes