Application Performance best practices

Application performance is one of the most important factor in software. Application should be designed and implemented considering performance objectives in mind. While defining architecture for an application or defining design, implementing code, several best practices can be considered to design/implement the application for good performance. At the same time some bad practices or anti-patterns should be avoided to avoid bad performance.

Some of these best practices or bad practices may be generic in nature and may apply to most of the cases but some of them may be subjective to a project context and may not apply in every case. Project team and architect should take conscious decision to reject or accept these best practices. Often project teams without knowing the detail try to avoid some best practices in the name of subjectivity and project context but in many cases actually project context doesn't stop team to adopt a best practice.

Long running transactions

Throughput of a system (request processing per second) is impacted by long running transactions. Long running transactions cause wait and waiting requests cause congestion, congestion causes delays and delays cause locks. Locks cause dead locks.

Some of the strategies can be splitting transactions, keeping transactions as low as possible, using asynchronous processing wherever possible, tuning long running database queries, increasing thread pool. Avoid chatty transactions, use cache to avoid round trips and increase throughput to avoid delays and locks.

Define uniform architecture

Uniformity helps scaling out horizontally. If components of an architecture have complex dependencies it is difficult to add new nodes for scaling.

Use cache

Cache can help avoiding long query execution and as a result waiting threads can be reduced which can increase throughput.

Partitioning

Sometimes partitioning is referred to as general term which people apply in different context. In case of relational data generally it is decomposing tables either column wise or row wise. It is not normalization as refereed to in several normal forms in relational databases. Normalization is conceptual level optimization but partitioning referred to here is physical level optimization. Even if a table has no redundancy you may split some of its columns to make another table in another database to overcome huge size of one database only. This way you divide your data into more than one database to overcome limitation of data size your limited box (server) has. It is not necessary that partitioning is done to split the data in another database but it may be within the same database too. It improves performance of reading/writing data but there are other benefits too as better manageability, availability and load balancing.

Vertical splitting (Partitioning)

Vertical splitting of database means storing different tables & columns in a separate database. Partitions in this case are created based on domain. Application data is logically split and stored in different databases. This kind of split is implemented at the application level so that application code reads and writes to a designated database. Sometimes term row splitting is used for this type of splitting because row is split by its column.

Horizontal splitting (Sharding)

Horizontal splitting means storing rows of a table in multiple database nodes.

In some databases, sharding is a first-class concept. These databases are capable of storing and retrieving data in a cluster. Many databases of today have native support for sharding. Cassandra, HBase, HDFS, and MongoDB are examples of distributed databases. Sharding can be implemented at either the application or database level. As an example a user table is divided in to multiple tables based on users location: UsersNorth, UsersSouth etc.

There are certain challenges in dealing with distributed data and correct strategies are applied to deal with them. One of the challenge is searches across partitions which may be not efficient. Another example is uneven distribution of data (called hot spots) which may limit benefits of sharding.