There are two ways to handle state in applications. One is to keep the state close to business logic (in-memory) , this is called “Stateful”. The other is to persist state somewhere in a store e.g. a SQL DB, Document DB or even in a Blob, away from business logic, which is called “Stateless”. Both variations have pros and cons, here the most striking ones:
- Very fast access to state data
- Straightforward to implement
- Hard to use in scenarios with concurrent access
- Persistence is not easy, especially, if persisted state needs to be up-to-date.
- Difficult to synchronize with other systems (e.g. between Azure regions). This is especially worrisome in high availability / disaster recovery scenarios.
- Does not really scale well in concurrency scenarios
- Adds a lot of state handling logic to business logic, if you want to satisfy more complex scenarios such as session context, transactions or multi-tenancy.
- Hard to debug
- Data is volatile and therefore difficult to re-use
- Memory use grows linear with amount of data, which might create application problems under high load
- Re-use of stateful instances might get difficult/problematic
- Persisted state can be accessed easily
- Great in concurrency situations, if a suiting store (database) is used. This is because, the store handles access synchronization.
- Good to debug. There is good tool support, for many stores
- Session or transactional capabilities are quite often built into stores
- Great data re-use options in other parts of the application
- Good data synchronization capabilities, which enables robust HA/DR scenarios
- Easy re-use of stateless components
- Implementation not as straightforward, because of store access
- Data access from business logic not as fast as stateful direct memory access
- Requires an additional PaaS store, such as Azure SQL or Cosmos DB, bringing in additional infrastructure costs / component risks.
No wrong or right
Real life is not “black or white” and, due to this, a recommendation just to use one of these approaches certainly will not fit all use cases possible. However, in serverless applications a stateless approach should be favored, because it enables true flexibility, re-use and granularity without worrying about state handling.
Stateful scenarios make especially sense, if one thinks of them as “cache”, which then ideally is backed by persisted data from a store and kept up-to-date via events or cache expiration.
Azure Functions can only be used in a stateless fashion and also most available 3rd party connectors adhere to this paradigm.
Choosing a stateless store
As you most probably might guess, the choose of a data store does have quite an impact on a solution. We have been talking about Azure SQL, Cosmos DB and Blobs. Azure Tables should also be mentioned in this context. Looking at functionality, databases should be preferred over relatively raw storage solutions such as Blobs or Azure Tables. If your application is not a very simple one, or might grow, those stores do not provide functionality you might need over time.
Azure SQL and Cosmos DB do provide a lot of data handling functionality.
Porting a SQL-based application to Azure Azure SQL might provide some good opportunities for code re-use (e.g. looking at stored procedures) from an existing system. Azure SQL has good data synchronization mechanisms with failover capabilities (single master) and provides great scale via partitioning or sharding of data.
If you are completely free to choose, I definitely recommend to have a look at Cosmos DB. Its data access performance is absolutely fantastic: I have seen read access durations of 1-2 milliseconds. It synchronizes instances around the globe configured by a mouse click and enables multi-master scenarios having different consistency levels. Additionally, Cosmos DB can be enhanced with powerful indexing and search capabilities offered by Azure Search and it provides connectors into the “Big Data” world, e.g. for Azure Databricks.
There is one drawback with Cosmos DB, which is the higher price compared to Azure SQL or Azure Cloud Storage. In certain not sophisticated scenarios, where no cross-region synch, high speed data read and multi-master scenarios are not required, these higher costs may not be justified.
Nevertheless, if your requirements are more demanding, Cosmos DB will be your friend! You will be able to compare higher Azure costs to implementation and infrastructure efforts saved! It should be taken into account that Microsoft needs to set up datacenters , networks and servers to provide the “Cosmos DB level of comfort” to developers, too.