Scalability is one of the common areas where I have seen common misconceptions when customers begin building on the platform.
But before we jump into that topic, let's first revisit the three cloud operating models or paradigms:
Infrastructure as a Service (IaaS) - Think of this as "Hosting" your application. The clue is in the term "Infrastructure". You are very much getting a Virtual Machine - So are responsible for the Operating System, Network Connectivity, etc.
Platform as a Service (PaaS) - Think of that as "Building" your application. Rather than caring too much about the infrastructure (i.e. Which version of the OS, advanced configurations of the underlying infrastructure), you abstract away from those details and focus on building the application quickly. (Note: There is likely still some degree of configuration required here!)
Software as a Service (SaaS) - Think of this as "Consuming" an application. You don't want to build another version, so you consume software that has been produced by another organisation. Think of Office 365 as an example here.
There are a couple of other concepts to be aware of; Scaling Up vs. Scaling Out.
Imagine scaling up as throwing more power behind a particular server (i.e. Increasing the number of CPUs it has, increasing the RAM).
Now, think of Scaling out as building out a server farm. Replicating the level of machine that you have, and growing the number of instances of that machine to serve the volume of load that you have.
Now, time for some myth-busting!
An important one - The cloud does not necessarily scale automatically. Be sure that once you have chosen an Azure Service, that you are comfortable on configuring scalability. (For example, in App Services, you may need to set scalability rules. This approach differs to Azure Functions Scalability, where you can either choose a Service Plan or Dynamic Service Plan. There is once again a different approach to be configured when Scaling Azure Service Fabric).
Understand how scaling can impact your application. For example, at the time of writing, when changing the performance level of SQL Azure, a replica is generated at the new performance level, and the connections are swapped across to the new replica once the requested changes are complete. Data will not be lost during the switchover, though connections to the database are disabled. As a result, some transactions in flight may be rolled back (more detailed documentation available here). Therefore, it is important to understand on a service-by-service level, how scaling up affects ongoing performance. There is an excellent talk from Microsoft Ignite "Design for scalability and high availability on Microsoft Azure", which speaks of the cloud mindset of scaling out rather than scaling up.
Do not wait until production to test whether your solution scales. Ensure that you have tested an environment that is representative of your production environment, at a comparable level to your expected volume of load. I am not trying to suck eggs, but this is something that I have seen come up many times!
Start thinking about the bigger picture. It will be a huge management overhead to re-deploy different performance tiers/SKUs of services if you are not scaling out based on demand. Consider pulling together your Infrastructure as Code using Azure Resource Manager (ARM) templates. These templates will make your infrastructure agile to spin up, and therefore, easier to re-deploy at different levels of scale. The Patterns for designing Azure Resource Manager templates has some good recommendations around "T-Shirt Sizes" and Capacity planning.
Elasticity is one of the large selling points of the cloud and is something that we should embrace when building Cloud applications. Though be cautious, it is of particular importance to ensure that you are familiar with how each service in your application scales.
Make yourself aware of the general principles before building your solution. Test the scalability at representative levels of user-load before you progress into production. Keep refining those load tests, ensuring that they remain representative of the volume of traffic that you are servicing.