Wayne, a few messages ago you said:
Because the services takes alot of CPU time the webserver is slow
Typically what I have found is that processes like IIS and COM+ are heavy on the memory load of a given server. Depending on the types of calculations being done, and instructions, and threads, etc. CPU load will increase as users do. On the otherhand, OLTP systems cause large volumes of file I/O.
Scaling up refers to adding additional hardware to an existing server to make it more optimized for performance, i.e. adding secondary RAID arrays, isolating database files to seperate physical partitions, extra processors, and extra RAM. If your server is not upgradable, you scale the application out. Scaling out, refers to adding additional servers to balance the load and to support more concurrent users. You can use application server and application load balancing to divide the load across servers.
For example, lets say that your site averages 2000 concurrent users per hour and the ram averages 60% usage and the processor usages averages around 80%. That would be pretty much optimum performance (according to some enterprise architechts.) So now you need to roll out a new peice of the business model, and your stress tests show that if you increase the load to 3000 users per hour, then the server starts to implode. You can roll out a new server in a clustered environment, and send 1500 users to server A and 1500 users to server B. You also get the extra security of having a redundant system. One server can go down, and the application can function properly until IT fixes the bad server, and the application is just less responsive for a short period of time due to the increased workload.
Scaling up and our are not cheap. Especially when you start to look at multiple RAID 5 disk arrays and database redundancy and SAN solutions for database mirroring and failover. However, client XYZ might feel that they get the return on investment by the product that youre developing.
One thing that I am curious about is what type of processors is their current setup running, whats the normal concurrent user load, and what types math are you doing that causes the CPU to be pegged?
If you want to drop me an email to the one in my profile, I would be happy to kick around some ideas with you.