Change autoscaling and downscaling triggers to be capacity instead of queued requests
Change the trigger instead to look the capacity of the AppServers to serve about 60-80% of the maximum sessions it can handle.
Calculations would look like:
Maximum sessions = S(max) = Num of AppServers * Maxconn (AppServer concurrency)
Current sessions = S(curr) = Num of current requests
Capacity = Current sessions / Maximum sessions
And the trigger to scale up would be capacity >=80% and to scale down would be capacity <=60%